
Association Rule Mining is an unsupervised machine learning method. The requirements for Association Rule Mining are: data needs to be in transaction format (no column names or row names) and labels should be removed (an exception can be made if the goal is to find associations to the labels). The goal of Association Rule Mining is to find the association between rules. Association is measured in support, confidence, and lift. A rule has a left hand side and a right hand side. Essentially, the rule is asking: what is association between the left hand side and the right hand side. The sides of rules are sets, so this means that frequency does not matter for a single ‘transaction’. Association Rule Mining only cares if a value is in a ‘transaction’ at least once, not how many times it appears in a ‘transaction’. The order values in the left hand side or right hand side of the rule also does not matter. The sets of rules can have zero values up to every unique value in the dataset. Some applications of Association Rule Mining are analysis of shopping transactions to determine product placement, image identification, text analytics, and various biological functions (binding sites, amino acids in proteins, etc..).
Association Measures
Let A and B be sets, where A and B are sets of zero or more values. Assume rule: {A} -> {B}
Support
Support is the measure of how often items in A and items in B occur together. Support is always low for rare values. So, if a value is rare support should not be used to measure the strength of association. Confidence should be used instead. This is because confidence takes into account how many transactions the rare value shows up instead of the total number of transactions.
Sup(A, B) = P(A, B) = (Count of A and B together) / (total # of transactions)
Confidence
Confidence is the measure of how often items in A and items in B occur together, relative to the number of transactions that contain A. Confidence is always greater than or equal to support. This is because the number of transactions a value appears in cannot be less than the total number of transactions. Essentially this is saying that in the equation below, the numerator is the same as the numerator for support, but the denominator will always be a equal or smaller value. If values always appear together then confidence equals 1.
Conf(A, B) = P(B|A) = P(A,B) / P(A) = (Count of A and B together) / (Count of A)
Lift
If the value is in every transaction then lift will be equal to 1.
- If Lift(A, B) = 1, then A and B are independent
- If Lift(A, B) < 1, then A and B are negatively correlated
- If Lift(A, B) > 1, then A and B are positively correlated
Only rules with Lift > 1 are considered because this indicates assocation.
Lift(A, B) = P(A, B) / P(A)*P(B) = Sup(A,B) / P(A) * P(B) = Sup(A,B) / (Count of A / total # of transactions) * (Count of B / total # of transactions)
Apriori Algorithm

The apriori algorithm makes Association Rule Mining feasible in programming for large datasets. The apriori algorithm takes in a minimum threshold for support. Starting with a base set, the algorithm calculates the support value. If the support value is higher than the support threshold, then the algorithm will move on to a superset (the base set plus one more value). The algorithm will again calculate support of this superset. If the support value is higher than the support threshold, then the algorithm will move onto a larger superset (the previous superset plus one more value). The algorithm iterates until it reaches a set that does not meet the minimum support threshold. It then moves on to the next base set. The apriori algorithm is helpful because it prunes the number of sets that need to be looked at in Association Rule Mining. This is because adding a value to set will result in a lower probability of it occurring in a transaction in the dataset. If a set doesn’t meet the minimum support threshold, then neither will any of it’s supersets. As a result, those supersets don’t even have to be looked at, reducing the overall number of calculations needed to be made when doing Association Rule Mining.
Example
This will be an example showing rules and calculating support, confidence, and lift for those rules from the following dataset. The following dataset contains purchases from a grocery store.
Bread | Soda | Milk | |
Bread | Cheese | ||
Cheese | Soda | Napkins | Milk |
Cheese | Bread | Napkins | Milk |
Soda | Napkins | Milk |
Rules
The following are an example of some of the rules from the above dataset.
- {Napkins} -> {Cheese}
- {Milk, Bread} -> {Soda}
- {Milk, Bread} -> {Soda, Napkins}
- {Napkins} -> {Cheese, Bread}
Looking at the first rule, it is asking the probability of adding ‘Cheese’ if the shopper already has ‘Napkins’.
Measures
Sup({Napkins},{Cheese}) = 2/5 = 0.40
Conf({Napkins}, {Cheese}) = 2/3 = 0.667
Lift({Napkins}, {Cheese}) = (2/5) / ((3/5) * (3/5)) = 10/9 = 1.11
ARM in this Project
This project aims to determine if the total result can be predicted with at least 52.4% accuracy based on pre-game information and statistics. So, ARM will find out if their are associations between the pre-game information and statistics and the over or under occurring. If their are associations, it will also reveal how strong and confident these associations are.