During search Magnum Opus can automatically filter out rules and itemsets that are likely to be of little interest. The filter mode controls which such rules and itemsets are filtered out. The four options are Filter-out None, Filter-out Trivial, Filter-out Unproductive, and Filter-out Insignificant.
A rule is trivial if there is another rule with the same Right-Hand-Side and a subset of the Left-Hand-Side that covers exactly the same cases from the data set. For example, the first of the two rules below is trivial because it has the same coverage as the second. Adding Tomatoes to the LHS of the second rule does not affect it.
Lettuce & Tomatoes -> Cucumber [Coverage=0.250 (250); Support=0.239 (239); Strength=0.956; Lift=2.91; Leverage=0.1568 (156)]Lettuce -> Cucumber [Coverage=0.250 (250); Support=0.239 (239); Strength=0.956; Lift=2.91; Leverage=0.1568 (156)]
If a rule is trivial then it will have the same support, strength, lift, and leverage as the rule with respect to which it is trivial.
A rule is unproductive if there is another rule with the same Right-Hand-Side and a subset of the Left-Hand-Side that has equal or higher strength. For example, the first of the rules below is unproductive because it has lower strength than the second. Adding Promotion1=f to the LHS of the second rule decreases its performance.
Profitability99 < 419 & Promotion1=f -> Spend99 < 2030 [Coverage=0.274 (274); Support=0.248 (248); Strength=0.905; Lift=2.72; Leverage=0.1568 (156)]Profitability99 < 419 -> Spend99 < 2030 [Coverage=0.333 (333); Support=0.302 (302); Strength=0.907; Lift=2.72; Leverage=0.1911 (191)]
If a rule is unproductive then it will have the same or worse support, strength, lift, and leverage as the rule with respect to which it is unproductive.
A rule is insignificant if its strength is not significantly greater than that of all of its immediate generalizations and a default rule. An immediate generalization is formed by deleting a single condition from the LHS of a rule. A default rule is formed by deleting all conditions from the LHS of a rule. A Fisher exact test is used to test for significance. The critical value for the significance test can be chosen by the user and defaults to 0.01. For example, the first of the rules below is insignificant using the default critical value of 0.01 because adding NoVisits99 < 35 to the LHS of the second rule does not significantly increase its strength.
Spend99 < 2030 & NoVisits99 < 35 -> Profitability99 < 419 [Coverage=0.272 (272); Support=0.255 (255); Strength=0.938; Lift=2.82; Leverage=0.1644 (164)]Spend99 < 2030 -> Profitability99 < 419 [Coverage=0.333 (333); Support=0.302 (302); Strength=0.907; Lift=2.72; Leverage=0.1911 (191)]
Filtering out insignificant rules will remove many rules that result from adding another value to the Left-Hand-Side of another rule without substantially increasing its strength.
If a rule is trivial then it will also be unproductive. Hence, Filter-out Unproductive Mode filters out all rules filtered out by Filter-out Trivial Mode. If a rule is unproductive then it will also be insignificant. Hence, Filter-out Insignificant Mode filters out all rules filtered out by Filter-out Trivial and Filter-out Unproductive Modes.
The trivial and unproductive filters are identical for itemsets. They both remove any itemsets with leverage ≤ 0.0. The insignificant filter removes any itemsets that fail a Fisher exact test for the null hypothesis that leverage ≤ 0.0.
Each filter mode detects the respective type of spurious rule or itemset and removes it from the list of rules or itemsets that is returned to the user.
The current filter mode is displayed in the Filter Mode ComboBox on the Search Settings Page.
| © G I WEBB & ASSOCIATES 1999-2005 | Last updated September 2005 |
| home | products | download | evaluations | prices | purchase | contact us |