December 18, 2013

Finding Strong Associations

A substantial part of the KDD literature deals with finding strong statistical associations (or correlations) between data elements in the collection (e.g. Toivonen et al 1995; Kloesgen 1995a; Feldman et al, 1996). Such associations were used for various applications, including:
• Supermarket shopping list: finding correlations between user purchase preferences
• Identifying telecommunications alarm rules, as associations between system attributes and faults.
In the KDT context, we are interested in finding statistical associations between
various keywords. For example, we may identify the economical topics which are
highly associated with a certain country. The comparative approach of the previous section
enables us to focus on associations that are likely to be interesting, i.e. those associations that deviate from a baseline model. For example, we will give a higher rank to an association between a country and a topic only if this association is not typical for other countries as well.


Previous                                     Next

No comments:

Post a Comment