2012年2月12日星期日

Basket analysis & Association Mining

I'm looking for suggestions on the right design approach in relation to a problem that resembles Basket analysis. The data to be analyzed is a dimension Attribute_DIM and contains an ID, Attribute and Attribute_Value. Some examples of the data are :

ID Attribute Attribute_Value

1 Color Black

1 Movie Men in Black

1 Book Of Human Bondage

2 Color White

2 Movie Men in Black

2 Book Grapes of Wrath

We need to be able to analyze multiple selections of the dimension. For example,

Men In Black

Grapes Of Wrath Of Human Bondage

Men In Black Black 1 1

White 1 0

I have had some success using the Association Algorithm Mining Model. I think It is an overkill since I only need descriptive and no predictive analysis.

I'm looking for some ideas on the right approach to this problem. Ideally, we need to present the data in a cube and have the possibility to perform member analysis of the dimension.

I have looked at several articles (including http://msdn2.microsoft.com/en-us/library/aa902637(sql.80).aspx and http://www.aspnetpro.net/newsletterarticle/2004/10/asp200410ri_l/asp200410ri_l.asp). I'm not convinced those are the solutions and would appreciate any insight into this problem.

Thank you,

Anna.

You might try an OLAP cube with a many to many dimension - described here: http://msdn2.microsoft.com/en-us/library/ms170463.aspx. There's also a short book dedicated to the feature by Marco Russo (http://www.lulu.com/content/812235)

|||

So, if I apply your solution, I would have two dimensions from the same source and establish a many-to-many relationship between the two, right? Wouldn't this limit the analysis to only two dimensions; i.e., if I needed to analyze 3 attributes and how the basket looks in that case; I need to be able to analyze an open-ended number of attributes on separate axes.

This article comes closest to describing my problem: http://msdn2.microsoft.com/en-us/library/aa902637(sql.80).aspx (Analysis Services: DISTINCT COUNT, Basket Analysis, and Solving the Multiple Selection of Members Problem). The article is based on SQL Server 2000. I would like to know if there is a simpler and different approach with SQL Server 2005.

|||Actually, in response to your original question, Association Rules is generally used for descriptive analysis, not predictive analysis. It is a rather recent innovation that has allowed AR to be used for predictive purposes. I think your best bet is to use AR.|||Thank you for the suggestion. I will go ahead with the idea of using a Mining model with Association rules.

没有评论:

发表评论