The Result Data Newsletter   
Volume 801 - January 2008   
© Copyright 2007 Result Data Consulting, Ltd.  614-505-0770  www.resultdata.com   

    Result Data Home Page  |  Newsletter Archive  |  Upcoming Events  |  Classes & Workshops  |  Request Information
Upcoming Events:  Mid-Ohio BusinessObjects User Group:  2/6, SharePoint Seminar:  2/20

Return to Newsletter Contents...

 

SQL Server Analysis Services: 'Analysis vs. Data Mining

by: Charles Tournear, Sr. Consultant, MCT, MCSE, MCSD, MCDBA, CRCP

Microsoft SQL Server Analysis Services provides tools to analyze data and to perform data-mining of data.  This article will begin to answer the question: “What is the difference between simple analysis and data-mining?”

Analytics (or data analysis) looks at existing data to see where we are now or what we’ve done in the past.  It allows you to compare data to create trends to try and predict where you might be next month compared to this month or to the same month last year.  You can use Key Performance Indicators (KPI) to look at the status of current data and drill-down to detail data to see what or why the indicator specifies a specific trend.  You can retrieve data by looking up information using Transact-SQL or a Multi-Dimensional Expression (MDX).

Data Mining is Predictive Analytics.  It looks for previously unknown patterns in existing data or to answer questions like what if we do this or that what might happen next.  Data Mining uses patterns in data to define a set of rules to help in predicting future outcomes based on some set of mathematical algorithm.

It is very easy to create questions based on your existing data that might sound like a data-mining concept, but is in reality just an analysis question.  So do you need data-mining?  If most of the questions that you would ask about your data or would want to retrieve data “that meets a particular condition”, could be answered by a query no matter how complex or through the MDX extension to the T-SQL language then the answer is No.

To determine whether you need to use data-mining you need a thorough understanding of how to use T-SQL and the analysis tools available in SQL Server Analysis Services (SSAS) to perform analysis using tools that provide trend analysis through cubes or KPIs such as Pro-Clarity or Performance Point Server or to explore data using T-SQL and MDX.  Without such knowledge you won’t be able to decide if the questions that you need to ask about your data require data-mining. 

In most cases, analysis is enough.  Also in many cases you will find that in order to do some true data mining you may not have all the data that you need to perform mining analysis.  You may need to have some surveys sent out to collect more detail about people, places, or products in order for new patterns to become visible.

In SSAS, data mining is done by first selecting a group of data that will fit a certain model you are trying to analyze.  The next step is to select an appropriate algorithm to apply against that data to create a set of rules. And the last step is to apply those rules to similar data to get a predictive result.

The advantage of the data-mining tools provided by SSAS is that you don’t have to hire some PHD that has a major in statistical analysis to apply some complex set of mathematical equations to give you different collections of data, which you then have decide if any of this information is really useful.

The hardest part especially after having someone demonstrate data-mining and showing you how spectacular data-mining results look, is to determine whether you have the data necessary, are willing to spend the extra cost to get the necessary data, and the time to evaluate the results to determine if any of the output received from using a particular algorithm is useful.

So the easiest way to begin is to start building a list of questions or results that you would like to get from your data.  Then see if it’s possible to use standard analysis processes, to answer those questions.  In most cases you will find that data-mining just isn’t what we really need right now. 

Most of us ask questions to find out where we are at NOW and what we should be doing next to survive and don’t really have the data or time to invest in looking at where will we be 5 years from now, 10 years from now based on certain trends or patterns in our data.  We tend to focus first on ways to improve how we’re doing things now and how can we improve the efficiency of what we’re doing.  Until those questions can be answered, (until we’re sure we can survive through tomorrow), then we can use the additional tools of data-mining to help us predict the future, or learn what data we need to collect to try and find hidden patterns in our processes.  If you really want to look forward to the future now though, consider looking at some of the algorithms that are available in data-mining, to find out what additional data you could be collecting, so that in the future you are prepared to perform the analysis.

Go to Top  |  Return to Newsletter Contents

The Result Data Newsletter is published approximately once a month to share the latest information on business intelligence, data management and CRM. There should be a link below to allow you to change or remove yourself from our list. We take your requests very seriously. If you have any difficulty please contact us at 614-505-0770 and we will make sure that your request is handled properly. This is not intended to be an unsolicited message and you can reach us in person if needed.

© Copyright 2007 Result Data Consulting, Ltd. - All Rights Reserved
All trademarks and copyrights are the property of their respective owners. This information is provided without warranty.
Announcements
Quarter 1 Training Special
Schedule and attend any public training class now through March 31st and receive 10% off the normal class price OR opt for a gift certificate to the Apple Store for that same dollar amount. Restrictions apply and you must mention the promotional code Apple08 at the time of registration to receive the promotion.  Call 614-505-0770 for further details and restrictions.
Next Mid-Ohio BusinessObjects User Group Meeting
The next MOBOUG is Feb. 6, 2008.  Call 614-505-0770 or click here to reserve your seat.
Attend the First Microsoft SharePoint Seminar
The first free Microsoft SharePoint Seminar is on Feb. 20, 2008.  Call 614-505-0770 or click here to reserve your seat.
Looking for a Few Good Men and Women
Join our award winning team of Business Intelligence consultants and .Net Software developers.
Send your resume and salary requirements to:
jobs@resultdata.com