Presentation is loading. Please wait.

Presentation is loading. Please wait.

Data Mining Workbenches: a overview &comparison focusing on open-source packages CS240A notes by C. Zaniolo.

Similar presentations


Presentation on theme: "Data Mining Workbenches: a overview &comparison focusing on open-source packages CS240A notes by C. Zaniolo."— Presentation transcript:

1 Data Mining Workbenches: a overview &comparison focusing on open-source packages
CS240A notes by C. Zaniolo

2 Most Popular Data Mining Software
Rexer Analytics Survey (Early 2007) asked about the tools used often and occasionally. Clearly more popular than the rest were: SPSS or SPSS Clementine "Own Code" SAS or SAS Enterprise Miner Followed by R Weka C4.5 / C5.0

3 Critical Mass and Popularity
Top ten most used packages by KDD Nuggets Survey (May 2007): SPSS/ SPSS Clementine Salford Systems CART/MARS/TreeNet/RF Yale (now Rapid Miner) SAS / SAS Enterprise Miner Angoss Knowledge Studio / Knowledge Seeker KXEN Weka R Microsoft SQL Server? MATLAB? Note: Microsoft Excel omitted as it's not really "data mining" software, and I've merged the tools offered by a single vendor (SPSS and SAS) You can see the full survey results 

4 Comments Gregory Piatetsky-Shapiro, KDnuggets Editor:
Votes from tool vendors were removed.. Comparing with 2008 KDnuggets Poll on data mining tools/software used, the big changes are growth in SPSS, RapidMiner, and R.

5 Popular Data Mining Software (cont.)
Rexer Analytics Survey is taken every year and the summary report can be obtained free. 2009 SURVEY HIGHLIGHTS: Open-source tools Weka and R made substantial movement up data miner’s tool rankings this year, and are now used by large numbers of both academic and for-profit data miners. SAS Enterprise Miner dropped in data miner’s tool rankings 2010 SURVEY HIGHLIGHTS: R:  After a steady rise across the past few years, R overtook other tools to become the tool used by more data miners (43%) STATISTICA has also been climbing in the rankings. STATISTICA, IBM SPSS Modeler, and R received the strongest satisfaction ratings in both 2010 and 2009.

6

7


Download ppt "Data Mining Workbenches: a overview &comparison focusing on open-source packages CS240A notes by C. Zaniolo."

Similar presentations


Ads by Google