Presentation is loading. Please wait.

Presentation is loading. Please wait.

Will Data Mining Change the Functions of DBMS? Jiawei Han DAIS (Data And Information Systems) Lab University of Illinois at Urbana-Champaign.

Similar presentations


Presentation on theme: "Will Data Mining Change the Functions of DBMS? Jiawei Han DAIS (Data And Information Systems) Lab University of Illinois at Urbana-Champaign."— Presentation transcript:

1 Will Data Mining Change the Functions of DBMS? Jiawei Han DAIS (Data And Information Systems) Lab University of Illinois at Urbana-Champaign

2 Will DM Be Integrated with DB Functions? DM: Already a functional component of DBMS  Microsoft/SQLServer: Analysis Manager  IBM/DB2 & IntelligentMiner  Oracle: Data Mining Package But will DM be “intruding” into DBMS, i.e., be integrated with essential DBMS functions?  Indexing  Data integration  Data cleaning  Query processing

3 Indexing by Data Mining Indexing graphs? ─ # of subgraphs: exponential!  Chemical Informatics/bioinformatics …  Discriminative frequent graph patterns (SIGMOD’04) Indexing subsequences?  Shopping sequence, DNA/protein sequence (SDM’05) When is discriminative frequent pattern indexing useful?  Complex objects, big (object) queries (a)(b)(c) Sample database Query graph

4 Data Cleaning by Data Mining Load messy data into a structured database?  Inconsistent data: age = “1946”?  Field mis-alignments  Glitches of data: completely messed up inputs  Missing/un-matching delimiters: XML, HTML data  Big field: BLOB, CLOB, multimedia and text Data mining  Data cleaning by distribution/outlier analysis  Dependency/correlation analysis  Schema-directed or schema “discovery”

5 Data Integration by Data Mining Linking and mining cross-over multiple data relations  Cross-mine (Classification across multiple data relations: ICDE’04) Search across heterogeneous databases  Object identification/merge, reference reconciliation (Alon’s group)  Mining across heterogeneous DBs  Personalizing data from heterogeneous sources

6 Query Processing by Data Mining Query plan refinement based on query execution history Better query planning by investigating additional data statistics  Current optimizer: key/foreign key, cardinality, # distinct values  Additional information: Strong dependency/correlation Histogram, dense vs. sparse regions, etc.

7 Conclusions DBers have been “invading” into DM and made great contributions It is time to consider that DM may invade DBMS to enhance its functionality General philosophy  Invisible data mining Google is doing this for page ranking successfully Can we do it to enhance DBMS?  You can do better if you know your data better!


Download ppt "Will Data Mining Change the Functions of DBMS? Jiawei Han DAIS (Data And Information Systems) Lab University of Illinois at Urbana-Champaign."

Similar presentations


Ads by Google