Data Mining with Big Data IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014 Xiangyu Cai ( )
Why choose this paper? General introduction to big data characteristics Indicate the challenges with big data project Recommend references and related works to overcome the big data challenges
OUTLINE Introduction Big data characteristics: HACE Theorem Data Mining Challenges with Big Data Research Initiatives and Projects Related Work Conclusion
I. Big Data Characteristics
HACE Theorem Huge Data with Heterogeneous and Diverse Dimensionalities Autonomous Sources with Distributed and Decentralized Control Complex and Evolving Relationships
II. Data Mining Challenges with Big Data
Data Mining Challenges with Big Data Big Data Mining Platform Big Data Semantics and Application Knowledge Information Sharing and Data Privacy Domain and Application Knowledge Big Data Mining Algorithms Local Learning and Model Fusion for Multiple Information Sources Mining form Sparse, Uncertain and Incomplete Data Mining Complex and Dynamic Data
III. Related Research & Work
Data Mining Challenges with Big Data Big Data Mining Platform MapReduce Integration of R and Hadoop Big Data Semantics and Application Knowledge “Anonymizing Classification Data Using Rough Set Theory” User privacy restrictions may include: No local data copies or downloading All analysis must be deployed based on the existing data storage systems without violating existing privacy settings