1896192019872006 Data Mining with Big Data IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014 Xiangyu Cai ( 5120309687 )

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Universität Innsbruck Leopold Franzens Copyright 2006 DERI Innsbruck LarCK Workshop, ISWC/ASWC Busan, Korea 16-Feb-14 Towards Scalable.
Information Society Technologies programme 1 IST Programme - 8th Call Area IV.2 : Computing Communications and Networks Area.
anywhere and everywhere. omnipresent A sensor network is an infrastructure comprised of sensing (measuring), computing, and communication elements.
Nokia Technology Institute Natural Partner for Innovation.
Distributed Approximate Spectral Clustering for Large- Scale Datasets FEI GAO, WAEL ABD-ALMAGEED, MOHAMED HEFEEDA PRESENTED BY : BITA KAZEMI ZAHRANI 1.
Provenance in Open Distributed Information Systems Syed Imran Jami PhD Candidate FAST-NU.
Merter Sualp and Tolga Can IEEE Transactions on Knowledge and Data Engineering 1 Paper study- 2012/12/22.
A. Bucchiarone, Juan P. Galeotti / GT-VMT’08 Dynamic Software Architectures Verification using DynAlloy Antonio Bucchiarone IMT Graduate School of Lucca,
Trust Establishment in Pervasive Grid Environments Syed Naqvi, Michel Riguidel TÉLÉCOM PARIS ÉNST É cole N ationale S upérieur des T élécommunications.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
NSF Break-out Group: Medical Informatics Coordinator: Wanda Pratt Scribe: Betty Salzberg.
Neural Network Homework Report: Clustering of the Self-Organizing Map Professor : Hahn-Ming Lee Student : Hsin-Chung Chen M IEEE TRANSACTIONS ON.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
A Study on Mobile P2P Systems Hongyu Li. Outline  Introduction  Characteristics of P2P  Architecture  Mobile P2P Applications  Conclusion.
Community Manager A Dynamic Collaboration Solution on Heterogeneous Environment Hyeonsook Kim  2006 CUS. All rights reserved.
CS Machine Learning. What is Machine Learning? Adapt to / learn from data  To optimize a performance function Can be used to:  Extract knowledge.
LÊ QU Ố C HUY ID: QLU OUTLINE  What is data mining ?  Major issues in data mining 2.
© Spinnaker Labs, Inc. Google Cluster Computing Faculty Training Workshop Open Source Tools for Teaching.
Tennessee Technological University1 The Scientific Importance of Big Data Xia Li Tennessee Technological University.
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Introduction to the Course January.
1 IEEE TRANSACTION ON KNOWLEDGE AND DATA ENGINEERING, VOL. 15 NO.5, SEPTEMBER/OCTOBER 2003 Manuscript received 10 July 2000; received 2 Jan. 2001; accept.
An Answer to the EC Expert Group on CLOUD Computing Keith G Jeffery Scientific Coordinator.
Speaker: Oscar Corcho Building Semantic Sensor Webs and Applications ESWC 2011 Tutorial 29 May 2011.
EXPOSE GOOGLE APP ENGINE AS TASKTRACKER NODES AND DATA NODES.
 The project goal is to provide an environment and framework for students to get practical experience on real-life service development, going from the.
CDT PROJECTS John Keane, Software Systems Group 1. Data Analytics / Big Data 2. Parallel & Distributed Systems 3. Decision Support.
Data and Applications Security Dr. Bhavani Thuraisingham The University of Texas at Dallas Lecture #1 Introduction to Data and Applications Security August.
A Graph-based Friend Recommendation System Using Genetic Algorithm
Data Tagging Architecture for System Monitoring in Dynamic Environments Bharat Krishnamurthy, Anindya Neogi, Bikram Sengupta, Raghavendra Singh (IBM Research.
A Context Model based on Ontological Languages: a Proposal for Information Visualization School of Informatics Castilla-La Mancha University Ramón Hervás.
Weekly Project Dashboard: Project Name: Name: Qinyun Zhu Date: 5/17/2012 4/20/2012 R Key Accomplishments for this Reporting Period Read the AI book Chapter.
Copyright © 2007 Ramez Elmasri and Shamkant B. Navathe Slide
Data and Applications Security Developments and Directions Dr. Bhavani Thuraisingham The University of Texas at Dallas Introduction to the Course January.
Data Mining with Big data
Andreas J. Dietrich , Stefan Kirn , and Vijayan Sugumaran
Redeployment for Mobile Wireless Sensor Networks Weihong Fan, Hengyang Zhang and Xuanping Cai Yunhui Liu Yunhui LiuJoint Center of Intelligent Sensing.
 Frequent Word Combinations Mining and Indexing on HBase Hemanth Gokavarapu Santhosh Kumar Saminathan.
Multicast Recipient Maximization in IEEE j WiMAX Relay Networks Wen-Hsing Kuo † ( 郭文興 ) & Jeng-Farn Lee ‡ ( 李正帆 ) † Department of Electrical Engineering,
Data Mining with Big Data. Abstract Big Data concerns large-volume, complex, growing data sets with multiple, autonomous sources. With the fast development.
Peer-to-Peer Systems: An Overview Hongyu Li. Outline  Introduction  Characteristics of P2P  Algorithms  P2P Applications  Conclusion.
Paper Title Authors names Conference and Year Presented by Your Name Date.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Ontology Evaluation and Ranking using OntoQA Samir Tartir and I. Budak Arpinar Large-Scale Distributed Information Systems Lab University of Georgia The.
CSE 5810 Biomedical Informatics and Cloud Computing Zhitong Fei Computer Science & Engineering Department The University of Connecticut CSE5810: Introduction.
Composing semantic Web services under constraints E.Karakoc, P.Senkul Journal: Expert Systems with Applications 36 (2009)
Data and Applications Security
Data Mining, Data Science, Big Data
Tutorial: Big Data Algorithms and Applications Under Hadoop
Data and Applications Security Developments and Directions
Data and Applications Security
Introduction C.Eng 714 Spring 2010.
A Unifying View on Instance Selection
Data Warehousing and Data Mining
Tools for Processing Big Data Jinan Al Aridhee and Christian Bach
Web Service Accounting System
Computing/Modeling with Big Data
Distributed systems: How did we get here?
Cloud Programming Models
Execution Framework: Hadoop 2.x
The Past, The Present, and The Future
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Data and Applications Security Developments and Directions
Anjuman College of Engineering & Technology Computer Science & Engineering Department Subject Code: BECSE408T Subject Name: (ELECTIVE-III)Clustering &
Data and Applications Security
Data and Applications Security
Data and Applications Security
Presentation transcript:

Data Mining with Big Data IEEE TRANSACTIONS ON KNOWLEDGE AND DATA ENGINEERING, 2014 Xiangyu Cai ( )

Why choose this paper? General introduction to big data characteristics Indicate the challenges with big data project Recommend references and related works to overcome the big data challenges

OUTLINE  Introduction  Big data characteristics: HACE Theorem  Data Mining Challenges with Big Data  Research Initiatives and Projects  Related Work  Conclusion

I. Big Data Characteristics

HACE Theorem Huge Data with Heterogeneous and Diverse Dimensionalities Autonomous Sources with Distributed and Decentralized Control Complex and Evolving Relationships

II. Data Mining Challenges with Big Data

Data Mining Challenges with Big Data Big Data Mining Platform Big Data Semantics and Application Knowledge Information Sharing and Data Privacy Domain and Application Knowledge Big Data Mining Algorithms Local Learning and Model Fusion for Multiple Information Sources Mining form Sparse, Uncertain and Incomplete Data Mining Complex and Dynamic Data

III. Related Research & Work

Data Mining Challenges with Big Data Big Data Mining Platform MapReduce Integration of R and Hadoop Big Data Semantics and Application Knowledge “Anonymizing Classification Data Using Rough Set Theory” User privacy restrictions may include: No local data copies or downloading All analysis must be deployed based on the existing data storage systems without violating existing privacy settings