Download presentation
Presentation is loading. Please wait.
1
Web Data Management Dr. Daniel Deutch
2
Web Data The web has revolutionized our world Data is everywhere Constitutes a great potential But also a lot of challenges – Web data is huge, unstructured, heterogonous, partially incorrect.. Just the ingredients of a fun topic!
3
Challenges Bringing structure to the Web Utilizing the structure for various tasks Searching for relevant web-pages – Given keywords, social profile… Ranking the results Combining results from different sources – E.g. Social networks + Search history – Combining rankings Recommendations All with huge and uncertain databases
4
Ingredients Modeling & Storage – XML representation – XML Typing – XPath, XQuery – Efficient XML querying and manipulation Search and Retrieval – Crawling – Querying – Information Retrieval and Extraction (basics)
5
Ranking – HITS algorithm – Google PageRank – Rank Aggregation and Top-K algorithms Semantic Web – Onthologies – Data Integration – Deriving semantic information – Wikipedia as an example
6
Web Services and Business Processes – BPEL, WSDL standards – Orchestration – Mashups – Analysis Recommendations – Collaborative Filtering – The NetFlix Million Dollars Challenge
7
Querying the deep web Online advertisements – Models – Algorithms Building a large-scale application – Distributed data management – MapReduce and PigLatin
8
Resources Book – http://webdam.inria.fr/Jorge/index.php http://webdam.inria.fr/Jorge/index.php – Free full version available online Papers – Links will be available when relevant Web-site – Accesible from http://cs.bgu.ac.il/~deutchd/teachinghttp://cs.bgu.ac.il/~deutchd/teaching – All slides will be available online
9
Your Duties 20% Quiz 40% Project 40% Exercises – Including programming tasks
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.