Key Observation Theorem:

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

E ducational T echnology I ntegration & Implementation P rinciples to guide teachers in their instructional decision making Sara Dexter University of Virginia.
Haystack: Per-User Information Environment 1999 Conference on Information and Knowledge Management Eytan Adar et al Presented by Xiao Hu CS491CXZ.
1 Distributed Agents for User-Friendly Access of Digital Libraries DAFFODIL Effective Support for Using Digital Libraries Norbert Fuhr University of Duisburg-Essen,
Lukas Blunschi Claudio Jossen Donald Kossmann Magdalini Mori Kurt Stockinger.
COMP423 Intelligent Agents. Recommender systems Two approaches – Collaborative Filtering Based on feedback from other users who have rated a similar set.
OntoBlog: Informal Knowledge Management by Semantic Blogging Aman Shakya 1, Vilas Wuwongse 2, Hideaki Takeda 1, Ikki Ohmukai 1 1 National Institute of.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Mastering the Internet, XHTML, and JavaScript Chapter 7 Searching the Internet.
Towards Semantic Web Mining Bettina Berndt Andreas Hotho Gerd Stumme.
WebMiningResearch ASurvey Web Mining Research: A Survey By Raymond Kosala & Hendrik Blockeel, Katholieke Universitat Leuven, July 2000 Presented 4/18/2002.
Recall: Query Reformulation Approaches 1. Relevance feedback based vector model (Rocchio …) probabilistic model (Robertson & Sparck Jones, Croft…) 2. Cluster.
Chapter 14 The Second Component: The Database.
Information Retrieval
Overview of Web Data Mining and Applications Part I
Semantic Web Technologies Lecture # 2 Faculty of Computer Science, IBA.
Crowdsourcing Predictors of Behavioral Outcomes. Abstract Generating models from large data sets—and deter¬mining which subsets of data to mine—is becoming.
Enterprise & Intranet Search How Enterprise is different from Web search What to think about when evaluating Enterprise Search How Intranet use is different.
1 The BT Digital Library A case study in intelligent content management Paul Warren
What is the Internet? Internet: The Internet, in simplest terms, is the large group of millions of computers around the world that are all connected to.
Artificial Intelligence Techniques Internet Applications 1.
PERSONALIZED SEARCH Ram Nithin Baalay. Personalized Search? Search Engine: A Vital Need Next level of Intelligent Information Retrieval. Retrieval of.
Xiaoying Gao Computer Science Victoria University of Wellington Intelligent Agents COMP 423.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Markup and Validation Agents in Vijjana – A Pragmatic model for Self- Organizing, Collaborative, Domain- Centric Knowledge Networks S. Devalapalli, R.
29-30 October, 2006, Estonia 1 IST4Balt Information analysis using social bookmarking and other tools IST4Balt Information analysis using social bookmarking.
Grid Workload Management Massimo Sgaravatto INFN Padova.
Data Mining By Dave Maung.
2007. Software Engineering Laboratory, School of Computer Science S E Web-Harvest Web-Harvest: Open Source Web Data Extraction tool 이재정 Software Engineering.
Data Mining for Web Intelligence Presentation by Julia Erdman.
Jun Li, Peng Zhang, Yanan Cao, Ping Liu, Li Guo Chinese Academy of Sciences State Grid Energy Institute, China Efficient Behavior Targeting Using SVM Ensemble.
Of 33 lecture 1: introduction. of 33 the semantic web vision today’s web (1) web content – for human consumption (no structural information) people search.
Iana Atanassova Research: – Information retrieval in scientific publications exploiting semantic annotations and linguistic knowledge bases – Ranking algorithms.
Augmenting (personal) IR Readings Review Evaluation Papers returned & discussed Papers and Projects checkin time.
Identifying “Best Bet” Web Search Results by Mining Past User Behavior Author: Eugene Agichtein, Zijian Zheng (Microsoft Research) Source: KDD2006 Reporter:
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Lecture-6 Bscshelp.com. Todays Lecture  Which Kinds of Applications Are Targeted?  Business intelligence  Search engines.
© 2017 by McGraw-Hill Education. This proprietary material solely for authorized instructor use. Not authorized for sale or distribution in any manner.
Harnessing the Deep Web : Present and Future -Tushar Mhaskar Jayant Madhavan, Loredana Afanasiev, Lyublena Antova, Alon Halevy January 7,
Personalized Ontology for Web Search Personalization S. Sendhilkumar, T.V. Geetha Anna University, Chennai India 1st ACM Bangalore annual Compute conference,
SharePoint 101 – An Overview of SharePoint 2010, 2013 and Office 365
Recommender Systems & Collaborative Filtering
Evaluation Anisio Lacerda.
IST 220 – Intro to Databases
User Characterization in Search Personalization
Database Architectures and the Web
For Beginners Mike Buhmann Reference Librarian.
Themes in Geosciences.
External Services & Frameworks
Discussion of Challenges & Opportunities Brainstorming Stacy Kowalczyk
Database Architectures and the Web
Overview.
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Greater Arizona eLearning Association (GAZEL)
INPE, São José dos Campos (SP), Brazil
Information Retrieval
Defining Data-intensive computing
ece 627 intelligent web: ontology and beyond
The scientific Method.
Database Systems Instructor Name: Lecture-3.
Unit# 5: Internet and Worldwide Web
Agenda What is SEO ? How Do Search Engines Work? Measuring SEO success ? On Page SEO – Basic Practices? Technical SEO - Source Code. Off Page SEO – Social.
Observations on DB & IR and Conclusions
Semantic Web Towards a Web of Knowledge - Projects
Performance And Scalability In Oracle9i And SQL Server 2000
Cabrillo College’s Ellucian Portal Project
Information Retrieval and Web Design
Who is Using your webSite?
Data Warehouse and OLAP Technology
Presentation transcript:

Key Observation Theorem: the one resource where demand always exceeds supply is human intelligence (smart people) Proof: by looking around and induction Corollary: must give top priority to eliminating human bottleneck or at least alleviating human bottleneck autonomic computing intelligent information organization & search

Problem 1: Self-tuning DBMS - From Visionary Buzzwords to Practical Tools - Common practice: KIWI (kill it with iron) Problem: When is KIWI applicable, and how much will it cost? Problem (technical variant): given: perfect info about DB and previous week‘s multi-class workload throughput & response time goals (QoS / SLA / WS Policies)  predict accurately performance improvement by adding x GB memory adding y MB/s disk bandwidth adding or replacing processors with all software tuning knobs left invariant or with tuning knobs automatically adjusted Challenge: How to automate introspective bottleneck analysis and asap alerting about (modest) HW upgrades?

Problem 2: Exploit Collective Human Input for Collaborative Web Search - Beyond Relevance Feedback and Beyond Google - href links are human endorsements  PageRank, etc. Opportunity: online analysis of human input & behavior may compensate deficiencies of search engine Typical scenario for 3-keyword user query: a & b & c  top 10 results: user clicks on ranks 2, 5, 7  top 10 results: user modifies query into a & b & c & d user modifies query into a & b & e user modifies query into a & b & NOT c query logs, bookmarks, etc. provide human assessments & endorsements correlations among words & concepts and among documents  top 10 results: user selects URL from bookmarks user jumps to portal user asks friend for tips Challenge: How can we use knowledge about the collective input of all users in a large community?

Problem 2‘: Exploit Collective Human Input for Automated Data Integration - Inspired by Alon Halevy - „semantic“ data integration is hoping for ontologies Opportunity: all existing DBs & apps already provide a large set of subjective mini-ontologies Typical scenario for analyzing if A and B mean the same entity  compare their attributes, relationships, etc. consider attributes and relationships of all similar tables/docs in all known DBs DB schemas, instances & data usage in apps provide human annotations correlations among tables, attr‘s, etc. consider instances of A and B in comparison to instances of similar tables/docs in all known DBs compare usage patterns of A and B in queries & apps in comparison to similar tables/docs of all known DBs Challenge: How can we use knowledge about the collective designs of all DB apps in a large community?