Research at Open Systems Lab IIIT Bangalore

Slides:



Advertisements
Similar presentations
Data Mining and the Web Susan Dumais Microsoft Research KDD97 Panel - Aug 17, 1997.
Advertisements

Mining Web’s Link Structure Sushanth Rai University of Texas at Arlington
1 Latent Semantic Mapping: Dimensionality Reduction via Globally Optimal Continuous Parameter Modeling Jerome R. Bellegarda.
 Copyright 2005 Digital Enterprise Research Institute. All rights reserved. 1 The Architecture of a Large-Scale Web Search and Query Engine.
A Web of Concepts Dalvi, et al. Presented by Andrew Zitzelberger.
Measuring Monolinguality Chris Biemann NLP Department, University of Leipzig LREC-06 Workshop on Quality Assurance and Quality Measurement for Language.
The MetaDater Model and the formation of a GRID for the support of social research John Kallas Greek Social Data Bank National Center for Social Research.
Web Archive Information Retrieval Miguel Costa, Daniel Gomes (speaker) Portuguese Web Archive.
Enhance legal retrieval applications with an automatically induced knowledge base Ka Kan Lo.
Overview of Web Data Mining and Applications Part I
CS344: Introduction to Artificial Intelligence Vishal Vachhani M.Tech, CSE Lecture 34-35: CLIR and Ranking in IR.
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics – Bag of concepts – Semantic distance between two words.
TREC 2009 Review Lanbo Zhang. 7 tracks Web track Relevance Feedback track (RF) Entity track Blog track Legal track Million Query track (MQ) Chemical IR.
1 The BT Digital Library A case study in intelligent content management Paul Warren
Semantic Publishing Update Second TUC meeting Munich 22/23 April 2013 Barry Bishop, Ontotext.
Knowledge based Learning Experience Management on the Semantic Web Feng (Barry) TAO, Hugh Davis Learning Society Lab University of Southampton.
CC P ROCESAMIENTO M ASIVO DE D ATOS O TOÑO 2015 Lecture 8: Information Retrieval II Aidan Hogan
GLOSSARY COMPILATION Alex Kotov (akotov2) Hanna Zhong (hzhong) Hoa Nguyen (hnguyen4) Zhenyu Yang (zyang2)
Researcher affiliation extraction from homepages I. Nagy, R. Farkas, M. Jelasity University of Szeged, Hungary.
Ihr Logo Chapter 7 Web Content Mining DSCI 4520/5240 Dr. Nick Evangelopoulos Xxxxxxxx.
Review of the web page classification approaches and applications Luu-Ngoc Do Quang-Nhat Vo.
A Survey for Interspeech Xavier Anguera Information Retrieval-based Dynamic TimeWarping.
Web Search. Structure of the Web n The Web is a complex network (graph) of nodes & links that has the appearance of a self-organizing structure  The.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
1 Yang Yang *, Yizhou Sun +, Jie Tang *, Bo Ma #, and Juanzi Li * Entity Matching across Heterogeneous Sources *Tsinghua University + Northeastern University.
Q2Semantic: A Lightweight Keyword Interface to Semantic Search Haofen Wang 1, Kang Zhang 1, Qiaoling Liu 1, Thanh Tran 2, and Yong Yu 1 1 Apex Lab, Shanghai.
به نام خدا مهندسي اينترنت جوانمرد اسلايد پنجم.
CS315-Web Search & Data Mining. A Semester in 50 minutes or less The Web History Key technologies and developments Its future Information Retrieval (IR)
L JSTOR Tools for Linguists 22nd June 2009 Michael Krot Clare Llewellyn Matt O’Donnell.
Mining real world data Web data. World Wide Web Hypertext documents –Text –Links Web –billions of documents –authored by millions of diverse people –edited.
LOGO 1 Corroborate and Learn Facts from the Web Advisor : Dr. Koh Jia-Ling Speaker : Tu Yi-Lang Date : Shubin Zhao, Jonathan Betz (KDD '07 )
Search Tools and Search Engines Searching for Information and common found internet file types.
Document Databases for Information Management Gregor Erbach FTW, Wien DFKI, Saarbrucken ETL, Tsukuba
1 A Biterm Topic Model for Short Texts Xiaohui Yan, Jiafeng Guo, Yanyan Lan, Xueqi Cheng Institute of Computing Technology, Chinese Academy of Sciences.
4. Relationship Extraction Part 4 of Information Extraction Sunita Sarawagi 9/7/2012CS 652, Peter Lindes1.
Knowledge based Question Answering System Anurag Gautam Harshit Maheshwari.
The World Wide Web. What is the worldwide web? The content of the worldwide web is held on individual pages which are gathered together to form websites.
Using Wikipedia for Hierarchical Finer Categorization of Named Entities Aasish Pappu Language Technologies Institute Carnegie Mellon University PACLIC.
Leveraging Knowledge Bases for Contextual Entity Exploration Categories Date:2015/09/17 Author:Joonseok Lee, Ariel Fuxman, Bo Zhao, Yuanhua Lv Source:KDD'15.
Exploiting Wikipedia Inlinks for Linking Entities in Queries Entity Recognition and Disambiguation Challenge ACM SIGIR 2014 July 6-11, 2014 The 37 th Annual.
CIS750 – Seminar in Advanced Topics in Computer Science Advanced topics in databases – Multimedia Databases V. Megalooikonomou Link mining ( based on slides.
1 CS 8803 AIAD (Spring 2008) Project Group#22 Ajay Choudhari, Avik Sinharoy, Min Zhang, Mohit Jain Smart Seek.
Web Page Clustering using Heuristic Search in the Web Graph IJCAI 07.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
COMP423: Intelligent Agent Text Representation. Menu – Bag of words – Phrase – Semantics Semantic distance between two words.
Google Cloud computing techniques (Lecture 03) 18th Jan 20161Dr.S.Sridhar, Director, RVCT, RVCE, Bangalore
WEB STRUCTURE MINING SUBMITTED BY: BLESSY JOHN R7A ROLL NO:18.
Trends in NL Analysis Jim Critz University of New York in Prague EurOpen.CZ 12 December 2008.
Data mining in web applications
Topic Modeling for Short Texts with Auxiliary Word Embeddings
Einat Minkov University of Haifa, Israel CL course, U
Neighborhood - based Tag Prediction
Measuring Monolinguality
Map Reduce.
Personalized Social Image Recommendation
Extraction, aggregation and classification at Web Scale
MR Application with optimizations for performance and scalability
Thanks to Bill Arms, Marti Hearst
NoSQL Systems Motivation.
Welcome to SharePoint Saturday Denver!
Declarative Creation of Enterprise Applications
Navi 下一步工作的设想 郑 亮 6.6.
Introduction Task: extracting relational facts from text
MR Application with optimizations for performance and scalability
Effective Entity Recognition and Typing by Relation Phrase-Based Clustering
Additional Example 2: Graphing Ordered Pairs Graph and label each point on a coordinate grid. A. L (3, 5) Start at (0, 0)
ProBase: common Sense Concept KB and Short Text Understanding
Query Type Classification for Web Document Retrieval
Information Retrieval and Web Design
A framework for ontology Learning FROM Big Data
Presentation transcript:

Research at Open Systems Lab IIIT Bangalore http://osl.iiitb.ac.in/

Broad Areas Co-occurrence analyses Multi-agent approaches for (database related) optimization

Co-occurrence Analyses Using models of semantic memory from cognitive psychology to extract latent semantics in document collections Graphs depicting higher-order inferences Co-occurrence (labeled) graph Document corpus

Co-occurrence graph Captures pair-wise co-occurrences across different typed entities Entity types Nouns (Person, Institution, Place, Country, etc.) Tags URLs Phrases

Higher-order inferences Topic anchors Topic markers Synonymy Semantic siblings Topic induction

Higher-order inferences Co-citations as URL co-occurrences Contrasting co-citation patterns between Web pages and Wikipedia Co-citation as hyperlink endorsements Co-citation as knowledge aggregation Co-citation as conditional probability of topical relevance

Dataset Co-occurrence graph built from a complete Wikipedia dump Co-citation graph built from a crawl of over 10 million pages and over 85 million hyperlinks

Some results Topical anchor experiments http://tinyurl.com/topicalanchors

Some results: web co-citation graph

Some results: endorsed hyperlink graph

Some questions Innate macro characteristics of co-occurrence graphs Concept formation from instances of co-occurrences Multipartite clustering

Multi-agent optimization Query optimization in stream grids Distributed index design under arbitrary constraints (churn, load, symmetry, etc.)

Thank you