Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Structural.

Slides:



Advertisements
Similar presentations
Prof. Carolina Ruiz Department of Computer Science Worcester Polytechnic Institute INTRODUCTION TO KNOWLEDGE DISCOVERY IN DATABASES AND DATA MINING.
Advertisements

Chapter 5: Introduction to Information Retrieval
Building and Analyzing Social Networks Web Data and Semantics in Social Network Applications Dr. Bhavani Thuraisingham February 15, 2013.
Research topics Semantic Web - Spring 2007 Computer Engineering Department Sharif University of Technology.
Search Engines and Information Retrieval
Human Language Technologies. Issue Corporate data stores contain mostly natural language materials. Knowledge Management systems utilize rich semantic.
WebMiningResearch ASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007.
Artificial Intelligence and Lisp Lecture 13 Additional Topics in Artificial Intelligence LiU Course TDDC65 Autumn Semester, 2010
Semantic text features from small world graphs Jure Leskovec, IJS + CMU John Shawe-Taylor, Southampton.
WebMiningResearchASurvey Web Mining Research: A Survey Raymond Kosala and Hendrik Blockeel ACM SIGKDD, July 2000 Presented by Shan Huang, 4/24/2007 Revised.
Semantics For the Semantic Web: The Implicit, the Formal and The Powerful Amit Sheth, Cartic Ramakrishnan, Christopher Thomas CS751 Spring 2005 Presenter:
© 2001 Franz J. Kurfess Introduction 1 CPE/CSC 580: Knowledge Management Dr. Franz J. Kurfess Computer Science Department Cal Poly.
02 -1 Lecture 02 Agent Technology Topics –Introduction –Agent Reasoning –Agent Learning –Ontology Engineering –User Modeling –Mobile Agents –Multi-Agent.
Overview of Web Data Mining and Applications Part I
1 Using R for consumer psychological research Research Analytics | Strategy & Insight September 2014.
Some studies on Vietnamese multi-document summarization and semantic relation extraction Laboratory of Data Mining & Knowledge Science 9/4/20151 Laboratory.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence From Data Mining To Knowledge.
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Building and Analyzing Social Networks Case Studies of Semantic Social Network Analysis Dr. Bhavani Thuraisingham February 22, 2013.
Research paper: Web Mining Research: A survey SIGKDD Explorations, June Volume 2, Issue 1 Author: R. Kosala and H. Blockeel.
Deciding Semantic Matching of Stateless Services Duncan Hull †, Evgeny Zolin †, Andrey Bovykin ‡, Ian Horrocks †, Ulrike Sattler † and Robert Stevens †
Copyright R. Weber Machine Learning, Data Mining ISYS370 Dr. R. Weber.
Search Engines and Information Retrieval Chapter 1.
revised CmpE 583 Fall 2006Discussion: OWL- 1 CmpE 583- Web Semantics: Theory and Practice DISCUSSION: OWL Atilla ELÇİ Computer Engineering.
Page 1 WEB MINING by NINI P SURESH PROJECT CO-ORDINATOR Kavitha Murugeshan.
RuleML-2007, Orlando, Florida1 Towards Knowledge Extraction from Weblogs and Rule-based Semantic Querying Xi Bai, Jigui Sun, Haiyan Che, Jin.
Using Transactional Information to Predict Link Strength in Online Social Networks Indika Kahanda and Jennifer Neville Purdue University.
Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Structural.
Ontology-Driven Automatic Entity Disambiguation in Unstructured Text Jed Hassell.
Introduction to machine learning and data mining 1 iCSC2014, Juan López González, University of Oviedo Introduction to machine learning Juan López González.
WebMining Web Mining By- Pawan Singh Piyush Arora Pooja Mansharamani Pramod Singh Praveen Kumar 1.
Page 1 Ming Ji Department of Computer Science University of Illinois at Urbana-Champaign.
Kansas State University Department of Computing and Information Sciences Kansas State University KDD Lab ( cDNA.
COM1721: Freshman Honors Seminar A Random Walk Through Computing Lecture 2: Structure of the Web October 1, 2002.
Special Topics in Educational Data Mining HUDK5199 Spring 2013 March 25, 2012.
Page 1 Alliver™ Page 2 Scenario Users Contents Properties Contexts Tags Users Context Listener Set of contents Service Reasoner GPS Navigator.
Computing & Information Sciences Kansas State University Paper Review Guidelines KDD Lab Course Supplement William H. Hsu Kansas State University Department.
BAA - Big Mechanism using SIRA Technology Chuck Rehberg CTO at Trigent Software and Chief Scientist at Semantic Insights™
October 2005CSA3180 NLP1 CSA3180 Natural Language Processing Introduction and Course Overview.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, March 29, 2000.
Kansas State University Department of Computing and Information Sciences CIS 830: Advanced Topics in Artificial Intelligence Wednesday, January 24, 2001.
Exploit of Online Social Networks with Community-Based Graph Semi-Supervised Learning Mingzhen Mo and Irwin King Department of Computer Science and Engineering.
Algorithmic Detection of Semantic Similarity WWW 2005.
Computing & Information Sciences Kansas State University Data Sciences Summer Institute Multimodal Information Access and Synthesis Learning and Reasoning.
Computing & Information Sciences Kansas State University IJCAI HINA 2015: 3 rd Workshop on Heterogeneous Information Network Analysis KSU Laboratory for.
Practical Issues for Automated Categorization of Web Sites John M. Pierre Metacode Technologies, Inc. 139 Townsend Street San Francisco,
Compact Encodings for All Local Path Information in Web Taxonomies with Application to WordNet Svetlana Strunjaš-Yoshikawa Joint with Fred Annexstein and.
Computing & Information Sciences Kansas State University Paper Review Guidelines KDD Lab Course Supplement William H. Hsu Kansas State University Department.
Kansas State University Department of Computing and Information Sciences CIS 730: Introduction to Artificial Intelligence Friday, 14 November 2003 William.
1 Introduction to Data Mining C hapter 1. 2 Chapter 1 Outline Chapter 1 Outline – Background –Information is Power –Knowledge is Power –Data Mining.
A Portrait of the Semantic Web in Action Jeff Heflin and James Hendler IEEE Intelligent Systems December 6, 2010 Hyewon Lim.
Intelligent Database Systems Lab 國立雲林科技大學 National Yunlin University of Science and Technology 1 Mining Advisor-Advisee Relationships from Research Publication.
CIS750 – Seminar in Advanced Topics in Computer Science Advanced topics in databases – Multimedia Databases V. Megalooikonomou Link mining ( based on slides.
Computing and Information Sciences Kansas State University ANNIE Conference November 10, 2008 Predicting Links and Link Change in Friends Networks: Supervised.
GoRelations: an Intuitive Query System for DBPedia Lushan Han and Tim Finin 15 November 2011
1 Intelligent Information System Lab., Department of Computer and Information Science, Korea University Semantic Social Network Analysis Kyunglag Kwon.
Chapter 8: Web Analytics, Web Mining, and Social Analytics
Computing & Information Sciences Kansas State University An Overview of Big Data Analytics: Challenges & Selected Applications Guest Seminar Drake University.
Ontology Engineering and Feature Construction for Predicting Friendship Links in the Live Journal Social Network Author:Vikas Bahirwani 、 Doina Caragea.
Warren Shen, Xin Li, AnHai Doan Database & AI Groups University of Illinois, Urbana Constraint-Based Entity Matching.
GUILLOU Frederic. Outline Introduction Motivations The basic recommendation system First phase : semantic similarities Second phase : communities Application.
Learning Bayesian Networks for Complex Relational Data
Semantic Graph Mining for Biomedical Network Analysis: A Case Study in Traditional Chinese Medicine Tong Yu HCLS
Data-Driven Educational Data Mining ---- the Progress of Project
School of Computer Science & Engineering
Artificial Intelligence and Lisp Lecture 13 Additional Topics in Artificial Intelligence LiU Course TDDC65 Autumn Semester,
Analyzing and Securing Social Networks
Data Warehousing and Data Mining
Defining Data-intensive computing
Web Mining Research: A Survey
Presentation transcript:

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Structural Link Analysis from User Profiles and Friends Networks: A Feature Construction Approach William H. Hsu, Joseph Lancaster, Martin S. R. Paradesi, Tim Weninger Monday, 26 March 2007 Laboratory for Knowledge Discovery in Databases Kansas State University

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Link Analysis in Social Networks: The K-State Corpus

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Problem Definition  Given: records of users of weblog or social network service  Discover  Features of entities: users, communities  Relationships: friendship, membership, moderatorship  Explanations and predictions for relationships Goals  Boost precision and recall of link existence prediction  Find relevant features Significance: Recommendations (Friendship, Membership) Problem Statement: Link Mining in Social Networks

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Related Work: Link Mining Getoor and Diehl (2005) - Graphical model representations of link structure Ketkar et al. (2005) - Data mining techniques vs graph-based representation Sarkar & Moore (2005) - Change in link structure across discrete time steps Popescul & Ungar (2003) - ER model to predict links Hill (2003), Bhattacharya & Getoor (2004) – Statistical Relational Learning to resolve identity uncertainty Resig et al. (2004) - Predicting IM online times using friends graph degree McCallum et al. (2005) - Inferring roles and topic categories based on link analysis

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Rationale Limitations of Current State of the Art  Do not take graph features into account  Limited ability to select, extract features Novel Contribution: Link Mining System  Extracts, computes features of network model  Towards dependent types for relational link mining Rationale  Desired functionality: infer new links from old  Evaluation: precision, recall for link existence

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) K-State Test Bed: LJMiner Corpus User Contact Info User Interest, Schools, Friends Community Membership Info

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) LiveJournal Topology [1]: Tools and Security Model LJMindMap.com © 2004 mcfnord © 2007 Denga, Inc.

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) LiveJournal Topology [2]: Definitions

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Graph Features [1]: Node, Pair, Link-Dependent uvuvuvuvuv Node-Dependent Features: specific to one node (vertex) within candidate pair Indegree (u) “Source popularity” Outdegree (u) “Source fertility” Outdegree (v) “Target fertility” Indegree (v) “Target popularity” Pair-Dependent Features: specific to one candidate pair of nodes (vertices) Link-Dependent Features: specific to one link (edge) in directed graph uv Common entities: interests, friends, schools, etc. Attributes of common entities Computed from relational query on entities u, v Past, predicted duration Diagnosed cause Computed and stored with relationship set

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Graph Features [2]: Node and Pair Features in LJMiner Graph Features Interest-Related Features

Computing & Information Sciences Kansas State University LJCrawler System Design  Data acquisition: client, injector, parser  Ancillary issues  Multi-threading  Distribution  Storage  Analytical postprocessing: LJClipper, LJStats Distinguishing features of LJCrawler Results  200 users/second maximum, 5 users/second allowed  Approximately 2 million pages crawled Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007)

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Network Statistics: Graph Distance 1000 nodes 4000 nodes

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Interpretation of Results 941-node graph (Hsu et al., 2006): LJCrawler v1 output node graphs: LJCrawler v2 output

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Outline Background, Related Work and Rationale Technical Objective: Link Mining in Social Networks Methodology: Graph Feature Extraction Experimental Results: K-State LJMiner Corpus Continuing Work: Statistical Relational Models

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Results Establishing an Interdisciplinary Research Initiative  K-State / KU / UNL collaboration  Resources: Linguistic Data Consortium  NIST evaluations Involving End Users of Machine Translation  Document users  Machine learning, data mining, info extraction researchers Novel Applications  Social networks and collaborative recommendation  Gisting and beyond

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Information Extraction and Intelligent IR  Learning models for IE: ontologies  Latent semantic analysis Machine Learning  Natural language learning  Time series learning and understanding  Relational and first-order models Automated Reasoning  Probabilistic  Case-based and analogical Data Mining and Warehousing Grid Computing Continuing Work

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) References Knight, K. What’s New in Statistical Machine Translation. Invited Talk, International Joint Conference on Artificial Intelligence (IJCAI-2005), Edinburgh, UK, August, Knight, K. & Graehl, J. (2005). An Overview of Probabilistic Tree Transducers for Natural Language Processing. In Proceedings of CICLing 2005, p Chiang, D. A hierarchical phrase-based model for statistical machine translation. In Proceedings of the Conference of the Association for Computational Linguistics (ACL 2005), p. 263–270. Koehn, P., Och, F. J., & Marcu, D. (2003). Statistical Phrase-Based Translation. In Proceedings of HLT-NAACL 2003, the Human Language Technology Conference of the North American Chapter of the Association for Computational Linguistics, May 27 - June 1, 2003, Edmonton, CANADA.

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Acknowledgements K-State Lab for Knowledge Discovery in Databases  Vikas Bahirwani  Tejaswi Pydimarri  Andrew King Social Networks, Graph Theory, Graph Algorithms  Kirsten Hildrum (IBM T. J. Watson Labs)  Todd Easton (K-State, Industrial and Manufacturing Systems Engineering) Machine Learning  Dan Roth, Cinda Heeren, Jiawei Han (University of Illinois at Urbana-Champaign)  AnHai Doan (University of Wisconsin – Madison)

Computing & Information Sciences Kansas State University Boulder, Colorado First International Conference on Weblogs And Social Media (ICWSM-2007) Questions and Discussion