Download presentation
Presentation is loading. Please wait.
Published byShreya Clowers Modified over 10 years ago
1
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases University of Kansas CBAR Wednesday, 04 September 2013 William H. Hsu Laboratory for Knowledge Discovery in Databases, Kansas State University http://www.kddresearch.org Acknowledgements Kansas State: Wesam Elshamy, Ming Yang, Surya Teja Kallumadi, Majed Alsadhan Illinois: Chengxiang Zhai, Jiawei Han, Kevin Chang, Dan Roth iQGateway: Praveen Koduru, Krishna Kumar Vallyatodi Dynamic Topic Modeling for Spatiotemporal Event Extraction: Probabilistic Approaches and The Dim Sum Process
2
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Based on NLP Group NER Toolkit © 2005-2010 Stanford University Simile © 2003-2010 Massachusetts Institute of Technology Google Maps © 2007-2010 Tele Atlas, Inc. and Google, Inc. Motivation: Thematic Mapping [1] Summarizing News from The Web http://fingolfin.user.cis.ksu.edu/timemap2gs
3
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases http://healthmap.org © 2006 – 2013 Brownstein, J. & Freifeld, C. Motivation: Thematic Mapping [2] HealthMap
4
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases http://healthmap.org © 2006 – 2013 Brownstein, J. & Freifeld, C. Motivation: Thematic Mapping [2] HealthMap
5
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases © 2011 – 2012 TextMap.org Motivation: Thematic Mapping [4] TextMap & Topic modelsc
6
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Volkova, S., Caragea, D., Hsu, W. H., Drouhard, J., & Fowles, L. (2010). Boosting Biomedical Entity Extraction by using Syntactic Patterns for Semantic Relation Discovery. Proceedings of the 2010 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2010). See also: Volkova, S. (2010). As Entity Extraction, Animal Disease-related Event Recognition and Classification from Web. M.S. thesis, Kansas State University. Motivation: Thematic Mapping [5] Existing Systems & Limitations
7
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation Topic Modeling: Static (Atemporal) to Dynamic Continuous Time vs. Variable Number of Topics Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed News Monitoring: Geotagging & Timelines Recent Results STEF & Heterogeneous Info Network Analysis Outline
8
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Timeline Formation: General Task Illustrated Elshamy (2012)
9
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation (STEF) Adapted from Elshamy (2012) Time t: 3 extant topicsTime t + k: 2 extant topics
10
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation Topic Modeling: Static (Atemporal) to Dynamic Continuous Time vs. Variable Number of Topics Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed News Monitoring: Geotagging & Timelines Recent Results STEF & Heterogeneous Info Network Analysis Outline
11
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Topic Modeling [1]: Basic Task (Static) Elshamy (2012)
12
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Topic Modeling [2]: Understanding Plate Notation Adapted from Elshamy (2012)
13
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Topic Modeling [3]: Hyperparameters (Another Model) Adapted from Elshamy (2012)
14
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation Topic Modeling: Static (Atemporal) to Dynamic Continuous Time vs. Variable Number of Topics Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed News Monitoring: Geotagging & Timelines Recent Results STEF & Heterogeneous Info Network Analysis Outline
15
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Continuous Time vs. Variable Number of Topics Elshamy (2012) State of the Field Goal
16
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Event s from Text: Markov Model for Topic Detection & Tracking Adapted from Elshamy (2012)
17
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation Topic Modeling: Static (Atemporal) to Dynamic Continuous Time vs. Variable Number of Topics Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed News Monitoring: Geotagging & Timelines Recent Results STEF & Heterogeneous Info Network Analysis Outline
18
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Continuous-time Dynamic Topic Model (cDTM) Elshamy (2012)
19
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Discrete Time Online Hierarchical Dirichlet Process (oHDP) Elshamy (2012)
20
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Continuous-time Infinite Dynamic Topic Model (CIDTM) Elshamy (2012)
21
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation Topic Modeling: Static (Atemporal) to Dynamic Continuous Time vs. Variable Number of Topics Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed News Monitoring: Geotagging & Timelines Recent Results STEF & Heterogeneous Info Network Analysis Outline
22
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases http://healthmap.org © 2006 – 2013 Brownstein, J. & Freifeld, C. HealthMap Redux: Thematic Mapping, Health Infor matics, & Epidemiology
23
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation Topic Modeling: Static (Atemporal) to Dynamic Continuous Time vs. Variable Number of Topics Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed News Monitoring: Geotagging & Timelines Recent Results STEF & Heterogeneous Info Network Analysis Outline
24
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Thematic Mapping Tasks [1]: Entities Example: CNN, 2007 Foot-and-Mouth Disease (http://bit.ly/3gof6o)http://bit.ly/3gof6o Tests have confirmed a second foot-and-mouth outbreak in southern England, the government announced, raising fears that the highly contagious animal virus is spreading. Chief Veterinary Officer Debby Reynolds said Tuesday that tests showed a herd of cattle had been infected. The animals were culled Monday evening after showing signs of the disease. Update Summarization A second foot-and-mouth disease infection in a herd of cattle in southern England was responded to by culling on Monday evening and announced by Debby Reynolds on Tuesday. (Second since earlier report – hence “update”.) Compare: Recognizing Textual Entailment A foot-and-mouth disease infection was reported the day after culling. (True.)
25
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Thematic Mapping Tasks [2]: Aspects © 2008 C. Zhai University of Illinois http://sifaka.cs.uiuc.edu/ir/
26
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Current off-the-shelf applications fall into ambiguity problems Thematic Mapping Tasks [3]: Location & Disambiguation © 2008 W. Elshamy
27
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Search phrase: “smallpox”© 2007 – 2009 Google, Inc. Thematic Mapping Tasks [4]: Time & Timelines
28
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Thematic Mapping Tasks [5]: Timeline Reconstruction Murphy, Hsu, Elshamy, Kallumadi, & Volkova (2012)
29
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation Topic Modeling: Static (Atemporal) to Dynamic Continuous Time vs. Variable Number of Topics Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed News Monitoring: Geotagging & Timelines Recent Results STEF & Heterogeneous Info Network Analysis Outline
30
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Recent Results [1]: Meth Lab mapping Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012)
31
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Recent Results [2]: Visual Analytics Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012)
32
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Recent Results [3]: Topic Proportions Hsu, Abduljabbar, Osuga, Lu, & Elshamy (2012)
33
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Simultaneous Topic Enumeration & Formation Topic Modeling: Static (Atemporal) to Dynamic Continuous Time vs. Variable Number of Topics Dim Sum Process for Hybrid STEF Dynamic Topic Modeling Test Bed News Monitoring: Geotagging & Timelines Recent Results STEF & Heterogeneous Info Network Analysis Outline
34
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Sentiment Analysis Tasks: Polarity http://dslreports.com © 1999 – 2012 dslreports.com
35
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Aggregation & OLAP: Wikipedia Infobox as Fact Table Infobox: Albert Einstein © 2001 – 2010 Wikimedia Foundation Q: Where can this information be found? A: It depends… How much formatting does source page have? Marked up? (Machine-readable?) Semantically rich markup? Albert Einstein © 2001 – 2010 Wikimedia Foundation
36
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [1]: Health Blogs on Chronic Disease
37
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [2]: New Entities & Relationships
38
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [3]: Polarity http://twitrratr.com/search/EuroHCIR © 2012 Twitrratr
39
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases Opinion Mapping Example [4]: Aims & Approach Aim 1 – Extend Algorithms to Detect New: Entities: Diseases, Treatments, Complications Relationships: Adverse Reactions, Controversies Aim 2 – Domain-Specific Ontology Symptoms, Disease Attributes Treatments, Complications Comparisons Aim 3 – Better Recognition of Scope, Polarity
40
Computing & Information Sciences Kansas State University University of Kansas Center for Business Analytics research Seminar Laboratory for Knowledge Discovery in Databases User Groups: Goals & Primary Use Cases Goal: Thematic Opinion Map (Choropleth, etc.) User Groups Experienced: policymakers, health professionals Individual stakeholders: patients, activists, voters Primary Use Case: Infographics as IE Views http://bit.ly/fu04zf © 2011 Mediabistro Are Germans really the happiest Twitter users by country, Tennesseans by U.S. state?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.