Presentation is loading. Please wait.

Presentation is loading. Please wait.

Marko Grobelnik Jasna Škrbec Jozef Stefan Institute Social Context as a part of News-Archive-Explorer Web application for exploratory browsing of news.

Similar presentations


Presentation on theme: "Marko Grobelnik Jasna Škrbec Jozef Stefan Institute Social Context as a part of News-Archive-Explorer Web application for exploratory browsing of news."— Presentation transcript:

1 Marko Grobelnik Jasna Škrbec Jozef Stefan Institute Social Context as a part of News-Archive-Explorer Web application for exploratory browsing of news streams and archives

2 Introduction News publishers generate content archives The goal is to build a system to make such archives usable through text mining & visualization Archive characteristics: Large corpora (up-to few M articles) Rich meta data (specific for each archive) Different input formats (xml structure) Poor search interfaces (not specialized for archives)

3 What we want? Application to… help user search and browse through archives help user read more about topics related to search visualize how things are connected in time, place, stories, etc. get user’s attention and interest in other related issues tell more about searched content

4 Architecture Archive PreprocessingEnrycher SQL Server Server side Client side

5 Database model

6 Already done Import archive xml files New York Times archive (15M articles) NYTimes LDC (1.7M articles) Nature (300k articles), Reuters (830k articles) Server side Import to database - PostgreSQL Preprocessed with enrycher Client side Faceted Search interface (author, entity, keyword, publish date, category) Showing context around searched content/article

7 Current version of the GUI

8 Showing relationships between entities

9 Plans for the future Improve search (with narrowing criteria, suggestions) Adding visualizations to show content in time, space and other contexts Adding links to similar content (stories) Adding links to outside resources (like dbpedia) or bring this resources inside this application Integrate with tools developed in AILab to improve search and presentation of articles (SearchPoint, DocAtlas, …) Improve usability & appearance of user interface

10 Topic landscape of the query “Clinton” from Reuters news 1996-1997 Query Search Results Topic Map Selected group of news Selected story

11 Visualization of social relationships between “Clinton” and other entities Query Named entities in relation

12 Topic Trends Tracking of the documents including “Clinton” Query Topic Trends Visualization Topics description US Elections US Budget Mid-East conflict NATO-Russia Result set

13 WW2 query “Pearl Harbor” into NYTimes archive Dec 7 th 1941

14 WW2 query “Belgrade” into NYTimes archive Apr 6 th 1941

15 WW2 query “Normandy” into NYTimes archive June 1944


Download ppt "Marko Grobelnik Jasna Škrbec Jozef Stefan Institute Social Context as a part of News-Archive-Explorer Web application for exploratory browsing of news."

Similar presentations


Ads by Google