1 JCDL 2011 Report Kazunari Sugiyama WING meeting 19 th August, 2011.

Slides:



Advertisements
Similar presentations
DRIVER Building a worldwide scientific data repository infrastructure in support of scholarly communication 1 JISC/CNI Conference, Belfast, July.
Advertisements

1 Integrating user environments and data liquidity to improve the research experience.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
The Application of Machine Translation in CADAL Huang Chen, Chen Haiying Zhejiang University Libraries, Hangzhou, China
Information Society Technologies Third Call for Proposals Norbert Brinkhoff-Button DG Information Society European Commission Key action III: Multmedia.
Providing collections, tools and services for digital humanities A national library perspective Clément Oury Head of Digital Legal Deposit Bibliothèque.
Jing-Shin Chang National Chi Nan University, IJCNLP-2013, Nagoya 2013/10/15 ACLCLP – Activities ( ) & Text Corpora.
Online Sources for Historians Seeley Historical Library 19 October 2010.
New Directions for the Collections Trust Nick Poole, Chief Executive, Collections Trust.
The Tiger Project: Korea Culture and Heritage DL Kim, Sung Hyuk Division of Information Science Sookmyung Women’s University, Seoul, Korea.
Title Course opinion mining methodology for knowledge discovery, based on web social media Authors Sotirios Kontogiannis Ioannis Kazanidis Stavros Valsamidis.
Latin American and Human Rights Web Archiving as part of Research Library Special Collections Kent Norsworthy LLILAS Benson Digital Curation Coordinator,
IDENTIFIERS & THE DATA CITATION INDEX DISCOVERY, ACCESS, AND CITATION OF PUBLISHED RESEARCH DATA NIGEL ROBINSON 17 OCTOBER 2013.
Semantic Web and Web Mining: Networking with Industry and Academia İsmail Hakkı Toroslu IST EVENT 2006.
Building Reliable Distributed Information Spaces Carl Lagoze CS /22/2002.
1 CS 502: Computing Methods for Digital Libraries Lecture 20 Multimedia digital libraries.
The Subject Librarian's Role in Building Digital Collections: Where Information Management and Subject Expertise Meet Ruth Vondracek Oregon State University.
Citances and What should our UI look like? Marti Hearst SIMS, UC Berkeley Supported by NSF DBI and a gift from Genentech.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional repository for the University of.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional repository for the University of Pretoria.
Araba Dawson-Andoh 122 A Alden Library
GL12 Conf. Dec. 6-7, 2010NTL, Prague, Czech Republic Extending the “Facets” concept by applying NLP tools to catalog records of scientific literature *E.
Search is not only about the Web An Overview on Printed Documents Search and Patent Search Walid Magdy Centre for Next Generation Localisation School of.
I:\Share\Bestuursinligting\OUDITfinaal\Portfolio\Statistics\BI UPSpace An institutional research repository for the University of Pretoria.
Grey Literature in Scholarly Communication Current Thinking from Libraries and Publishers James Neal and Kate Wittenberg.
Digital Library Architecture and Technology
World Bank, Africa Region, Africa Household Survey Databank - The World Bank - Africa.
The SEASR project and its Meandre infrastructure are sponsored by The Andrew W. Mellon Foundation SEASR Overview Loretta Auvil and Bernie Acs National.
CONTI’2008, 5-6 June 2008, TIMISOARA 1 Towards a digital content management system Gheorghe Sebestyen-Pal, Tünde Bálint, Bogdan Moscaliuc, Agnes Sebestyen-Pal.
A New Approach for Cross- Language Plagiarism Analysis Rafael Corezola Pereira, Viviane P. Moreira, and Renata Galante Universidade Federal do Rio Grande.
MediaEval Workshop 2011 Pisa, Italy 1-2 September 2011.
Serenate1 Non-standard users: The Library Raf Dekeyser K.U.Leuven.
revised CmpE 583 Fall 2006Discussion: OWL- 1 CmpE 583- Web Semantics: Theory and Practice DISCUSSION: OWL Atilla ELÇİ Computer Engineering.
Visual-Spatial Thinking in Digital Libraries —Top Ten Problems Chaomei Chen Brunel University June 28th 2001, Hotel Roanoke and Conference Center, Roanoke,
Learning Object Metadata Mining Masoud Makrehchi Supervisor: Prof. Mohamed Kamel.
LIS510 lecture 3 Thomas Krichel information storage & retrieval this area is now more know as information retrieval when I dealt with it I.
The New Digital World and the Transformation of Information and Libraries Patricia L. Thibodeau Associate Dean Library Services & Archives Oct. 26, 2011.
Collaborative Research: Curriculum Development for Digital Library Education Presentation in May 1,2006
Digital/Open Access repositories Paul Sheehan Director of Library Services DCU HEAnet National Networking Conference Athlone 11 th November 2005.
Project Proposal Lebeko Poulo, Jorgina Paihama & Morwan Mohamed Nour Supervisor: Dr. Hussein Suleman Co-supervisor: Hisham Abdalla (PhD Student) 14 th.
Topical Crawlers for Building Digital Library Collections Presenter: Qiaozhu Mei.
Producción de Sistemas de Información Agosto-Diciembre 2007 Sesión # 8.
Jela Steinerová, Andrea Hrčková Comenius University Bratislava Slovakia 15th International Conference on Grey Literature GL 15.
 Text Representation & Text Classification for Intelligent Information Retrieval Ning Yu School of Library and Information Science Indiana University.
Interactive Science Publishing: A Joint OSA-NLM Project Michael J. Ackerman National Library of Medicine.
PLoS ONE Application Journal Publishing System (JPS) First application built on Topaz application framework Web 2.0 –Uses a template engine to display.
Why Not Grab a Free Lunch? Mining Large Corpora for Parallel Sentences to Improve Translation Modeling Ferhan Ture and Jimmy Lin University of Maryland,
Learning Multilingual Subjective Language via Cross-Lingual Projections Mihalcea, Banea, and Wiebe ACL 2007 NLG Lab Seminar 4/11/2008.
The Humanities in a Global e-Infrastructure A Shopping-List Gregory Crane, Perseus Project, Tufts Brian Fuchs, Internet Centre, Imperial College Dolores.
Project Final Presentation – Dec. 6, 2012 CS 5604 : Information Storage and Retrieval Instructor: Prof. Edward Fox GTA : Tarek Kanan ProjArabic Team Ahmed.
CyberInfrastructure for Network Analysis Importance of, contributions by network analysis Transformation of NA Support needed for NA.
Building an Infrastructure for Digital Humanities: Issues and Considerations Peter Zhou 周欣平 University of California, Berkeley October 8, 2009.
Symposium on Global Scientific Data Infrastructures Panel Two: Stakeholder Communities in the DWF Ann Wolpert, Massachusetts Institute of Technology Board.
Modern Information Retrieval Presented by Miss Prattana Chanpolto Faculty of Information Technology.
Serenate1 The librarian’s view Raf Dekeyser K.U.Leuven.
Soon Joo Hyun Database Systems Research and Development Lab. US-KOREA Joint Workshop on Digital Library t Introduction ICU Information and Communication.
JCDL/ICADL 2010 Report Kazunari Sugiyama 9 th July, 2010.
Digital University of Pisa Alessandro Lenci CoLing Lab – Laboratorio di Linguistica Computazionale Università di Pisa Aix-Marseille Université.
Cases of Local Language Content Development & Dissemination across Developing Asia.
1 JCDL 2013 Report Kazunari Sugiyama WING meeting 23 rd August, 2013.
Enriching Europeana Newspapers Nuno Freire, Europeana Foundation/The European Library Clemens Neudecker, Staatsbibliothek.
TRECVID IES Lab. Intelligent E-commerce Systems Lab. 1 Presented by: Thay Setha 05-Jul-2012.
Perspectives on Information Course Introduction January 25, 2016.
From CLEF to TrebleCLEF Promoting Technology Transfer
Open Research Data and Open Access publications: How do they sit in the Web of Science? Guillaume Rivalle, Manager, Europe solution specialists
Million Books Update: Perseus
Impact of the Alternative e-Publishing Model: From Open Access Resources & Self-Publishing toward Librarian’s New Challenges 溫達茂 飛資得資訊 中華民國九十三年十一月.
Joseph JaJa, Mike Smorul, and Sangchul Song
The Economy of Distributed Metadata Authoring
Objectives, activities, and results of the database Lituanistika
Presentation transcript:

1 JCDL 2011 Report Kazunari Sugiyama WING meeting 19 th August, 2011

2 Outline of JCDL11 Venue – Ottawa, Ontario, Canada

3

4 Outline of JCDL11 Acceptance ratio – 23.5% [57 / 243] Submitted papers: 243 papers Accepted papers: 57 papers – 28 full papers and 29 short papers Future JCDL – 2012: Washington DC, US – 2013: Indianapolis, US – 2014: ??? Candidates: Argentina, Italy, UK Joint with TPDL (Theory and Practice of Digital Libraries) – “ECDL” -> “TPDL” (since 2011)

5 Research Topics in JCDL Collaborative and participatory information environments Cyberinfrastructure architectures, applications, and deployments Data mining/extraction of structure from networked information Digital library and Web Science curriculum development Distributed information systems Evaluation of online information environments Impact and evaluation of digital libraries and information in education Information and knowledge systems Information policy and copyright law Information visualization Interfaces to information for novices and experts Personal digital information management Retrieval and browsing Scientific data curation, citation and scholarly publication Social networks, virtual organizations and networked information Social-technical perspectives of digital information Studies of human factors in networked information Systems, algorithms, and models for data preservation Theoretical models of information interaction and organization User behavior and modeling Visualization of large-scale information environments (Cited from “

6 Presented Research Topics Content analysis (18 papers) – information extraction, plagiarism detection, topic coherence, etc. Education Information policy, rights Infrastructure Interfaces Metadata, Annotation (8 papers) Mobile applications Preservation, Archive User’s information needs (8 papers) Visualization WWW “Measuring Historical Word Sense Variation”

7 Content Analysis “Measuring Historical Word Sense Variation” David Bamman and Gregory Crane (Tufts University, US)

8 Outline Automatically classify Latin word senses Track the historical variation of these senses more than 2,000 years span – Example: “radical” “Oxford English Dictionary” (1) Political meaning ( ) “advocating thorough or far-reaching political or social reform” (2) Slang term ( ) “Excellent, fantastic” Dataset – 83,892 words from the aligned parallel corpus – Manually annotated sample of 525 words

9 Proposed Approach Constructing Latin corpus Inducing Latin senses in English Word sense disambiguation Tracking sense variation over 2,000 years

10 Constructing Latin Corpus Collect Latin books from Internet Archive ( – 7,055 books – 389 million words (Cited from D. Bamman and G. Crane: “Measuring Historical Word Sense Variation,” JCDL2011) Invention of printing press by “Industrial Revolution” Renaissance

11 Inducing Latin Senses Latin English 129 translation book pairs 40,323 sentence pairs MGIZA GIZA++ Alignments at the level of individual words Clean alignments for 504,857 words Aggregate English Translations for each Latin Lemma 109,432 Latin-English translation pairs (Training set for word sense disambiguation)

12 Word Sense Disambiguation Classifiers – Language model classifier Trained on Uni-gram, bi-gram, 5-gram, 6-gram – Naïve Bayes Trained on uni-gram – TF-IDF Uni-gram – K-nearest neighbor Features – 20 words around each target word Baseline – Simply select the most frequent sense from the lexicon

(Cited from D. Bamman and G. Crane: “Measuring Historical Word Sense Variation,” JCDL2011) Latin word “oratio” (“prayer,” “speech” in English) 13 Tracking Sense Variation over 2,000 Years Apply 6-gram language model classifier to 389 million words “prayer” (16.7%) “speech” (83.3%) “prayer” (80.0%) “speech” (20.0%) “speech”

14 Doctoral Consortium 10 students – Germany (1), Norway (2), Portugal (1), UK (1), US (5) Topics – Metadata – IR and NLP in Semi-structured data – Applications – Information Discovery Travel support for students