Which Log for which Information? Gathering Multilinguality Data from Different Log File Types Maria Gäde, Vivien Petras, and Juliane Stiller Humboldt-Universität.

Slides:



Advertisements
Similar presentations
WDL Technical Architecture Working Group (TAWG) June 2010 Achievements and Recommendations Co-chaired by Noha Adly, Bibliotheca Alexandrina Babak Hamidzadeh,
Advertisements

EQUINOX DATA DELIVERY SYSTEM May 31, 2011 –Elizabeth Hill Equinox.uwo.ca.
Wincite Knowledge Warehousing and Networking Sophisticated Simplicity.
Macromedia Dreamweaver MX 2004 – Design Professional Dreamweaver GETTING STARTED WITH.
Thomas Mandl, Julia Maria Schulz LREC 2010, Web Logs & QA, /10 Log-Based Evaluation Resources for Question Answering Thomas Mandl, Julia Maria.
LogCLEF 2009 Log Analysis for Digital Societies (LADS) Thomas Mandl, Maristella Agosti, Giorgio Maria Di Nunzio, Alexander Yeh, Inderjeet Mani, Christine.
Google Chrome & Search C Chapter 18. Objectives 1.Use Google Chrome to navigate the Word Wide Web. 2.Manage bookmarks for web pages. 3.Perform basic keyword.
Web design Most digitisation projects are made available through Websites Effective Access depends on good web design Identify users and their information.
Helpful Hints Outlook GoogleHaiku.  Ever delete something and you want it back?  Go to Options, scroll down and click on Retrieve Deleted Items  Click.
© 2006 KDnuggets [16/Nov/2005:16:32: ] "GET /jobs/ HTTP/1.1" "
The user entered the query “What is the historical relation between Greek and Roma”. Here are the query’s results. The user clicked the topic “Roman copies.
Explore the Dreamweaver Workspace View a Web page and use Help Plan and Define a Web site Add a Folder and Pages, and set the Home page Create and View.
Enterprise Search With SharePoint Portal Server V2 Steve Tullis, Program Manager, Business Portal Group 3/5/2003.
Searching and Researching the World Wide: Emphasis on Christian Websites Developed from the book: Searching and Researching on the Internet and World Wide.
Europeana: Europe's Digital Library, Museum and Archive Ashley Carter and Dana Sagona.
Kentico CMS 5.0 Full-featured Flexible Web Content Management System for All Your Needs.
New School Websites Teacher Pages. Visit the SCUSD Website for videos tutorials: For more information.
HTML 1 Introduction to HTML. 2 Objectives Describe the Internet and its associated key terms Describe the World Wide Web and its associated key terms.
Prof. Vishnuprasad Nagadevara Indian Institute of Management Bangalore
CS598CXZ Course Summary ChengXiang Zhai Department of Computer Science University of Illinois, Urbana-Champaign.
Towards Online Accessibility of Valuable Phenomena of the Bulgarian Folklore Heritage Radoslav Pavlov 1 Konstantin Rangochev 1 Desislava Paneva-Marinova.
Classroom User Training June 29, 2005 Presented by:
Web Site Performance An analytical approach for benchmarking and tuning.
“Cross-Media and Personalized Learning Applications on top of Digital Libraries” 20 September 2007, Budapest, Hungary M. Agosti 1, T. Coppotelli 1, G.M.
1 Session 1: Introduction to HTML Spring Today’s Agenda Cover useful terminology for today’s session HTML, browsers, servers, etc. HTML Tags Get.
HOW WEB SERVER WORKS? By- PUSHPENDU MONDAL RAJAT CHAUHAN RAHUL YADAV RANJIT MEENA RAHUL TYAGI.
Project Overview Bibliographic merging, Endeca, and Web application.
CIS 205—Web Design & Development Dreamweaver Chapter 1.
Microsoft Internet Explorer and the Internet Using Microsoft Explorer 5.
Adobe Certified Associate Objectives 6 Evaluating and Maintaining a site.
EUscreen: Examining An Aggregator ’ s Role in Digital Preservation Samantha Losben Digital Preservation - Final Project December 15, 2010.
Thanks to Bill Arms, Marti Hearst Documents. Last time Size of information –Continues to grow IR an old field, goes back to the ‘40s IR iterative process.
Objective Understand concepts used to web-based digital media. Course Weight : 5%
Our Examples Video Capture Working With Interactive Video Objects Buttons symbols – are areas on the monitor that a sensitive to user actions such.
Aquenergy Portal Elisabetta Zuanelli, University of Rome “Tor Vergata”, Italy E-Age 2014 Muscat december.
0 SharePoint Search 2013 Rafael de la Cruz SharePoint Developer Seneca Resources twitter.com/delacruz_rafael
Overview Web Session 3 Matakuliah: Web Database Tahun: 2008.
WebEx. Google 101: Getting more from Google 7/26/2010.
IAEA International Atomic Energy Agency INIS Collection Search: Introduction and main features The Role of the International Nuclear Information System.
Video Active Presentation Agenda: –Demonstration of videoactive.eu Frontend and Backend fiatifta.dk Copenhagen September 2008.
Digital Library Syllabus Uploader Will Cameron CSC 8530 Fall 2006 Presentation 1.
Antoine Isaac Europeana – VU University Amsterdam Dagstuhl Multilingual Semantic Web seminar.
Chapter 1 Introduction to HTML, XHTML, and CSS HTML5 & CSS 7 th Edition.
JavaScript and Ajax (Internet Background) Week 1 Web site:
Overview Using Plugins Developing Plugins Basic Examples / Demo Outlook Overview Using Plugins Developing Plugins Basic Examples / Demo Outlook Plugin.
Session 1: Introduction to HTML Fall Today’s Agenda Talk about the functions of the Internet Cover useful terminology for today’s session HTML,
Internet Searching the World Wide Web. The Internet and the World Wide Web The Internet is a worldwide collection of networks that allows people to communicate.
Multilingual terminologies: the experience of Europeana Collection Athena Plus Workshop : “Innovative tools and pilots for access to digital.
Websites Creating Basic Course. What´s a website ? A website (or "web site") is a collection of related web pages, images, videos or other digital assets.
What Is Firefox? __________ is a Web ___________ that you use to search for and view Web pages, save pages for use in the future, and maintain a list.
Google Analytics Graham Triggs Head of Repository Systems, Symplectic.
Week-6 (Lecture-1) Publishing and Browsing the Web: Publishing: 1. upload the following items on the web Google documents Spreadsheets Presentations drawings.
| 1 EBSCOadmin EBSCO Support EDS Wiki Renata Wlodarczyk | EBSCO.
Semantic & Multilingual Interoperability in Cultural Heritage Information Systems Vivien Petras Berlin School of Library and Information Science 14 November.
Using Google Scholar Ronald Wirtz, Ph.D.Calvin T. Ryan LibraryDec Finding Scholarly Information With A Popular Search Engine Tool.
Essex Insight Introduction to Essex Insight Training Guide Source: Research and Analysis Unit v4.
Chapter 10: Web Basics.
Cms Full-featured Flexible Web Content Management System for All Your Needs.
Web-based structures, links and testing
JavaScript and Ajax (Internet Background)
Chapter 1 Introduction to HTML.
Multilingual Web Services Possibilities and Pitfalls
CNIT 131 Internet Basics & Beginning HTML
Browsing and Searching the Web
Brian McCallum UWS, Web Services Unit 15 November 2011
Basic Searching for K-12 School Libraries
User Requirements in the Cultural Heritage Domain
Objectives To understand the about types of computer network
Introduction to World Wide Web
GT Portal v. 2.0 Data Delivery
Presentation transcript:

Which Log for which Information? Gathering Multilinguality Data from Different Log File Types Maria Gäde, Vivien Petras, and Juliane Stiller Humboldt-Universität zu Berlin CLEF 2010 Padova, 21 September 2010

2 / 16 Premise Assume you are building a multilingual digital library and could log every user action with particular consideration for multilingual activities.  Which questions could one ask?  (Which questions cannot be answered by logging?) Outline: Europeana Log file types Logging multilingual information Europeana ClickStreamLogger

3 / 16 Europeana 1,000+ content providers Portal + APIs Services September 2010: 7.8 mio. images 4.6 mio. texts 127,000 videos 68,000 sounds “A digital library that is a single, direct and multilingual access point to the European cultural heritage.” European Parliament, 27 September 2007

4 / 16 Multilingual Europeana Interface Search Browse Results

5 / 16 Multilingual Europeana

6 / 16 Log File Types [11/Mar/2010:09:42: ] "GET /cache/image/?uri= thumb/0098/ jpg&size=BRIEF_DOC&type=IMAGE HTTP/1.0" " doc.html?start=1&view=table&query=italy" "Mozilla/5.0 (Windows; U; Windows NT 5.1; it; rv:1.9.2) Gecko/ Firefox/3.6 (.NET CLR )" Example Apache web server log

7 / 16 Log File Types Example Google Analytics Map overlay (IP address) Languages (system language)

8 / 16 Log File Types – Missing Information Web server log (Apache) Interface language missing Certain actions cannot be distinguished (browse = search) Ajax / Flash actions (saved searches, tags, filter) Reconstruct sessions Search engine log (Solr) Only queries Google Analytics Queries missing

9 / 16 Logging Multilingual Information Stages of the interaction: Approaching the system / background information Launching queries / browsing Viewing results Interacting with the results (filter, save, tag, repeat) User background Interface language Query language Query type Query content Query translation Search results Result set views Result translation Query reformulation User-generated content Saved searches / docs

10 / 16 Logging Multilingual Information - Background User background information Country of access, system language, referrer site Interface language Change  stronger intervention

11 / 16 Logging Multilingual Information - Query Query language Query processing Adapting languages to system Query type Simple, advanced, fielded (e.g. language restriction) Pre-selected categories for browsing Query content Named entities, dates, numbers (language ambiguous) Query translation

12 / 16 Logging Multilingual Information - Results Search results Document languages Result set views Detailed view, external click  stronger intervention Result translation

13 / 16 Logging Multilingual Information – User Activities Query reformulation / refinement Language switch Filtering (language), related-item search User-generated content Language of tags Language of documents being tagged Saved searches / documents ???

14 / 16 Europeana ClickStreamLogger Interface language state + change for every activity Search Result numbers, distribution of results by language / country Filtering and related searches Browse Browsing activities + starting points Navigation Move outside Europeana Ajax Save / remove searches / tags User management Account creation etc.

15 / 16 What happens now… Soft roll-outs of new releases change site Analysis of log data Interpretation Re-iteration of “useful information” categories Re-design user interaction?

16 /