Computing and Information Technology Interactive Digital Educational Library Technical Development Content Collection Edward Fox (director) John A. N.

Slides:



Advertisements
Similar presentations
Your Trade Exchange And
Advertisements

GMD German National Research Center for Information Technology Darmstadt University of Technology Perspectives and Priorities for Digital Libraries Research.
DELOS Highlights COSTANTINO THANOS ITALIAN NATIONAL RESEARCH COUNCIL.
Chapter 2. Slide 1 CULTURAL SUBJECT GATEWAYS CULTURAL SUBJECT GATEWAYS Subject Gateways  Started as links of lists  Continued as Web directories  Culminated.
Web search results clustering Web search results clustering is a version of document clustering, but… Billions of pages Constantly changing Data mainly.
Information Retrieval in Practice
Interfaces for Selecting and Understanding Collections.
Supervised by Prof. LYU, Rung Tsong Michael Department of Computer Science & Engineering The Chinese University of Hong Kong Prepared by: Chan Pik Wah,
Introducing Symposia : “ The digital repository that thinks like a librarian”
Connecting Diverse Web Search Facilities Udi Manber, Peter Bigot Department of Computer Science University of Arizona Aida Gikouria - M471 University of.
Bar|Scan ® Asset Inventory System The leader in asset and inventory management.
Overview of Search Engines
PubMed/How to Search, Display, Download & (module 4.1)
Hadoop Team: Role of Hadoop in the IDEAL Project ●Jose Cadena ●Chengyuan Wen ●Mengsu Chen CS5604 Spring 2015 Instructor: Dr. Edward Fox.
Project Proposal: Academic Job Market and Application Tracker Website Project designed by: Cengiz Gunay Client: Cengiz Gunay Audience: PhD candidates and.
Dienst Distributed Networked Publishing Carl Lagoze Digital Library Scientist Cornell University.
Improving the Catalogue Interface using Endeca Tito Sierra NCSU Libraries.
1 JCDL/ICADL 2010 (Gold Coast, Australia – June 24) “Ensemble PDP-8: Eight Principles for Distributed Portals” Edward A. Fox, Yinlin Chen, Monika Akbar,
Ask A Librarian and QuestionPoint: Integrating Collaborative Digital Reference in the Real World (and in a really big library) Linda J. White Digital Project.
Strategies for improving Web site performance Google Webmaster Tools + Google Analytics Marshall Breeding Director for Innovative Technologies and Research.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
Information Need Question Understanding Selecting Sources Information Retrieval and Extraction Answer Determina tion Answer Presentation This work is supported.
Multi-agent Research Tool (MART) A proposal for MSE project Madhukar Kumar.
Glogster EETT Training Mathew Swerdloff November 30, 2010.
| e n a b l i n g | i n t e r a c t i v e | a d a p t i v e | O V E R V I E W Providing secure access to real-time data via the Internet Focused on delivering.
Business Software What is database software? p. 145 Allows you to create, access, and manage data Add, change, delete, sort, and retrieve data Next.
Web Categorization Crawler Mohammed Agabaria Adam Shobash Supervisor: Victor Kulikov Winter 2009/10 Design & Architecture Dec
Case Studies in the US National Science Digital Library (NSDL): DL-in-a-box, CITIDEL, OCKHAM ICADL2003, Dec, 8-11, 2003 Kuala Lumpur, Malaysia Edward A.
Developing a Concept Extraction Technique with Ensemble Pathway Prat Tanapaisankit (NJIT), Min Song (NJIT), and Edward A. Fox (Virginia Tech) Abstract.
Fourth Edition Discovering the Internet Discovering the Internet Complete Concepts and Techniques, Second Edition Chapter 3 Searching the Web.
1 Computer Programming (ECGD2102 ) Using MATLAB Instructor: Eng. Eman Al.Swaity Lecture (1): Introduction.
CITIDEL: Computing & Information Technology Interactive Digital Educational Library Web Page: Contacts: Future.
Usage versus Cost Analytics for Selection Management and Informed Purchase Decisions MTA Budapest, October 2012.
The Grid System Design Liu Xiangrui Beijing Institute of Technology.
CitiViz: A Visual User Interface to the CITIDEL System ECDL 2004, Bath, England, September 2004 Nithiwat Kampanya, Rao Shen, Seonho Kim, Chris North, and.
The Internet 8th Edition Tutorial 4 Searching the Web.
Math Information Retrieval Zhao Jin. Zhao Jin. Math Information Retrieval Examples: –Looking for formulas –Collect teaching resources –Keeping updated.
Introducing HingX now with Capacity Development Network.
The Web-DL Environment for Building Digital Libraries from the Web P. Calado 1, M. Gonçalves 2, E. Fox 2, B. Ribeiro-Neto 1, A. Laender 1, A. da Silva.
ICDL 2004 Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer Science Old Dominion University.
Sea Ice Mapping Systems Archive Browser Interface Distribution IngestProduction Ice Analyst Application Database Henrik Steen AndersonDMI Paul SeymourNIC.
Next Generation Search Engines Ehsun Daroodi 1 Feb, 2003.
Logging in Digital Libraries. Last week …. Introduction to quality indicators and the way in which these are formalized and made computable, according.
BOĞAZİÇİ UNIVERSITY DEPARTMENT OF MANAGEMENT INFORMATION SYSTEMS MATLAB AS A DATA MINING ENVIRONMENT.
Chapter 1 Getting Started With Dreamweaver. Exploring the Dreamweaver Workspace The Dreamweaver workspace is where you can find all the tools to create.
Digital Libraries1 David Rashty. Digital Libraries2 “A library is an arsenal of liberty” Anonymous.
Automatic Metadata Discovery from Non-cooperative Digital Libraries By Ron Shi, Kurt Maly, Mohammad Zubair IADIS International Conference May 2003.
Hussein Suleman University of Cape Town Department of Computer Science Digital Libraries Laboratory February 2008 Data Curation Repositories:
Towards a Reference Quality Model for Digital Libraries Maristella Agosti Nicola Ferro Edward A. Fox Marcos André Gonçalves Bárbara Lagoeiro Moreira.
ETD Search Services Ming Luo Edward A. Fox Virginia Tech.
Visual Semantic Modeling of Digital Libraries Qinwei Zhu, Marcos André Gonçalves, Rao Shen, Edward A. Fox – Virginia Tech,, Blacksburg, VA, USA Lillian.
Feb 24-27, 2004ICDL 2004, New Dehli Improving Federated Service for Non-cooperating Digital Libraries R. Shi, K. Maly, M. Zubair Department of Computer.
Soon Joo Hyun Database Systems Research and Development Lab. US-KOREA Joint Workshop on Digital Library t Introduction ICU Information and Communication.
WebWatcher A Lightweight Tool for Analyzing Web Server Logs Hervé DEBAR IBM Zurich Research Laboratory Global Security Analysis Laboratory
Document Clustering for Natural Language Dialogue-based IR (Google for the Blind) Antoine Raux IR Seminar and Lab Fall 2003 Initial Presentation.
GROUP PresentsPresents. WEB CRAWLER A visualization of links in the World Wide Web Software Engineering C Semester Two Massey University - Palmerston.
WebScan: Implementing QueryServer 2.0 Karl Geiger, Amgen Inc. BRS NA UG August 1999.
Simulation Production System Science Advisory Committee Meeting UW-Madison March 1 st -2 nd 2007 Juan Carlos Díaz Vélez.
Lecture 11 Introduction to R and Accessing USGS Data from Web Services Jeffery S. Horsburgh Hydroinformatics Fall 2013 This work was funded by National.
General Architecture of Retrieval Systems 1Adrienn Skrop.
Data mining in web applications
Information Retrieval in Practice
Introduction to Visual Basic. NET,. NET Framework and Visual Studio
Simulation Production System
Search Engine Architecture
MICROSOFT OUTLOOK and Outlook service Provider
Strategies for improving Web site performance
Datamining : Refers to extracting or mining knowledge from large amounts of data Applications : Market Analysis Fraud Detection Customer Retention Production.
Database System Concepts and Architecture
Academy Hub An eUnomia Factory Solution.
Presentation transcript:

Computing and Information Technology Interactive Digital Educational Library Technical Development Content Collection Edward Fox (director) John A. N. Lee Manuel Pérez-Quiñones Community Development John Impagliazzo Assessment Lillian Cassel Search Engines C. Lee Giles CSTC Deborah Knox

Searching CITIDEL searching, which is driven by the ESSEX search engine for relevance computation (fast, in-RAM processing with checkpoints), also provides a list of relevant categories within the classification schemes. Browsing and Searching with Filters Users are placed in chosen sub-communities. They can filter results based on these sub-communities. Also there is further customization. Alternatively, users may view all results. Users may set up multiple filters for simple or complex filtering based on many factors such as education level, role, resource type, language, source, and much more. This allows users to get exactly what they are or are not looking for in the digital library. At any time, users are free to disable these filters or see results excluded by them.

Enjoy in GrapeZone Derived from Carrot 2 project ( ex.xml?lang=en) Online Grape Cluster search results from CITIDEL Offline Grape Cluster a static collection

Cluster search results from CITIDEL

Cluster a member collection in CITIDEL The Computer Science Teaching Center (CSTC) NDLTD-Computing ACM Digital Library

Cluster CSTC

Cluster NDLTD-Computing

Cluster ACM

PIPE: Personalization by Partial Evaluation Interactions at existing web sites are predefined by the site designer Personalization is achieved by the designer’s anticipation of users’ expectations PIPE allows automatic personalization of a web site without designer anticipation –Recognized with the 2001 New Century Technology Council Innovation award

PIPE provides Mixed-Initiative Interaction Involves an extra specification window (e.g., a toolbar) system-initiated + user-initiated modes of interaction Traditional browser: the user merely clicks on available hyperlinks. PIPE window: the user can type in any information out-of-turn Can also mix-n-match

Features of PIPE applicable to many information system technologies web sites (even third-party) Digital Libraries (currently working on CITIDEL integration) voice-activated systems (e.g., pizza ordering, movie information, and flight reservation services) PIPE is available for licensing and is ready for commercialization, through VTIP PIPE has been featured in IEEE Internet Computing, IEEE IT Professional, and the Appian Web Personalization Report.

PIPE system architecture

CITIDEL + PIPE Adds Interaction Personalization to CITIDEL Automatically handles multi-modal conversion to Cell phone, PDA, Etc. Can be adopted to any digital data set, only requires XML file of content with hierarchy maintained.

Programming Team DL Project Logan Hanks, Mike Scarborough, Stafford Fuller, Problem Description: VT has multiple programming teams, and has sent a team to the ACM world finals every year for the past decade. Each week during the semester, the teams practice using a problem set from a past regional or international contest. Each practice generates multiple solutions for each problem. What is needed is a digital library to collect these solutions and serve as a reference.

Programming Team DL Project Deliverables: Problem statement and solution importer/archiver. Classification framework for problems and solutions. Search engine for the DL to locate problems and solutions by their relevance to a set of classes given as input. Web interface for browsing problems and solutions as well as accessing all of the above deliverables. Integration with CITIDEL. Requirements: Importing and classifying problem statements and solutions. Solutions should be classified based on what algorithms and methods they use and what problems they solve. Interface for browsing problem statements and their solutions. A search engine for finding problem statements or solutions based on their classifications.

The XML Log Standard for Digital Libraries: Analysis, Evolution, and Deployment Marcos André Gonçalves, Ganesh Panchanathan, Unnikrishnan Ravindranathan, Aaron Krowne, Edward A. Fox – Virginia Tech Filip Jagodzinski, Lillian Cassel – Villanova University

Evolution of the Log Tool Monolithical Log Tool DL Hierarchical Log Tool DL Socket (e.g., MARIAN) Componentized Log Tool DL C x XML Log Repository 1st Generation 2nd Generation Standard Protocol (XOAI) (e.g., CITIDEL) DL C y Next Generation DL C z DL

The XML Log Format Log SessionIdMachineInfo StatementTransactionTimestamp SessionInfoRegisterInfo StatementEventTimestamp Action SearchBrowse StoreSysInfoUpdate SearchBy QueryString CatalogCollection PresentationInfo StatusInfo Timeout

Evolution of the Log Format

Log Analysis Tools Standardization of log processing and analysis –Different DLs can be compared, monitored, and analyzed Support of clickstream analysis –Provides detailed information on user activity and overall user trends

XML Log Log Data Parser/ Error Checker Routine module usr T20:10: … low back pain … 5114 Step 1: Extract Log Data User Activity Files Query String StatsUser ID Stats Domain StatsError Stats Document ID Stats Step 3: Individual Modules Populate Intermediate Stat files, Increment Global Variables, etc. Step 4: At Conclusion of XML Log, different stats can be combined and/or used to get additional statisticss. Final Report/ Statistics module Step 2: Parse XML Log Data; Send Log Line to Appropriate Module Log Analysis Tools: Parsing the XML Log

Log Analysis Tools: Creating the Clickstream Stats Visualizer (GUI) Visualizations Use Activity File User 4532; 25 Logons, 22 Logoffs, 3 accesses from.edu, 2 hits from.gov, history:: Logon page [13 may 2003, 16:00] -> Browse page [13 may 2003, 16:02] -> search page [13 may 2003, 16:04] -> results page [13 may 2003, 16:04] -> view document 254 page [13 may 2003, 16:07] -> download page [13 may 2003, 16:10] -> logoff page [13 may 2003, 16:11] User ; 3 Logons, 0 Logoffs, 1 accesses from.mil, 2 hits from.com, history: Logon page [12 may 2003, 12:00] -> Browse page [12 may 2003, 12:02] -> logoff page [12 may 2003, 12:03] Etc. etc. etc Clickstream stat generator Clickstream stats Step 1: intermediate Statistics Files are Used as input Step 2: Clickstream Data is produced Step 3: GUI is used To produce usage Statistics, clickstream Stat visual aids, etc.