An Overview of Literature Management Systems Qiaozhu Mei April 12, 2007
Example Systems Digital Libraries: –ACM Digital LibraryACM Digital Library –CiteSeerCiteSeer –JSTORJSTOR –PubMedPubMed Domain-specific Search Engines: –Google Scholar; Live academic search;Google ScholarLive academic search –Libra; Rexa; SCOPUS;LibraRexaSCOPUS –DBLP; Bibliography SearchDBLP Integrated Systems: –DBLife, bibSonomy, BeeSpace?DBLifebibSonomy
Outline Characteristics –What’s unique with scientific literature? Functionalities –Possible utilities based on the characteristics Prototypes – es_and_search_engineshttp://en.wikipedia.org/wiki/List_of_academic_databas es_and_search_engines
Scientific Literature: What’s Unique? Structured content: –Title, author, abstract, conclusion, reference,.. A latent network structure: –Reference; citations; co-authorship Contexts: –Time, conference/area, authorship,... –Citation context What about the content & language? –Topic dense? Terminology rich? Definitive language? –Controlled vocabulary? Low noise?
Managing Literature: What can We Do? Structured content: –Full text search, structured search, citation search –Similarity search Latent Network structure: –Citation navigation, co-author navigation –Co-citation Analysis, etc –Community analysis Context: –Trend analysis, author comparison, interdisciplinary research identification, … Language: –Topic extraction & categorization, filtering, concept analysis, ontology, summarization, …
Existing Systems: What do They Provide? Systems/ Function ENEN CSCS BCBC CNCN ANAN FTFT TATAC TCTCS BibBib ACM DL1XXXXXX CiteSeer1XXXXXX Google Sch.1XXX Live Acade.1XX DBLP2XXX Libra2XXXX Rexa2XXXXXX DBLife2XXXX bibSonomy1XXX SCOPUS 1XXX EN: Entity object 1: Paper 2: Author & Paper CS: content search BC: browsing by context CN: citation navigation AN: author/co-author navigation FT: full text download TA: tagging & annotation CC: citation context TC: topic catergorization SS: Similarity Search Bib: BibTex
Example Systems CiteSeer, Libra, Rexa, SCOPUS, DBLife Who did it? What’s their unique feature? What’s not covered?
CiteSeer Penn State Similarity Search Citation Context (partially) Citation Trend Statistics
CiteSeer: Similarity Search Content similarity Sentence- level similarity Co-citation similarity
CiteSeer: Citation Context
CiteSeer: Citation Trend
Libra Zaiqing Nie, Microsoft Research Asia Community search (topic extraction?)
Libra: Community Search
Libra: Author and Community
Rexa UMass, Andrew McCallum Search for grants, Tagging Topic extraction & categorization
Rexa: Tags
Rexa: Topic Extraction & Categorization Topic Relation: Citing, cited, co- occurring Trends
SCOPUS Subject Area: not CS –Life/health/physics/social science Alerts (Simple Filtering) –Search (content) alerts –Document Citation alerts Citation trends
DBLife Anhai Doan, U of Wisc. Integrating heterogonous academic information Change monitoring Not much on literature Many prototype functionalities
DBLife: Integration
Discussion: What is not Covered? Features: –Citation Context Functionalities: –Filtering; Summarization Opportunities: –Comparison across contexts –In-depth community analysis –More useful classification –…–…
Thanks!