Exploring Scholarly Data with Rexplore Francesco Osborne1,2, Enrico Motta1, Paul Mulholland1 1Knowledge Media Institute, The Open University, 2Dept. of Computer Science, University of Torino
Outline Introduction State of the art Overview of Rexplore Empirical Evaluation Conclusions
Introduction Understanding what goes on in a research area is no easy task A variety of entities, such as publications, publication venues, researchers, research groups, events Relationships which exist between them Different categories of users Rexplore, which integrates statistical analysis, semantic technologies, and visual analytics To investigate research trends effectively at different levels of granularity To relate authors ‘semantically’ To perform fine-grained academic expert search along multiple dimensions
State of the art Providing an interface to a specific repository of bibliographic data Integrating multiple data sources to provide access to a richer set of data Some widely used academic search engine: Google Scholar(GS), FacetedDBLP, Microsoft Academic Search(MAS), CiteSeer, Saffron
Gap Analysis No semantic characterization of research areas Systems tend to use keywords as proxies for research areas Lack of granular analysis E.g.: MAS can visualize publication trends in ”Wrold Wide Web” and “Databases”, but cannot provide this feature for “Semantic Web” Digital library bias 缺少精确的分析 数字图书馆偏爱
Overview of Rexplore Detect and make sense of the important trends in one or more research areas Identify researchers and analyze their academic trajectory and performance in one or multiple areas, according to a variety of fine-gained requirements Discover and explore a variety of dynamic relations between researchers, between topics, and between researchers and topics Support ranking of specific sets of authors, generated through multi-dimensional filters, according to various metrics
Rexplore Architecture Using a combination of statistical methods and background knowledge Statistical methods and background
Means for Achieving Ontology population with Klink characterizes research areas and their relationships skos:broaderGeneric: “Semantic Web Service”-”Semantic Web” and “Web Service” contributesTo: “Ontology Engineering”-”Semantic Web” relatedEquivalent: “Ontology Matching”-”Ontology Alignment” Geographic Enrichment Universities, Research Labs, Hospitals Maps the affiliation to GeoNames -e.g., “University of Turin” and “University of Torino”
Means for Achieving Topic Analysis General information about the topic Access to relevant authors and publications The topic navigator Visual analytics on broaderGeneric and contributesTo topics Visual analytics on authors’ migration pattens from other topics to and from the topic in question
Means for Achieving Author Analysis General bio information Authors’ scores according to different bibliometric measures Topic analysis Co-author analysis Pattern analysis Graph view
Means for Achieving Faceted Search and Data Browsing Filter: name or a part of it, career range, topics of interest, venues in which they published Rank: number of publications, number of citations, H-Index, G-Index, HT-Index, GT-Index, number of publications/citations in a topic or set of topics, number of publications/citations in a venue or set of venues
Means for Achieving The Graph View
Experimental Setup
Results 17 PhD students and researchers 50 of the 51 tasks using Rexplore, with a 98% success rate(complete the task within 15 min) 8/9 subjects were asked to work with GS/MAS Only 3 people completed a task with MAS
Results No domain specific expertise is needed to use Rexplore to make sense of a particular research area Experts in Bibliometric and Learning Analytics would do better SUS: 75/100, ≥72% of the 500 tested systems 94%: the system are well integrated 82%: would be happy to use Rexplore for their work
Feedback 94%: “very effective” 18%: “easy/natural/intuitive” The Most useful features: Faceted filter (59%) The visualization/charts (47%) The graph view (47%) The semantic characterization of topics (41%) The main weakness: Visual complexity (41%) Not always well-evidenced Navigation context (35%)
Feedback To Suggest new features: “minor interface change” (23%) A natural language interface for formulating complex searched The ability to retrieve and search full text of a publication from within Rexplore Did not need any additional features (23%)
Conclusions Rexplore arguably affords a major advantage over other tools in its ability to support: The visualization of trends at a very fine level of granularity Methods to identify ‘semantic’ relations between authors Fine-grained multi-dimensional academic expert search Future work: Improve the minor interface Add to the number of navigation filters Release a version of the tool with comprehensive data coverage for use by the scientific community
Thank you