Detailed search stats from DSpace Solr

Slides:



Advertisements
Similar presentations
Usage Statistics in Context: related standards and tools Oliver Pesch Chief Strategist, E-Resources EBSCO Information Services Usage Statistics and Publishers:
Advertisements

IRRA DSpace April 2006 Claire Knowles University of Edinburgh.
EPrints 3 Technical Overview EPrints 3 Briefing 8 th December 2006, London.
CONTENTdm vs DSpace vs Fedora
Cognos Web Services Business Intelligence. SOA SOA (Service Oriented Architecture) The SOA approach involves seven key principles: -- Coarse -grained.
Retrieval of Information from Distributed Databases By Ananth Anandhakrishnan.
DSpace 4: TDL upgrades and new features SEPTEMBER 30, 2014.
The Knowledge Bank Project at the Ohio State University Presented at the American Accounting Association Meeting – Chicago 8/6/07 Charles J. Popovich Head.
A. Grigorov, A. Georgiev, M. Petrov, S. Varbanov, K. Stefanov Building a Knowledge Repository for Life-long Competence Development.
DSpace Devika P. Madalli DRTC, ISI Bangalore.
A Framework for Distributed Preservation Workflows Rainer Schmidt AIT Austrian Institute of Technology iPres 2009, Oct. 5, San.
ACAT 2008 Erice, Sicily WebDat: Bridging the Gap between Unstructured and Structured Data Jerzy M. Nogiec, Kelley Trombly-Freytag, Ruben Carcagno Fermilab,
Implementing search with free software An introduction to Solr By Mick England.
Xpantrac connection with IDEAL Sloane Neidig, Samantha Johnson, David Cabrera, Erika Hoffman CS /6/2014.
ETD Repositories Using DSpace Software Andrew Penman The Robert Gordon University 27 th September 2004.
Publishing Digital Content to a LOR Publishing Digital Content to a LOR 1.
Strategies for improving Web site performance Google Webmaster Tools + Google Analytics Marshall Breeding Director for Innovative Technologies and Research.
Web Site Performance An analytical approach for benchmarking and tuning.
1. 2 introductions Nicholas Fischio Development Manager Kelvin Smith Library of Case Western Reserve University Benjamin Bykowski Tech Lead and Senior.
Making Grey Literature Available through Institutional Repositories LeRoy J. LaFleur, Social Sciences Bibliographer Nathan A. Rupp, Metadata Librarian.
Online Autonomous Citation Management for CiteSeer CSE598B Course Project By Huajing Li.
HTRC API Overview Yiming Sun. HTRC Architecture Data API Portal access Direct programmatic access (by programs running on HTRC machines) Security (OAuth2)
Improving user engagement in a data repository with web analytics LITA Forum November 7, 2013 Heather CoatesSummer Durrant Digital Scholarship & Data Management.
Page 1 CSISS Center for Spatial Information Science and Systems Design and Implementation of CWIC Metrics Weiguo Han, Liping Di, Yuanzheng Shao, Lingjun.
© Geodise Project, University of Southampton, Data Management in Geodise Zhuoan Jiao, Jasmin Wason and Marc Molinari
Part 1 – PubMed Interface, Display options, Saving, Printing, and ing results. Instructions This part of the course is a PowerPoint demonstration.
University of North Texas Libraries Building Search Systems for Digital Library Collections Mark E. Phillips Texas Conference on Digital Libraries May.
Using JUSP to help with the SCONUL return Angela Conyers Evidence Base, Birmingham City University.
"How much?": Aggregating usage data from Repositories in the UK Jo Lambert, Ross Macintyre, Paul Needham, Jo Alcock OR2015.
IPC Working Group 30 - Updates on IT support for the IPC Geneva November 6, 2013 Patrick Fiévet Head of IT Systems Section.
Uganda Scholarly Digital Library (USDL) Makerere University’s Institutional Repository By Margaret Nakiganda URL:
GBIF Data Access and Database Interoperability 2003 Work Programme Overview Donald Hobern, GBIF Programme Officer for Data Access and Database Interoperability.
The DSpace Course Module – Items in DSpace. Module objectives  By the end of this module you will:  Understand what an item in DSpace is, and what it.
Data, data, data In-depth session on data integration.
Mercury – A Service Oriented Web-based system for finding and retrieving Biogeochemical, Ecological and other land- based data National Aeronautics and.
Mercury. One single online platform: Mercury Highlights – USP’s Web-based platform: accessible from any computer in any location without installing any.
TopCAT Use Cases Priorities User Interface 1 ICAT developer workshop, August 2009 Laurent Lerusse – STFC
DSpace - Digital Library Software
DSpace System Architecture 11 July 2002 DSpace System Architecture.
Web Design Terminology Unit 2 STEM. 1. Accessibility – a web page or site that address the users limitations or disabilities 2. Active server page (ASP)
DSpace Statistics Graham Triggs Head of Repository Systems, Symplectic.
EBay Searcher Brian Payton, Jason Nowakoski, Justin Szeluga, Salvatore Siragusa, David Wolkiser.
Google Analytics Graham Triggs Head of Repository Systems, Symplectic.
General Architecture of Retrieval Systems 1Adrienn Skrop.
SmartCode Brad Argue INLS /19/2001.
Introducing SQL Server 2000 Reporting Services
User Characterization in Search Personalization
The Client-Server Model
Dspace Statistics: Google Analytics, Solr
Jon Wheeler | Kenning Arlitsch
VI-SEEM Data Discovery Service
Strategies for improving Web site performance
Introduction, Features & Technology
Web Engineering.
VI-SEEM Data Repository
Building Search Systems for Digital Library Collections
VI-SEEM Data Repository
Institutional Repository at NIO: Inspiration to Implementation
CDISC SHARE API v1.0 CAC Update 22 February 2018
Embargo Support for DSpace
PX-Web 2019 and more… Mikael Nordberg Developer Statistics Sweden.
Network Controllable MP3 Player
Getting Started With Solr
Item 2.2 of the Agenda Remote access to confidential data for researchers: possible actions under the 7th Framework Programme Pascal JACQUES Unit B 5 15.
Chapter 42 Web Services.
Programmatic interaction with the Invenio-based NADRE Repository
Programmatic interaction with the Invenio-based NADRE Repository
Advanced hands-on on programmatic access to an Open Access Repository
WHERE TO FIND IT – Accessing the Inventory
Intro to Web Services Consuming the Web.
Presentation transcript:

Detailed search stats from DSpace Solr Ohio IR day– 2016 October 5 Eric Johnson Data Librarian, Miami University

Repository Statistics How often is a resource being viewed? Where are the views coming from? (who is viewing the resource?) What is the trend? (increasing, decreasing) Statistics are used for funding requests. Statistics help us improve our selection of resources. (What types of items are people looking for?)

Dspace statistics

Half year history

Popular in Japan!

Over the last 6 months

Client’s need Wanted detailed statistics not just an “overview”. Statistics for particular time periods Statistics for particular publications Needed for grant application submissions

Solr DSpace uses Solr to index search and download actions. Solr is a search platform developed by Apache and integrated into the Dspace repository software. Uses Apache Lucene

Solr dashboard Comes with Dspace Allows advanced queries http://localhost:1234/solr/#/statistics/schema-browser

Solr API If you want to programmatically access the Solr data, you create URLs and parse the results. http://MyDSpace/solr/search/select/?q=location.comm :25+AND+search.resourcetype:2&start=0&rows=3 Data is returned in XML format <doc> <arr name="dc.title"> <str> Implementation of the 2006 Ohio nursing home family satisfaction survey: Final report </str>

The Request – list all metadata for items in a particular community http://myDSpace/solr/search/select/? – The URL of your repository (Solr server) q=location.comm:25 – Community ID# +AND+search.resourcetype:2 – Type = item (vs. collection, community, bitstream, etc.) &start=0 – Which rows of the response should be sent with this request. Useful for long responses. &rows=3

Each <doc> is a separate item The response Each <doc> is a separate item Likewise with search stats, each search is a separate search activity. NumFound – number of results found. Use “Start” and “rows” to get them in batches. XML format Search the tags to retrieve information.

Use “start” and “rows” to download the responses in batches “numFound” is the total number of responses Likewise with search stats, each search is a separate search activity. NumFound – number of results found. Use “Start” and “rows” to get them in batches.

Tracking of each search action Total # of searches during that time Address of each searcher Location of searcher: Continent, Country, city. Latitude & Longitude. Was the searcher a robot or human? ID # of this search Item # that was returned

Workflow Get parameters (date range, collection) Identify month periods Get a list of items in that collection Find the number of search requests for each item in each time period Output a TSV (Tab separated variables) file (CSV won’t work because some titles contain commas)

Interface

Result

Much more informative now

Questions? Eric Johnson Miami University johnsoeo@miamioh.edu