Shibboleth: EBSCOhost implementation Lech Wojtowicz Director of Software Development EBSCO Publishing Access 2003 October 3, 2003
Overview About EBSCO Publishing and EBSCOhost EBSCO’s involvement in Internet2 Current authentication methods Why Shibboleth Shibboleth implementation time-line EBSCOhost configuration Outstanding issues and future
About EBSCO Publishing Part of EBSCO Information Services Provide information and tools to access information online Primarily institutional market International customer base Began in 1986 with CD-ROMs and evolved to Web EBSCOhost at version 6.4, version 7.0 will release in Fall
About EBSCOhost Web based search and retrieval system Supporting : 50 full text databases 65 secondary databases Links to 12,000 e-journals Native interface and Z39.50 access Internet network access from: UUnet Genuity Internet2 (Abilene Network)
About EBSCOhost, cont’d Multi-tiered system: Windows 2000 with IIS on front lines EBSCOhost is an ASP Web application, XML is an internal data format and protocol Several supporting services: , Transaction Logging, Content Enhancements, Article Matching/Rights Checking Solaris and Linux back end tier for performing searches Multiple NFS servers used for data storage
About EBSCOhost, cont’d Peak load: 25,000 simultaneous ASP sessions during peak time 200,000 searches peak hours, over 2 mln. searches a day 600,000 user logins per day 25 million ‘transactions’ per day 50% of outbound bandwidth is Internet2
EBSCO and Internet2 Most Internet2 members are EBSCO customers Many customers on affiliated network Recognized need for reliable high-speed connectivity ( Became Corporate Member in Fall 2000 Initial connection via vBNS+ Spring 2002 became Collaborating site Current connection to Abilene are two T3’s
Current authentication methods IP Address Username and password Referring URL Customer coordinated patron ID (library bar code) Pattern matching (patron ID) Athens Introducing Shibboleth...
IP Address Mechanism IP address ranges recorded in EBSCOadmin Associated with customer and group Shortcomings Multiple campuses with shared dynamic IPs may be a problem Remote access requires use of proxy server
Username/password Mechanism In EBSCOadmin a given user group is associated with a username and password User is prompted for username and password Shortcomings Communication of usernames and passwords Not very secure as usernames tend to be “advertised” No incentive for a patron to not share
Referring URL Mechanism Customer performs authentication Access to EBSCOhost is from secure page URL of secure page recorded in EBSCOadmin HTTP Referrer of request looked up Shortcomings Assumes customer’s page is secure End user must access through library authentication system
Customer coordinated Mechanism Customer uploads patron IDs (library bar code) to EBSCOadmin Patron IDs can be associated with a specific user group User must enter valid patron ID to access Shortcomings Link to EBSCOhost must include CustID Maintenance of patron ID
Pattern matching Mechanism Customer enters pattern of patron ID Associates pattern with user group User prompted for patron ID to access Length and significant characters must match Shortcomings Patron ID must follow a pattern Not very secure Maintenance: no easy way to “remove” a patron
Athens Mechanism Access rights managed centrally in UK by Athens group Prompt for users Athens User ID and password ( Call to Athens server to validate and get institution code Institution code matched to account in EBSCOadmin Shortcomings Management of users and rights in separate system from institution
Why Shibboleth EBSCO offers multiple services from different locations: EBSCOhost databases EBSCOhost Electronic Journals Service (EJS) A-Z journal locator service LinkSource OpenURL resolver Redirect customers to publisher sites
Why Shibboleth, cont’d Currently supporting multiple (independent) authentication options Customers want seamless access between services Users want single login EBSCO needs to provide secure authentication to meet expectations of data providers
Shibboleth project timeline Mar 14/02 – initial contact by Steven Carmody Apr 4/02 – development initiated Apr 29/02 – DLF/CNI meeting: proof of concept in place and demonstration of Shibboleth in action --- port of Shibboleth package to Win Sep 12/02 – Win32 Shib Package available (Version 0.7) Sep 26/02 – EBSCO Pilot project completed; Scott Cantor performs first real world test from Ohio State University to EBSCOhost July 2003 – Shibboleth version 1.1 released with Win32 support Aug – EBSCOhost Shibboleth Pilot project upgraded to use version 1.1 (
EBSCOhost configuration
Outstanding issues Handling multiple ‘sites’ for an institution Example: OSU has 14 EBSCOhost accounts Associate originSiteID with customer account(s) in EBSCOadmin If one originSiteID is associated with multiple customer accounts, use entitlement for finer resolution Allow self administration EBSCO specific eduPerson entitlement: urn:mace:ebsco.com:
Future proposal… use of attributes typeorigin SiteID affiliationentitlementcustIDgroupID 1.ubcn/a ubcmain 2. ubc n/a ubc staff student main 3.{ubc}n/aubcmed:main ubc:staff ubcmed ubc main staff 1. originSiteID – single custID and groupID (majority of cases) 2. affiliation – single custID and multiple groupID (includes walk-ins) 3. entitlement – multiple custID
Observations Development effort Implement ISAPI filter Supporting infrastructure inside EBSCOadmin Administration effort Find appropriate contacts at institution Determine customer account to use and domains and affiliation Set up mapping or allow customers establish this Meets goal of single login for multi-site sessions
Future Expand test to other EBSCO sites EBSCOhost Electronic Journals Service LinkSource MetaPress Work with major publishers to extend reach of seamless access Handling multiple federations by accessing multiple WAYF servers, based on information from user