Stephen Rhind-Tutt, President, Where are we? Where are we going? A survey of the electronic publishing landscape SSP November, 2007
1.About us 2.Aggregation and discovery 3.Adding value 4.Web 2.0 Overview
About us
Founded in July 2000 Scholarly, electronic publisher in the Humanities Based in Alexandria, Virginia 70 employees licensing partners, including Warner Bros., Penguin Putnam, Faber & Faber, Macmillan, University Presses, etc… Special collection partners include AAS, Library of Congress, NYPL, Wisconsin State Historical Society, South Hadley Historical Society, and many more. Sell to libraries around the world About Alexander Street Press
The Portfolio Performing Arts, Drama, and Film World Literature Womens History Religion Psychotherapy Music Social and Cultural History Sociology Black Studies American Civil War
To craft electronic products of exceptional utility and quality, using the skills of librarianship and traditional publishing To give voice to those who would otherwise be silent Our Mission
Surrendering control Loss of proprietary gateways to content Expensive, new technologies Most content not created by publishers Large new players with enormous network advantages Mission statements that are the same as publishers/librarians
Aggregation and Discovery
What stage is the web at? ?
Car in 1904 Quadricycles, Phaetons, Horseless Carriages, Autocars, Motor cars ? Horsepower to weight ratio - Electric, Hydrogen or Gasoline ? Materials – Wood, Steel, or combination ? Production line – Custom or mass produced ? Starting Systems – Manual or electric ? Legal – UK law restricting speed to 5 mph Education – Would population be able to master the machines ? Costs – Typically in excess of $2,000 (Source: Various Articles in The Living Age, 1904)
Car in 1920 Motor cars Horsepower to weight ratio - Gasoline and clearly going to improve in future Materials – Steel Production line – Mass produced Starting Systems – Electric Legal – Building of highways Education – No longer an issue Costs – Model T cost $400
Electronic Journals vs. books… Electronic JournalsBooksE-Books Cost/item/person> ¢> $20- SizeUnlimited Pages- AccessibilitySitePerson at a time(+) OrganizationIntegratedIsolated(+) Searchability20 entry points2 entry points(+) DivisionAtomic + LinearLinear- CurrencyDaily updates> Quarterly- Delivery speedInstant> Day ?Instant Interaction?None- Process integration?None-
Portals make most use of the medium... Electronic JournalsPortals Cost/item/person0.002 ¢Free SizeUnlimited AccessibilitySiteUniversal OrganizationIntegrated Searchability20 entry pointsMultiple entry points DivisionAtomic + LinearAtomic CurrencyDaily updates Delivery speedInstant Interaction?Multiple Process integration?Multiple
Value in the electronic world is about... Understanding electronic products The manner in which or the efficiency with which something reacts or fulfills its intended purpose Websters Unabridged
Aggregation and Discovery
Remixability No one site will contain all information Effective publication is a function of delivering the right content, in the right way to the right people. To do this we will need high quality access to content across different publishers, libraries and websites. Discoverability – Aggregation – Value added
Nature of electronic publications Atomic Interconnected Interdependent Connection vs. the object Pliable Constantly evolving Without place Practically unlimited in size Page
Its about the links… Links document intellectual pathways through data Indexing links adds value substantially Links Prevent duplication of indexing, content and commentary Links are expensive to create and maintain Versioning is critical to scholarship. Some links confer authority Links are intrinsically bidirectional (Ted Nelson)
Option A: Link to resource -Low cost -Lower utility -Changing URLs prevents access. Loss of control issues Option B: License resource -High cost (royalties) -Loading cost -High functionality -Permanence in collection We wont license…its on our site already and we dont want to lose the usage…
Integration is unavoidable Loosely Held Tightly Held Free Websites Loosely integrated Tightly integrated Refuse to License License widely License widely and be a Licensor
Building the network… Unhelpful Legal warnings not to link Changing links constantly Disabling links No permanent URLs No crawling Randomly changing URLs Insisting on one interface and one access point Unattached pages Helpful Citations visible to the outside world Permanent URLs RSS feeds OpenURL Design for multiple interfaces Open to crawling Published APIs Welcome linking Ask others to do the same
No silos! Search via ASP interface Search via library catalog (direct or federated) OpenURL links directly to ASP content OpenURL links directly to other databases Search the web Search the library catalog e.g. RILM, Grove Music Online
Result of an external search
Unlock the value… Books Search Tool Discipline Tool Personal Work Bench Music Newspapers Films Images Grants Memoirs Place: Alexandria, Virginia
Free index to first-person narratives The most comprehensive archive of social history yet created Perform in-depth field and keyword searches across scholarly materials that are freely available on the Web Also indexes for fee letters, diaries, oral histories, memoirs, and autobiographies within Alexander Street Press databases Access thousands of personal narratives from the English-speaking world, in a single search
OPAC or Google Students of the Sixties Hogan Jazz Archive Free: Public Web collections, semantically indexed by Alexander Street ($) Alexander Street collections containing first-person materials Free: Materials submitted by the user community
Context and selection
Search Power
Organized Results
Adding value
Not just reproducing paper… Electronic versions are often deliberately changed to improve performance Reduced color and clarity for faster display Change size for improved screen display Transcriptions for increased searchability Citations for improved searching Mark-up interspersed with text itself The connection is more important than the object Electronic publishing…
Digital Surrogates… Black & White Grayscale 24 bit color 48 bit color 100 dpi 600 dpi JPG TIFF Citation MARC Record Dirty OCR % rekeying Semantic Indexing Thumbnails 100 dpi Page Collection Letter Facsimiles Transcriptions EAD Finding Aid Repository Mobile Web TCP-IP
Printed Reference: Xitztum JL, Galinski W: Are microbiological mechanisms relevant for the development of atherosclerosis? Clin Immunol , 1999 Example 1: Xitztum JL, Galinski W: Are microbiological mechanisms relevant for the development of atherosclerosis? Clin Immunol 90 :153–156, 1999 Example 2: Xitztum JL Galinski W Are microbiological mechanisms relevant for the development of atherosclerosis? Clin Immunol Mixing text and mark-up…
Evaluating different formats 0 Native Format per page 2 – 8 per page per page per page per page Relative Cost Interchange Enforce Standards Component Reuse Searching Re-purposing Distributing Page Images XMLSGMLHTMLPDFTIFF Use Format Excellent Very GoodGood LimitedNone Source: Don Bridges, 48 th Annual STC Conference, Presentation, May
Functionality vs. Preservation Low High TIFF VRML Flash JPG High Functionality Level of Mark-up ASCII XML 8 cents/page $1.50/page $10/image
Semantic Indexing Collection Series Book or Volume Chapter Page Word Where ? When ? What ? Who ? Traditional indexing > Semantic indexing >
The real world Play Author Production Stills Playbills Production Venue Director Lighting Set Designers Theater Performance Location Production Company Producer Texts Criticism Cast List Performers Posters Ephemera Scenes Acts Characters Dramatis Personae
The virtual world… Author Birth date Death date Birth Place Death Place Nationality Occupation Awards (38 fields) Theater District Location Capacity Style Etc… (18 fields) Company Name Productions Performers Etc… (14 fields) Production Director Theater Cast # of Perfs. Lighting Costumes Etc… (47 fields) Characters Plays Age Author Performer Etc… (30 fields) Scenes Where When Setting Subject Etc… (41 fields) Resources Play Director Theater Production Co. Character Scene Etc… (45 fields) Texts Keyword Author Date Written Date Published Production (67 fields)
The virtual world… Author Birth date Death date Birth Place Death Place Nationality Occupation Awards (38 fields) Theater District Location Capacity Style Etc… (18 fields) Company Name Productions Performers Etc… (14 fields) Production Director Theater Cast # of Perfs. Lighting Costumes Etc… (47 fields) Characters Plays Age Author Performer Etc… (30 fields) Scenes Where When Setting Subject Etc… (41 fields) Resources Play Director Theater Production Co. Character Scene Etc… (45 fields) Texts Keyword Author Date Written Date Published Production (67 fields)
Traditional vs. Semantic Indexing Traditional IndexingSemantic Indexing General HistoryDramaReligion What?Article, BookEventScenePassage Who?AuthorParticipantsCharactersAuthor Where?Where publishedWhere occurredWhere set- When?When publishedWhen it happenedWhen setWhen written Give me articles from journal xxx prior to 1990 Give me documents that discuss battles where more than 100 people were killed? Give me all scenes set before 1850 that portray lynching? Which authors cite Genesis most frequently?
Identify and divide texts into content elements (e.g. letter, diary entry…) Identify key concepts for these elements (e.g. authors, sources, battles, encounters…) Index both elements and associated concepts Integrate to form a cohesive whole Unique ways of browsing through concepts Unique ways to ask questions Semantic Indexing
Tables of Contents
Search functionality
Web 2.0 – what relevance?
Participation Playlists on ASPs music products – 19,000 users Over 100,000 playlists created so far 800 created by ASP 38,000 user created 70,000 derivative playlists
Playlists
Fading Growing Typesetting Printing Print monograph Print directory Public domain reprints Simple, one database search Rare and unpublished material Linking Licensing Free materials Semantic indexing Process integration Unified search software Workflow tools Warehousing Community building Asset management Commissioning? Editorial? Quality? Selection? Interactions? Never ending value…