Content Model Workstream phase 1
Business transformation goals Efficient content updating and publishing New business opportunities enabled Efficient location of internal content Fine-grained business analytics
Phases of the content modelling workstream phase 1: creating the model phase 2: testing and refining the model phase 3: handing over the model to DK phase 4: migrating content to the new model
Phase 1 deliverables A model of how DK content can be structured as content objects Ontologies, which support content search and reuse A prototype to demonstrate content objects and ontologies
What are content objects? Reusable "chunks" of meaningful content, which can be combined to created products
Book structure
XML structure
Selecting content
DocBook Assembly DocBook’s method for topic-based authoring. Can be chunked at any level (whole chapter or a paragraph) Can be transformed from or to a traditional book structure
DocBook Toolkit Out-of-the-box production of PDF, HTML, EPUB, slides etc (good for proofing) Excellent tools support (Oxygen XML) Fully customizable - can layer own XSL on top Disassemble existing DK-Schema books using standard XSL
Introduction to Linked data Linked data is about facts fact: The national gallery is an art gallery fact: The national gallery is in Trafalgar Square fact: Trafalgar Square's nearest tube is Charing Cross Very simple facts Thing ... has some property ... value subject ... predicate ... object Facts are stored in a database called a triplestore Ask questions like – what cultural buildings have a nearest tube station of Charing Cross Linked data uses the web http://sparql.kode1100.com/id/geonames/6944334 http://sparql.kode1100.com/id/station/westminster Linked data stores are schema-less Got a new fact – just dump it in the store
What facts might be useful to DK? Facts about real world things e.g. National Gallery type, nearest tube, lat/long, wheelchair access, opening times, tours, events, ... usually called reference data Store them once – accessible to all titles and products Facts about content this content object is about the national gallery, was written by ..., uses this image ..., which was taken by ..., which is rights cleared for the UK, is on page ... of the 2012 edition of the RG to London
Facts need a vocabulary That's what the ontology is A vocabulary for writing down facts The ontology is divided into a number of separate modules there currently are ontology modules assembly asset attraction book content location product transport travel-product travel web there are extensible controlled vocabularies book-categories brands series travel-content travel-themes
The Attractions Module Built Location Natural Location Event Attraction Activity Attraction Journey Attraction
The Attractions Module Built Location Natural Location Event Attraction Activity Attraction Journey Attraction Cultural Location Gallery Museum
The Attractions Module Built Location Natural Location Event Attraction Activity Attraction Journey Attraction Cultural Location Beach Waterfall Gallery Museum
The Attractions Module Built Location Natural Location Event Attraction Activity Attraction Journey Attraction Cultural Location Beach Waterfall Horse Riding Hiking Gallery Museum
Not limited to one Vocabulary Different people use different vocabularies to describe the same thing to the geek AggregateContentObject to API developer an article to the editor Breakout Box to the web publisher a web page Core Content Assembly Web Book RG EWG Content
How is this used in the prototype Gallery Leicester Square Content type National Gallery nearestMetro hasPart subject Content Charing Cross representativePoint title National Gallery lat long 51.50872 -0.12841 Content Embankment Content Metadata Reference Data
Prototype wanted an environment for trying things out Web Server wanted an environment for trying things out before the CMS system is ready purely experimental Developer's UI Web API Web API Query content facts XML Database A triple store munge original
Demo
Questions
Next steps Requirements from February 18th What does the content model need to do? Phase 2 throughout March Creating products from content objects Searching for content objects Modelling more domains Phase 3 beginning in April? Handover to DK Migration beginning in April?