Download presentation
Presentation is loading. Please wait.
Published byByron Francis Modified over 9 years ago
1
Harpers.org: a Semantic Web(ish) site for Harper’s Magazine Paul Ford Associate Web Editor, Harpers.org ford@harpers.org
2
Harper’s is… A magazine of literature, politics, culture, and the arts published continuously from 1850 A small non-profit
3
Available content The Weekly Review, an emailed summary of world events, from 2000 The Harper’s Index, a statistical portrait of the world, from 1998 Public domain, scanned-in archives from 1850-1982 Readings Occasional features
4
And that’s it. Maybe full text of issues will be offered someday, but not soon. So… How do we get more value out of limited content?
5
Solution Hack up the what we have into bits by content type, then… Reassemble it according to link targets… Which are arranged in a taxonomy… Creating a very small “Semantic Web” for Harpers.org
6
A quick demo… >>> >>>
7
How it works Simple set of ontological relationships (partOf, supervisorOf) Taxonomy of content & narrative content that is split into smaller pieces & links into the taxonomy
8
Markup Text: “ Country Y announced that it had cut off relations with country Z. On Wednesday, something happened to persons X and Y.”
9
Markup Country Y announced that it had cut off relations with country Z. On Wednesday, something happened to persons W and X.
10
Markup Country Y announced that it had cut off relations with country Z.
11
Markup Country Y announced that it had cut off relations with country Z.
12
Conditionals Some text required conditional markup Text: “ Country Y announced that it had cut off relations with country Z, and o n Wednesday, something happened to persons X and Y.”
13
Conditionals: ugly, but simple Country Y announced that it had cut off relations with country Z, and. on On o n Wednesday, something happened to persons X and Y.
14
Conditionals: ugly, but simple Narrative version Country Y announced that it had cut off relations with country Z, and o n Wednesday, something happened to persons X and Y. Timeline-friendly version Country Y announced that it had cut off relations with country Z. On Wednesday, something happened to persons X and Y.
15
All of it gets slurped up And turned into a set of triples Then processed in-memory With HTML pages spit out as a result
16
Hard, then easy Hard to get started (lots of events, facts, and links) Easy to keep going, if you don’t mind the markup and use a good text editor
17
Tools used emacs, vi, bbedit XSLT2.0 (SAXON) CVS
18
Why not RDF? Not right for redundant content and conditionals Easy enough to transform arbitrary structured XML into RDF with XSLT, as needed (Or into RSS1.0, RSS2.0, Atom, etc.) ?
19
For free… From 300 individual pages… To 1100 pages of “remixed” content – all unique and relevant And Google-friendly
20
And also for free… Semantically relevant in-site advertising, if we want it Topic-sorted, reusable content Permanent, readable URIs
21
Do people get it? Some do, and others just navigate the site as usual Harper’s was fine with the learning curve “Odd but useful” – Gawker
22
Results Uptick in traffic and subscription revenues Low cost of maintenance Ever-increasing database of facts and events – adding one Weekly Review adds value to 50 different pages Happy client
23
Why the SemWeb(ish) framework? Leaves plenty of room to grow Web-only content Full text of issues Subscriber services Etc Take advantage of new SemWeb tools Incorporate RDF sources into the taxonomy Anticipate Semantic Web browsers
24
Next?
25
Make it pretty Redesign Hide some of the navigation Turn links on and off
26
Make it scale Currently maxes out at about 20-30 megs of content, due to limits of in-memory DOM representation (10-12x XML document size) Use a publicly available storage layer (Kowari, Jena, etc) Go triple-crazy
27
Make it easy to query and navigate “Show me everything related to George Bush and Iraq.” or “Show me everything related to politicians and the Middle East.” New navigation ?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.