Presentation is loading. Please wait.

Presentation is loading. Please wait.

Template-based Authoring Knowledge Systems Laboratory Stanford.

Similar presentations


Presentation on theme: "Template-based Authoring Knowledge Systems Laboratory Stanford."— Presentation transcript:

1 Template-based Authoring Knowledge Systems Laboratory Stanford

2 Project Goals Assist analyst in everyday work Knowledge Authoring Tools to assist in: Research for reports Produce reports Consume reports Share reports Our solution: Semantic Web Templates

3 Semantic Web Templates Knowledge Representation, Semantics are key for information exchange Creation, maintenance of knowledge must be transparent Automate extraction of knowledge Enhance knowledge retrieval methods

4 Semantic Web Templates Similar to MS Word Templates Different templates for different tasks Word templates can have restrictions on text Very primitive, such as length of text Simplistic patterns such as “phone number” No concepts such as “color” or “country” One template, many documents HTML templates are very common today Many web sites use SQL database as back end, template + SQL  HTML

5 Semantic Web Templates An HTML file with additional tags Tags specify: Where particular knowledge is stated What kind of knowledge it is Where it came from, if applicable References to an entity or relation Repetitive regions of text

6 Goal: Assist Research Unstructured Extraction Sort through buckets of data to find gold Entity recognition Relation recognition Semistructured Extraction Utilize repetitive patterns within a page Use similar pages to extract more data Robust despite changing pages, data

7 Unstructured Extraction Natural language processing News feeds Indexing, storage, retrieval Plugin architecture Web Services Our system, collaboration with IBM via NIMD Rover news crawler Political news articles from Yahoo! 22,000 articles, ~8500 concepts, ~1000 relations Used in authoring tools

8 Unstructured Extraction Pattern based system Leverage “hints” for the reader in news articles British Prime Minister Tony Blair “Tony Blair” is a Prime Minister who represents the Country “England”. System runs daily on Yahoo political news Highlights known terms in green Highlights new terms in red Used to create search index, maintain KB Demo

9 Semi-structured Extraction Extract, produce knowledge Initial model is Domain Authorities Enhance KB with ground facts Strong for relations and breadth of data Leverages work of others Makes use of SQL databases Future work is wide-scale web of trust

10 Semi-structured Extraction Site Registry By description and property CIA World Fact Book has data about items which are of type CIA World Fact Book has properties,,, etc. Demo

11 Semi-structured Extraction Publishing Human editing good for high-level concepts Automated techniques good for relations, ground level facts, and massive repetition Rover web crawler Template construction is currently manual With critical mass of data, templates could be discovered.

12 Enhanced Document Retrieval Enhanced document retrieval Search based on concept Find articles about… Membership: Scottie Pippen  Trailblazers Membership: Osama bin Laden  al-Qaeda Subgroups: Ramadan Shallah  Islamic Jihad  al-Qaeda Semantic search

13 Enhanced Document Retrieval Document Augmentation Sidebar acts as glossary as you read Pre-fetch data user is likely to want Adapt to user preferences, activities Deeper understanding for user, gets answers to questions raised while reading

14 Enhanced Document Retrieval

15 Search Augmentation Google assumes users only want documents Provide answers along with documents Use query term denotation to more closely target results “Browns Ferry” is a garden park “Browns Ferry” is a nuclear power plant Automates what people do with IR systems Append hints about the type of term being sought

16 Search Augmentation

17 Demo: Basic Search Demo: Followup Data Demo: Disambiguation Demo: Relations

18 Basic Question Answering Automated techniques for ground facts Use reasoners for higher-level facts Tie in with KSL AQUAINT work Feedback, direction from user Structure of knowledge allows simple form of question answering

19 Basic Question Answering Multiple views into data Browse interface Ugly, but complete view Activity-based knowledge presentation Search, document augmentation Future work accept user feedback, customization, preferred sources

20 Basic Question Answering Query by example Users create many similar documents These are targeted to an activity Use past work to speed present work User creates and templates which present data they find interesting in a way they find convenient

21 Query by Example

22

23

24 Goal: Produce Reports Most reports are made with Office Word processor, spreadsheet Enhance with semantic awareness Provide seamless access to knowledge Transparent maintenance, creation Low overhead of operation Avoid centralized approach Contrast with relational database

25 Word Processing Creation of new data Semantic scan Like spell check or grammar check Automatically identifies referenced entities Learns new entities, relations between entities Annotation of text User manually adjusts system User adds new data System gets smarter over time

26 Word Processing Create data via entry into templates Create new templates For others For personal use Extend templates with new entry areas Enhance analyst’s view Semantic Search, Document Augmentation Sidebar boxes are templates too

27 Word Processing Demo: Semantic Scan Demo: Annotation Demo: Knowledge Creation

28 Spreadsheets Spreadsheets are key tools in analysis Tabular format, UI are both intuitive Sorting, basic math functions We add semantics: New formula type: “Get Data” New formula type: “Put Data” Summarization, new views

29 Spreadsheets Example scenario Suppose SARS was found to affect Asian- Americans more than others? Analyst wants to determine, based on that, which states are most at risk Knowledge from Census tells us Asian- American population as a percentage

30 Spreadsheets

31

32

33

34

35

36 Goal: Consume Reports Verify others’ data against yours Incorporate others’ results into your knowledge base, track sources Maintain data Change notification Document updates with new data Versioning of documents, data

37 Goal: Share Reports Easily exchangable via e-mail Truth maintenance techniques Multiple views into data Leverage domain expertise The missile guy has a KB, … Collaboration, trust levels Colleagues disagree, sources are unreliable

38 Conclusion KD-D effort is focused on authoring, analysis tasks Leverage automated techniques to complement manual techniques System gets smarter as it’s used Tie in with commonly used applications


Download ppt "Template-based Authoring Knowledge Systems Laboratory Stanford."

Similar presentations


Ads by Google