Download presentation
Presentation is loading. Please wait.
Published byNicholas McDowell Modified over 9 years ago
1
Template-based Authoring Knowledge Systems Laboratory Stanford
2
Project Goals Assist analyst in everyday work Knowledge Authoring Tools to assist in: Research for reports Produce reports Consume reports Share reports Our solution: Semantic Web Templates
3
Semantic Web Templates Knowledge Representation, Semantics are key for information exchange Creation, maintenance of knowledge must be transparent Automate extraction of knowledge Enhance knowledge retrieval methods
4
Semantic Web Templates Similar to MS Word Templates Different templates for different tasks Word templates can have restrictions on text Very primitive, such as length of text Simplistic patterns such as “phone number” No concepts such as “color” or “country” One template, many documents HTML templates are very common today Many web sites use SQL database as back end, template + SQL HTML
5
Semantic Web Templates An HTML file with additional tags Tags specify: Where particular knowledge is stated What kind of knowledge it is Where it came from, if applicable References to an entity or relation Repetitive regions of text
6
Goal: Assist Research Unstructured Extraction Sort through buckets of data to find gold Entity recognition Relation recognition Semistructured Extraction Utilize repetitive patterns within a page Use similar pages to extract more data Robust despite changing pages, data
7
Unstructured Extraction Natural language processing News feeds Indexing, storage, retrieval Plugin architecture Web Services Our system, collaboration with IBM via NIMD Rover news crawler Political news articles from Yahoo! 22,000 articles, ~8500 concepts, ~1000 relations Used in authoring tools
8
Unstructured Extraction Pattern based system Leverage “hints” for the reader in news articles British Prime Minister Tony Blair “Tony Blair” is a Prime Minister who represents the Country “England”. System runs daily on Yahoo political news Highlights known terms in green Highlights new terms in red Used to create search index, maintain KB Demo
9
Semi-structured Extraction Extract, produce knowledge Initial model is Domain Authorities Enhance KB with ground facts Strong for relations and breadth of data Leverages work of others Makes use of SQL databases Future work is wide-scale web of trust
10
Semi-structured Extraction Site Registry By description and property CIA World Fact Book has data about items which are of type CIA World Fact Book has properties,,, etc. Demo
11
Semi-structured Extraction Publishing Human editing good for high-level concepts Automated techniques good for relations, ground level facts, and massive repetition Rover web crawler Template construction is currently manual With critical mass of data, templates could be discovered.
12
Enhanced Document Retrieval Enhanced document retrieval Search based on concept Find articles about… Membership: Scottie Pippen Trailblazers Membership: Osama bin Laden al-Qaeda Subgroups: Ramadan Shallah Islamic Jihad al-Qaeda Semantic search
13
Enhanced Document Retrieval Document Augmentation Sidebar acts as glossary as you read Pre-fetch data user is likely to want Adapt to user preferences, activities Deeper understanding for user, gets answers to questions raised while reading
14
Enhanced Document Retrieval
15
Search Augmentation Google assumes users only want documents Provide answers along with documents Use query term denotation to more closely target results “Browns Ferry” is a garden park “Browns Ferry” is a nuclear power plant Automates what people do with IR systems Append hints about the type of term being sought
16
Search Augmentation
17
Demo: Basic Search Demo: Followup Data Demo: Disambiguation Demo: Relations
18
Basic Question Answering Automated techniques for ground facts Use reasoners for higher-level facts Tie in with KSL AQUAINT work Feedback, direction from user Structure of knowledge allows simple form of question answering
19
Basic Question Answering Multiple views into data Browse interface Ugly, but complete view Activity-based knowledge presentation Search, document augmentation Future work accept user feedback, customization, preferred sources
20
Basic Question Answering Query by example Users create many similar documents These are targeted to an activity Use past work to speed present work User creates and templates which present data they find interesting in a way they find convenient
21
Query by Example
24
Goal: Produce Reports Most reports are made with Office Word processor, spreadsheet Enhance with semantic awareness Provide seamless access to knowledge Transparent maintenance, creation Low overhead of operation Avoid centralized approach Contrast with relational database
25
Word Processing Creation of new data Semantic scan Like spell check or grammar check Automatically identifies referenced entities Learns new entities, relations between entities Annotation of text User manually adjusts system User adds new data System gets smarter over time
26
Word Processing Create data via entry into templates Create new templates For others For personal use Extend templates with new entry areas Enhance analyst’s view Semantic Search, Document Augmentation Sidebar boxes are templates too
27
Word Processing Demo: Semantic Scan Demo: Annotation Demo: Knowledge Creation
28
Spreadsheets Spreadsheets are key tools in analysis Tabular format, UI are both intuitive Sorting, basic math functions We add semantics: New formula type: “Get Data” New formula type: “Put Data” Summarization, new views
29
Spreadsheets Example scenario Suppose SARS was found to affect Asian- Americans more than others? Analyst wants to determine, based on that, which states are most at risk Knowledge from Census tells us Asian- American population as a percentage
30
Spreadsheets
36
Goal: Consume Reports Verify others’ data against yours Incorporate others’ results into your knowledge base, track sources Maintain data Change notification Document updates with new data Versioning of documents, data
37
Goal: Share Reports Easily exchangable via e-mail Truth maintenance techniques Multiple views into data Leverage domain expertise The missile guy has a KB, … Collaboration, trust levels Colleagues disagree, sources are unreliable
38
Conclusion KD-D effort is focused on authoring, analysis tasks Leverage automated techniques to complement manual techniques System gets smarter as it’s used Tie in with commonly used applications
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.