Presentation is loading. Please wait.

Presentation is loading. Please wait.

Semantic Infrastructure Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services

Similar presentations


Presentation on theme: "Semantic Infrastructure Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services"— Presentation transcript:

1 Semantic Infrastructure Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com

2 2 Agenda  Search and Semantic Infrastructure – Elements /Rich Dynamic Results – Different Environments – Design Issues  Platform for Information Applications – Multiple Applications – Case Study – Categorization & Sentiment – Case Study – Taxonomy Development – Case Study – Expertise & Sentiment  Conclusions

3 3 A Semantic Infrastructure Approach to Search: Elements  Multiple Knowledge Structures – Facet – orthogonal dimension of metadata – Taxonomy - Subject matter / aboutness – Ontology – Relationships / Facts Subject – Verb - Object  Software - Search, ECM, auto-categorization, entity extraction, Text Analytics and Text Mining  People – tagging, evaluating tags, fine tune rules and taxonomy  People – Users, social tagging, suggestions  Rich Search Results – context and conversation

4 4 A Semantic Infrastructure Approach to Search: Rich Results  Elements – Faceted Navigation – Categorization – metadata and/or dynamic – Tag Clouds – clustering – User Tags, personalization – Related topics – discovery  Supports all manner of search behaviors and needs – Find known items – zero in with facets – Discovery – Tags clouds, user tags, related topics – Deep dive - categorization

5 5

6 6

7 7

8 8 A Semantic Infrastructure Approach to Search: Three Environments  E-Commerce – Catalogs, small uniform collections of entities – Conflict of information and Selling – Uniform behavior – buy this  Enterprise – More content, more types of content – Enterprise Tools – Search, ECM – Publishing Process – tagging, metadata standards  Internet – Wildly different amount and type of content, no taggers – General Purpose – Flickr, Yahoo – Vertical Portal – selected content, no taggers

9 9 A Semantic Infrastructure Approach to Search: Enterprise Environment –Taxonomy, 7 facets  Taxonomy of Subjects / Disciplines: – Science > Marine Science > Marine microbiology > Marine toxins  Facets: – Organization > Division > Group – Clients > Federal > EPA – Instruments > Environmental Testing > Ocean Analysis > Vehicle – Facilities > Division > Location > Building X – Methods > Social > Population Study – Materials > Compounds > Chemicals – Content Type – Knowledge Asset > Proposals

10 10 A Semantic Infrastructure Approach to Search: Internet Design  Subject Matter taxonomy – Business Topics – Finance > Currency > Exchange Rates  Facets – Location > Western World > United States – People – Alphabetical and/or Topical - Organization – Organization > Corporation > Car Manufacturing > Ford – Date – Absolute or range (1-1-01 to 1-1-08, last 30 days) – Publisher – Alphabetical and/or Topical – Organization – Content Type – list – newspapers, financial reports, etc.

11 11

12 12 Rich Search Results Design Issues - General  What is the right combination of elements? – Faceted navigation, metadata, browse, search, categorized search results, file plan  What is the right balance of elements? – Dominant dimension or equal facets – Browse topics and filter by facet  When to combine search, topics, and facets? – Search first and then filter by topics / facet – Browse/facet front end with a search box

13 13 Rich Search Results Design Issues - General  Homogeneity of Audience and Content  Model of the Domain – broad – How many facets do you need? – More facets and let users decide – Allow for customization – can’t define a single set  User Analysis – tasks, labeling, communities Issue – labels that people use to describe their business and label that they use to find information  Match the structure to domain and task – Users can understand different structures

14 14 Rich Search Results Automatic Facets – Special Issues  Scale requires more automated solutions – More sophisticated rules  Rules to find and populate existing metadata – Variety of types of existing metadata – Publisher, title, date – Multiple implementation Standards – Last Name, First / First Name, Last  Issue of disambiguation: – Same person, different name – Henry Ford, Mr. Ford, Henry X. Ford – Same word, different entity – Ford and Ford  Number of entities and thresholds per results set / document – Usability, audience needs  Relevance Ranking – number of entities, rank of facets

15 15 Semantic Infrastructure for Search Based Apps Multiple Applications  Platform for Information Applications – Content Aggregation – Duplicate Documents – save millions! – Text Mining – BI, CI – sentiment analysis – Combine with Data Mining – disease symptoms, new – Social – Hybrid folksonomy / taxonomy / auto-metadata – Social – expertise, categorize tweets and blogs, reputation – Ontology – travel assistant – SIRI  Use your Imagination!

16 16 Semantic Infrastructure for Search Apps Multiple Applications  SIRI – Travel Assistant

17 Semantic Infrastructure for Search Apps Case Study – Categorization & Sentiment  Call Motivation – Categorization – Motivation Taxonomy – Purpose of previous calls to understand current call – Issues of scale, small size of documents, jargon, spelling  Customer Sentiment – Telecom Forums – Feature level – not just products – Issue of context - sarcasm, jargon  Knowledge Base – Categorization, Product extraction, expertise-sentiment analysis – Social Media as source for solutions 17

18 Case Study – Categorization & Sentiment 18

19 Case Study – Categorization & Sentiment 19

20 Case Study – Categorization & Sentiment 20

21 Case Study – Categorization & Sentiment 21

22 Semantic Infrastructure for Search Apps Case Study – Taxonomy Development  Problem – 200,000 new uncategorized documents  Old taxonomy –need one that reflects change in corpus  Text mining, entity extraction, categorization  Content – 250,000 large documents, search logs, etc.  Bottom Up- terms in documents – frequency, date,  Clustering – suggested categories  Clustering – chunking for editors  Entity Extraction – people, organizations, Programming languages  Time savings – only feasible way to scan documents  Quality – important terms, co-occurring terms 22

23 Case Study – Taxonomy Development 23

24 Case Study – Taxonomy Development 24

25 Case Study – Taxonomy Development 25

26 26 Semantic Infrastructure Applications Expertise Analysis  Sentiment Analysis to Expertise Analysis(KnowHow) – Know How, skills, “tacit” knowledge  No single correct categorization – Women, Fire, and Dangerous Things – Types of Animals Those that belong to the Emperor Embalmed Ones Suckling Pigs Fabulous Ones Those that are included in this classification Those that tremble as if they were mad Other

27 27 Semantic Infrastructure Applications Expertise Analysis – Basic Level Categories  Mid-level in a taxonomy / hierarchy  Short and easy words  Maximum distinctness and expressiveness  First level named and understood by children  Level at which most of our knowledge is organized  Levels: Superordinate – Basic – Subordinate – Mammal – Dog – Golden Retriever – Furniture – chair – kitchen chair

28 28 Semantic Infrastructure Applications Expertise Analysis  Experts prefer lower, subordinate levels – In their domain, (almost) never used superordinate  Novice prefer higher, superordinate levels  General Populace prefers basic level  Not just individuals but whole societies / communities differ in their preferred levels  Issue – artificial languages – ex. Science discipline  Issue – difference of child and adult learning – adults start with high level

29 29 Semantic Infrastructure Applications Expertise Analysis  What is basic level is context(s) dependent – Document/author expert in news health care, not research  Hybrid – simple high level taxonomy (superordinate), short words – basic, longer words – expert Plus  Develop expertise rules – similar to categorization rules – Use basic level for subject – Superordinate for general, subordinate for expert  Also contextual rules – “Tests” is general, high level – “Predictive value of tests” is lower, more expert – If terms appear in same sentence - expert

30 30 ExpertGeneral Research (context dependent)Kid StatisticalPay Program performanceClassroom ProtocolFail Adolescent AttitudesAttendance Key academic outcomesSchool year Job training programClosing American Educational Research AssociationCounselor Graduate management educationDiscipline Education Terms

31 31 ExpertGeneral MouseCancer DoseScientific ToxicityPhysical DiagnosticConsumer MammographyCigarette SamplingSmoking InhibitorWeight gain EdemaCorrect NeoplasmsEmpirical IsotretinionDrinking EthyleneTesting SignificantlyLesson Population-baseKnowledge PharmacokineticMedicine MetaboliteSociology PolymorphismTheory SubsyndromicExperience RadionuclideServices EtiologyHospital OxidaseSocial CaptoprilDomestic Pharmacological agents Dermatotoxicity Mammary cancer model Biosynthesis Healthcare Terms

32 32 Semantic Infrastructure Applications Expertise Analysis – application areas  Taxonomy/ Ontology development /design – use basic level  User contribution – Card sorting – non-experts use superficial similarities – Survey for attributes instead of cart sorting, general structure  Develop expert and general versions/sections/synonyms  Info presentation – combine superordinate and basic – Similar to scientific – Genus – Species is official name  Text Mining – Expertise characterization of writer

33 33 Semantic Infrastructure Applications Expertise Analysis – application areas  Business & Customer intelligence – General – characterize people’s expertise to add to evaluation of their comments – Combine with sentiment analysis – finer evaluation – what are experts saying, what are novices saying – Deeper research into communities, customers  Enterprise Content Management – At publish time, software automatically gives an expertise level – present to author for validation – Combine with categorization – offer tags that are suitable level of expertise

34 34 Semantic Infrastructure Applications Expertise Analysis – application areas  Social Media - Community of Practice – Characterize the level of expertise in the community – Evaluate other communities expertise level – Personalize information presentation by expertise  Expertise location – Generate automatic expertise characterization based on authored documents  Expertise of people in a social network – Terrorists and bomb-making

35 Semantic Infrastructure Applications Expertise Analysis – application areas- CoP  Basic Level – Blog – Software (Design) – Web (Design) – Linux – Javascript – Web2.0 – Google – Css – Flash  Superordinate – Music – Photography – News – Education – Business – Technology – Politics – Science – Culture 35

36 Semantic Infrastructure Applications Expertise Analysis – application areas-Tags  CSS – Web Design – Design – Css3 – Tutorial – Webdev – Javascript – Web – Development – Html – Jquery – html5  Education – Technology – Resources – Teaching – Learning – Science – Web20 – Games – Interactive – Research – Tools – reference 36

37 37 Semantic Infrastructure Approach to Search Conclusions  Semantic Infrastructure solution (people, policy, technology, semantics) and feedback is best approach  Foundation – Hybrid ECM model with text analytics, Search  Integrated Search design is essential – rich results – Subject, facets, tag clouds, etc.  Semantic Infrastructure as a platform for multiple applications – Build on infrastructure for economy and quality  Text Analytics (Entity extraction and auto-categorization) are essential  Future – new kinds of applications: – Text Mining and Data mining, research tools, sentiment – Beyond Sentiment – expertise applications – NeuroAnalytics – cognitive science meets search and more Watson is just the start

38 Questions? Tom Reamy tomr@kapsgroup.com KAPS Group Knowledge Architecture Professional Services http://www.kapsgroup.com

39 39 Resources  Books – Women, Fire, and Dangerous Things George Lakoff – Knowledge, Concepts, and Categories Koen Lamberts and David Shanks  Web Sites – Text Analytics News - http://social.textanalyticsnews.com/index.php http://social.textanalyticsnews.com/index.php – Text Analytics Wiki - http://textanalytics.wikidot.com/http://textanalytics.wikidot.com/

40 40 Resources  Blogs – SAS- http://blogs.sas.com/text-mining/ http://blogs.sas.com/text-mining/  Web Sites – Taxonomy Community of Practice: http://finance.groups.yahoo.com/group/TaxoCoP/ http://finance.groups.yahoo.com/group/TaxoCoP/ – LindedIn – Text Analytics Summit Group – http://www.LinkedIn.com http://www.LinkedIn.com – Whitepaper – CM and Text Analytics - http://www.textanalyticsnews.com/usa/contentmanagementm eetstextanalytics.pdf http://www.textanalyticsnews.com/usa/contentmanagementm eetstextanalytics.pdf – Whitepaper – Enterprise Content Categorization strategy and development – http://www.kapsgroup.comhttp://www.kapsgroup.com

41 41 Resources  Articles – Malt, B. C. 1995. Category coherence in cross-cultural perspective. Cognitive Psychology 29, 85-148 – Rifkin, A. 1985. Evidence for a basic level in event taxonomies. Memory & Cognition 13, 538-56 – Shaver, P., J. Schwarz, D. Kirson, D. O’Conner 1987. Emotion Knowledge: further explorations of prototype approach. Journal of Personality and Social Psychology 52, 1061-1086 – Tanaka, J. W. & M. E. Taylor 1991. Object categories and expertise: is the basic level in the eye of the beholder? Cognitive Psychology 23, 457-82


Download ppt "Semantic Infrastructure Workshop Applications Tom Reamy Chief Knowledge Architect KAPS Group Knowledge Architecture Professional Services"

Similar presentations


Ads by Google