INF 384 C, Spring 2009 Faceted Classification Complex subjects from simpler components.

Slides:



Advertisements
Similar presentations
Architecture Representation
Advertisements

Faceted Classification Complex subjects from simpler components.
Units of specialized knowledge* “A unit of specialized knowledge (SKU) is a unit that represents specialized knowledge at the content level, and communicates.
Task Analysis EDU 553 – Principles of Instructional Design Dr. Steve Broskoske.
Module 8a: Faceted Classification
Module 10b: Wrapup IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
Wrapping Up PBL Problems Hal White Dept. of Chemistry and Biochemistry Workshop Wednesday June 28, 2006 about Developed by with who uses Presented on emphasizing.
Introduction to Systems CSCI102 - Systems ITCS905 - Systems MCS Systems.
Creating Architectural Descriptions. Outline Standardizing architectural descriptions: The IEEE has published, “Recommended Practice for Architectural.
Qualitative Data Analysis and Interpretation
IMT530- Organization of Information Resources1 Feedback Like exercises –But want more instructions and feedback on them –Wondering about grading on these.
1 Languages for aboutness n Indexing languages: –Terminological tools Thesauri (CV – controlled vocabulary) Subject headings lists (CV) Authority files.
Chapter 1 Program Design
Meaning and Language Part 1.
Text Mining: Finding Nuggets in Mountains of Textual Data Jochen Dijrre, Peter Gerstl, Roland Seiffert Presented by Drew DeHaas.
Knowledge Organization By C.RANGANATHAN. Basic Concepts and Terminology Subject: Subject refers to ‘an organized systematized body of ideas, whose extension.
10 December, 2013 Katrin Heinze, Bundesbank CEN/WS XBRL CWA1: DPM Meta model CWA1Page 1.
MDC Open Information Model West Virginia University CS486 Presentation Feb 18, 2000 Lijian Liu (OIM:
Why classification matters The foundations of bibliographic classification.
Developing facets in UDC for online retrieval Claudio Gnoli (University of Pavia) Aida Slavic (UDC Consortium) 8th NKOS Workshop, Corfu, 1 Oct 2009.
The Writing Process Introduction Prewriting Writing Revising
1 MeSH & Principles of Classification April 13, 2005.
Chapter 9 Database Planning, Design, and Administration Sungchul Hong.
Faceted Classification
Chapter 6 System Engineering - Computer-based system - System engineering process - “Business process” engineering - Product engineering (Source: Pressman,
October 4 Educ 325 Developing Whole Number Operations and Reading Fluency.
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Information retrieval wed sept data…. -start at 6.45.
SAMANVITHA RAMAYANAM 18 TH FEBRUARY 2010 CPE 691 LAYERED APPLICATION.
INF 384 C, Spring 2009 Subject Languages Category structures to represent topics.
AAT Art & Architecture Thesaurus. Diffuse list of museum standards
Data/term-model. 2 Copyright e-Government Program (Yesser) Data/term-model - Summary Slide  Definition of a data/term model  Term Analysis and Modeling.
بسم الله الرحمن الرحيم. Organizing holdings & providing library services To provide high quality information services, librarians and information specialists.
Semantic Data & Ontologies CMPT 455/826 - Week 5, Day 2 Sept-Dec 2009 – w5d21.
Semantic web course – Computer Engineering Department – Sharif Univ. of Technology – Fall Knowledge Representation Semantic Web - Fall 2005 Computer.
CIS 112 Exam Review. Exam Content 100 questions valued at 1 point each 100 questions valued at 1 point each 100 points total 100 points total 10 each.
The Semantics of Classification Motivating the New Part 2 Jim Carpenter Bureau of Labor Statistics WG2 Meetings Santa Fe, NM January 27-31, 2003.
Personalized Interaction With Semantic Information Portals Eric Schwarzkopf DFKI
Summing Up Object Oriented Design. Four Major Components: Abstraction modeling real-life entities by essential information only Encapsulation clustering.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Teaching Writing.
IMT530- Organization of Information Resources1 Feedback Lectures –More practical examples –Like guest lecturers –Generally helpful in understanding concepts.
Achieving Semantic Interoperability at the World Bank Designing the Information Architecture and Programmatically Processing Information Denise Bedford.
LIS 204: Introduction to Library and Information Science Week Nine Kevin Rioux, PhD.
Enable Semantic Interoperability for Decision Support and Risk Management Presented by Dr. David Li Key Contributors: Dr. Ruixin Yang and Dr. John Qu.
Program Design. Simple Program Design, Fourth Edition Chapter 1 2 Objectives In this chapter you will be able to: Describe the steps in the program development.
Ontologies COMP6028 Semantic Web Technologies Dr Nicholas Gibbins
Sauder Unlimited presents… Corporate Development 101 Corey Wong President of BizTech ( )
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
Subject Indexing 384C – Organizing Information Week 6 Spring 2016
Faceted classification
COMP6215 Semantic Web Technologies
Chapter 7: Classification: What is Classification?
The Systems Engineering Context
Software Design and Architecture
About the Presentations
Tagging documents made easy, using machine learning
Taxonomies & Classification for Organizing Content
The Use of Facets in Web Search Engines
Metadata standards Guidelines, data structures, and file formats to facilitate reliability and quality of description INF 384 C, Spring 2009.
Classification Design
Ontology Reuse In MBSE Henson Graves Abstract January 2011
Attributes and Values Describing Entities.
INDEXING TECHNIQUES The process of constructing document surrogates or document representations is called as Subject Indexing. Indexing has to specify.
University of Northern IA
University of Northern IA
ece 627 intelligent web: ontology and beyond
SAMANVITHA RAMAYANAM 18TH FEBRUARY 2010 CPE 691
University of Northern IA
Attributes and Values Describing Entities.
Presentation transcript:

INF 384 C, Spring 2009 Faceted Classification Complex subjects from simpler components

INF 384 C, Spring 2009 Outline Goals of faceted classification. Basic design of faceted classifications. Facet analysis of complex subjects (factoring). Determination of facet structure. Faceted browsing on the Web.

INF 384 C, Spring 2009 Motivations for faceted classification The sheer number of documents keeps growing. The subjects of the documents are both more specific and more complex. Knowledge itself is rapidly expanding—new subjects are constantly being created. It’s not helpful to put huge numbers of documents in general subject categories (British History, Nuclear Physics). And yet we can’t possibly enumerate all the possible subjects that either currently exist or may soon exist. What to do?

INF 384 C, Spring 2009 Goals of faceted classification If we can create a classification scheme that lists subject components, then we can build complex subjects out of the components as necessary. We facilitate the construction of complex subjects by organizing the component concepts that make up our classification into facets, or potential aspects of the subject.

INF 384 C, Spring 2009 From compound to components Example of complex subject: The history of Japanese tea drinking etiquette Components (or isolates, or factors): History + Japan + Tea + Drinking + Etiquette Potential fundamental categories (facets) for the components: Disciplines (history); Locations (Japan); Beverages (Tea); Activities (Drinking); Values (Etiquette)

INF 384 C, Spring 2009 Building subjects from components A traditional faceted classification for libraries includes both the facet structure of components and syntax rules for combining the components into complex subjects. These rules are necessary to ensure that documents are filed consistently on shelves. (In an online environment, these rules become superfluous.) To “mechanize” the subject-building process and simplify filing, components are given a notation (such as “soil acidity – sag” that clarifies the component’s position within a facet.

INF 384 C, Spring 2009 Structure of faceted classifications While a facet may be a simple list, components within a facet are typically arranged hierarchically (using a stricter or looser sense of hierarchy as appropriate). Organic farming classification CropsProcessesMaterials Fruits (by origin)PlantingNatural soil amendments Vines BerriesControlling pestsCompost BushesFertilizingMulch TreesNatural pesticides Vegetables Herbs

INF 384 C, Spring 2009 Designing faceted classifications 1.Decompose complex concepts (which you have gathered via your research into the subject literature) into component parts, via syntactic or semantic factoring. 2.Group the simple components into fundamental categories. 3.Organize the components in each facet (with hierarchical relationships, subfacets that indicate multiple principles of division, order within arrays, and so on).

INF 384 C, Spring 2009 Understanding complex concepts There are two kinds of compounds: A multi-word unit (which may be a simple concept, such as stained glass, or a complex concept, such as glass cutting). A multi-concept unit (which may be a single word, such as sourdough).

INF 384 C, Spring 2009 Syntactic and semantic factoring Syntactic factoring: A term with multiple words is divided into smaller components. Example: rye bread into rye + bread; Irish emigration into emigration + Irish Semantic factoring: A term is divided into multiple elementary concepts. Example: apartment into dwelling + rental + shared building.

Semantic factoring Most standards/authorities don’t recommend semantic factoring, and there aren’t rules you can use to help with it. But semantic factoring can sometimes help you discover missing concepts in your subject language. It might be extreme to describe Passover as “holiday + Jewish + commemoration + Exodus,” but doing so might make us consider both religion and commemoration of events as aspects common to many holidays.

INF 384 C, Spring 2009 Parsing compounds A compound term consists of a focus (the class of things or events) and a difference, which modifies the class and makes a subclass. Examples: Car tires: Focus is tires, difference is cars. Opera singing: Focus is singing, difference is opera. Mushroom hunter: Focus is hunter, difference is mushroom.

INF 384 C, Spring 2009 Action/patient factoring If the term contains an action (focus) modified by the recipient of the action (difference), factor. But if the term refers to a material (focus) as modified by an action (difference), don’t factor. Example: Hair dyeing: hair + dyeing Bronze engraving: bronze + engraving But don’t factor: dyed hair, engraved bronzes

INF 384 C, Spring 2009 Part/whole factoring If the focus refers to a part or property, and the difference refers to the whole or the possessor of the part or property, factor. But if the focus is the whole, and the difference is the part or property, don’t factor. Examples: Soil acidity: soil + acidity Library shelves: libraries + shelves Don’t factor: acid soils, spare tires.

INF 384 C, Spring 2009 Action/performer factoring If the term contains an intransitive action (focus) modified by the performer (difference), factor. If the performer (focus) is modified by its performance of an intransitive action (difference), don’t factor. Examples: Student meeting: students + meetings Lemur migration: lemurs + migrations But don’t factor: migratory birds

INF 384 C, Spring 2009 Determination of facet structure Ranganathan started from the top down: describing fundamental categories (PMEST) for all subjects and organizing components into those universal facets. The Classification Research Group (CRG), as described by Vickery, advocates beginning from the bottom up: reviewing components and assigning preliminary fundamental categories based on the concept’s definition within the classification’s domain, then looking for commonalities in these preliminary choices. Facets are specific to each classification.

Principles for creating facets Spiteri, 1998, synthesized the following facet design principles from Ranganathan and the CRG: Differentiation. Relevance. Ascertainability. Permanence. Homeogeneity. Mutual exclusivity.

Differentiation principle When creating facets that split a group of entities, choose a principle of division that cleanly splits the group into component parts. For example, dividing people by gender creates two generally unambiguous categories. However, dividing socks according to color can cause problems when considering socks with multiple colors; color does not provide the same level of differentiation for socks as gender does for people.

More facet design principles  Relevance: Choose facets based on the purpose of the classification. A classification of gardening might divide terrain by sun exposure, but a classification of cycling might divide terrain by elevation.  Ascertainability: When possible, choose facets that can be reliably measured.  Permanence: When possible, choose facets that will not change over time.

Final facet design principles  Homogeneity: Each facet (or subfacet) should represent a single principle of division. For example, if we are classifying socks, we should not see colors and patterns in the same array. We would need to separate patterns and colors.  Mutual Exclusivity: The contents of any two facets (or subfacets) should not overlap (that is, they should be mutually exclusive). If we are dividing shoes by heel height and by form, we should not find any mixing of values for either facet (for example, we should not see "high-heeled pumps" in the form facet, but merely "pumps").

INF 384 C, Spring 2009 Faceted browsing on the Web Hearst’s Flamenco is an interface to support browsing of faceted structures on the Web.Flamenco The Hearst article that you read describes how users preferred the faceted browsing interface to a search engine when exploring the collection. (Note that the facets that Hearst used in the Flamenco system are semi-automatically generated and not, perhaps, the best that one might create...)

INF 384 C, Spring 2009 Your continuing mission Begin narrowing down your list of potential concepts for your classification: use your sense of the classification’s audience and purpose, as well as your subject knowledge, to more clearly define the scope of your classification, its boundaries and its central and peripheral areas. Begin defining each concept’s meaning in the context of your classification. Begin drafting a classified structure (one or multiple hierarchies) for your concepts.

Next week Sign up for a ten-minute meeting with me next week. Ten slots are available during class time. Eight slots are available on Thursday after class. Four slots are on Friday around noon. At the meeting, be prepared to tell me your subject, audience, and purpose in a few sentences. Bring a draft of your classified concepts if you have one. Next week I will also reiterate your assignment deliverables in more detail. INF 384 C, Spring 2009