Subject Indexing 384C – Organizing Information Week 6 Spring 2016

Slides:



Advertisements
Similar presentations
Subject Analysis: An Introduction Based on BASIC SUBJECT CATALOGING USING LCSH edited by Lori Robare.
Advertisements

Introduction to Research Methodology
July 11 th, 2005 Software Engineering with Reusable Components RiSE’s Seminars Sametinger’s book :: Chapters 16, 17 and 18 Fred Durão.
Module 10b: Wrapup IMT530: Organization of Information Resources Winter, 2007 Michael Crandall.
introduction to MSc projects
IMT530- Organization of Information Resources1 Feedback Like exercises –But want more instructions and feedback on them –Wondering about grading on these.
Conceptual modelling. Overview - what is the aim of the article? ”We build conceptual models in our heads to solve problems in our everyday life”… ”By.
Software Engineer Report What should contains the report?!
Research Writing and Scientific Literature
Advanced Technical Writing
INF 384 C, Spring 2009 Ontologies Knowledge representation to support computer reasoning.
Lecture Four: Steps 3 and 4 INST 250/4.  Does one look for facts, or opinions, or both when conducting a literature search?  What is the difference.
ZLOT Prototype Assessment John Carlo Bertot Associate Professor School of Information Studies Florida State University.
Current Events and Issues Using Index Databases for Finding Answers.
Introduction to Searching Databases and Records. What is a database? A database is a large, organized collection of information. Addresses Recipes Citations.
1 The Theoretical Framework. A theoretical framework is similar to the frame of the house. Just as the foundation supports a house, a theoretical framework.
Subjects Indexing, or assigning subject terms to documents.
Learning Targets January 21, 2008 Londa Richter & Jo Hartmann TIE.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Issues in Ontology-based Information integration By Zhan Cui, Dean Jones and Paul O’Brien.
Subject Headings Objective: Students will understand that both books and articles are assigned words to describe their contents. These terms are referred.
LIS 204: Introduction to Library and Information Science Week Nine Kevin Rioux, PhD.
FIND IT! USING LIBRARY CATALOGING CONCEPTS TO ORGANIZE AND MAKE RECORDS FINDABLE DIONNE L. MACK, INTERIM DIRECTOR OF QUALITY OF LIFE DEPARTMENTS.
1 M206 Chapter 31: An Overview of Software Development 1.Defining the problem 2.Analyzing the requirement – constructing initial structural model 3.Analyzing.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Some basic concepts Week 1 Lecture notes INF 384C: Organizing Information Spring 2016 Karen Wickett UT School of Information.
Theoretical Perspectives: Information, Language and Cognition Week 14 Lecture notes INF 380E: Perspectives on Information Spring
Knowledge Representation Part I Ontology Jan Pettersen Nytun Knowledge Representation Part I, JPN, UiA1.
Research Introduction to the concept of incorporating sources into your own work.
Dr.V.Jaiganesh Professor
Human Computer Interaction Lecture 21 User Support
Subject Analysis: An Introduction
Writing a Critical Summary of an Article or Paper
The Specialist Study Unit
COMP6215 Semantic Web Technologies
What is a CAT? What is a CAT?.
Software Engineering Lecture 4 System Modeling The Analysis Stage.
Working with Scholarly Articles
Human Computer Interaction Lecture 21,22 User Support
Indexing, or assigning subject terms to documents
ece 627 intelligent web: ontology and beyond
Ontology From Wikipedia, the free encyclopedia
Advanced Technical Writing
DnDAF security views.
Writing Tasks and Prompts
Content analysis, thematic analysis and grounded theory
Outline What is Literature Review? Purpose of Literature Review
CASE STUDY BY: JESSICA PATRON.
Abstract descriptions of systems whose requirements are being analysed
Indexing, or assigning subject terms to documents
Software Requirements analysis & specifications
 DATAABSTRACTION  INSTANCES& SCHEMAS  DATA MODELS.
Preparing for the Verbal Reasoning Measure
Metadata standards Guidelines, data structures, and file formats to facilitate reliability and quality of description INF 384 C, Spring 2009.
Chapter 2 Database Environment.
Defining Entities for Description
Attributes and Values Describing Entities.
Information Retrieval
Introduction Software is becoming very expensive:
How to Become an Expert on Any Topic!
Magnet & /facet Zheng Liang
Introduction to Information Retrieval
Guidelines Use a Large Bold Font (20PT or Larger)
Technical Writing Abstract Writing.
Common Core Standard 9-10.RL.Key Ideas and Details
SPOKEN LANGUAGE Higher English.
Attributes and Values Describing Entities.
Managerial Decision Making and Evaluating Research
What Is an Abstract? Abstract Writing.
Presentation transcript:

Subject Indexing 384C – Organizing Information Week 6 Spring 2016 Karen Wickett School of Information University of Texas at Austin

Subjects When we say things like "I want some information about gardening" "I read a great book about Andrew Jackson's presidency" we are referring to gardening and Andrew Jackson's presidency as subjects. These are concepts that describe what a document is about, its topic or its major themes.

How do we determine this? Suppose you say "I read that Andrew Jackson book. It uses Jackson's presidency as an analytical focus, but I would say it's really about democracy and federalism in the early United States." Is the subject Andrew Jackson? Is it democracy and federalism in the US? How do we know?

Is there an absolutely correct answer? The idea of a "subject" seems intuitive and easy to define, but is difficult to pin down precisely In casual conversation, we can repair any misunderstandings easily. In an information system, we aim for predictability to support our objectives for information organizaiton

Document-oriented definitions As described by Fidel, subject is often defined as a document attribute the answer to the question "What is it about?" Two ideas emerge here Subjects exist as ideal forms and they can be accurately identified in documents via some rational decision process There is no ideal form of subject a subject depends on the way that people interpret the document it is hard to decide which interpretation is the best

Terminological interlude "Indexing" is the assignment of subject terms to a document to represent its contents often associated with info retrieval and search involves choosing subject terms from a controlled vocabulary some subject controlled vocabularies, notably thesauri, have relationships that structure the vocabulary "subject cataloging" is similar, but the terms are known as "headings" associated with library catalogs and the subject access point subject headings are a controlled vocabulary for subjects, typcally with less structure than a thesaurus "Classification" is the assignment of a subject class, or category, to a document classification of physical books involves choosing a single category, which is designated by a notation: e.g. QA76.9

Terminological interlude Indexing, subject cataloging, and classification are closely related activities with different histories In a digital environment many of the distinctions between them become imperceptible Any of them can be used for browsing, navigation, sorting, and demonstrating relationships between items in a collection

Terminology: coordination Precoordinate indexing complex terms have been enumerated in advance the indexer assigns the most specific appropriate term more precise, as it allows specification of the relationship between concepts Postcoordinate indexing multiple component terms may be assigned to indicate a complex term the coordination (or combination) occurs when the document is indexed or searched for more flexible

Examples: postcoordinate and precoordinate indexing Precoordinate terms: Design of hypertext literature Mushroom foraging in the Pacific Northwest Postcoordinate terms: Design Hypertext Literature Mushrooms Foraging Pacific Northwest

Theories of subject Subject as inherent property Subject as interpretive construct Use-oriented definitions

Subject as inherent property In this view, ideal subjects exist independently of people. Therefore, we can talk about the subject as an objective property that is inherent in a document. And we can accurately and objectively identify subjects in documents by following logical processes of deduction. The subject is an objective property, like mass or volume, and if we just knew how to assess the document properly, we could find it.

Example In the UT catalog, this book is described as being about: Women -- Czechoslovakia -- Social conditions. Women’s rights -- Czechoslovakia. But the author has stated that it’s about civil rights and democracy. According to the “objective idealist” view of the subject, neither of these might be right, but the right description IS out there. Elusive Equality: Gender, Citizenship, and the Limits of Democracy in Czechoslovakia, 1918-1950 The subject is an objective property, like mass or volume, and if we just knew how to weigh the document properly, we could find it. But it’s here that we run into a problem of the insufficiency of the Aristotelian notion of categories as a set of necessary and sufficient conditions for membership. In this idea of what a category is, we define a set of conditions and if an instance fulfils those conditions, it is part of the set. A bird has wings and feathers. (Lakoff, for those who have read it.) How do we devise necessary and sufficient conditions to specify the concept of women’s rights, let alone determining whether women’s rights is the subject of the book? This is where Patrick Wilson gets uncomfortable. He’s like, there’s no way that I can think of that unambiguously says what the subject is. Author doesn’t necessarily know best. different people might have different ideas as to what the most important or central idea is. the document is not necessarily about the item that is explicitly mentioned the most. and we can’t objectively determine a unifying purpose, either. Oh well, we just have to deal with it.

Subject as interpretive construct The topical content of a work is determined according to an individual's interpretation The book on the previous slide is about whatever each of us thinks it is about Taking the extreme version of this point of view will diminish the value of subject description recall, we want predictability and consistency to achieve our goals.

Use-oriented definitions In these approaches, the subject is determined with an eye on how the document is used, or might be used. "Request-oriented" ideas of subject focus on what users need from documents Also considered in terms of what the document's ultimate contribution to knowledge is.

Request-oriented indexing The suject of a document is based on the document's relevance to user needs. If you are writing a paper on democracy, and book on slide 12 is useful, then in that scenario, the book is about democracy. Subject is relative to the context of use.

Genre, form, and subject Do artist works have subjects? Does it matter if the work is textual (fiction, poetry) or not (images, music, film)? Is the "subject" of Uncle Tom's Cabin that it is a novel? It isn't about novels, though. Is Uncle Tom's Cabin about slavery in the same way that a history book is about slavery?

of-ness and aboutness This is a photo of a rose it contains a rose (content-wise) is it about roses? It was tagged with "happiness" on Flickr. is it about happiness?

Subject analysis Subject analysis involves the systematic determination of a document's subject for the purpose of placing it in an organized collection. Concepts that designate the subject are assigned, often using a controlled vocabulary or classification scheme. We want to be consistent in how subjects are assigned so that the subject has a consistent meaning in the context of our collection

Subject analysis process (ISO) Examining the document and identifying the subject not "reading" due to time constraints look at titles, abstracts, headings, figures, introduction Identifying primary concepts in the subject standardized via checklists Determining how to express those concepts in the indexing vocabulary

Example: subject analysis An article is about possible economic affects on US agriculture as the result of a policy decision by the EU to enable member countries to ban genetically engineered crops. Concepts in this subject description might be: Bans on genetically engineered crops EU agricultural import policies US agriculture industry

Exhaustivity and specificity If want consistency in indexing, we need to determine: Exhaustivity What makes a theme or topic important enough to be indexed "Exhaustivity refers to the number of factors which are represented by the terms assigned to a document" Specificity The level of detail at which the index terms are assigned "Specificity refers to the extent to which a particular concept which occurs in a document is specified exactly in the indexing language." "Loss of specificity occrs when a particular concept is represented by a term with more general meaning"

Summary The subject, or what a document is about, is a complex concept that is difficult to define precisely Some approaches are based on what a document says, others on the context of use. Subject analysis involves identifying what a document is about, expressing that subject as a set of concepts, and then selecting the index terms that best represent those concepts.

On to the next project You will be developing a small subject language (in the form of a taxonomy) to describe the subjects of documents in a particular domain. Your first step will be to decide on a subject are to represent. By next week's class. More specific and technical subject areas work best: baking, woodworking, photography, tattooing. Look at some basic resources to get a sense of the domain. Your approach to the subject will be mediated through a specific audience and purpose for your subject language. You are developing a structure to describe the subjects of documents. The components of your subject language will be subject concepts, not genre or form terms.