Download presentation
Published byAnabel Carson Modified over 7 years ago
1
Subject Indexing 384C – Organizing Information Week 6 Spring 2016
Karen Wickett School of Information University of Texas at Austin
2
Subjects When we say things like
"I want some information about gardening" "I read a great book about Andrew Jackson's presidency" we are referring to gardening and Andrew Jackson's presidency as subjects. These are concepts that describe what a document is about, its topic or its major themes.
3
How do we determine this?
Suppose you say "I read that Andrew Jackson book. It uses Jackson's presidency as an analytical focus, but I would say it's really about democracy and federalism in the early United States." Is the subject Andrew Jackson? Is it democracy and federalism in the US? How do we know?
4
Is there an absolutely correct answer?
The idea of a "subject" seems intuitive and easy to define, but is difficult to pin down precisely In casual conversation, we can repair any misunderstandings easily. In an information system, we aim for predictability to support our objectives for information organizaiton
5
Document-oriented definitions
As described by Fidel, subject is often defined as a document attribute the answer to the question "What is it about?" Two ideas emerge here Subjects exist as ideal forms and they can be accurately identified in documents via some rational decision process There is no ideal form of subject a subject depends on the way that people interpret the document it is hard to decide which interpretation is the best
6
Terminological interlude
"Indexing" is the assignment of subject terms to a document to represent its contents often associated with info retrieval and search involves choosing subject terms from a controlled vocabulary some subject controlled vocabularies, notably thesauri, have relationships that structure the vocabulary "subject cataloging" is similar, but the terms are known as "headings" associated with library catalogs and the subject access point subject headings are a controlled vocabulary for subjects, typcally with less structure than a thesaurus "Classification" is the assignment of a subject class, or category, to a document classification of physical books involves choosing a single category, which is designated by a notation: e.g. QA76.9
7
Terminological interlude
Indexing, subject cataloging, and classification are closely related activities with different histories In a digital environment many of the distinctions between them become imperceptible Any of them can be used for browsing, navigation, sorting, and demonstrating relationships between items in a collection
8
Terminology: coordination
Precoordinate indexing complex terms have been enumerated in advance the indexer assigns the most specific appropriate term more precise, as it allows specification of the relationship between concepts Postcoordinate indexing multiple component terms may be assigned to indicate a complex term the coordination (or combination) occurs when the document is indexed or searched for more flexible
9
Examples: postcoordinate and precoordinate indexing
Precoordinate terms: Design of hypertext literature Mushroom foraging in the Pacific Northwest Postcoordinate terms: Design Hypertext Literature Mushrooms Foraging Pacific Northwest
10
Theories of subject Subject as inherent property
Subject as interpretive construct Use-oriented definitions
11
Subject as inherent property
In this view, ideal subjects exist independently of people. Therefore, we can talk about the subject as an objective property that is inherent in a document. And we can accurately and objectively identify subjects in documents by following logical processes of deduction. The subject is an objective property, like mass or volume, and if we just knew how to assess the document properly, we could find it.
12
Example In the UT catalog, this book is described as being about:
Women -- Czechoslovakia -- Social conditions. Women’s rights -- Czechoslovakia. But the author has stated that it’s about civil rights and democracy. According to the “objective idealist” view of the subject, neither of these might be right, but the right description IS out there. Elusive Equality: Gender, Citizenship, and the Limits of Democracy in Czechoslovakia, The subject is an objective property, like mass or volume, and if we just knew how to weigh the document properly, we could find it. But it’s here that we run into a problem of the insufficiency of the Aristotelian notion of categories as a set of necessary and sufficient conditions for membership. In this idea of what a category is, we define a set of conditions and if an instance fulfils those conditions, it is part of the set. A bird has wings and feathers. (Lakoff, for those who have read it.) How do we devise necessary and sufficient conditions to specify the concept of women’s rights, let alone determining whether women’s rights is the subject of the book? This is where Patrick Wilson gets uncomfortable. He’s like, there’s no way that I can think of that unambiguously says what the subject is. Author doesn’t necessarily know best. different people might have different ideas as to what the most important or central idea is. the document is not necessarily about the item that is explicitly mentioned the most. and we can’t objectively determine a unifying purpose, either. Oh well, we just have to deal with it.
13
Subject as interpretive construct
The topical content of a work is determined according to an individual's interpretation The book on the previous slide is about whatever each of us thinks it is about Taking the extreme version of this point of view will diminish the value of subject description recall, we want predictability and consistency to achieve our goals.
14
Use-oriented definitions
In these approaches, the subject is determined with an eye on how the document is used, or might be used. "Request-oriented" ideas of subject focus on what users need from documents Also considered in terms of what the document's ultimate contribution to knowledge is.
15
Request-oriented indexing
The suject of a document is based on the document's relevance to user needs. If you are writing a paper on democracy, and book on slide 12 is useful, then in that scenario, the book is about democracy. Subject is relative to the context of use.
16
Genre, form, and subject Do artist works have subjects?
Does it matter if the work is textual (fiction, poetry) or not (images, music, film)? Is the "subject" of Uncle Tom's Cabin that it is a novel? It isn't about novels, though. Is Uncle Tom's Cabin about slavery in the same way that a history book is about slavery?
17
of-ness and aboutness This is a photo of a rose
it contains a rose (content-wise) is it about roses? It was tagged with "happiness" on Flickr. is it about happiness?
18
Subject analysis Subject analysis involves the systematic determination of a document's subject for the purpose of placing it in an organized collection. Concepts that designate the subject are assigned, often using a controlled vocabulary or classification scheme. We want to be consistent in how subjects are assigned so that the subject has a consistent meaning in the context of our collection
19
Subject analysis process (ISO)
Examining the document and identifying the subject not "reading" due to time constraints look at titles, abstracts, headings, figures, introduction Identifying primary concepts in the subject standardized via checklists Determining how to express those concepts in the indexing vocabulary
20
Example: subject analysis
An article is about possible economic affects on US agriculture as the result of a policy decision by the EU to enable member countries to ban genetically engineered crops. Concepts in this subject description might be: Bans on genetically engineered crops EU agricultural import policies US agriculture industry
21
Exhaustivity and specificity
If want consistency in indexing, we need to determine: Exhaustivity What makes a theme or topic important enough to be indexed "Exhaustivity refers to the number of factors which are represented by the terms assigned to a document" Specificity The level of detail at which the index terms are assigned "Specificity refers to the extent to which a particular concept which occurs in a document is specified exactly in the indexing language." "Loss of specificity occrs when a particular concept is represented by a term with more general meaning"
22
Summary The subject, or what a document is about, is a complex concept that is difficult to define precisely Some approaches are based on what a document says, others on the context of use. Subject analysis involves identifying what a document is about, expressing that subject as a set of concepts, and then selecting the index terms that best represent those concepts.
23
On to the next project You will be developing a small subject language (in the form of a taxonomy) to describe the subjects of documents in a particular domain. Your first step will be to decide on a subject are to represent. By next week's class. More specific and technical subject areas work best: baking, woodworking, photography, tattooing. Look at some basic resources to get a sense of the domain. Your approach to the subject will be mediated through a specific audience and purpose for your subject language. You are developing a structure to describe the subjects of documents. The components of your subject language will be subject concepts, not genre or form terms.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.