Indexing, or assigning subject terms to documents

Slides:



Advertisements
Similar presentations
Meaning of Research 1) Research refers to a search for knowledge.
Advertisements

Developing a coding scheme for content analysis A how-to approach.
Subject Analysis: An Introduction Based on BASIC SUBJECT CATALOGING USING LCSH edited by Lori Robare.
Gathering Information Information Collection: Garbage In – Garbage Out.
Introduction to Research Methodology
Everything you need to know in order to set up your Reader’s Notebook
Automating Discovery from Biomedical Texts Marti Hearst & Barbara Rosario UC Berkeley Agyinc Visit August 16, 2000.
Qualitative Data Analysis Systems
Problem Identification
IMT530- Organization of Information Resources1 Feedback Like exercises –But want more instructions and feedback on them –Wondering about grading on these.
Fact or Fiction: Teaching with Historical Fiction American History Foundations August 18, 2011 Fran Macko, Ph.D.
Fact or Fiction: Teaching with Historical Fiction
POINT OF VIEW IN HISTORICAL INTERPRETATION & ANALYSIS October 16, 2013.
SURVEY OF LITERATURE.  tells the reader how you will interpret the significance of the subject matter under discussion.  is a road map for the paper;
Lecture Four: Steps 3 and 4 INST 250/4.  Does one look for facts, or opinions, or both when conducting a literature search?  What is the difference.
Information retrieval wed sept data…. -start at 6.45.
Subjects Indexing, or assigning subject terms to documents.
Learning Targets January 21, 2008 Londa Richter & Jo Hartmann TIE.
WEB 2.0 PATTERNS Carolina Marin. Content  Introduction  The Participation-Collaboration Pattern  The Collaborative Tagging Pattern.
Subject Headings Objective: Students will understand that both books and articles are assigned words to describe their contents. These terms are referred.
FIND IT! USING LIBRARY CATALOGING CONCEPTS TO ORGANIZE AND MAKE RECORDS FINDABLE DIONNE L. MACK, INTERIM DIRECTOR OF QUALITY OF LIFE DEPARTMENTS.
Attributes and Values Describing Entities. Metadata At the most basic level, metadata is just another term for description, or information about an entity.
Research Introduction to the concept of incorporating sources into your own work.
Subject Indexing 384C – Organizing Information Week 6 Spring 2016
Subject Analysis: An Introduction
Primary vs. Secondary Sources
PeerWise Student Instructions
March 13, 2014 RS and GISc Institute of Space Technology
What is a CAT? What is a CAT?.
Review of Related Literature
Working with Scholarly Articles
Indexing, or assigning subject terms to documents
“What is the Horror Genre?”
Evaluating and Interpreting Oral History
What Is Literary Analysis?
Thesis.
Content analysis, thematic analysis and grounded theory
Ambiguities & epistemology (ways to know and produce knowledge): documentary as organized ethnographic texts SM6324 Dr. Linda C.H. LAI.
By Dr. Abdulrahman H. Altalhi
Federated & Meta Search
Metadata standards Guidelines, data structures, and file formats to facilitate reliability and quality of description INF 384 C, Spring 2009.
Self-Critical Writing:
© 2012 The McGraw-Hill Companies, Inc.
Defining Entities for Description
Lecture 1: Course Outline and Introduction
(Or as I like to say, “What’s your point?”)
CHAPTER 4 Designing Studies
Attributes and Values Describing Entities.
CSCD 506 Research Methods for Computer Science
Writing a literary analysis essay
CHAPTER 4 Designing Studies
Overview of Group Presentations & Counterarguments
Depth and Complexity Icons
Introduction to Information Retrieval
CHAPTER 4 Designing Studies
Synthesis.
CHAPTER 4 Designing Studies
Attributes and Values Describing Entities.
CHAPTER 4 Designing Studies
CHAPTER 4 Designing Studies
Primary vs. Secondary Sources
Chapter 4: Designing Studies
PBL at Aalborg University
LITERATURE REVIEW by Moazzam Ali.
Answering the exam task
CHAPTER 4 Designing Studies
A modest attempt at measuring and communicating about quality
Reading and effective note-making
CHAPTER 4 Designing Studies
A Content Comprehension Program that teaches students to 1
Presentation transcript:

Indexing, or assigning subject terms to documents Subjects Indexing, or assigning subject terms to documents

Subjects When we say “I want some information about gardening” or “I read a great book about Andrew Jackson’s presidency” we all know what those things mean, right? We are referring to gardening and Andrew Jackson’s presidency as subjects. These are concepts that describe what the document is about, its topic or its major themes.

But how do we determine a document’s subject? What if I say, “Oh, I read that Andrew Jackson book. I don’t think it’s really about Andrew Jackson, though. It’s really about democracy and federalism in the early United States.” Is the subject Andrew Jackson? Is it democracy and federalism in the United States? How do we know?

Oh no, it’s like the “work” Similar to the idea of the “work” that we talked about earlier in the course, the idea of a “subject” seems intuitive and easy to define, but it’s actually difficult to pin down precisely. Remember the catalog records for the Protocols of the Elders of Zion? Is that book “about” the Jewish conspiracy to create a world government? Is it “about” anti-semitism? Is it “about” hoaxes?

Why should we care? So it’s hard to say for certain what the subject of a document is. What’s the problem? Like the idea of the work, we use the idea of the subject constantly when we are seeking, using, evaluating, and otherwise needing to describe documents. The subject is a key attribute in facilitating effective information retrieval. So we kind of have to deal with it.

Document-oriented definitions of the subject As described by Fidel, subject is often defined as a document attribute. (The answer to the question “what is it about?”) Two ideas of the subject emerge from this line of thinking: Subjects exist as ideal forms, and they can be accurately identified in documents via some rational decision process (Hjorland’s “objective idealism”). There is no ideal form of subject; a subject depends on the way that people interpret the document, and it is hard to decide which interpretation is best (Hjorland’s “subjective idealism”)

Interlude: indexing “Indexing” is the assignment of subject terms to a document to represent its contents. “Subject cataloging” is similar, but the terms are known as “headings.” “Classification” is the assignment of a subject class, or category, to a document. Yeah, they’re kind of the same thing. Except they have slightly different histories.

Interlude: postcoordinate and precoordinate indexing Precoordinate indexing means that complex terms have been enumerated in advance, and the indexer assigns, typically, the most specific appropriate term. Postcoordinate indexing means that multiple component terms may be assigned to indicate a complex term. The “coordination” or combination occurs when the document is indexed or searched for.

Examples: postcoordinate and precoordinate indexing Precoordinate indexing terms might include: Design of hypertext literature. Mushroom foraging in the Pacific Northwest. Postcoordinate indexing terms might include: Design. Hypertext. Literature. Mushrooms. Foraging. Pacific Northwest.

Subject as inherent property In this view, ideal subjects exist in some Platonic dimension. Because subjects do, in fact, exist independently of people, we can talk about the subject as an objective property that is inherent in documents. We can accurately and objectively identify subjects in documents by following logical processes of deduction.

Example: My sister’s book In the UT catalog, my sister’s book is about: Women -- Czechoslovakia -- Social conditions. Women's rights -- Czechoslovakia. But she thinks it’s about civil rights and democracy. According to the “objective idealist” view of the subject, neither of these might be right, but the right description IS out there. Elusive Equality: Gender, Citizenship, and the Limits of Democracy in Czechoslovakia, 1918-1950

Subject as interpretive construct The “subject” is subjective! Anyone’s interpretation is ok! Maybe my sister’s book is about women’s rights in Czechoslovakia, maybe it’s about civil rights and democracy. Can’t we just all get along? But can I say the book is about sea slugs or Andrew Jackson’s presidency or how bad things happen when you let those pesky wimmin get the vote and stuff?

Use-oriented definitions of the subject In these definitions, the subject depends on how a document is or might be used. Two variations of this idea: “Request-oriented” ideas of the subject, or what users need from documents (Hjorland’s “pragmatic” view). The ultimate contribution of the document to knowledge (Hjorland’s “realist/materialist” view).

Subject as what you need In “request-oriented indexing” as described by Fidel, or in Hjorland’s discussion of pragmatic views of the subject, the subject of a document is based on its relevance to user needs. If you are writing a paper on democracy and my sister’s book can help you, then her book is about democracy. For you. In that situation.

Subject as prediction of a document’s contribution In Hjorland’s “realist/materialist” view, the subject describes a document’s contribution to its discipline, or to human knowledge. A subject determination is thus a kind of prediction about what a document’s importance will be to the field. This is “realist” because eventually there will be a sort of answer. The Protocols of the Elders of Zion is really about anti-semitism and hoaxes—that’s its ultimate contribution to knowledge—not the Jewish conspiracy to rule the world. In the meantime, a subject determination is an argument for what might or should happen.

Genre, form, and subject Do artistic works have subjects? Does it matter if the work is text (fiction, poetry) or not (images, music, film)? Is the “subject” of Uncle Tom’s Cabin that it is a novel? It’s not about novels! Is the subject “slavery?” Is Uncle Tom’s Cabin about slavery in the same way that a history book is about slavery?

Of-ness and aboutness This photo is of a rose (it contains a rose). Is it about happiness? (It was tagged with “happiness” in Flickr.)

Subject analysis Subject analysis involves the systematic determination of a document’s subject for the purpose of placing the document in an organized collection. Concepts that designate the subject are often assigned uusing some type of controlled vocabulary or classification scheme. In this case, we want to be consistent in how subjects are assigned, so that the subject has a consistent meaning in the context of the collection, at least.

Subject analysis process There are a number of models, but the ISO standard for subject analysis involves three activities: Examining the document and identifying the subject. Identifying the primary concepts in the subject. Determining how to express those concepts in the vocabulary that is being used for indexing.

Example: subject analysis An article is about possible economic effects on U.S. agriculture as the result of a policy decision by the EU to enable member countries to ban genetically engineered crops. Concepts in this subject description might be: Bans on genetically engineered crops. European Union agricultural import policies. United States agriculture industry.

Exhaustivity and specificity If we want to attempt consistency in indexing, we need to determine: What makes a theme or topic important enough to be indexed (exhaustivity). The level of detail at which index terms are assigned (specificity).

Summary The subject, or what a document is about, is a complex concept that is difficult to define precisely. Some ideas of the subject are based on what a document says, others on the context of use. Subject analysis involves identifying a what a document is “about,” expressing that subject as a set of concepts, and then selecting the index terms that best represent those concepts.