Self-organizing Conceptual Map and Taxonomy of Adjectives Noriko Tomuro, DePaul University Kyoko Kanzaki, NICT Japan Hitoshi Isahara, NICT Japan April 20, 2007
Overview In natural languages, adjectives are polysemous. “warm soup” (temperature), “warm person” (personality) Conversely, a given adjectival concept includes adjectives extended from various domains. e.g. Adjectives which express feeling – “happy” (emotion), “cold” (temperature), “painful” (sensation) We use Kohonen Self-Organizing Map (SOM) to Visualize the adjectival concept space. Create a taxonomy of adjectival concepts. A comprehensive, corpus-based study of adjectives.
Outline 1.Adjectival Concepts 2.Kohonen Self-Organizing Map (SOM) 3.Conceptual Map of Adjectival Concepts 4.Taxonomy of Adjectival Concepts 5.Conclusions 6.Future Work
1. Adjectival Concepts (1) An adjectival concept is a semantic class of adjectives (adjectives which express XX). Some adjectival concepts are more closely related than others. perception – “warm”, “painful” personality – “warm”, “gentle” degree – “high”, “wide” We wish to visualize the adjectival concept space automatically, using corpus data; on a 2-dimensional map.
1. Adjectival Concepts (2) Use abstract nouns to represent an adjectival concept. Extract examples from corpus (Japanese newspaper articles) where adjectives modify abstract nouns. e.g. “warm personality”, “warm feeling” Dataset: 361 Japanese abstract nouns, defined by 2374 adjectives Frequency counts are changed to Mutual Information (MI) values (for feature weighting).
2. Kohonen Self-Organizing Map (SOM) (1) SOM is an unsupervised learning method, originally developed by T. Kohonen in 1980’s. Used for: neuroscience (to map sensory stimuli to brain) clustering visualizing high-dimensional data in low dimension (usually a 2-dimensional grid)
2. SOM (2) Each node in a SOM map is associated with a reference vector. During learning, weights on the reference vectors are adjusted so that similar input instances are mapped to nearby nodes in the map.
3. Conceptual Map of Adjectival Concepts Map size: 45 * 45
Overlaying tight clusters the “TOP” node
Image, Figure, Atmosphere tender, brilliant, quiet, … Characteristics of humans or things, Relation brave, monopolistic, privileged, intimate, … Sense, Perception soft, bitter, red, white, … Shape, Appearance round, square, flat, three-dimensional, Degree fast, high, cheap, wide, … Viewpoint, Domain, Attitude traditional, conservative, historical, … Effect, Influence powerful, extreme, … TOP, Thing, Feeling, Aspect State, Status unhappy, dangerous, difficult, … Talent, Ability excellent, creative, … Conceptual Area Map of Adjectives
We also wish to extract a taxonomy of adjectival concepts – indicates the breadth of the concepts. Hierarchical relation is based on subsumption. 4. Taxonomy of Adjectival Concepts (1) Feeling Tactual sensation painful, tickling Temperature hot, cold Sense hot, cold, painful, tickling Emotion happy, sad happy, sad, hot, cold, painful, tickling
4. Taxonomy of Adjectival Concepts (2) The derived SOM map also readily formed a taxonomy, because: highly abstract nouns take on many adjectives; less abstract nouns take on specific adjectives. Connect map nodes in the parent-child relation (using cosine & entropy) => a taxonomy overlaid on the SOM map moderately abstract highly abstract less abstract
Image, Figure, Atmosphere Characteristics of humans or things, Relation … Sense, Perception Shape, Appearance Degree Viewpoint, Domain, Attitude Effect, Influence TOP, Matter, Feeling, State, Status Talent, Ability Branches are descending to specific concept areas. Concepts near “TOP” are densely overlapping – extremely abstract concepts are vague and indistinguishable.
FeelingAspect ImageDirection Atmosphere Perception Nature Circumstance Personality Shade Situation Degree Order Taxonomy of “kibishii (tough/hard/strict)” Taxonomy is a graph (not a tree), forming a complex hierarchical structure. We can observe the breadth of various concepts. e.g. Many kinds of “image” – atmospheric (“quiet image”), perceptual (“soft image”), personality (“brave image”). Also we can observe relative closeness between concepts.
5. Conclusions We used SOM to visualize the adjectival concept space and derive a taxonomy. Our results will be useful in various areas: to study how adjectives extend meanings (meaning shift) -- linguistics to study how adjectives are acquired or cognitively modeled -- cogsci and psychology to (automatically) derive the meaning of a sentence at a deeper level -- NLP as meta-tags to describe data instances – semantic web
7. Future Work Apply the method to other data: other language (e.g. English) other genre (e.g., web texts) Conduct psychological experiments to see the correlation with human cognition of adjectives. Develop lexical representation for adjectives.