Presentation is loading. Please wait.

Presentation is loading. Please wait.

Large-Scale Organization of Semantic Networks Mark Steyvers Josh Tenenbaum Stanford University.

Similar presentations


Presentation on theme: "Large-Scale Organization of Semantic Networks Mark Steyvers Josh Tenenbaum Stanford University."— Presentation transcript:

1 Large-Scale Organization of Semantic Networks Mark Steyvers Josh Tenenbaum Stanford University

2 Graph theoretic analyses: Collaboration network of film actors, scientists Watts & Strogatz (1998); Newman (2001) Neural network of worm: C. elegans Watts & Strogatz (1998) WWW Barabasi & Albert (1999)

3 Link structure of semantic networks: Small-world & scale free What produces such link structures? Semantic growth Relation to age-of-acquisition effects Behavioral effects of link structure Overview

4 Word Association Nelson et al. (1999) n words = 5,000+

5 Roget’s Thesaurus Categories 1,000 Word forms 29,000+

6 Wordnet George Miller Word forms 122,000+ Word senses 99,000+

7 1. Short path lengthsL = average length of 3.0410.6 5.6 shortest path between two nodes One class of Small World Networks: Word Association Roget WordNet 3 x number of triangles number of connected triples of vertices C=0C=1 2. Local clustering C =.186.029.875 3. Power-law  = exponent in power-law  degree distribution distribution 3.03 10.61 5.43.004.000.613 - - - Random Graphs n = number of nodes 5018 200,000+ 30,000+

8 Power law tail Exponential tail e.g., random graphs (Erdös-Réyni) or Watts & Strogatz (1998) model Exponential: Power law: HUBS

9  =3.01  =3.19  =3.11

10 Zipf’s (1949) “Law of Meaning” Word frequency rank #meanings Slope in rank plot a=.466 Slope in distribution plot  = 3.15 Adamic (2000):  =1+1/a

11 Link structure of semantic networks: Small-world & scale free What produces such link structures? Semantic growth Relation to age-of-acquisition effects Behavioral effects of link structure Overview

12 H.A. Simon (1955). Power laws in distributions: –Scientists by number of papers published –Cities by population –Income by size -> “rich get richer” growth-like stochastic process Barabasi et al. (1999). Power laws in WWW –in-degree & out-degree -> growth processes

13 Proposal: Power-law degree distributions in semantic networks are signature of semantic growth within individual; lexical development across speakers; language evolution Disclaimer: We will not describe in detail any specific psychological mechanism

14 Growing Network Model Representation: Nodes represent words or concepts Edges represent semantic relations or associations Variables: k i = degree of node i u i = utility of node i based on word frequency:

15 Start with small fully connected network with M nodes A new node is inserted: 1)Choose a local neighborhood i (a neighborhood i of a node is formed by node i and its neighbors) 2)Make M connections into neighborhood repeat n times until network is large enough

16 3 2 6 3 2 5 3 3 2 4 new node Preferentially choose large neighborhoods: 1 new node Preferentially make M connections to nodes with high utility: 2.1 3.4.6.2 2.3 2.1 1.5 2

17

18

19

20

21

22

23

24

25 Word Association n50185018 2222 Path LengthL3.042.84 (.04) ClusteringC.186.185 (.007) coefficient Power-Law  coefficient Growing Network Model Barabasi & Albert (1999) Model 5018 22 2.85.020 2.83

26 Power-laws in non-growing semantic representations?

27 LSA: Latent Semantic Analysis e.g., Landauer & Dumais (1997) Analyzed co-occurrence statistics in a large corpus Placed 60,000+ words in 300-dimensional space Good semantic neighbors volcano Hawaii ache relax soothe lava Convert LSA space to graph by variable thresholding on similarity measure

28 Tversky & Hutchinson (1986) Low dimensional geometric models are not suitable for representing conceptual similarity relations; upper bound on the number of points that can share the nearest neighbor

29 Ferrer & Solé (submitted): Connect two words if they co-occur within a small contextual window Slide window over large corpus No good semantic neighborhoods volcano -> was -> head -> ache (word association: volcano->hawaii->relax->soothe->ache) or tick -> tock -> made -> wonderful -> universe (word association: tick -> dog -> master -> universe)

30 Link structure of semantic networks: Small-world & scale free What produces such link structures? Semantic growth Relation to age-of-acquisition effects Behavioral effects of link structure Overview

31 Age of acquisition (AoA) effects Naming and lexical decision tasks Carroll & White (1973); Brysbaert et al. (2000) Locus of AoA effects? Brown & Watson (1987); Lambon Ralph et al. (1998) AoA is really cumulative frequency effect? Lewis, Gerhand & Ellis (1999) Need framework to understand AoA effects.

32 t=1…15 t=16…50 t=51...150 Prediction of model: early acquired nodes have more connections. Do words acquired early in life have more connections?

33

34 Language Evolution Words acquired early in English language are words with high degree (work in progress)

35 Link structure of semantic networks: Small-world & scale free What produces such link structures? Semantic growth Relation to age-of-acquisition effects Behavioral effects of link structure Overview

36 Behavioral effects of structural variables centrality Degree-centrality Authority (Eigenvector-centrality) Proposal: In cognitive system, search is biased toward facts, concepts or words with high centrality Naming and lexical decision latencies

37

38

39 Semantic Dementia Hodges & Patterson (1995)

40 Conclusion Link structure of semantic networks: a)shows non-trivial patterns b)shows signature of growth processes “rich get richer” respecting local neighborhoods c)is relevant for search strategies central words might be searched first. Paper will be available at www-psych.stanford.edu/~msteyver

41 But… Early acquired words become more central in your model but maybe Words that are more central are acquired earlier

42 Earliest year of quotation (in OED) vs. k (connectivity)


Download ppt "Large-Scale Organization of Semantic Networks Mark Steyvers Josh Tenenbaum Stanford University."

Similar presentations


Ads by Google