Download presentation
Presentation is loading. Please wait.
1
A probabilistic approach to semantic representation Paper by Thomas L. Griffiths and Mark Steyvers.
2
Approaches to semantic representation Spatial approaches Latent Semantic Analysis – LSA Semantic networks Probabilistic approach
3
A probabilistic approach The purpose of semantic networks Representation of a topic Dimensionality reduction Co-occurrence matrix Possible improvement over LSA Sampling technique
4
Simulation 1: Learning topics with Gibbs sampling Establish the statistical properties of the sampling procedure Demonstrate that complexities of the language are being captured Parameters: 4544 words, random set of 5000 documents, 150 topics Result
5
Bipartite semantic networks Standard conception of a semantic network Bipartite semantic network topic word Power law distribution
6
Simulation 2: Power law degree distribution Distribution over bipartite graphs Input: 4544 words from the word association norms used in Sim. 1, 4544 words drawn at random from the TASA corpus, 5000 random documents, number of topics – 50,150,250
7
Simulation 3: Origins of the power law Initialization procedure test Variants: Keep word frequencies const. Force frequencies of all words to the median Hold the number of documents const. Median number of documents in which words occurred, documents picked at random for words below the median
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.