Fractal Composition of Meaning: Toward a Collage Theorem for Language Simon D. Levy Department of Computer Science Washington and Lee University Lexington, VA
Part I: Self-Similarity
... But I know, too,
That
the blackbird is involved in
what what
... But I know, too, That the blackbird is involved In what – Wallace Stevens I know.
Part II: A Little Math
Iterated Function Systems IFS: Another way to make a fractal Start with an arbitrary initial image Apply a set of contractive affine transforms Repeat until image no longer changes E.g., Sierpinski Triangle...
Sierpinski Triangle I.e., three half-size copies, one in each of three corners of [0,1] 2
Sierpinski Triangle: 0 Iterations
Sierpinski Triangle: 1 Iteration
Sierpinski Triangle: 2 Iterations
Sierpinski Triangle: 3 Iterations
Sierpinski Triangle: 4 Iterations
Sierpinski Triangle: 5 Iterations
Sierpinski Triangle: 6 Iterations
Sierpinski Triangle: 7 Iterations
Sierpinski Triangle: New Initial Image
Sierpinski Triangle
IFS Fractals in Nature
Fractal Image Compression Doesn't matter what image we start with All information needed to represent final “target” image is contained in transforms Instead of storing millions of pixels, determine transforms for target image, and store them How to determine transforms?
The Collage Theorem Let is the attractor or “fixed point” of Collage Theorem (Barnsley 1988): Given arbitrary target image, transforms encoding are s.t. Use various practical methods to find
Practical Fractal Image Compression Most real-world images are only partially self- similar Arbitrary images can be partitioned into “tiles”, each associated with a transform. Compression algorithm computes and stores locations and transforms of tiles
Practical Fractal Image Compression
Part III: Unification
The “Two Cultures” Discrete Symbols Semantic Relations Grammar Rules Graph Structures Continuous Vectors Metric Spaces Continuous Transforms Image s Linguistics AI Logic Dynamical Systems Chaos Electrical Engineering
Meanings as Vectors “You shall know a word by the company it keeps” – J. R. Firth Vector representation of a word encodes co- occurrence with other words Latent Semantic Analysis (Indexing) – Singular Value Decomposition of co- occurrence matrix on text; 300-dimensional vectors [Landauer, Dumais, Kintsch]
Meanings as Vectors Self-Organizing Maps – Collapse high- dimensional descriptions (binary features or real-val vectors) into 2-D [Kohonen] Simple Recurrent Networks – Hidden-variable temporal model predicting next word based on current; 150-D vectors [Elman]
Meanings as Vectors Fred says the woman arrived. The woman says fred arrived. Fred loves the woman. The woman loves fred. The woman arrived. Fred arrived.
Composing Meaningful Vectors the woman.
Composing Meaningful Vectors loves the woman.
Composing Meaningful Vectors Fred loves the woman.
Part IV: Conclusions
Advantages of Vector Representations Meaning as a gradient phenomenon (semantic spaces = vector spaces) Can represent all transforms with a single hidden-variable non-linear equation (“grammar”) Gradient-descent methods as learning model Principled, biologically plausible alternative to “Words and Rules” approach [Chomsky, Pinker, Fodor]
A Collage Theorem for Language
A Collage Conjecture for Language
A Collage Hypothesis for Language
A Collage S.W.A.G. for Language Words/meanings are co-occurrence vectors. Compositions of meanings are transients to words. “Correct” set of transients is one for which word vectors form a subset of the attractor.