Download presentation
Presentation is loading. Please wait.
Published byWinfred Conley Modified over 7 years ago
1
Stephen Brown Museums and the Web Asia 9-12 December 2013 Hong Kong
Where are the pictures? Linking photographic records across collections using fuzzy logic Stephen Brown Museums and the Web Asia 9-12 December 2013 Hong Kong
2
Research question Can fuzzy logic based data mining algorithms be used to identify matches between different online collections? Answer: yes
13
erps16578 [Royal Museum, the court (i.e. Bargello Museum, the courtyard), Florence, Italy] No person listed 1 photomechanical print : photochrom color. [between ca and ca. 1900]. From the Library of Congress The Courtyard of the Bargello, Florence Henry Little Bromide (Print) 1895 From the ERPS collection 13
17
Method Data preparation (correcting typographical errors, standardizing data such as dates, removing duplicate entries and mapping data to a common metadata schema). Data aggregation (combining standardized records in a single XML database where they can be mined for similarities). Query expansion (extending the range of keywords that are searched for). Field comparison (comparing the contents of individual fields and combining these to produce an overall similarity metric).
18
Alternative computing logics
Classic logic is binary True/False zero/one Set theory Fuzzy logic Degrees of truth Fuzzy set theory 18
19
The concept of tall people
Ben Youngs 5’10” Toby Flood 6’2” Geoff Parling 6’6” 19
20
The concept of tall people
Classical approach: Any one over 6”is tall Ben Youngs 5’10” Toby Flood 6’2” Geoff Parling 6’6” 20
21
The concept of tall people
Classical approach: Any one over 6”is tall Ben Youngs 5’10” Toby Flood 6’2” Geoff Parling 6’6” 21
22
Classical computing The membership function of the set tall people 1
5” 6” 7” Toby Flood 6’2” Ben Youngs 5’10” Geoff Parling 6’6” 22
23
The concept of tall people
Fuzzy approach: Everyone is tall to some degree (as measure by the membership function) Ben Youngs 5’10” Toby Flood 6’2” Geoff Parling 6’6” 23
24
Fuzzy computing The membership function of the set tall people 1 5” 6”
5” 6” 7” Toby Flood 6’2” Ben Youngs 5’10” Geoff Parling 6’6” 24
25
Soft computing The membership function of the set tall people 1 5” 6”
0.95 0.7 0.45 5” 6” 7” Toby Flood 6’2” Ben Youngs 5’10” Geoff Parling 6’6” 25
26
Fuzzy computing Allows for vagueness in concepts Soft boundaries
Partial degrees of truth 26
27
Lightweight Semantic Similarity
B Chrysanthemum 1 Flower A. Chrysanthemum B. Flower
28
Lightweight Semantic Similarity
B Chrysanthemum 1 Flower Chrysanthemum Cosine of the angle between A and B = 0 Therefore, no similarity between A and B Flower
29
Lightweight Semantic Similarity
Chrysanthemum Flower
30
Lightweight Semantic Similarity
Fuzzy term vectors using synset similarity values from WordNet Chrysanthemum Cosine of the angle between A and B > 0 Therefore, some similarity between A and B Flower
31
Combined similarity metric
IF title is good AND person is good THEN match is good. IF title is good AND (date is good OR process is good) THEN match is ok. IF person is good AND title is bad THEN match is ok. IF title is bad AND person is bad THEN match is bad.
32
Conclusion Large numbers of small amounts of text are common in collections records. Text volumes too small for corpus linguistics analysis. Need for query expansion Text volumes too small for established Semantic Similarity analysis Lightweight semantic
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.