Presentation is loading. Please wait.

Presentation is loading. Please wait.

Stephen Brown Museums and the Web Asia 9-12 December 2013 Hong Kong

Similar presentations


Presentation on theme: "Stephen Brown Museums and the Web Asia 9-12 December 2013 Hong Kong"— Presentation transcript:

1 Stephen Brown Museums and the Web Asia 9-12 December 2013 Hong Kong
Where are the pictures? Linking photographic records across collections using fuzzy logic Stephen Brown Museums and the Web Asia 9-12 December 2013 Hong Kong

2 Research question Can fuzzy logic based data mining algorithms be used to identify matches between different online collections? Answer: yes

3

4

5

6

7

8

9

10

11

12

13 erps16578 [Royal Museum, the court (i.e. Bargello Museum, the courtyard), Florence, Italy] No person listed 1 photomechanical print : photochrom color. [between ca and ca. 1900]. From the Library of Congress The Courtyard of the Bargello, Florence Henry Little Bromide (Print) 1895 From the ERPS collection 13

14

15

16

17 Method Data preparation (correcting typographical errors, standardizing data such as dates, removing duplicate entries and mapping data to a common metadata schema). Data aggregation (combining standardized records in a single XML database where they can be mined for similarities). Query expansion (extending the range of keywords that are searched for). Field comparison (comparing the contents of individual fields and combining these to produce an overall similarity metric).

18 Alternative computing logics
Classic logic is binary True/False zero/one Set theory Fuzzy logic Degrees of truth Fuzzy set theory 18

19 The concept of tall people
Ben Youngs 5’10” Toby Flood 6’2” Geoff Parling 6’6” 19

20 The concept of tall people
Classical approach: Any one over 6”is tall Ben Youngs 5’10” Toby Flood 6’2” Geoff Parling 6’6” 20

21 The concept of tall people
Classical approach: Any one over 6”is tall Ben Youngs 5’10” Toby Flood 6’2” Geoff Parling 6’6” 21

22 Classical computing The membership function of the set tall people 1
5” 6” 7” Toby Flood 6’2” Ben Youngs 5’10” Geoff Parling 6’6” 22

23 The concept of tall people
Fuzzy approach: Everyone is tall to some degree (as measure by the membership function) Ben Youngs 5’10” Toby Flood 6’2” Geoff Parling 6’6” 23

24 Fuzzy computing The membership function of the set tall people 1 5” 6”
5” 6” 7” Toby Flood 6’2” Ben Youngs 5’10” Geoff Parling 6’6” 24

25 Soft computing The membership function of the set tall people 1 5” 6”
0.95 0.7 0.45 5” 6” 7” Toby Flood 6’2” Ben Youngs 5’10” Geoff Parling 6’6” 25

26 Fuzzy computing Allows for vagueness in concepts Soft boundaries
Partial degrees of truth 26

27 Lightweight Semantic Similarity
B Chrysanthemum 1 Flower A. Chrysanthemum B. Flower

28 Lightweight Semantic Similarity
B Chrysanthemum 1 Flower Chrysanthemum Cosine of the angle between A and B = 0 Therefore, no similarity between A and B Flower

29 Lightweight Semantic Similarity
Chrysanthemum Flower

30 Lightweight Semantic Similarity
Fuzzy term vectors using synset similarity values from WordNet Chrysanthemum Cosine of the angle between A and B > 0 Therefore, some similarity between A and B Flower

31 Combined similarity metric
IF title is good AND person is good THEN match is good. IF title is good AND (date is good OR process is good) THEN match is ok. IF person is good AND title is bad THEN match is ok. IF title is bad AND person is bad THEN match is bad.

32 Conclusion Large numbers of small amounts of text are common in collections records. Text volumes too small for corpus linguistics analysis. Need for query expansion Text volumes too small for established Semantic Similarity analysis Lightweight semantic


Download ppt "Stephen Brown Museums and the Web Asia 9-12 December 2013 Hong Kong"

Similar presentations


Ads by Google