Presentation is loading. Please wait.

Presentation is loading. Please wait.

A PPLICATIONS OF TOPIC MODELS Daphna Weinshall B-530 1 Slides credit: Joseph Sivic, Li Fei-Fei, Brian Russel and others.

Similar presentations


Presentation on theme: "A PPLICATIONS OF TOPIC MODELS Daphna Weinshall B-530 1 Slides credit: Joseph Sivic, Li Fei-Fei, Brian Russel and others."— Presentation transcript:

1 A PPLICATIONS OF TOPIC MODELS Daphna Weinshall B-530 1 Slides credit: Joseph Sivic, Li Fei-Fei, Brian Russel and others

2 Object Bag of ‘words’ 2

3 independent local features extraction B AG OF W ORDS R EPRESENTATION 3

4 D EFINITION OF “B O W” independent local features extraction histogram representation 4

5 T YPES OF F EATURES Regular grid 5

6 T YPES OF F EATURES Regular grid Interest point detector 6

7 E XAMPLE : INTEREST POINT + SIFT Normalize patch Detect patches Compute SIFT descriptor [Lowe’99] 7 …

8 C ODEWORDS DICTIONARY FORMATION … 8

9 Vector quantization … 9

10 Codewords dictionary formation 10

11 Image patch examples of codewords 11

12 Image representation ….. frequency codewords 12

13 A NOTHER EXAMPLE Dictionary Histogram Visual wordsInterest regions 2 20 4 4 15 35 18 39 21 10 61 2 3 4 1 13

14 w N d z D “face” First try - pLSA 14

15 A NALYZING IMAGES WITH TOPIC MODELS Given a corpus: Extract features, then generate dictionary with K- means clustering Determine the topic vectors which are common to all images using pLSA or LDA Learn the mixture coefficients (of topics) for each document (pLSA) or estimate the hyperparameters α and β (LDA) Classification of unseen images with pLSA - topics on one set of images are used to determine the topics in the novel set An image is classified according to the topic with maximal probability (or weight) 15

16 4 categories: faces, motorbikes, airplanes, cars E XAMPLE 1: SCENE RECOGNITION WITH P LSA (S IVIC ET AL, 2005) 16 most common visual words

17 W ORD AND TOPIC DISTRIBUTION 17

18 CategoriespLSALDAKmeans UB: Motorbikes(349) + Airplanes(263) 100%99%91% UB: Faces (435) + Motorbikes (349) + Airplanes(263) 100%96%94% Faces(435) + Motorbikes(800) + Airplanes(800) 97%96%91% Faces(435) + Motorbikes(800) + Airplanes(800) + Car Rears(1155) 98%87%72% Faces(435) + Motorbikes(800) + Airplanes(800) + Car Rears(1155) + Background(1370) 78%77%73% Faces(435) + Motorbikes(800) + Airplanes(800) + Car Rears(1155) + Background(1370), K = 6 76%-- Faces(435) + Motorbikes(800) + Airplanes(800) + Car Rears(1155) + Background(1370), K = 7 83%-- Faces + Motorbikes + Airplanes + Car Rears + Leoperds + Watch(241) + Ketch(114) + Background(1370) 59%64%47% 18

19 Example 2: scene recognition with LDA (Fei Fei & Perona, 2005) 19

20 M ETHOD AND RESULTS Compute LDA model for each category of images Label new images based on maximum likelihood 20

21 D YNAMIC TOPIC MODEL (S HALIT ET AL, 2013) 21

22 D ETECTING UNUSUAL EVENTS Video surveillance – millions of cameras, endless streaming, but who’s watching? 22

23 ● Track high level activities (cars, people) ● Represent activities as `bags of words' ● Model typical behaviours using LDA ● Identify atypical events with low probability M ETHOD 23

24 ● Compute displacement vector ● Bin into one of 25 quantization bins ● Consider transition between one bin to another as a word (25 * 25 = 625 vocabulary words) ● `Bag of words' representation T UBE TRAJECTORY REPRESENTATION 24

25 25

26 T RACKING RESULTS 26

27 Training and test videos are each an hour long, of an urban street intersection Each hour contributed ~1000 tubes We set k, the number of latent topics to be 8 E XPERIMENTAL R ESULTS 27

28 Data Training and test videos are each an hour long, of an urban street intersection Each hour contributed ~1000 tubes We set k, the number of latent topics to be 8 Learned topics: cars going left to right cars going right to left people going left to right Complex dynamics: turning into top street E XPERIMENTAL R ESULTS 28

29 R ESULTS – L EARNED TOPICS Cars going left to right, or right to left 29

30 R ESULTS – L EARNED TOPICS People walking left to right, or right to left 30

31 R ESULTS – LOW PROBABILITY EVENTS 31

32 N EXT STEP – BEYOND BAG OF WORDS Tracking is not always so simple… Representation with bag of discrete words is not always appropriate for images 32

33 B AG O F W ORDS IN C OMPUTER V ISION G ENERAL APPROACH 33

34 B AG O F W ORDS IN C OMPUTER V ISION G ENERAL APPROACH 34

35 1 2 1 1 0 B AG O F W ORDS IN C OMPUTER V ISION G ENERAL APPROACH 35

36 E XPERIMENT – UCSD PED 2 H ARD A SSIGNMENT 16 training videos containing only pedestrians, and 12 test videos containing also abnormal events. Each event’s support is of size 24x24x21 composed 4x4x3 patches of size 9x9x10 Events and patches with no movement are filtered out. Dictionary contains 100 words. 36

37 G ENERATIVE DICTIONARY OF D YNAMIC T EXTURE G. Doretto, A. Chiuso, Y. N. Wu, and S. Soatto. Dynamic textures. International Journal of Computer Vision, 2003 37

38 G ENERATIVE DICTIONARY OF D YNAMIC T EXTURE 38

39 E XPERIMENT – UCSD PED 2 H ARD A SSIGNMENT 16 training videos containing only pedestrians, and 12 test videos containing also abnormal events. Each event’s support is of size 24x24x21 containing 4x4x3 patches of size 9x9x10 Events and patches with no movement are filtered out. Dictionary contains 100 words. 39

40 LDA WITH S OFT A SSIGNMENT OF D ESCRIPTORS 40

41 LDA WITH S OFT A SSIGNMENT OF D ESCRIPTORS 41

42 LDA WITH S OFT A SSIGNMENT OF D ESCRIPTORS 42

43 LDA WITH S OFT A SSIGNMENT OF D ESCRIPTORS 43

44 LDA WITH S OFT A SSIGNMENT OF D ESCRIPTORS 44

45 LDA WITH S OFT A SSIGNMENT OF D ESCRIPTORS 45

46 V ARIATION I NFERENCE UNDER LDA WITH SOFT ASSIGNMENT 46 ⇨

47 PARAMETER ESTIMATION UNDER LDA WITH SOFT ASSIGNMENT 47 ⇨ Obtained while maximizing a lower bound over (itself a lower bound)

48 V ARIATION I NFERENCE AND PARAMETER ESTIMATION UNDER LDA 48 ⇨

49 V ARIATION I NFERENCE AND PARAMETER ESTIMATION UNDER LDA WITH SOFT ASSIGNMENTS 49

50 E XPERIMENT – UCSD PED 2 S OFT A SSIGNMENT 16 training videos containing only pedestrians, and 12 test videos containing also abnormal events. Each event’s support is of size 24x24x21 containing 4x4x3 patches of size 9x9x10 Events and patches with no movement are filtered out. Dictionary contains 100 generative words (Dynamic texture). 50

51 E XPERIMENT – UCSD PED 2 51

52 E XPERIMENT – UCSD PED 2 52

53 S UMMARY We discussed a few methods to organize data in an unsupervised manner This is used to analyze and obtain insights about all kinds of data, from text to images to collections of music Many extensions to the basic LDA model: Hierarchical LDA (topics are organized in a hierarchy) Hierarchical Dirichlet Process mixture model, where the number of topics is not pre-determined Dynamic topic model, which allows topics to change with time and more… 53


Download ppt "A PPLICATIONS OF TOPIC MODELS Daphna Weinshall B-530 1 Slides credit: Joseph Sivic, Li Fei-Fei, Brian Russel and others."

Similar presentations


Ads by Google