Presentation is loading. Please wait.

Presentation is loading. Please wait.

Can we build recommender system for artwork evaluation?

Similar presentations


Presentation on theme: "Can we build recommender system for artwork evaluation?"— Presentation transcript:

1 Can we build recommender system for artwork evaluation?
Zbigniew W. Ras & Anna Gelich KDD Laboratory Computer Science Department University of North Carolina, Charlotte Research Sponsored by

2 Assigning Price Tags for Art - BIG DATA Problem
Semantic WEB – Very rich source of data/knowledge about artists & Art Websites with information about artists: ArtPrice (649,129 artists) National Gallery, Artists Signatures Websites with information about art: ArtPrice (detailed auction results, marked trends) Saatchiart, Artlog, DevianArt (26 milion members, 1.5 milion comments daily), ….. Recommender System Partitioning Artists into Semantically Similar Buckets & treating buckets as a WEB Crowd (1) Data Collection: Extracting Information from WEB using crawlers, spiders [Apache Nutch based on Apache Hadoop] Data Processing: Extracting Features including Price (discretization cuts: 480, 1020, 2335, 5100, 10650, 19500, …) (3) Data Mining /Map Reduce/

3 x2 Sample Attributes: Has to be discretized
[0,480), [480,1020), [1020,2335), [2335,5100), [5100,10650), [10650,19500), [19500, -) Price discretized Dense area Sparse area 480 1020 2335 5100 10650 19500 x2 x2 x2 x2 x2 Price Distribution 20,543 paintings Price range: 0 – 40,000 Has to be discretized

4 Confusion Matrix f a b d e c g  classified as Number 1863 731 264 71
f = [0,480) 2933 646 2401 1054 384 53 3 a = [480, 1020) 4543 250 1060 2568 1236 134 12 8 b = [1020, 2335) 5268 109 374 1135 2915 430 41 15 d = [2335, 5100) 5019 26 88 229 822 676 73 25 e = [5100, 10650) 1939 6 14 57 166 123 78 9 c = [10650, 19500) 453 4 18 85 44 11 214 g = [19500, -) 388 It looks like paintings within the range $5,100 – $19,500 are overpriced Decision Tree Classifier (Precision) Price Range Correctness Left Ext. Correctness Right Ext. Correctness Overpriced (?) [0,480) 64% - 88% [480, 1020) 53% 67% 76% NO [1020, 2335) 49% 69% 72% [2335, 5100) 58% 81% YES [5110, 10650) 35% 77% 39% [10650, 19500) 17% 44% 19% [19500, -) 55% Average Precision - 52%; Left Ext Average Precision – 71% ; Right Ext Average Precision – 69%

5 Confusion Matrix f a b d e c g  classified as Number 1863 731 264 71
f = [0,480) 2933 646 2401 1054 384 53 3 a = [480, 1020) 4543 250 1060 2568 1236 134 12 8 b = [1020, 2335) 5268 109 374 1135 2915 430 41 15 d = [2335, 5100) 5019 26 88 229 822 676 73 25 e = [5100, 10650) 1939 6 14 57 166 123 78 9 c = [10650, 19500) 453 4 18 85 44 11 214 g = [19500, -) 388 It looks like paintings within the range $5,100 – $19,500 are overpriced Decision Tree Classifier (Precision) Price Range Correctness Left Ext. Correctness Right Ext. Correctness Overpriced (?) [0,480) 64% - 88% [480, 1020) 53% 67% 76% NO [1020, 2335) 49% 69% 72% [2335, 5100) 58% 81% YES [5110, 10650) 35% 77% 39% [10650, 19500) 17% 44% 19% [19500, -) 55% Average Precision - 52%; Left Ext Average Precision – 71% ; Right Ext Average Precision – 69%

6 Confusion Matrix f a b d e c g  classified as Number 1857 676 286 99
11 3 1 f = [0,480) 2933 576 2467 1026 400 61 6 7 a = [480, 1020) 4543 229 982 2668 1175 184 21 9 b = [1020, 2335) 5268 77 336 986 3063 493 43 d = [2335, 5100) 5019 19 76 227 709 812 70 26 e = [5100, 10650) 1939 12 49 151 118 106 c = [10650, 19500) 453 5 18 36 13 239 g = [19500, -) 388 Random Forest (Precision) Price Range Correctness Left Ext. Correctness Right Ext. Correctness Overpriced (?) [0,480) 63% - 86% [480, 1020) 54% 67% 77% NO [1020, 2335) 51% 69% 73% [2335, 5100) 61% 81% 71% YES [5110, 10650) 42% 78% 45% [10650, 19500) 23% 49% 26% [19500, -) 62% 65% Average Precision - 55%, Left Ext Average Precision – 71% ; Right Ext Average Precision – 71%

7 Confusion Matrix f a b d e c g  classified as Number 1857 676 286 99
11 3 1 f = [0,480) 2933 576 2467 1026 400 61 6 7 a = [480, 1020) 4543 229 982 2668 1175 184 21 9 b = [1020, 2335) 5268 77 336 986 3063 493 43 d = [2335, 5100) 5019 19 76 227 709 812 70 26 e = [5100, 10650) 1939 12 49 151 118 106 c = [10650, 19500) 453 5 18 36 13 239 g = [19500, -) 388 Random Forest (Precision) Price Range Correctness Left Ext. Correctness Right Ext. Correctness Overpriced (?) [0,480) 63% - 86% [480, 1020) 54% 67% 77% NO [1020, 2335) 51% 69% 73% [2335, 5100) 61% 81% 71% YES [5110, 10650) 42% 78% 45% [10650, 19500) 23% 49% 26% [19500, -) 62% 65% Average Precision - 55%, Left Ext Average Precision – 71% ; Right Ext Average Precision – 71%

8 WEB (Source of Data & Knowledge)
Extracting artworks (paintings), relevant features, comments about artworks, artists biographies Step 1 Step 1 Initial Dataset of Artists Initial Dataset of Artworks Converting data to a format ready for mining. New features construction using text mining, sentiment mining, guided folksonomy. Step 2 Step 2 Dataset of Artists Dataset of Artworks Clustering based on semantic distance Step 3 Step 4 For each bucket, identifying artworks done by artists listed in that bucket Disjoint Buckets of Semantically Similar Artists Collection of Datasets of Artworks Data Mining and Crowd Construction Step 5 Recommender Systems for Artwork Evaluation and Pricing System Demo (One RS) System Interface

9 Q & A


Download ppt "Can we build recommender system for artwork evaluation?"

Similar presentations


Ads by Google