Data Mining and Its Applications to Image Processing

Slides:

Advertisements

Similar presentations

Association Rule Mining

Advertisements

Association Analysis (2). Example TIDList of item ID’s T1I1, I2, I5 T2I2, I4 T3I2, I3 T4I1, I2, I4 T5I1, I3 T6I2, I3 T7I1, I3 T8I1, I2, I3, I5 T9I1, I2,

Frequent Itemset Mining Methods. The Apriori algorithm Finding frequent itemsets using candidate generation Seminal algorithm proposed by R. Agrawal and.

Data Mining Techniques Association Rule

Frequent Closed Pattern Search By Row and Feature Enumeration

FP (FREQUENT PATTERN)-GROWTH ALGORITHM ERTAN LJAJIĆ, 3392/2013 Elektrotehnički fakultet Univerziteta u Beogradu.

Chapter 5: Mining Frequent Patterns, Association and Correlations

Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,

Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,

Data Mining Association Analysis: Basic Concepts and Algorithms

Data Mining Association Analysis: Basic Concepts and Algorithms

Association Analysis: Basic Concepts and Algorithms.

Data Mining Association Analysis: Basic Concepts and Algorithms

© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.

© Vipin Kumar CSci 8980 Fall CSci 8980: Data Mining (Fall 2002) Vipin Kumar Army High Performance Computing Research Center Department of Computer.

Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining By Tan, Steinbach, Kumar Lecture.

Modul 7: Association Analysis. 2 Association Rule Mining  Given a set of transactions, find rules that will predict the occurrence of an item based on.

Data Mining Association Analysis: Basic Concepts and Algorithms Lecture Notes for Chapter 6 Introduction to Data Mining by Tan, Steinbach, Kumar © Tan,Steinbach,

1 An Efficient VQ-based Data Hiding Scheme Using Voronoi Clustering Authors:Ming-Ni Wu, Puu-An Juang, and Yu-Chiang Li.

CSE4334/5334 DATA MINING CSE4334/5334 Data Mining, Fall 2014 Department of Computer Science and Engineering, University of Texas at Arlington Chengkai.

1 Information Hiding Based on Search Order Coding for VQ Indices Source: Pattern Recognition Letters, Vol.25, 2004, pp.1253 – 1261 Authors: Chin-Chen Chang,

Palette Partition Based Data Hiding for Color Images Yu-Chiang Li, Piyu Tsai, Chih-Hung Lin, Hsiu-Lien Yeh, and Chien-Ting Huang Speaker : Yu-Chiang Li.

Reporter :Chien-Wen Huang Date : Information Sciences, Vol. 176, No. 22, Nov. 2006, pp Received 29 December 2004; received in revised.

Data Mining Association Rules Mining Frequent Itemset Mining Support and Confidence Apriori Approach.

1 Data Mining Lecture 6: Association Analysis. 2 Association Rule Mining l Given a set of transactions, find rules that will predict the occurrence of.

Advisor: Chang, Chin-Chen Student: Chen, Chang-Chu

An Image Database Retrieval Scheme Based Upon Multivariate Analysis and Data Mining Presented by C.C. Chang Dept. of Computer Science and Information.

Reducing Number of Candidates

Data Mining Association Analysis: Basic Concepts and Algorithms

A Secret Information Hiding Scheme Based on Switching Tree Coding

Data Mining: Concepts and Techniques

Frequent Pattern Mining

Chapter 3 向量量化編碼法.

Reversible Data Hiding in JPEG Images using Ordered Embedding

Data Mining Association Analysis: Basic Concepts and Algorithms

Association Rule Mining

Data Mining Association Analysis: Basic Concepts and Algorithms

Chair Professor Chin-Chen Chang Feng Chia University Aug. 2008

A Parameterised Algorithm for Mining Association Rules

A Color Image Hiding Scheme Based on SMVQ and Modulo Operator

Chair Professor Chin-Chen Chang Feng Chia University

Data Mining Association Analysis: Basic Concepts and Algorithms

Source :Journal of visual Communication and Image Representation

Chair Professor Chin-Chen Chang Feng Chia University

External Joins Query Optimization 10/4/2017

Advisor: Chin-Chen Chang1, 2 Student: Wen-Chuan Wu2

指導教授： Chang, Chin-Chen (張真誠)

A Data Hiding Scheme Based Upon Block Truncation Coding

Association Analysis: Basic Concepts and Algorithms

第七章資訊隱藏張真誠國立中正大學資訊工程研究所.

Hiding Data in a Color Palette Image with Hybrid Strategies

Frequent-Pattern Tree

A Study of Digital Image Coding and Retrieving Techniques

Advisor: Chin-Chen Chang1, 2 Student: Yi-Pei Hsieh2

Reversible Data Hiding Scheme Using Two Steganographic Images

Advisor：Prof. Chin-Chen Chang Student ：Kuo-Nan Chen

Density-Based Image Vector Quantization Using a Genetic Algorithm

Association Rule Mining

Dynamic embedding strategy of VQ-based information hiding approach

Chair Professor Chin-Chen Chang Feng Chia University

A Self-Reference Watermarking Scheme Based on Wet Paper Coding

A Color Image Hiding Scheme Based on SMVQ and Modulo Operator

Hiding Information in VQ Index Tables with Reversibility

Information Hiding and Its Applications

Authors: Chin-Chen Chang, Yi-Hui Chen, and Chia-Chen Lin

A Self-Reference Watermarking Scheme Based on Wet Paper Coding

De-clustering and Its Application to Steganography

Association Analysis: Basic Concepts

A Data Hiding Scheme Based Upon Block Truncation Coding

Hiding Information in VQ Index Tables with Reversibility

Presentation transcript:

Data Mining and Its Applications to Image Processing 資料挖掘技術及其在影像處理之應用指導教授： Chang, Chin-Chen (張真誠) 研究生： Lin, Chih-Yang (林智揚) Department of Computer Science and Information Engineering, National Chung Cheng University

The Fields of Data Mining Mining Association Rules Sequential Mining Clustering (Declustering) Classification ……………

Outline Part I: Design and Analysis Data Mining Algorithms Part II: Data Mining Applications to Image Processing

Part I: Design and Analysis Data Mining Algorithms 1. Perfect Hashing Schemes for Mining Association Rules (or for Mining Traversal Patterns)

Mining Association Rules Support Obtain Large Itemset Confidence Generate Association Rules

D C1 L1 Apriori Scan D C2 Sup=2 C2 L2 Scan D C3 C3 L3 Scan D TID Items 100 A C D 200 B C E 300 A B C E 400 B E Itemset Sup. {A} 2 {B} 3 {C} {D} 1 {E} Itemset Sup. {A} 2 {B} 3 {C} {E} Scan D C2 Sup=2 C2 L2 Itemset {A B} {A C} {A E} {B C} {B E} {C E} Itemset Sup. {A B} 1 {A C} 2 {A E} {B C} {B E} 3 {C E} Itemset Sup. {A C} 2 {B C} {B E} 3 {C E} Scan D 名詞解釋因為求初始Large Itemset為整個演算法最花執行時間所在，而在initial的表現以Apriori的表現為最佳，所以以此與DHP做比較的基準 Minimum Support=2 C3因為{BC},{BE}有共同的第一項，所以測試{CE}是否也為Large Itemset，是，所以得到{BCE}為候選 C3 C3 L3 Scan D Itemset {B C E} Itemset Sup. {B C E} 2 Itemset Sup. {B C E} 2

Apriori Cont. Disadvantages Inefficient Produce much more useless candidates

DHP Prune useless candidates in advance Reduce database size at each iteration

D C1 Count {A} 2 {B} 3 {C} {D} 1 {E} L1 {A} {B} {C} {E} Min sup=2 TID Items 100 A C D 200 B C E 300 A B C E 400 B E Making a hash table 100 {A C} 200 {B C},{B E},{C E} 300 {A B},{A C},{A E},{B C},{B E},{C E} 400 {B E} H{[x y]}=((order of x )*10+(order of y)) mod 7; {A E} {B E} {C E} {B C} {A C} {A B} 3 2 1 4 5 6 Hash 方法的介紹，包括雜湊函數，方法等在資料庫D完成1-subset support掃瞄後，2-item的雜湊表也同時完成，依照資料庫D用2-item做區分照排序帶入雜湊函數，並丟入Hash table 計算每個bucket的數量利用buckets count(大於s=2)可得到bit vector，再用其過濾L1*L1就可得到較小的C2 Hash table H2 Hash address Bit vector The number of items hashed to bucket 0

Perfect Hashing Schemes (PHS) for Mining Association Rules

Motivation Apriori and DHP produce Ci from Li-1 that may be the bottleneck Collisions in DHP Designing a perfect hashing function for every transaction databases is a thorny problem

Definition Definition. A Join operation is to join two different (k-1)-itemsets, , respectively, to produces a k-itemset, where = p1p2…pk-1 = q1q2…qk-1 and p2=q1, p3=q2,…,pk-2=qk-3, pk-1=qk-2. Example: ABC, BCD 3-itemsets of ABCD: ABC, ABD, ACD, BCD only one pair that satisfies the join definition

Algorithm PHS (Perfect Hashing and Data Shrinking)

(BC)(BD)(BE)(CD)(CE)(DE) L1 Itemset Sup. {B} 3 {C} {D} 2 {E} Example1 (sup=2) TID Items 100 ACD 200 BCE 300 BCDE 400 BE TID Items 100 (CD) 200 (BC) (BE)(CE) 300 (BC)(BD)(BE)(CD)(CE)(DE) 400 (BE) Itemsets (BC) (BD) (BE) (CD) (CE) (DE) Support 2 1 3 Encoding A B C D Original (BC) (BE) (CD) (CE)

Decode: AD -> (BC)(CE) = BCE Example2 (sup=2) TID Items 100 Null 200 (AD) 300 (AC)(AD) 400 Itemsets (AB) (AC) (AD) (BC) (BD) (CD) Support 1 2 Encoding A Original (AD) Decode: AD -> (BC)(CE) = BCE

Problem on Hash Table Consider a database contains p transactions, which are comprised of unique items and are of equal length N, and the minimum support of 1. At iteration k, the # of candidate k-itemsets is The # of buckets required in the next pass is= , where m = While the actual # of the next candidates is Loading density :

How to Improve the Loading Density Two level perfect hash scheme (parital hash) Itemsets (AB) (AC) (AD) (BC) (BD) (CD) Support 1 2 A B C Hash Table D Null Count 1 2

Experiments

Experiments

Experiments

Part II: Data Mining Applications to Image Processing 1. A Prediction Scheme for Image Vector Quantization based on Mining Association Rules 2. Reversible Steganography for VQ-compressed Images Using Clustering and Relocation 3. A Reversible Steganographic Method Using SMVQ Approach based on Declustering

A Prediction Scheme for Image Vector Quantization Based on Mining Association Rules

Vector Quantization (VQ) Image encoding and decoding techniques

SMVQ(cont.) Codebook State Codebook

Framework of the Proposed Method v/10 (Quantized)

If “X  y' , there is no such rule X'  y', Condition Horizontal, Vertical, Diagonal, Association Rules If “X  y' , there is no such rule X'  y', where X'  X and y' = y.

The Prediction Strategy

Example Rules DB ? may be 5, 1, 8, or 10. How to decide? Query Result Matched set of rules Matched vertical rules Matched horizontal rules Matched diagonal rules (4, 2, 3, 3  5) confv = 90% (12, 12, 1, 3  5) confh = 90% (6, 4, 2, 2, 3  5) confd=100% (4, 2, 3  1) confv = 85% (12, 12  1) confh = 95% (6, 4, 2, 2  8) confd =70% X (6, 4, 2  10) confd = 75% ? may be 5, 1, 8, or 10. How to decide?

Example cont. The weight of 5: 4*90%+4*90%+5*100%= 12.2 Matched set of rules Matched vertical rules Matched horizontal rules Matched diagonal rules (4, 2, 3, 3  5) confv = 90% (12, 12, 1, 3  5) confh = 90% (6, 4, 2, 2, 3  5) confd=100% (4, 2, 3  1) confv = 85% (12, 12  1) confh = 95% (6, 4, 2, 2  8) confd =70% X (6, 4, 2  10) confd = 75% The weight of 5: 4*90%+4*90%+5*100%= 12.2 The weight of 1: 3*85%+2*95% = 4.45 The weight of 8: 4*70% = 2.8 The weight of 10: 3*75% = 2.25 {5, 1} is called the consequence list, which size is determined by the user

Experiments Reconstructed image by the proposed method Original Image Reconstructed image by full-search VQ

Experiments cont. The performance comparisons on various methods Performance Lena Pepper F16 Full-search VQ PSNR (dB) 32.25 31.41 31.58 Bit-rate (bpp) 0.5 SMVQ 28.57 28.04 27.94 0.33 0.32 Our Scheme 30.64 30.05 29.74 0.34

Experiments cont. Overfitting problem

Advantages Mining association rules can be applied to image prediction successfully Broader spatial correlation is considered than that of SMVQ More efficient than that of SMVQ since no Euclidean distances should be calculated

Reversible Steganography for VQ-compressed Images Using Clustering and Relocation

Flowchart of the Proposed Method X

Construction of the Hit Map 13 1 13 7 13 4 6 7 1 1 4 4 2 7 3 11 . . . Sorted codebook Hit map

Assume that the size of a codebook is 15: cw0, cw1, …, cw14 Clustering Codebook Assume that the size of a codebook is 15: cw0, cw1, …, cw14 Clustering: C1: cw0, cw1, cw3, cw6, cw8, cw10 C2: cw4, cw14 C3: cw2, cw5, cw9 C4: cw12 C5: cw7, cw11, cw13

Assume that the size of the state codebook is 4 L cw14 Assume that the size of the state codebook is 4 Relocation cw0, cw1 cw3, cw6 cw8, cw10 cw2, cw5 cw9 cw4, cw14 cw7, cw11 cw13 cw12

Embedding Secret bits: 1011 Only the codewords in G0 can embed the secret bits Embedding The codewords in G1 should be replaced with the codewords in G2 cw14 cw12 cw1 cw2 cw6 cw3 cw10 cw8 Secret bits: 1011 cw4 cw12 cw0 cw2 cw6 cw5 cw3 cw8 cw1

Extraction & Reversibility cw4 cw12 cw0 cw2 cw6 cw5 cw3 cw8 cw1 1 1 1 recover cw14 cw12 cw1 cw2 cw6 cw3 cw10 cw8 Secret bits:

12 hit maps (600 bits), 250 clusters Experiments Method Measure Lena Pepper Sailboat Baboon Modified Tian’s method PSNR (dB) 26.92 26.45 25.05 22.70 Payload (bits) 2777 3375 3283 2339 MFCVQ 28.03 26.43 26.60 24.04 5892 5712 5176 1798 Proposed method 30.23 29.15 28.00 8707 8421 7601 3400 12 hit maps (600 bits), 250 clusters

Experiments Tian’s method MFCVQ

Using clustering and multiple hit maps Single hit map Multiple hit maps without clustering Using clustering and multiple hit maps

Using Lena as the cover image Experiments Using Lena as the cover image

A Reversible Steganographic Method Using SMVQ Approach based on Declustering

Find the most dissimilar pairs (De-clustering) … CW1 CW8 CW2 CW9 CW3 CW10 CW4 CW11 CW5 CW12 CW6 CW13 CW7 CW14 1 Dissimilar

Embedding Using Side-Match CW1 CW8 :Dissimilar Pair Assume X = CW1 V0 = ((U13+L4)/2, U14, U15, U16, L8, L12, L16) V1 = (X1, X2, X3, X4, X5, X9, X13)CW1 V8 = (X1, X2, X3, X4, X5, X9, X13)CW8 d1=Euclidean_Distance(V0, V1) d8=Euclidean_Distance(V0, V8) If (d1<d8), then Block X is replaceable Otherwise, Block X is non-replaceable

A secret message: 1 0 1 0 1 0 0 1 0 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 Secret bits Index Table If (d6<d13) CW1, CW2, CW3, CW4 CW5, CW6 CW7 , CW15 CW8, CW9 CW10, CW11 CW12, CW13 CW14 , CW0 6 Embedding Result 1

A secret message: 1 0 1 0 1 0 0 1 0 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 Secret bits Index Table If (d2<d9) CW1, CW2, CW3, CW4 CW5, CW6 CW7, CW15 CW8, CW9 CW10, CW11 CW12, CW13 CW14 , CW0 6 9 Embedding Result 1

A secret message: 1 0 1 0 1 0 0 1 0 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 Secret bits Index Table If (d12>=d5) CW1, CW2, CW3, CW4 CW5, CW6 CW7 , CW15 CW8, CW9 CW10, CW11 CW12, CW13 CW14 , CW0 6 9 15||12 Embedding Result 1 CW15: embed 1

A secret message: 1 0 1 0 1 0 0 1 0 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 Secret bits Index Table If (d9>=d2) CW1, CW2, CW3, CW4 CW5, CW6 CW7 , CW15 CW8, CW9 CW10, CW11 CW12, CW13 CW14 , CW0 6 9 15||12 0||9 Embedding Result 1 CW0: embed 0

Steganographic Index Table Extraction and Recovery 6 9 15||12 0||9 1 Extract Secret bits Steganographic Index Table If (d6<d13) CW1, CW2, CW3, CW4 CW5, CW6 CW7 , CW15 CW8, CW9 CW10, CW11 CW12, CW13 CW14 , CW0 6 Recovery 1

Steganographic Index Table Extraction and Recovery 6 9 15||12 0||9 1 Extract Secret bits Steganographic Index Table If (d9>=d2) CW1, CW2, CW3, CW4 CW5, CW6 CW7 , CW15 CW8, CW9 CW10, CW11 CW12, CW13 CW14 , CW0 6 2 Recovery 1

Steganographic Index Table Extraction and Recovery 6 9 15||12 0||9 1 1 Extract Secret bits Steganographic Index Table CW1, CW2, CW3, CW4 CW5, CW6 CW7 , CW15 CW8, CW9 CW10, CW11 CW12, CW13 CW14 , CW0 6 2 12 Recovery 1

Steganographic Index Table Extraction and Recovery 6 9 15||12 0||9 1 1 Extract Secret bits Steganographic Index Table CW1, CW2, CW3, CW4 CW5, CW6 CW7 , CW15 CW8, CW9 CW10, CW11 CW12, CW13 CW14 , CW0 6 2 12 9 Recovery 1

Find Dissimilar Pairs PCA projection

Improve Embedding Capacity Partition into more groups

Experiments Codebook size: 512 Codeword size: 16 The number of original image blocks:128*128=16384 The number of non-replaceable blocks: 139

Experiments Codebook size: 512 Codeword size: 16 The number of original image blocks:128*128=16384 The number of non-replaceable blocks: 458

Size of the state codebook Experiments Embedding capacity Images Tian’s method MFCVQ Chang et al.’s method Proposed Method (3 groups) (9 groups) (17 groups) Lena 2,777 5,892 10,111 16,129 45,075 55,186 Baboon 2,339 1,798 4,588 36,609 39,014 Time Comparison Image Lena Methods Tian’s method MFCVQ Chang et al.’s method Proposed mehtod Time (sec) 0.55 1.36 Size of the state codebook Number of groups 4 8 16 32 3 5 9 17 14.59 29.80 58.8 161.2 0.11 0.13 0.14 0.19

Future Research Directions Extend the proposed reversible steganographic methods to other image formats Apply perfect hashing schemes to other applications

Thanks all