Presentation is loading. Please wait.

Presentation is loading. Please wait.

Privacy Preserving Similarity Evaluation of Time Series Data

Similar presentations


Presentation on theme: "Privacy Preserving Similarity Evaluation of Time Series Data"— Presentation transcript:

1 Privacy Preserving Similarity Evaluation of Time Series Data
Haohan Zhu, Xianrui Meng, George Kollios Boston University

2 Time Series Data are Everywhere
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Time Series Data are Everywhere Sensitive Time-series Data Physiological Medical Data Signature Data Trajectory Data Financial Data

3 Privacy Concerns for Time Series Data
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Privacy Concerns for Time Series Data Time series data include sensitive information Physiological Medical Data Signature Data Trajectory Data Financial Data Need to preserve privacy when using time series data Existing work on this topic limited (Ex. Differential Privacy ) Focused mostly on aggregation/summarization queries. Here we consider Similarity Queries Exact Distance between two specific time series

4 Privacy Preserving Similarity Evaluation
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Privacy Preserving Similarity Evaluation Two-Party Computation Communication PROTOCOL Input Input Output Distance

5 Similarity Evaluation of Time Series Data
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Similarity Evaluation of Time Series Data Y Distance Functions Euclidean Distance LCSS Distance ERP Distance Fréchet Distance Dynamic Time Warping (DTW) Dynamic Time Warping 𝐷 𝑖, 𝑗 =𝑑 𝑖, 𝑗 + min {𝐷 𝑖−1, 𝑗 , 𝐷 𝑖, 𝑗−1 , 𝐷(𝑖−1, 𝑗−1)} Where 𝑑 𝑖,𝑗 = 𝑥 𝑖 − 𝑦 𝑗 7 11 4 5 3 2 6 1 8 12 17 X

6 Computational Model Settings
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Computational Model Settings Two-Party Computation Two Parties: Server and Client (Different computational powers) Inputs: Each Party has its own Time Series Data Output: One Party / Both Parties get the Distance Each party learns only the distance from the protocol Semi-Honest Threat Model (also called Honest-But-Curious Model) Each party executes the protocol correctly as specified (no deviations, malicious or not) However, they may try to learn as much as possible about the input of the other party from their views of the protocol

7 What we need to hide? Y The Original Time Series The Matrix
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 What we need to hide? Y The Original Time Series The Matrix Entries Optimal Path Potential Risk during Computation Filling the Matrix 7 11 4 5 3 2 6 1 8 12 17 X

8 Secret Key Protect the Matrix Cipher Matrix and Secret Key
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Protect the Matrix Cipher Matrix and Secret Key Secret Key Enc(7) Enc(11) Enc(4) Enc(5) Enc(2) Enc(3) Enc(6) Enc(1) Enc(8) 3 4 5 Owner of Cipher Matrix Owner of Secret Key Client Server

9 Secret Key Protection when Filling the Matrix 1/3
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Protection when Filling the Matrix 1/3 Cipher Matrix and Secret Key Secret Key Enc(7) Enc(5) Enc(6) Enc(4) Enc(2) 3 4 5 Owner of Cipher Matrix Owner of Secret Key Client Server

10 Secret Key Protection when Filling the Matrix 2/3
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Protection when Filling the Matrix 2/3 Cipher Matrix and Secret Key Secret Key Enc(7) Enc(11) Enc(5) Enc(6) Enc(4) Enc(2) Enc(1) Enc(3) Enc(8) 3 4 5 Owner of Cipher Matrix Owner of Secret Key Client Server

11 Secret Key Protection when Filling the Matrix 3/3
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Protection when Filling the Matrix 3/3 Cipher Matrix and Secret Key Secret Key Enc(7) Enc(11) Enc(4) Enc(5) Enc(2) Enc(3) Enc(6) Enc(1) Enc(8) 3 4 5 Owner of Cipher Matrix Owner of Secret Key Client Server

12 How to Fill One Entry of the Matrix
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 How to Fill One Entry of the Matrix Dynamic Time Warping Secure Euclidean Distance Computation (Part 1) Privacy Preserving Comparison of Ciphertexts (Part 2) Y2 Enc(B) Enc( d(X2, Y2) + Min(A, B, C) ) Part Part2 Y1 Enc(A) Enc(C) X1 X2

13 Secure Euclidean Distance Computation
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Secure Euclidean Distance Computation Part 1 of the Protocol PaillierEncryption (Partial Homomorphic Encryption) 𝐸𝑛𝑐 𝑚 1 + 𝑚 2 𝑚𝑜𝑑 𝑛 =𝐸𝑛𝑐 𝑚 1 ∙𝐸𝑛𝑐 𝑚 2 𝑚𝑜𝑑 𝑛 2 𝐸𝑛𝑐 𝑚 1 ∙𝑘 𝑚𝑜𝑑 𝑛 = 𝐸𝑛𝑐 𝑚 1 𝑘 𝑚𝑜𝑑 𝑛 2 Square of Euclidean Distance Computation 𝐸𝑛𝑐 𝛿 𝐸𝑢 2 ( 𝑝 , 𝑞 ) =𝐸𝑛𝑐( 𝑖=1 𝑑 𝑝 𝑖 2 + 𝑖=1 𝑑 𝑞 𝑖 2 − 2 𝑖=1 𝑑 𝑝 𝑖 𝑞 𝑖 ) 𝐸𝑛𝑐( 𝑖=1 𝑑 𝑝 𝑖 2 ) can be computed by the owner of 𝑝 𝑖 (The Server) 𝐸𝑛𝑐( 𝑖=1 𝑑 𝑞 𝑖 2 ) can be computed by the owner of 𝑞 𝑖 (The Client) 𝐸𝑛𝑐(−2 𝑖=1 𝑑 𝑝 𝑖 𝑞 𝑖 ) = 𝑖 𝑑 𝐸𝑛𝑐( 𝑝 𝑖 ) −2 𝑞 𝑖 can be evaluated by the owner of 𝑞 𝑖 (The Client) with information of 𝐸𝑛𝑐( 𝑝 𝑖 ) the owner of 𝑞 𝑖 (The Client) is the owner of the matrix who cannot decrypt ciphertexts. The Client is also the evaluator of part 1 of the protocol

14 Privacy Preserving Comparison
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Privacy Preserving Comparison Part 2 of the Protocol We need a protocol to compute the minimum of the three values in 𝐷 𝑖−1, 𝑗 , 𝐷 𝑖, 𝑗−1 and 𝐷(𝑖−1, 𝑗−1) Inputs & Outputs: 3 Ciphertexts inputs from the client: Enc(A), Enc(B), Enc(C) 1 Ciphertext output to the client: Enc(min(A, B, C)) Solution Using Random Offsets One round communication protocol

15 Privacy Preserving Comparison cont.
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Privacy Preserving Comparison cont. The Client Generates Candidates: The Client generates k Random Values: r1, r2, r3, …., rk (r1 is the minimum). The Client creates k+2 candidates: Enc(A+r1), Enc(B+r1), Enc(C+r1), Enc(X2+r2), Enc(X3+r3), …, Enc(Xk+rk), (Xi is one of A, B and C) by using homomorphic addition of Paillier encryption. The Client permutes and sends the k+2 candidates to the Server. The Server Decrypts and Compares Candidates The server decrypts k+2 candidates with the secret key and finds the minimal plaintext M among the k+2 plaintexts. The server re-encrypts the minimal plaintext M and returns Enc(M) to the client. The Client Compute the Result The client computes Enc(min(A, B, C)) = Enc(M - r1) by using homomorphic addition of Paillier encryption as the result.

16 Compute 𝐸𝑛𝑐 𝐷 𝑋 𝑖 , 𝑌 𝑗 + min (𝐴, 𝐵, 𝐶) 𝐷 𝑋 𝑖 , 𝑌 𝑗 + min (𝐴, 𝐵, 𝐶)
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Full Protocol for Each Entry Secure Euclidean Distance Computation (Part 1) One-way (Half Round) Communication Privacy Preserving Comparison of Ciphertexts (Part 2) One Round Communication Server Client Compute 𝐸𝑛𝑐(𝐷( 𝑋 𝑖 , 𝑌 𝑗 )) Encrypt 𝑌 𝑗 and 𝑌 𝑗 2 Create k+2 Candidates 1. Decrypt Candidates 2. Compare Plaintexts 3. Re-Encrypt 4. Return 𝐸𝑛𝑐(𝑀) Compute 𝐸𝑛𝑐 𝐷 𝑋 𝑖 , 𝑌 𝑗 + min (𝐴, 𝐵, 𝐶) 𝐷 𝑋 𝑖 , 𝑌 𝑗 + min (𝐴, 𝐵, 𝐶)

17 Performance Analysis Communication Cost : mn(d + k + 4)
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Performance Analysis Communication Cost : mn(d + k + 4) Computation Cost Server: mn(d + 2) encryptions and mn(k + 2) decryptions Client: mn(k+1) encryptions m: length of time series data of server n: length of time series data of client d: dimensionality of single data point k: size of random set For low dimensions, the server has more computational cost than the client, because encryptions and decryptions cost more than other operations and decryptions cost more than encryptions

18 Privacy Preserving Similarity Evaluation of Time Series Data
7/22/2018 Security Analysis The client learns nothing during the computation since the client cannot decrypt any ciphertext. The server learns nothing during the first phase of the protocol, however…. The server decrypts k+2 candidates. The server may create a permuted linear equation system. To solve this system is NP-hard. The Client needs to select the range for random values close to the actual values and needs to make sure that there is no wrap around. We show that our protocol can preserve at least half of the information entropy of the plaintexts.

19 Enc( Max ( d(X2, Y2) , Min(A, B, C) ) )
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 How to Fill One Entry of the Matrix Fréchet Distance Three-Phase Protocol Two Comparison Protocols (Max and Min) Y2 Enc(B) Enc( Max ( d(X2, Y2) , Min(A, B, C) ) ) Y1 Enc(A) Enc(C) X1 X2

20 Experiments Datasets: Distance Functions Parameters Real world data
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Experiments Datasets: Real world data Synthetic data Distance Functions Dynamic Time Warping Discrete Fréchet Distance Parameters n: length of time series data d: dimensionality of single data point k: size of random set

21 Experiments Performance vs Sequence Size (Different Phases)
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Experiments Performance vs Sequence Size (Different Phases)

22 Experiments Performance vs Sequence Size (Different Parties)
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Experiments Performance vs Sequence Size (Different Parties)

23 Experiments Performance vs Dimensionality (Different Phases)
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Experiments Performance vs Dimensionality (Different Phases)

24 Experiments Performance vs Size of Random Set (Different Phases)
Privacy Preserving Similarity Evaluation of Time Series Data 7/22/2018 Experiments Performance vs Size of Random Set (Different Phases)

25 Thanks


Download ppt "Privacy Preserving Similarity Evaluation of Time Series Data"

Similar presentations


Ads by Google