Download presentation
Presentation is loading. Please wait.
1
Architectural Impact of SSL Processing Jingnan Yao
2
Reference “ Architectural Impact of Secure Socket Layer on Internet Servers ”, Karishna Kant, Ravishankar Iyer and Prasant Mohapatra. “ Anatomy and Performance of SSL Processing ”, Li Zhao, Ravi Iyer, Srihari Makineni and Laxmi Bhuyan.
3
Two Major Approach IPSEC: Internet Protocal Security Protocol IP level Implemented in NICs (network interface cards) SSL: Secure Socket Layer Transport level Secures an individual communication session Secure HTTP (called HTTPS) uses SSL for security and is being used widely in e- commerce environment.
4
Performance Impact Server: number of simultaneous connections drop significantly Client: unduly long client response time (10-25% ecommerce transactions are aborted)
5
Simultaneous Connections for SPECWeb99 and SPECweb99_SSL It can be seen that SPECWeb99 can achieve much higher throughput than SPECWeb99_SSL.
6
Overview of SSL Privacy, Integrity & Authentication Session Negotiation Phase: Authentication of the server and client at the beginning of the session Bulk Data Transfer Phase: Encryption/decryption of data exchanged between the two parties during the session
8
Execution Time Breakdown in Web Server (1KB webpage) SSL processing (libcrypto & libssl) takes 71.6% of the execution time.
9
Further Breakdown of Crypto Operations Public key encryption Private key encryption Hashing Other operations
10
Configurations Number of processors in the SMP server: Uniprocessor Dual Processor Quad processor Three different L2 cache sizes 512KB 1MB 2MB Three different file sizes 30 byte handshake performance 1 MB bulk data encryption performance 36 kB average web-page transfer
11
Overall Performance
12
Observation 1: “ SSL increases path length 10-15 fold over non- SSL case ” + “ CPI drops by more than a factor of 2 ” “ The use of SSL increases computational cost of the transactions by a factor of 5-7. ” “ As the number of processors increase, the ratio goes down. ” “ More processors mean more coherency traffic in both SSL and non-SSL cases. ”
13
Observation 2: “ Small CPI for SSL ” A faster CPU core would not be very helpful in improving SSL performance so long as L1 is large enough to supply much of the code and data needed. “ Bulk data encryption/decryption algorithms highly sequential in nature ” A wider issue width would not help, but a longer pipeline would.
14
L1 Cache Characteristics Separate instruction and data L1 caches: 16KB Single unified L2 Cache
15
Observation 1: “ L1 instruction miss ratios are very low in all cases. L1 data miss ratios are more significant. ” “ The instruction miss ratio generally decreases with number of processors, but the data miss ratio goes up. ” “ More processors allow a better sharing of code, but the coherency misses in data cache increase. ”
16
Observation 2: “ 30 byte file sizes: the miss ratio for both instruction and data are much lower in the SSL case than non-SSL case. ” “ The data miss ratio retains the same behavior for all file sizes and processor configurations. ” “ The frequent reuse of the data during the encryption and decryption process. ” “ The instruction locality relating to handshaking process is very high. ”
17
Observation 3: “ 1 MB files sizes: the instruction miss ratio becomes very poor with the SSL traffic for bulk data transfers. ” “ Low instruction locality in the bulk data transfer case. ” “ Working set of instructions in the bulk transfer case does not fit within L1 cache. ” “ Larger instruction L1 cache would help to improve bulk data encryption performance. ”
18
L2 Cache Characteristics
19
Observation 1: “ High L2 miss ratios, especially for large size webpages (1MB sizes) ” “ High degree of locking/contention in TCP processing. ” “ Cache pollution because of TCP checksum. ”
20
Encryption Dominated & SSL Handshake Dominated (1MB files) (30 byte files)
21
Observation 1: “ 1MB case: SSL bulk data transfer shows very good L2 miss ratios. ” “ The heavy computational workload of SSL helps in reducing the L2 cache miss ratio. ” “ SSL processing itself has certain features that would lead to high L2 cache miss ratios. ” “ 30 byte case: SSL Handshake shows very high L2 miss ratios. ”
22
Branch and Prediction Behavior
23
Observation 1: “ Branch frequency with SSL is about 30%-50% of that without SSL. ” “ There are less control dependencies in the SSL- based transactions. ” “ Low branch frequency in SSL encourages high degree of pipelining in the processor architecture. ” “ Lower control dependency is another reason for high hit rate in L1 and low CPI in case of SSL. ”
24
Observation 2: “ For 1P/2P configuration: the miss-prediction rate with SSL is lower. ” “ For 4P configuration: the miss-prediction rate with SSL is always higher. ” “ For 4P configuration: BTB is highly inefficient. ” “ Better branch prediction algorithms can be investigated. ” “ Avoid overly complex branch predictor for SSL transactions since the branch frequency is very low. ”
25
Conclusion SSL overhead increases computational cost of the transactions by a factor of 5-7 times SSL transactions do not benefit much from a larger L2 cache but a larger L1 cache would be helpful. A complex logic for handling control dependencies is not useful for SSL transaction as the frequency of branches is very low.
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.