Parallelizing Security Checks on Commodity Hardware E.B. Nightingale, D. Peek, P.M. Chen and J. Flinn U Michigan.

Slides:



Advertisements
Similar presentations
Lecture 12: MapReduce: Simplified Data Processing on Large Clusters Xiaowei Yang (Duke University)
Advertisements

Enabling Speculative Parallelization via Merge Semantics in STMs Kaushik Ravichandran Santosh Pande College.
Energy Efficiency through Burstiness Athanasios E. Papathanasiou and Michael L. Scott University of Rochester, Computer Science Department Rochester, NY.
Gwendolyn Voskuilen, Faraz Ahmad, and T. N. Vijaykumar Electrical & Computer Engineering ISCA 2010.
+ Accelerating Fully Homomorphic Encryption on GPUs Wei Wang, Yin Hu, Lianmu Chen, Xinming Huang, Berk Sunar ECE Dept., Worcester Polytechnic Institute.
OSDI ’10 Research Visions 3 October Epoch parallelism: One execution is not enough Jessica Ouyang, Kaushik Veeraraghavan, Dongyoon Lee, Peter Chen,
Multithreaded FPGA Acceleration of DNA Sequence Mapping Edward Fernandez, Walid Najjar, Stefano Lonardi, Jason Villarreal UC Riverside, Department of Computer.
Low-Cost Data Deduplication for Virtual Machine Backup in Cloud Storage Wei Zhang, Tao Yang, Gautham Narayanasamy University of California at Santa Barbara.
More on Thread Level Speculation Anthony Gitter Dafna Shahaf Or Sheffet.
Shimin Chen Big Data Reading Group.  Energy efficiency of: ◦ Single-machine instance of DBMS ◦ Standard server-grade hardware components ◦ A wide spectrum.
Processor history / DX/SX SX/DX Pentium 1997 Pentium MMX
TaintCheck and LockSet LBA Reading Group Presentation by Shimin Chen.
Process Description and Control
Operating System Support Focus on Architecture
1 Lecture 8: Transactional Memory – TCC Topics: “lazy” implementation (TCC)
Dongyoon Lee, Benjamin Wester, Kaushik Veeraraghavan, Satish Narayanasamy, Peter M. Chen, and Jason Flinn University of Michigan, Ann Arbor Respec: Efficient.
DoublePlay: Parallelizing Sequential Logging and Replay Kaushik Veeraraghavan Dongyoon Lee, Benjamin Wester, Jessica Ouyang, Peter M. Chen, Jason Flinn,
1 Input/Output Chapter 3 TOPICS Principles of I/O hardware Principles of I/O software I/O software layers Disks Clocks Reference: Operating Systems Design.
1 RAKSHA: A FLEXIBLE ARCHITECTURE FOR SOFTWARE SECURITY Computer Systems Laboratory Stanford University Hari Kannan, Michael Dalton, Christos Kozyrakis.
MapReduce : Simplified Data Processing on Large Clusters Hongwei Wang & Sihuizi Jin & Yajing Zhang
Hystor : Making the Best Use of Solid State Drivers in High Performance Storage Systems Presenter : Dong Chang.
Operating System Support for Application-Specific Speculation Benjamin Wester Peter Chen and Jason Flinn University of Michigan.
SIDDHARTH MEHTA PURSUING MASTERS IN COMPUTER SCIENCE (FALL 2008) INTERESTS: SYSTEMS, WEB.
Accelerating Mobile Applications through Flip-Flop Replication
Report : Zhen Ming Wu 2008 IEEE 9th Grid Computing Conference.
Bob Thome, Senior Director of Product Management, Oracle SIMPLIFYING YOUR HIGH AVAILABILITY DATABASE.
1 AutoBash: Improving Configuration Management with Operating System Causality Analysis Ya-Yunn Su, Mona Attariyan, and Jason Flinn University of Michigan.
Kenichi Kourai (Kyushu Institute of Technology) Takuya Nagata (Kyushu Institute of Technology) A Secure Framework for Monitoring Operating Systems Using.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
High Performance Computing Processors Felix Noble Mirayma V. Rodriguez Agnes Velez Electric and Computer Engineer Department August 25, 2004.
MapReduce How to painlessly process terabytes of data.
Parallelizing Security Checks on Commodity Hardware Ed Nightingale Dan Peek, Peter Chen Jason Flinn Microsoft Research University of Michigan.
Christopher Kruegel University of California Engin Kirda Institute Eurecom Clemens Kolbitsch Thorsten Holz Secure Systems Lab Vienna University of Technology.
Scalable Multi-core Sonar Beamforming with Computational Process Networks Motivation Sonar beamforming requires significant computation and input/output.
CCGrid 2014 Improving I/O Throughput of Scientific Applications using Transparent Parallel Compression Tekin Bicer, Jian Yin and Gagan Agrawal Ohio State.
On-Demand Dynamic Software Analysis Joseph L. Greathouse Ph.D. Candidate Advanced Computer Architecture Laboratory University of Michigan December 12,
Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan.
An Integrated Framework for Dependable and Revivable Architecture Using Multicore Processors Weidong ShiMotorola Labs Hsien-Hsin “Sean” LeeGeorgia Tech.
MapReduce : Simplified Data Processing on Large Clusters P 謝光昱 P 陳志豪 Operating Systems Design and Implementation 2004 Jeffrey Dean, Sanjay.
Virtual Application Profiler (VAPP) Problem – Increasing hardware complexity – Programmers need to understand interactions between architecture and their.
1 Lecture 1: Computer System Structures We go over the aspects of computer architecture relevant to OS design  overview  input and output (I/O) organization.
Streaming Big Data with Self-Adjusting Computation Umut A. Acar, Yan Chen DDFP January 2014 SNU IDB Lab. Namyoon Kim.
 Introduction  Architecture NameNode, DataNodes, HDFS Client, CheckpointNode, BackupNode, Snapshots  File I/O Operations and Replica Management File.
CS 351/ IT 351 Modeling and Simulation Technologies HPC Architectures Dr. Jim Holten.
Speculation Supriya Vadlamani CS 6410 Advanced Systems.
Dynamic Parallelization of JavaScript Applications Using an Ultra-lightweight Speculation Mechanism ECE 751, Fall 2015 Peng Liu 1.
Dynamic Taint Analysis for Automatic Detection, Analysis, and Signature Generation of Exploits on Commodity Software Paper by: James Newsome and Dawn Song.
Different Microprocessors Tamanna Haque Nipa Lecturer Dept. of Computer Science Stamford University Bangladesh.
On-Demand Dynamic Software Analysis Joseph L. Greathouse Ph.D. Candidate Advanced Computer Architecture Laboratory University of Michigan November 29,
Improving the Reliability of Commodity Operating Systems Michael M. Swift, Brian N. Bershad, Henry M. Levy Presented by Ya-Yun Lo EECS 582 – W161.
Flashback : A Lightweight Extension for Rollback and Deterministic Replay for Software Debugging Sudarshan M. Srinivasan, Srikanth Kandula, Christopher.
Speculative Execution in a Distributed File System Ed Nightingale Peter Chen Jason Flinn University of Michigan Best Paper at SOSP 2005 Modified for CS739.
Lab Activities 1, 2. Some of the Lab Server Specifications CPU: 2 Quad(4) Core Intel Xeon 5400 processors CPU Speed: 2.5 GHz Cache : Each 2 cores share.
Transactional Memory Coherence and Consistency Lance Hammond, Vicky Wong, Mike Chen, Brian D. Carlstrom, John D. Davis, Ben Hertzberg, Manohar K. Prabhu,
Analyzing Memory Access Intensity in Parallel Programs on Multicore Lixia Liu, Zhiyuan Li, Ahmed Sameh Department of Computer Science, Purdue University,
MapReduce: Simplied Data Processing on Large Clusters Written By: Jeffrey Dean and Sanjay Ghemawat Presented By: Manoher Shatha & Naveen Kumar Ratkal.
An Adaptive Cache Coherence Protocol Optimized for Producer-Consumer Sharing Liquin Cheng, John B. Carter and Donglai Dai cs.utah.edu by Evangelos Vlachos.
Qin Zhao1, Joon Edward Sim2, WengFai Wong1,2 1SingaporeMIT Alliance 2Department of Computer Science National University of Singapore
On-Demand Dynamic Software Analysis
Olatunji Ruwase* Shimin Chen+ Phillip B. Gibbons+ Todd C. Mowry*
Linchuan Chen, Xin Huo and Gagan Agrawal
Heming Cui, Jingyue Wu, John Gallagher, Huayang Guo, Junfeng Yang
The Basics of Apache Hadoop
Changing thread semantics
Parallelizing Dynamic Information Flow Tracking
Co-designed Virtual Machines for Reliable Computer Systems
Dynamic Verification of Sequential Consistency
A Virtual Machine Monitor for Utilizing Non-dedicated Clusters
COMP755 Advanced Operating Systems
Chapter 13: I/O Systems “The two main jobs of a computer are I/O and [CPU] processing. In many cases, the main job is I/O, and the [CPU] processing is.
Presentation transcript:

Parallelizing Security Checks on Commodity Hardware E.B. Nightingale, D. Peek, P.M. Chen and J. Flinn U Michigan

Overview Introduction Speculator Design Parallel lifeguards Evaluation Conclusion

Introduction Security checkers (lifeguards) are too slow (~30X with taintcheck) Multi core systems are increasingly popular Can we exploit idle cores to improve lifeguard performance Speck (Speculative Error ChecKing), parallelizes lifeguards to improve performance

Introduction (2) Security checks are decoupled from application execution Security checks are executed in parallel on separate cores Speculator for speculative execution and rollback

Speculator OS level support for speculative execution and rollback Checkpoint process state before system call execution Use buffering to hide side effects (e.g I/O) of speculative execution Block process if cannot hide side effects Rollback to checkpoint state if necessary

Speck Design Fork instrumented clones of monitored application to run on other cores Security checks run on instrumented clones OS logging to handle non deterministic execution e.g signal delivery, system call results Speculator for speculative execution and rollback of system call

Design

Parallel Lifeguards Process Memory Analysis System Call Analysis Taint Analysis

Parallel Process Memory Analysis Security violations can be detected in memory –Decrypted virus image –Leaked data Check each store location for pattern All checks are independent Easy to parallelize

Parallel System Call Analysis Analyze program behavior using system calls –Check system call parameters –Check system call history Checks are independent Easy to parallelize

Parallel Taint Analysis Detect critical use of malicious input –Track propagation of input Pin based sequential taintcheck is 18X Checking is inherently sequential and hard to parallelize Log based approach to parallelize –Parallel log generation by instrumented clones (workers) –Sequential log processing by master

Parallel Taint Analysis Workers Generate log segments from replayed execution Eliminate redundant log records using mark and sweep algorithm (6X compression ratio) Send compressed segments to master for processing

Parallel Taint Analysis Master Maintains metadata Process segments in log order –Detects violations –Update metadata

Evaluation 8-core (quad dual core) Intel Xeon – 2.66 GHz, 4GB RAM, 8MB L2, 1.33 GHz bus –Linux 2.6 (64 bit) kernel 4-core (2 dual core) Intel Xeon –2.8G Hz, 3GB RAM, 4MB L2, 800 MHz bus – Linux 2.4 (32 bit) kernel

Benchmarks Process memory analysis –Frames per second of mplayer playing Harry Potter trailer System call analysis –Transactions per second (TPS) of Postmark benchmark Taint Analysis –Frames per second of mplayer playing Harry Potter trailer

Process Memory Analysis

System Call Analysis

Taint Analysis

Conclusion Speck parallelizes security checks on commodity hardware Pin based lifeguards OS level support (Speculator) for speculative execution of system call Speedups with (4 workers, 8 workers) –Process memory analysis (4X, 7.5X) –System Call Analysis (3.3X, 2.8X) –Taint Analysis (1.6X, 2X)