Download presentation
Presentation is loading. Please wait.
Published byAlbert Conley Modified over 9 years ago
1
1 RAMP Jan’08 Raksha & Atlas: Prototyping & Emulation at Stanford Christos Kozyrakis work done by S. Wee, N. Njoroge, M. Dalton, H. Kannan Computer Systems Laboratory Stanford University
2
2 RAMP Jan’08 Outline Raksha prototyping security architectures Raksha goals Generations of Raksha prototypes Experience & lessons Atlas emulating transactional memory architectures Atlas goals Architecture overview New programmability features Experience & lessons
3
3 RAMP Jan’08 Raksha Goals Architectural support for software security 1. Protect existing software from attacks Prevent buffer overflows, SQL injections, … Based on dynamic information flow tracking (DIFT) 2. Reduce trusted code base (TCB) for new software Simplify design & verification of security guarantees Using word-granularity protection on physical memory Robust, flexible, practical, end-to-end, fast
4
4 RAMP Jan’08 Raksha Architecture, Version 1 Policy Decode Tag ALU Tag Check PCPC DecodeD-CacheRegFile ALU I-CacheTraps WBWB Modified Sparc V8 processor (Leon) 4 programmable security policies using 4-bits/word User-level handling of security exceptions +7% logic, +0% clock cycle time over base design Full Linux distribution with > 120 software packages 1 st DIFT architecture to detect high-level attacks on binaries Have shared this design with 3 other institutions so far…
5
5 RAMP Jan’08 Raksha Architecture, Version 2 Small off-core coprocessor for all DIFT functionality + state Can be reused across multiple chips Requires minimal changes to main processor core <1% for our Sparc V8 processor Same security features as original architecture 8% performance overhead for SpecInt2000 Processor Core I CacheD Cache ROB Policy Decode Tag ALU Tag Check Tag Cache Tag RF WBWB DIFT Coprocessor PC, Inst, Address Security exception L2 Cache
6
6 RAMP Jan’08 Raksha Architecture, Version 3 (Loki) Supports fine-grain permission check on physical memory All words associated with a 32-bit tag Permission table provides access rights for different tags Trusted SW specifies permissions; HW enforces them Independently from OS; checks on device accesses as well Reduces TCB of a full OS down to 5KLOC Invariant: malicious user/kernel code cannot access data without permission Virtual memory & all device drivers outside of the TCB PCPC Decode D-Cache RegFile ALU I-Cache Traps WBWB I-TLB P-cache D-TLB P-cache Check
7
7 RAMP Jan’08 Experience & Lessons HW: a stable starting point is critical Despite deficiencies, Leon has been a reasonable base Good compromise of size, performance, flexibility, support Even for ISA-level research Can we match this with upcoming RAMP models? SW: full system is important (full OS + devices) Enables experimentation with wide range of apps Increases credibility of results What is the OS story for RAMP models? System: need low-cost board option Makes it easier to attract collaborators & disseminate design What is the replacement plan for XUPv5?
8
8 RAMP Jan’08 Repeat outline Raksha prototyping security architectures Raksha goals Generations of Raksha prototypes Experience & lessons Atlas emulating transactional memory architectures Atlas goals Architecture overview New programmability features Experience & lessons
9
9 RAMP Jan’08 Atlas Goals Fast: at speed experiments with hardware TM ~100x faster than simulator Comfortable: full-system environment Full Linux OS Integration with standard debugging tools Easy-to-use: rich support for programmability Automatic detection of performance bottlenecks Deterministic replay Automatic detection of atomicity bugs
10
10 RAMP Jan’08 ATLAS Hardware Architecture 9-way CMP with hardware support for TM TM support builds upon private caches & coherence protocol One processor dedicated for system code Uses hardcore PowerPC codes in user & control FPGAs in BEE2 TCC PPC 0 TCC PPC 1 I/O Linux PPC TCC PPC 2 TCC PPC 3 TCC PPC 4 TCC PPC 5 TCC PPC 6 TCC PPC 7 Control Switch Main Memory User Switch
11
11 RAMP Jan’08 ATLAS Software Architecture Application (OpenMP+TM) TM APIATLAS Profiler ATLAS Runtime System Linux OS ATLAS HW on BEE2 High-level application development OpenMP + TM, (Java + TM), … High-level application debugging Gdb based for common & new features (e.g., infinite watchpoints)
12
12 RAMP Jan’08 Deterministic Replay with ReplayT A critical tool for multiprocessor debugging Small system variations can mask bugs ReplayT: record & replay transaction commit order Sufficient for TCC’s “all transaction, all the time” execution model Serializable commit order captures all thread interactions Minimal runtime & space overhead (1 byte/transaction) Logging phaseReplay phase Commit time LOG: T0 T1 T2 write-set T0 T1 T2 Commit protocol replays logged commit order T0 T1T2 ComputationArbitrationCommitAbort
13
13 RAMP Jan’08 ReplayT Runtime Overhead (logging phase) Average slowdown is 1.05% Can continuously log on production runs
14
14 RAMP Jan’08 ReplayT Extensions Unique replay Problem: maximize usefulness of test runs Approach: shuffle commit order to generate unique scenarios Replay with monitoring code Problem: replay accuracy after recompilation Approach: faithfully repeat commit order if binary changes E.g., printf statements inserted for monitoring purposes Cross-platform replay Problem: debugging on multiple platforms Approach: support for replaying log across platforms & ISAs
15
15 RAMP Jan’08 Atomicity Bug Detection Problem: user breaks an atomic task as two transactions Hard to pinpoint problem even with replay The AVIO proposal [Lu et al. @ ASPLOS’06] Unserializable access interleavings are likely bugs Whitelist unserializable interleavings from correct runs Performed during application testing AVIO challenges Long & intrusive data collection phase Long analysis phase Corner cases (false positives & false negatives)
16
16 RAMP Jan’08 Atomicity Bug Detection on ATLAS Based on the general approach of AVIO but Fast & non-intrusive data collection Single log for each address accessed in transaction Log collected during deterministic replay Fast analysis Interleavings examined at transaction granularity More accurate analysis Eliminated false-negatives due to intermediate writes
17
17 RAMP Jan’08 Experience & Lessons HW: need multiple grades of hardware modeling Enable fast prototyping of new ISA & HW features Even if timing or other details not exactly accurate Atlas experience: 40+ tutorial participants enjoyed using new features in a timing “inaccurate” system SW: full system is important (full OS + devices) Enables experimentation with wide range of apps System: need low-cost board option Makes it easier to attract collaborators & disseminate design Scalability: need access to multiple boards Students will not scale design until 2 nd board arrives ISA: unfortunately, the key to more sharing of HW & SW models Difficult to share across ISAs due to differences in specification, interfaces, etc Should RAMP simply adapt Sparc?
18
18 RAMP Jan’08 Questions?
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.