EM64T ‘fast’ system-calls A look at the requirements for using Intel’s ‘syscall’ and ‘sysret’ instructions in 64-bit mode.

Slides:



Advertisements
Similar presentations
Intermediate x86 Part 3 Xeno Kovah – 2010 xkovah at gmail.
Advertisements

Processor Privilege-Levels
Using VMX within Linux We explore the feasibility of executing ROM-BIOS code within the Linux x86_64 kernel.
Computer Organization and Assembly Languages Yung-Yu Chuang
Intel MP.
Ring-transitions for EM64T How the CPU can accomplish transitions among its differing privilege-levels in 64-bit mode.
CSC 221 Computer Organization and Assembly Language
IA32 Paging Scheme Introduction to the Pentium’s support for “virtual” memory.
Task-Switching How the x86 processor assists with context-switching among multiple program-threads.
Interrupts in Protected-Mode Writing a protected-mode interrupt-service routine for the timer-tick interrupt.
Interrupts in Protected-Mode Writing a protected-mode interrupt-service routine for the timer-tick interrupt.
IA32 Paging Scheme Introduction to the Pentium’s support for “virtual” memory.
IA-32 Processor Architecture
Interrupts in Protected-Mode Writing a protected-mode interrupt-service routine for the timer-tick interrupt.
The various x86 ‘modes’ On understanding key differences among the processor’s several execution-architectures.
Processor Privilege-Levels How the x86 processor accomplishes transitions among its four distinct privilege-levels.
Venturing into protected-mode A first look at the CPU registers and instructions which provide the essential supporting infrastructure.
1 Hardware and Software Architecture Chapter 2 n The Intel Processor Architecture n History of PC Memory Usage (Real Mode)
X86 segmentation, page tables, and interrupts 3/17/08 Frans Kaashoek MIT
Model-Specific Registers
Memory Management (II)
Task-Switching How the x86 processor assists with context-switching among multiple program-threads.
CE6105 Linux 作業系統 Linux Operating System 許 富 皓. Chapter 2 Memory Addressing.
IA-32 Architecture COE 205 Computer Organization and Assembly Language Dr. Aiman El-Maleh College of Computer Sciences and Engineering King Fahd University.
Setup for VM launch Using ‘vmxwrite’ and ‘vmxread’ for access to state-information in a Virtual Machine Control Structure.
Venturing into 64-bit mode Examining the steps needed to take the processor into IA-32e mode -- and then back out again.
Venturing into protected-mode
Model-Specific Registers A look at Intel’s scheme for introducing new CPU features.
CS2422 Assembly Language & System Programming September 22, 2005.
Special segment-registers
A ‘protected-mode’ exploration A look at the steps needed to build segment-descriptors for displaying a message while in protected-mode.
Microprocessor Systems Design I Instructor: Dr. Michael Geiger Fall 2012 Lecture 15: Protected mode intro.
Interrupts in Protected-Mode Writing a protected-mode interrupt-service routine for the timer-tick interrupt.
UNIT 2 Memory Management Unit and Segment Description and Paging
Lecture 2. General-Purpose (GP) Computer Systems Prof. Taeweon Suh Computer Science Education Korea University ECM586 Special Topics in Embedded Systems.
An Introduction to 8086 Microprocessor.
Assembly Language for Intel-Based Computers, 4 th Edition Chapter 2: IA-32 Processor Architecture (c) Pearson Education, All rights reserved. You.
80386DX.
Intel MP (32-bit microprocessor) Designed to overcome the limits of its predecessor while maintaining the software compatibility with the.
6.828: PC hardware and x86 Frans Kaashoek
Memory Addressing in Linux  Logical Address machine language instruction location  Linear address (virtual address) a single 32 but unsigned integer.
Microprocessor system architectures – IA32 segmentation Jakub Yaghob.
The Pentium Processor.
The Pentium Processor Chapter 3 S. Dandamudi To be used with S. Dandamudi, “Introduction to Assembly Language Programming,” Second Edition, Springer,
The Pentium Processor Chapter 3 S. Dandamudi.
The Intel Microprocessors. Real Mode Memory Addressing Real mode, also called real address mode, is an operating mode of and later x86-compatible.
Fall 2012 Chapter 2: x86 Processor Architecture. Irvine, Kip R. Assembly Language for x86 Processors 6/e, Chapter Overview General Concepts IA-32.
1 Linux Operating System 許 富 皓. 2 Memory Addressing.
System Address Registers/Memory Management Registers Four memory management registers are used to specify the locations of data structures which control.
80386DX.
1 i386 Memory Management Professor Ching-Chi Hsu 1998 年 4 月.
Virtual 8086 Mode  The supports execution of one or more 8086, 8088, 80186, or programs in an protected-mode environment.  An 8086.
The Microprocessor and Its Architecture A Course in Microprocessor Electrical Engineering Department University of Indonesia.
Segment Descriptor Segments are areas of memory defined by a programmer and can be a code, data or stack segment. In segments need not be all the.
AMD K-6 Processor Evaluation. Registers AMD-K6 Registers General purpose registers Segment registers Floating point registers MMX registers EFLAGS register.
Introduction to Intel IA-32 and IA-64 Instruction Set Architectures.
Information Security - 2. Descriptor Tables Descriptors are stored in three tables: – Global descriptor table (GDT) Maintains a list of most segments.
Information Security - 2. CISC Vs RISC X86 is CISC while ARM is RISC CISC is Compiler’s heaven while RISC is Architecture’s heaven Orthogonal ISA in RISC.
Lecture 2. General-Purpose Computer Systems Prof. Taeweon Suh Computer Science Education Korea University ECM586 Special Topics in Embedded Systems.
Chapter Overview General Concepts IA-32 Processor Architecture
Microprocessor Architecture
Part of the Assembler Language Programmers Toolbox
143A: Principles of Operating Systems Lecture 7: System boot
Anton Burtsev February, 2017
Protection UQ: Explain the protection mechanism of X86 Intel family microprocessor(10 Marks)
x86 segmentation, page tables, and interrupts
Operating Modes UQ: State and explain the operating modes of X86 family of processors. Show the mode transition diagram highlighting important features.(10.
CS 301 Fall 2002 Computer Organization
Lecture 36 Syed Mansoor Sarwar
CS444/544 Operating Systems II Virtual Memory
Presentation transcript:

EM64T ‘fast’ system-calls A look at the requirements for using Intel’s ‘syscall’ and ‘sysret’ instructions in 64-bit mode

Privilege-levels Although the x86 processor supports four distinct privilege-levels in protected-mode, only two are actually used in the popular Windows and Linux operating-systems Ring3 (for applications) Ring0 (for the OS kernel) Ring2 (not used) Ring1 (not used)

Opportunity for optimization Just as the suppression of ‘segmentation’ in 64-bit memory-addressing has offered extra execution-speed and programming simplicity, there is opportunity for faster privilege-level transitions by eliminating references to system-tables in memory application program kernel services system-call ring3ring0 return

Sacrifice ‘flexibility’ for ‘speed’ The ‘syscall’ and ‘sysret’ instructions allow much faster ring-transitions during normal system-calls – by keeping all the required information in special CPU registers – but accepting some limitations: –Transitions are only between ring3 and ring0 –Only one ‘entry-point’ to all kernel-services –Some formerly ‘general-purpose’ registers would have to acquire a dedicated function

Layout of GDT descriptors Use of ‘syscall’ requires a pair of global descriptors to be adjacently placed: Use of ‘sysret’ requires a triple of global descriptors to be adjacently placed: GDTR 64-bit code DPL=0 ring0 data DPL=0 GDTR 16/32-bit code DPL=3 ring3 data DPL=3 64-bit code DPL=3

Model-Specific Registers Some MSRs must be suitably initialized: 0xC : IA32_MSR_EFER 0xC : IA32_MSR_STAR 0xC : IA32_MSR_LSTAR 0xC : IA32_MSR_CSTAR 0xC : IA32_MSR_FMASK The Intel processor must be executing in 64-bit mode to use ‘syscall’ and ‘sysret’

Extended Feature Enable Register This Model-Specific Register (MSR) was introduced in the AMD64 architecture and perpetuated by EM64T (for compatibility) SCESCE LMELME LMALMA NXENXE Legend: SCE = SysCall/sysret is Enabled (1=yes, 0=no) LME = Long-Mode is Enabled (1=yes, 0=no) LMA = Long-Mode is Active (1=yes, 0=no) NXE = Non-eXecutable pages Enabled (1=yes, 0=no) NOTE: The MSR address-index for EFER = 0xC , and this register is accessed using RDMSR or WRMSR instructions

The MSR_STAR register unused selector for 64-bit ring0 code-descriptor selector for ‘compatibility’ ring3 code-descriptor This selector is for the first in a pair of adjacently-placed GDT descriptors for the CS and SS registers, respectively, upon the transition from ring3 to ring0 This selector is for the first in a triple of adjacently-placed GDT descriptors for the CS and SS (or SS and CS) registers, respectively, upon the transition from ring0 to ring3 (depending on whether ‘sysret’ or ‘sysretq’ is used) for ‘syscall’for ‘sysret’

The MSR_LSTAR register Linear-address of the system-call entry-point 63 0 This is the 64-bit address which will go into the RIP register when the ‘syscall’ instruction is ececuted by ring3 code The former value from the RIP register (i.e., the ‘return-address’) will be saved in the RCX general-purpose register, to be used later by the ‘sysret’ instruction (so therefore it must be preserved)

The MSR_CSTAR register The function of this register is unknown 63 0 This register is observed to exist – Linux x86_64 writes a value into this register in fact – although current Intel documentation omits mention or explanation of this Model-Specific Register (We did find an obsolete Intel document online which referred to this register, but did not make clear its past purpose or function) It’s a mystery …

The MSR_FMASK register Reserved (unused) This register can be programmed by an Operating System with a bitmask that will be used by the processor to automatically ‘clear’ a specified selection of bits in the RFLAGS register when ‘syscall’ is executed (the former value of RFLAGS is saved in the general-purpose R11 register) Flags Mask

‘fastcall.s’ We created a demo-program that shows the use of ‘syscall’ and ‘sysret’, indicating what setup-steps are needed: Page-mapping tables (user-accessible frames) Global Descriptor Table layout Task-State Segment (needs ESP0 value) EFER (needs LME=1 and SCE=1) CR4 needs PAE=1, CR3 needs physical address of page-map level4 CR0 needs PE=1 and PG=1

Transitions in ‘fastcall.s’ real mode Ia-32e compatibility mode 64-bit mode (Ring3) 64-bit mode (Ring0) 64-bit mode (Ring3) real mode Ia-32e compatibility mode 64-bit mode (Ring0) syscall sysret lcall ljmp lretmov The main purpose of this demo-program is to illustrate the use of ‘syscall’ and ‘sysret’ (but we have to get to 64-bit mode to do it)

In-class exercise In our ‘fastcall.s’ demo-program there are two transitions from ring3 to ring0 (in one case via ‘syscall’ and in the other via ‘lcall’ through a call-gate Can you measure which of these is faster? (for example, by using the processor’s TimeStamp Counter, accessible with the ‘rdtsc’ instruction)

TimeStamp Counter 64-bit register automatically increments with every cpu clock-cycle 63 0 The ‘rdtsc’ instruction returns this register’s current value: rdtsc # EAX = least-significant 32-bits from TSC # EDX = most-significant 32-bits from TSC (This 64-bit register is initialized to zero at system-startup)