Download presentation
Presentation is loading. Please wait.
Published byΠαρθενιά Παπακώστας Modified over 5 years ago
1
Computer Architecture and System Programming Laboratory
TA Session 12 x86-SSE text string processing instructions
2
X86-SSE Programming – Text Strings (SSE4.2)
An implicit-length text string uses a terminating End-Of-String (EOS) character. X86-SSE includes four SIMD text string instructions that are capable of processing text string fragments up to 128 bits in length. Suppose you are given a text string fragment and want to create a mask to indicate the positions of the uppercase characters within the string. For example, each 1 in the mask b signifies an uppercase character in the corresponding position of the text string "Ab1cDE23f4gHi5J6". The desired character range and text string fragment are loaded into registers XMM1 and XMM2, respectively.
3
RFLAGS: 0x4831 = b
4
RFLAGS: RFLAGS: RFLAGS:
the output format bit 6 is set, which means that the mask value is expanded to bytes RFLAGS: multiple character ranges XMM1 contains two range pairs: one for uppercase letters and one for lowercase letters. RFLAGS: text string fragment that includes an embedded EOS (‘\0’) character ZF is set to 1 final mask value excludes matching range characters following EOS
5
CF flag – Reset if IntRes2 is equal to zero, set otherwise
RFLAGS: multiple character ranges XMM1 contains two range pairs: one for uppercase letters and one for lowercase letters. RFLAGS is set in a non-standard manner in order to supply the most relevant information: CF flag – Reset if IntRes2 is equal to zero, set otherwise ZF flag – Set if any byte/word of xmm2/mem128 is null, reset otherwise SF flag – Set if any byte/word of xmm1 is null, reset otherwise OF flag – IntRes2[0] AF flag – Reset PF flag – Reset
6
AZ2az_mask: times 16 db ('a' - 'A’) result: times 16 db 0 db `\n\0`
section .data str: db ‘Ab1cDE23f4gHi5J6’ AZ_mask: db ‘A', ‘Z’ times 14 db 0 imm: equ b AZ2az_mask: times 16 db ('a' - 'A’) result: times 16 db 0 db `\n\0` extern printf section .text global main main: enter movdqu xmm1, [AZ_mask] movdqu xmm2, [str] pcmpistrm xmm1, xmm2, imm movdqu xmm3, [AZ2az_mask] pand xmm0, xmm3 paddb xmm2, xmm0 movdqu [result], xmm2 mov rdi, result mov rax, 0 call printf leave ret MOVDQU xmm1, xmm2/m128 Move unaligned double quadword from xmm2/m128 to xmm1. PADDB xmm1, xmm2/m128 Add packed byte integers from xmm2/m128 and xmm1. PAND xmm1, xmm2/m128 Bitwise AND of xmm2/m128 and xmm1.
7
Equal any (imm[3:2] = 00). The result is a bit mask – 1 if the character belongs to a set, 0 if not. pcmpstrim xmm1, xmm2, b 00 ‘\0’ ‘1’ ‘k’ ‘b’ ‘a’ ‘2’ xmm1 00 ‘\0’ ‘1’ ‘k’ ‘C’ ‘a’ xmm2 00 FF xmm0 Equal each (imm[3:2] = 10). The result is a bit mask – 1 if the corresponding bytes are equal, 0 if not equal. pcmpstrim xmm1, xmm2, b 00 ‘\0’ ‘1’ ‘k’ ‘b’ ‘a’ xmm1 00 ‘\0’ ‘1’ ‘k’ ‘C’ ‘a’ xmm2 00 FF xmm0
8
Equal ordered (imm[3:2] = 11).
The result is a bit mask – 1 if the substring is found at the corresponding position, 0 otherwise. pcmpstrim xmm1, xmm2, b 00 ‘e’ ‘W’ xmm1 ‘!’ ‘d’ ‘e’ ‘W’ ‘B’ ‘l’ ‘i’ ‘n’ ‘h’ xmm2 00 FF xmm0
9
RCX = 16 (invalid index) rcx RFLAGS: RCX
IntRes1 calculation – mask according to the given range bit index in IntRes1 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 bit value in IntRes1 Negative- IntRes2 calculation bit index in IntRes1 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 bit value in IntRes1 RCX = index of least significant set bit in IntRes2 RCX = 16 (invalid index) RCX
10
RCX = 11 (index of ‘\0’ character, or length of string)
RFLAGS: rcx IntRes1 calculation – mask according to the given range bit index in IntRes1 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 bit value in IntRes1 Negative- IntRes2 calculation bit index in IntRes1 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 bit value in IntRes1 RCX = index of least significant set bit in IntRes2 RCX = 11 (index of ‘\0’ character, or length of string) RCX
11
rcx RFLAGS: RCX
12
first loop cycle: second loop cycle: section .data
RFLAGS: rcx section .data str: db ‘Ab1cDE23f4gHi5J6’ db ‘Ab1cDE23f4g\0’ EOS_mask: db 0x1,0xFF times 14 db 0 imm: equ b section .text global strlen strlen: enter xor rax xor rcx movdqu xmm1, [EOS_mask] .loop add rax, rcx pcmpistri xmm1, [str+rax], imm jnz .loop leave ret second loop cycle: RFLAGS: rcx
13
RCX
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.