Download presentation
Presentation is loading. Please wait.
1
LUM final presentation Chanit Giat Rachel Stahl Instructor: Artyom Borzin
2
PROXY CACHE ENGINE The proxy cache engine gives hardware support to a server ’ s OS in order to improve its service rate, and adds security features. The main memory of a network server is the quick storage device, where the recently accessed data is saved. When a new request for data is received, the application must search the memory. If the data are found - send the response; otherwise the data must be read from a slower storage device (disk, tape) and then sent to the user.
3
PROXY CACHE ENGINE The system stores the information about all the files ’ mapping in main memory and calculates the exact path to the required file if present in main memory. If not present, orders the operating system to bring it from the storage device, and supplies the path to the free memory space is supplied. The system holds 2 main data bases: A main memory, which holds up to 2Meg paths to the server ’ s memory, and their aging parameters. A bit map table, which allows faster memory management by holding the free space image of the main memory.
4
Main functions: Search – returns the path to the main memory, or a path to a free space in the memory. Set attributes – sets the file ’ s aging attributes, as supplied by the OS. Delete – deletes a certain path from the memory. Count free – returns number of free path slots in the memory. Init – initialize the machine. (age – when number of records exceeds a specified number, the system cleans up some of them.) LengthCID=1ASISSite#Data SEARCH:
5
Previous uArchitecture Local Bus Interface Reg. file Data Stream controller Output FIFO Input FIFO Decoder CRC unit Database Manager (DBM) UTCAM SRAM (Bit Map)
6
uArchitecture changes: Doubling the front-end of the machine, including: Input FIFO Decoder CRC unit Buffering between the decoders and the DBM with a FIFO. The search for a free index in the Bit Map is now done in parallel to the rest of the command execution.
7
Previous uArchitecture Local Bus Interface Reg. file Data Stream controller Output FIFO Input FIFO Decoder CRC unit Database Manager (DBM) UTCAM SRAM (Bit Map)
8
FrontEnd New uarchitecture Input FIFO Decoder CRC
9
New uarchitecture FrontEnd1 FrontEnd0 Input FIFO Decoder CRC Input FIFO Decoder CRC
10
Double FrontEnd1 DBM Fifo FrontEnd0 New uarchitecture Input FIFO Decoder CRC Input FIFO Decoder CRC FIFOFIFO
11
LOCAL BUS INTERFACE New uarchitecture Double FrontEnd1 DBM Fifo FrontEnd0 Input FIFO Decoder CRC Input FIFO Decoder CRC FIFOFIFO Reg. file Data Stream Controller Output FIFO
12
New uarchitecture LOCAL BUS INTERFACE Double FrontEnd1 DBM Fifo FrontEnd0 Input FIFO Decoder CRC Input FIFO Decoder CRC FIFOFIFO Reg. file Data Stream Controller Output FIFO DBMDBM
13
Data Flow LOCAL BUS INTERFACE Double FrontEnd1 DBM Fifo FrontEnd0 Input FIFO Decoder CRC Input FIFO Decoder CRC FIFOFIFO Reg. file Data Stream Controller Output FIFO DBMDBM
14
Data Flow LOCAL BUS INTERFACE Double FrontEnd1 DBM Fifo FrontEnd0 Input FIFO Decoder CRC Input FIFO Decoder CRC FIFOFIFO Reg. file Data Stream Controller Output FIFO DBMDBM
15
Data Flow LOCAL BUS INTERFACE Double FrontEnd1 DBM Fifo FrontEnd0 Input FIFO Decoder CRC Input FIFO Decoder CRC FIFOFIFO Reg. file Data Stream Controller Output FIFO DBMDBM
16
Data stream ctrl LOCAL BUS INTERFACE Input FIFO 0 Input FIFO 1 Reg. file Data Stream Controller Output FIFO FIFO 0FIFO 1 Sys_clr !sot & lwr SOT – start of transaction.lwr – specifies write/read from the system.
17
Sim: Data Stream ctrl Reading from register file (crc) Data enters FIFO 0 Data enters FIFO 1
18
DBM FIFO DBM Fifo FIFOFIFO DBMDBM WAIT ON GO0 WAIT ON GO1 DEC0DEC1 go0 & !dbm_full go1 & !dbm_full fifo_wrdone Sys_clr go0/1: FrontEnd0/1 (decoder0/1) are ready dbm_full: dbm FIFO is full. fifo_wrdone: Write to FIFO is done.
19
Sim: DBM FIFO State encoding: 1 – wait on go0 2 – DEC0 4 – wait on go1 8 – DEC1 WAIT ON GO0 WAIT ON GO1 DEC0 DEC1 go0 & !dbm_full go1 & !dbm_full fifo_wrdon e Sys_ clr DBM FIFO samples data from decoder 0 DBM FIFO samples data from decoder 1
20
DBM DOUBLE DBM interface DBM fifo ISSUE LOGIC EXECUTION UNIT REQ packet PACKER BIT MAP UNIT Saves the last bad Decoder status, Which goes to the Output FIFO with the Next successful Command
21
Sim: bad decoder status
22
Register file Previously, the user could read the system ’ s current parameters from the register file: command id, CRC value, file ’ s site etc. Since we have 2 pipes, the register file had to be changed: Some registers contain data from both pipes. For others, there is a need to specify the pipe of which to read the parameters.
23
ADD - old IDLE FND_NINDX ADD_NINDX ACK_NINDX (ad_en)&& (!ad_done) (bm_s4f_done) (ad_error) (ad_new_done) (bm_s4f_done) Finding a new Free index ~40 clk cycles Updating Bit map ~10 clk cycles IDLE FNEWACKN !Sys_clr Fnew_done Ackn_done Bm_s4f_new_ack !Bm_s4f_new_ac k s4f - old
24
ADD - new IDLE ADD_NINDXACK_NINDX !Sys_clr (ad_en)&&(!ad_done) && (bm_index_valid) (ad_err) (ad_new_done) (bm_ack_rcvd) New index is found while the ‘ADD’ module is idle ! (which is for more than 50 cycles…) WT_FOR_ACK FNEWACKN !Sys_clr Bm_index_valid Ackn_done Add_ack s4f - new
25
Sim: add, s4f s4f state encoding: 0 – wait for ack 2 – ack old index 1 – find new index add state encoding: 1 – idle 2 – add index 4 – ack index
26
Sim: add, s4f s4f state encoding: 0 – wait for ack 2 – ack old index 1 – find new index add state encoding: 1 – idle 2 – add index 4 – ack index
27
Sim: add, s4f s4f state encoding: 0 – wait for ack 2 – ack old index 1 – find new index add state encoding: 1 – idle 2 – add index 4 – ack index
28
Sim: add, s4f s4f state encoding: 0 – wait for ack 2 – ack old index 1 – find new index add state encoding: 1 – idle 2 – add index 4 – ack index
29
performance Main function is the ‘ search ’ command: Long path (up to 512 bytes) => long CRC calculation => long decoding stage. Access to main memory => if failed to find the path requested, adding a new record to the memory, which includes finding a new index and acknowledge of the record added (at least 4 memory accesses).
30
performance 2 input FIFOs – double rate receiving data from OS. 2 decoders – allows decoding of 2 commands in parallel. Significant for several long ‘ search ’ commands in a row. DBM FIFO – separates between the decoding and execution of commands, enables them to perform in parallel.
31
performance 2 search commands each with 102 bytes of path (on which crc is working): Old ArchitectureNew Architecture Ads_n falls (first search) 628n First dword in Input fifo (is_usedw) 719n718n End of decoding(crc_done) 6128n (dbm_fifo->fifo_input is ready) 6202n Pck__en raises9344n8560n Sot falls94682318n First dword in Input fifo (is_usedw) 2380n End of decoding(crc_done)14574n7869 Pck__en raises15408n9486 7942 8625 6064 926
32
performance Search for a free index now executes in parallel to other execution stages of a command. Saves ~50 clock cycles per ‘ search ’ command, which usually takes ~400-1000 cycles.
33
The end …
34
Sim: s4f State encoding: 0 – wait for ack 2 – ack old index 1 – find new index WT_FOR_ACK FNEWACKN !Sys_clr Bm_index_valid Ackn_d one Add _ack
35
Sim: s4f State encoding: 0 – wait for ack 2 – ack old index 1 – find new index WT_FOR_ACK FNEWACKN !Sys_clr Bm_index_valid Ackn_d one Add _ack
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.