Southampton: Oct 99Asynchronous Circuit Compilation- 1 AMULET3-H n Asynchronous macrocell ARM compatible processor core Full custom RAM Compiled ROM Balsa compiled DMA controller Test I/F, synchronous and off-chip bus bridges n Synchronous peripherals Designed by commercial partner...
Southampton: Oct 99Asynchronous Circuit Compilation- 2 AMULET3 System CPU / RAM ROMDMAC Periph1 Sync bridge MARBLESOCB
Southampton: Oct 99Asynchronous Circuit Compilation- 3 DMA Local RAM Access CPU / RAM ROMDMAC Periph1 Sync bridge MARBLESOCB
Southampton: Oct 99Asynchronous Circuit Compilation- 4 DMA Peripheral Accesses CPU / RAM ROMDMAC Periph1 Sync bridge MARBLESOCB DMA requests
Southampton: Oct 99Asynchronous Circuit Compilation- 5 Requirements / Specification n 16 clients, 32 channels n 3 channel types - complicated register structure n Programmable client channel 1 many mapping n Support synchronous requests n Transfers mostly between synchronous clients
Southampton: Oct 99Asynchronous Circuit Compilation- 6 Controller Structure
Southampton: Oct 99Asynchronous Circuit Compilation- 7 TI mem RB Regs RW WR +RW+ RW client req Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) Transfer engine Operation Request arrives
Southampton: Oct 99Asynchronous Circuit Compilation- 8 TI mem RB Regs RW WR +RW+ RW client req Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) Transfer engine Operation TE signals RB
Southampton: Oct 99Asynchronous Circuit Compilation- 9 TI mem RB Regs RW WR +RW+ RW client req Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) Transfer engine Operation RB returns address/count
Southampton: Oct 99Asynchronous Circuit Compilation- 10 TI mem RB Regs RW WR +RW+ RW client req Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) Transfer engine Operation TE performs transfer
Southampton: Oct 99Asynchronous Circuit Compilation- 11 TI mem RB Regs RW WR +RW+ RW client req Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) Transfer engine Operation Addresses are incremented
Southampton: Oct 99Asynchronous Circuit Compilation- 12 TI mem RB Regs RW WR +RW+ RW client req Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) Transfer engine Operation TE returns count/addr to RB RB resets request signals
Southampton: Oct 99Asynchronous Circuit Compilation- 13 Two Controller Descriptions n Sequential (previous slides) Very simple control flow Requires two passes through register bank Slow!, Only memory decoupling helps n Parallel (next slides) Decouple TE actions from memory R/W with a new unit: Transfer Interface Interrupt the register bank on end of transfer
Southampton: Oct 99Asynchronous Circuit Compilation- 14 Controller Structure
Southampton: Oct 99Asynchronous Circuit Compilation- 15 “Parallel” Design
Southampton: Oct 99Asynchronous Circuit Compilation- 16 Operation Request arrives Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) TE Transfer interface Addr end of run TI mem RB Regs R R R client req +W W +W RW
Southampton: Oct 99Asynchronous Circuit Compilation- 17 Operation TE signals RB Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) TE Transfer interface Addr end of run TI mem RB Regs R R R client req +W W +W RW
Southampton: Oct 99Asynchronous Circuit Compilation- 18 Operation RB responds || incrementing counter Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) TE Transfer interface Addr end of run TI mem RB Regs R R R client req +W W +W RW Increment in reg bank
Southampton: Oct 99Asynchronous Circuit Compilation- 19 Operation TE sends addr to TI || RB reg writes Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) TE Transfer interface Addr end of run TI mem RB Regs R R R client req +W W +W RW
Southampton: Oct 99Asynchronous Circuit Compilation- 20 Operation TI performs transfer Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) TE Transfer interface Addr end of run TI mem RB Regs R R R client req +W W +W RW
Southampton: Oct 99Asynchronous Circuit Compilation- 21 Operation TI signals RB if Last transfer completed Target (Slave) DMA registers Transfer engine DMA_rq[n] interface Arbitration async control Address Data AddressData synchronous island self timed region Channel DMA_irq/DMA_fiq DMA controller Initiator (Master) interface AddressData SOCB clock Client Requests request reset Requests DMA SPI (sync control) TE Transfer interface Addr end of run TI mem RB Regs R R R client req +W W +W RW
Southampton: Oct 99Asynchronous Circuit Compilation- 22 The Design n 919 lines of Balsa describing register bank control, TE and TI. n Custom register banks and Synchronous Peripheral Interface n Miscellaneous glue standard cells Register bank controllers MARBLE interfaces n Compass Design Automation CAD
Southampton: Oct 99Asynchronous Circuit Compilation- 23 Design Partitioning Marble BUS: outside of DMA controller
Southampton: Oct 99Asynchronous Circuit Compilation- 24 Design Partitioning Balsa synthesised standard cells
Southampton: Oct 99Asynchronous Circuit Compilation- 25 Design Partitioning Custom “regular” layout
Southampton: Oct 99Asynchronous Circuit Compilation- 26 Design Partitioning Hand designed standard cells
Southampton: Oct 99Asynchronous Circuit Compilation- 27 DMA Controller Floor-Plan
Southampton: Oct 99Asynchronous Circuit Compilation- 28 Implementation Technology n 0.35 m, 3LM CMOS n Standard cells from ARM Ltd. n Locally designed complex gates and asynchronous elements/gates. n Automated standard cell P&R n Only “essential” and simple gate level optimisation (by hand)
Southampton: Oct 99Asynchronous Circuit Compilation- 29 Simulation n LARD behavioural modelling n EPIC TimeMill transistor level simulation on schematic on cap. Extracted netlist n TimeMill is calibrated using SPICE, claims to be within a few % of SPICE n Results measured for run of 16 mem To mem transfers.
Southampton: Oct 99Asynchronous Circuit Compilation- 30 Comparing Performance