Download presentation
Presentation is loading. Please wait.
1
1 EE249 Discussion A Method for Architecture Exploration for Heterogeneous Signal Processing Systems Sam Williams EE249 Discussion Section October 15, 2002
2
2 EE249 Discussion Related Work – System Level Modeling and Analysis Polis/CFSMs –Elements are mapped to hardware and software components –Performance evaluated via simulation –Hardware/Software synthesis Chinook –Design of embedded systems –Mapping to IP blocks –Synthesized communication RASSP System –VHDL modeling of DSPs –ADEPT environment for hardware/software co-design Abstraction of architecture models can provide a speed up in design space exploration SPADE separates architecture from application models –Functionality is not modeled in architecture
3
3 EE249 Discussion Basics – Workloads and Resources Applications generate workloads –Computation –Communication –Storage Architecture provides resources –Computation processors, coprocessors, ASIC’s, etc… –Communication buses, ethernet, specialized interfaces, etc… –Memory RAMs, ROMs, etc… System is realization of graph connecting computation/memory components via communication components, and the mapping of applications onto it
4
4 EE249 Discussion Basics – Traces Signals –Logic transitions –Hardware specific Instructions –Specific to ISA –RISC instructions Macro Instructions / Functions (extremely coarse-grain) –iDCT –Structure moves a b c dli a0,addr ld a1,0(a0) addi a1,1,a1 sd a1,0(a0) load_next_frame(frame); decode_frame(frame,temp); copy_frame_to_buffer(temp); update(); …
5
5 EE249 Discussion Architecture Modeling Functional models not required Data dependent behavior results in data dependent traces Built from library of components Processing Resource: –Trace Driven Execution Unit = trace interpreter Table of latencies for each instruction Could be extended for other metrics (power, cost, etc…) –Some number of communication interfaces Translates generic internal protocol to specific one Other Resources included buses, and memories
6
6 EE249 Discussion Application Modeling Map functions to Kahn process networks unbounded FIFO’s – acceptable approximation Read/Write operations –generate a trace entry (bytes transferred over channel) –performs the port accesses in the Kahn Process Network Execution operation –only generates trace entries _______ ______ _______ ________ ___________ _____ ________ _______ ______ _________ _____ _______ ___________ _____ ____________ ________ _______ ________ _______ ___________ ________ F1 F2 F3 P1P2
7
7 EE249 Discussion Mapping, Simulation, and Analysis Mapping –Processes are mapped to a TDEU (n to 1) –Ports are mapped to interfaces of the TDEU (1 to 1) Simulation –Application and Architectural models are co-simulated –Traces are generated on the fly –Performance is generated by co-simulating traces on architecture Analysis –Utilization, Stalls, Latencies, Bandwidth –Could add power, area, cost, etc…
8
8 EE249 Discussion The Y-Chart Applications and architecture are clearly separable Several applications will be run on this system Representative applications are collected Designer makes a best guess at architecture System is evaluated by mapping each application to the architecture, simulating, and analyzing resulting numbers Designer then redesigns architecture and/or applications and repeats the mapping/simulation flow
9
9 EE249 Discussion Y-Chart (continued) Applications (C/C++) Application Models SpecBlocks Architecture Model Mapping Analysis Simulations remap repartition rearchitect Function|Latency Table Cycle accurate simulator Databook Guesses
10
10 EE249 Discussion MPEG2 Example C code was partitioned and mapped to Kahn Process Network Run standalone to gather frequencies of operations, and bandwidth requirements Mapped to TriMedia MPEG2 system (10 processing elements/33 interface) Simulations on a series of streams / bus loads / frame periods, resulting in a metric frames dropped Slow down for performance simulation was about 3600 from hardware –300 CPU days for a 2 hour movie –Limits to only analyze short clips
11
11 EE249 Discussion Conclusion Easy exploration of heterogeneous programmable architectures On the fly trace driven co-simulation Functionality is not required, only behavior Can be extended to analyze any number of metrics (power, cost, area, etc…) – they didn’t –Frames_Dropped(x,y,z,…)=0 –Power(x,y,z,…)<25W –Cost(x,y,z,…)<$30 ×Application is partitioned by hand ×Mapping is performed by hand ×Performance characteristics of components must be simulated, known, or estimated
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.