Advanced Computer Architecture & Processing Systems Research Lab Ongoing Computer Engineering Research Projects at.

Slides:

Advertisements

Similar presentations

Chapter 3 Embedded Computing in the Emerging Smart Grid Arindam Mukherjee, ValentinaCecchi, Rohith Tenneti, and Aravind Kailas Electrical and Computer.

Advertisements

Augsburg University, February 18 th 2010 Anticipatory Techniques in Advanced Processor Architectures Professor Lucian N. VINŢAN, PhD Lucian Blaga University.

Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.

Dynamic Thread Assignment on Heterogeneous Multiprocessor Architectures Pree Thiengburanathum Advanced computer architecture Oct 24,

Combining Statistical and Symbolic Simulation Mark Oskin Fred Chong and Matthew Farrens Dept. of Computer Science University of California at Davis.

Presentation of Designing Efficient Irregular Networks for Heterogeneous Systems-on-Chip by Christian Neeb and Norbert Wehn and Workload Driven Synthesis.

Implementation Approaches with FPGAs Compile-time reconfiguration (CTR) CTR is a static implementation strategy where each application consists of one.

REAL-TIME COMMUNICATION ANALYSIS FOR NOCS WITH WORMHOLE SWITCHING Presented by Sina Gholamian, 1 09/11/2011.

1 Advancing Supercomputer Performance Through Interconnection Topology Synthesis Yi Zhu, Michael Taylor, Scott B. Baden and Chung-Kuan Cheng Department.

An Analytical Model for Worst-case Reorder Buffer Size of Multi-path Minimal Routing NoCs Gaoming Du 1, Miao Li 1, Zhonghai Lu 2, Minglun Gao 1, Chunhua.

CSC457 Seminar YongKang Zhu December 6 th, 2001 About Network Processor.

Single-Chip Multiprocessor Nirmal Andrews. Case for single chip multiprocessors Advances in the field of integrated chip processing. - Gate density (More.

Extending the Unified Parallel Processing Speedup Model Computer architectures take advantage of low-level parallelism: multiple pipelines The next generations.

Software Architecture of High Efficiency Video Coding for Many-Core Systems with Power- Efficient Workload Balancing Muhammad Usman Karim Khan, Muhammad.

Chapter 8 Hardware Conventional Computer Hardware Architecture.

An Adaptable Benchmark for MPFS Performance Testing A Master Thesis Presentation Yubing Wang Advisor: Prof. Mark Claypool.

NoC Modeling Networks-on-Chips seminar May, 2008 Anton Lavro.

1 COMP 206: Computer Architecture and Implementation Montek Singh Mon., Sep 5, 2005 Lecture 2.

Network-on-Chip Examples System-on-Chip Group, CSE-IMM, DTU.

The new The new MONARC Simulation Framework Iosif Legrand  California Institute of Technology.

Orion: A Power-Performance Simulator for Interconnection Networks Presented by: Ilya Tabakh RC Reading Group4/19/2006.

Performance and Power Efficient On-Chip Communication Using Adaptive Virtual Point-to-Point Connections M. Modarressi, H. Sarbazi-Azad, and A. Tavakkol.

High Performance Embedded Computing © 2007 Elsevier Lecture 16: Interconnection Networks Embedded Computing Systems Mikko Lipasti, adapted from M. Schulte.

Conference title1 A New Methodology for Studying Realistic Processors in Computer Science Degrees Crispín Gómez, María E. Gómez y Julio Sahuquillo DISCA.

Research on cloud computing application in the peer-to-peer based video-on-demand systems Speaker : 吳靖緯 MA0G rd International Workshop.

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Fault-tolerant Multicore System on Network-on-Chip Presenter: Parhelia.

University of Michigan Electrical Engineering and Computer Science 1 Extending Multicore Architectures to Exploit Hybrid Parallelism in Single-Thread Applications.

Software Pipelining for Stream Programs on Resource Constrained Multi-core Architectures IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTED SYSTEM 2012 Authors:

Energy saving in multicore architectures Assoc. Prof. Adrian FLOREA, PhD Prof. Lucian VINTAN, PhD – Research.

Multi-core architectures. Single-core computer Single-core CPU chip.

Uncovering the Multicore Processor Bottlenecks Server Design Summit Shay Gal-On Director of Technology, EEMBC.

High-Level Interconnect Architectures for FPGAs Nick Barrow-Williams.

Model-Driven Analysis Frameworks for Embedded Systems George Edwards USC Center for Systems and Software Engineering

Multiprocessing. Going Multi-core Helps Energy Efficiency William Holt, HOT Chips 2005 Adapted from UC Berkeley "The Beauty and Joy of Computing"

1 Exploring Custom Instruction Synthesis for Application-Specific Instruction Set Processors with Multiple Design Objectives Lin, Hai Fei, Yunsi ACM/IEEE.

TEMPLATE DESIGN © Hardware Design, Synthesis, and Verification of a Multicore Communication API Ben Meakin, Ganesh Gopalakrishnan.

CAPS project-team Compilation et Architectures pour Processeurs Superscalaires et Spécialisés.

ACCESS IC LAB Graduate Institute of Electronics Engineering, NTU Under-project Meeting Network-on-Chip Group 2007/3/07 TA: 林書彥黃群翔.

Using Prediction to Accelerate Coherence Protocols Authors : Shubendu S. Mukherjee and Mark D. Hill Proceedings. The 25th Annual International Symposium.

1 Optical Packet Switching Techniques Walter Picco MS Thesis Defense December 2001 Fabio Neri, Marco Ajmone Marsan Telecommunication Networks Group

Resource Mapping and Scheduling for Heterogeneous Network Processor Systems Liang Yang, Tushar Gohad, Pavel Ghosh, Devesh Sinha, Arunabha Sen and Andrea.

Sep 08, 2009 SPEEDUP – Optimization and Porting of Path Integral MC Code to New Computing Architectures V. Slavnić, A. Balaž, D. Stojiljković, A. Belić,

QCAdesigner – CUDA HPPS project

Performance Analysis of a JPEG Encoder Mapped To a Virtual MPSoC-NoC Architecture Using TLM 林孟諭 Dept. of Electrical Engineering National Cheng Kung.

Dynamic Phase-based Tuning for Embedded Systems Using Phase Distance Mapping + Also Affiliated with NSF Center for High- Performance Reconfigurable Computing.

Advanced Computer Architecture & Processing Systems Research Lab Framework for Automatic Design Space Exploration.

SOC Virtual Prototyping: An Approach towards fast System- On-Chip Solution Date – 09 th April 2012 Mamta CHALANA Tech Leader ST Microelectronics Pvt. Ltd,

Grid Computing Framework A Java framework for managed modular distributed parallel computing.

Networks-on-Chip (NoC) Suleyman TOSUN Computer Engineering Deptartment Hacettepe University, Turkey.

1 Presenter: Min Yu,Lo 2015/12/21 Kumar, S.; Jantsch, A.; Soininen, J.-P.; Forsell, M.; Millberg, M.; Oberg, J.; Tiensyrja, K.; Hemani, A. VLSI, 2002.

OPERATING SYSTEMS CS 3530 Summer 2014 Systems and Models Chapter 03.

Computer Architecture Lecture 26 Past and Future Ralph Grishman November 2015 NYU.

Ning WengANCS 2005 Design Considerations for Network Processors Operating Systems Tilman Wolf 1, Ning Weng 2 and Chia-Hui Tai 1 1 University of Massachusetts.

Scheduling Issues on a Heterogeneous Single ISA Multicore IRISA, France Robert Guziolowski, André Seznec. Contact: 1. M. Becchi and P.

Multi-objective Topology Synthesis and FPGA Prototyping Framework of Application Specific Network-on-Chip m Akram Ben Ahmed Xinyu LI, Omar Hammami.

VU-Advanced Computer Architecture Lecture 1-Introduction 1 Advanced Computer Architecture CS 704 Advanced Computer Architecture Lecture 1.

Multi-cellular paradigm The molecular level can support self- replication (and self- repair). But we also need cells that can be designed to fit the specific.

A seminar Presentation on NETWORK- ON- CHIP ARCHITECTURE EXPLORATION FRAMEWORK Under the supervision of Presented by Mr.G.Naresh,M.Tech., V.Sairamya Asst.

Runtime Reconfigurable Network-on- chips for FPGA-based systems Mugdha Puranik Department of Electrical and Computer Engineering

Fall 2012 Parallel Computer Architecture Lecture 4: Multi-Core Processors Prof. Onur Mutlu Carnegie Mellon University 9/14/2012.

Dynamo: A Runtime Codesign Environment

Applying Control Theory to Stream Processing Systems

The Multikernel: A New OS Architecture for Scalable Multicore Systems

Simultaneous Multithreading

Gabor Madl Ph.D. Candidate, UC Irvine Advisor: Nikil Dutt

Model-Driven Analysis Frameworks for Embedded Systems

Computer Evolution and Performance

Department of Electrical Engineering Joint work with Jiong Luo

CAPS project-team Compilation et Architectures pour Processeurs Superscalaires et Spécialisés.

Presentation transcript:

Advanced Computer Architecture & Processing Systems Research Lab Ongoing Computer Engineering Research Projects at the Lucian Blaga University of Sibiu Prof. Lucian VINTAN, PhD-Director Advanced Computer Architecture & Processing Systems Research Lab -

Advanced Computer Architecture & Processing Systems Research Lab The Research Team Prof. Lucian VINTAN, PhD – Research Chair Assoc. Prof. Adrian FLOREA, PhD Senior Lecturer Daniel MORARIU, PhD Senior Lecturer Ion MIRONESCU, PhD Lecturer Arpad GELLERT, PhD Radu CRETULESCU, PhD student Horia CALBOREAN, PhD student Ciprian RADU, PhD student

Advanced Computer Architecture & Processing Systems Research Lab Computing hardware 14 Intel Compute nodes (2 processor HS21 blades with quad-core Intel Xeon) 2 Cell Compute nodes (2 processor QS22 blades withIBM PowerXCell 8i Processor )

Advanced Computer Architecture & Processing Systems Research Lab Our current research topics Anticipatory Techniques in Advanced Processor Architectures An Automatic Design Space Exploration Framework for Multicore Architecture Optimizations Optimizing Application Mapping Algorithms for NoCs through a Unified Framework Optimal Computer Architecture for CFD calculation Adaptive Meta-classifiers for Text Documents

Advanced Computer Architecture & Processing Systems Research Lab Anticipatory Techniques in Advanced Processor Architectures Prof. Lucian VINTAN, PhD Assoc. Prof. Adrian FLOREA, PhD Lecturer Arpad GELLERT, PhD

Advanced Computer Architecture & Processing Systems Research Lab Fetch Bottleneck Fetch Rate is limited by the basic-blocks’ dimension (7-8 instructions in SPEC 2000); Solutions Trace-Cache & Multiple (M-1) Branch Predictors; Branch Prediction increases ILP by predicting branch directions and targets and speculatively processing multiple basic-blocks in parallel; As instruction issue width and the pipeline depth are getting higher, accurate branch prediction becomes more essential. Some Challenges Identifying and solving some Difficult-to-Predict Branches (unbiased branches); Helping the computer architect to better understand branches’ predictability and also if the predictor should be improved related to Difficult-to-Predict Branches.

Advanced Computer Architecture & Processing Systems Research Lab Difficult to predict unbiased branches A difficult-to-predict branch in a certain dynamic context  unbiased  „highly shuffled“.

Advanced Computer Architecture & Processing Systems Research Lab Predicting Unbiased Branches State of the art branch predictors are unable to accurately predict unbiased branches; The problem: Finding new relevant information that could reduce their entropy instead of developing new predictors; Challenge: Adequately representing unbiased branches in the feature space! Accurately Predicting Unbiased Branches is still an Open Problem!

Advanced Computer Architecture & Processing Systems Research Lab Random Degree Metrics Based on: Hidden Markov Model (HMM) – a strong method to evaluate the predictability of the sequences generated by unbiased branches; Discrete entropy of the sequences generated by unbiased branches; Compression rate (Gzip, Huffman) of the sequences generated by unbiased branches.

Advanced Computer Architecture & Processing Systems Research Lab Issue Bottleneck (Data-flow) Conventional processing models are limited in their processing speed by the dynamic program’s critical path (Amdahl); 2 Solutions Dynamic Instruction Reuse (DIR) is a non-speculative technique. Value Prediction (VP) is a speculative technique. Common issue Value locality Chalenges Selective Instruction Reuse (MUL & DIV) Selective Load Value Prediction (“Critical Loads”) Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar / Simultaneous Multithreaded (SMT) Architecture to anticipate Long-Latency Instructions Results

Advanced Computer Architecture & Processing Systems Research Lab Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar Architecture Selective Instruction Reuse (MUL & DIV) Selective Load Value Prediction (Critical Loads)

Advanced Computer Architecture & Processing Systems Research Lab Selective Instruction Reuse and Value Prediction in Simultaneous Multithreaded Architectures Fetch Unit Branch Predictor PC I-CacheDecode Issue Queue Rename Table Physical Register File ROB LVPT Functional Units LSQ D-Cache RB SMT Architecture (M-Sim) enhanced with per Thread RB and LVPT Structures

Advanced Computer Architecture & Processing Systems Research Lab Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar Architecture The M-SIM Simulator Cycle-Level Performance Simulator Hardware Configuration SPEC Benchmark Power Models Hardware Access Counts Performance Estimation Power Estimation

Advanced Computer Architecture & Processing Systems Research Lab Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar Architecture Relative IPC speedup and relative energy-delay product gain with a Reuse Buffer of 1024 entries, the Trivial Operation Detector, and the Load Value Predictor

Advanced Computer Architecture & Processing Systems Research Lab Conclusions and Further Work Indexing the SLVP table with the memory address instead of the instruction address (PC); Exploiting an N-value locality instead of 1-value locality; Generating the thermal maps for the optimal superscalar and SMT configurations (and, if necessary, developing a runtime thermal manager); Understanding and exploiting instruction reuse and value prediction benefits in a multicore architecture.

Advanced Computer Architecture & Processing Systems Research Lab Anticipatory multicore architectures Anticipatory multicores would significantly reduce the pressure on the interconnection network  performance/energy; Value prediction, multithreading and the cache coherence/consistence mechanisms there are subtle, not well- understood relationships; data consistency errors  consistency violation detection and recovery; The inconsistency cause: VP might execute out of order some dependent instructions; Dynamic Instruction Reuse in a multicore system. Reuse Buffers coherence problems cache coherence mechanisms Details at

Advanced Computer Architecture & Processing Systems Research Lab An Automatic Design Space Exploration Framework for Multicore Architecture Optimizations Horia CALBOREAN, PhD student Prof. Lucian VINTAN, PhD

Advanced Computer Architecture & Processing Systems Research Lab Multiobjective optimization Number of (heterogeneous) cores in the processor becomes higher – the systems become more and more complex More configurations have to be simulated (NP-hard problem) Time needed to simulate all configurations prohibitive Performance evaluation has become a multiobjective evaluation

Advanced Computer Architecture & Processing Systems Research Lab Solutions Reducing simulation time  parallel & distributed simulation  sampling simulation Reducing number of simulations  intelligent multiobjective algorithms

Advanced Computer Architecture & Processing Systems Research Lab Proposed framework We developed FADSE (framework for automatic design space exploration) Compatible with most of the existing simulators Portable - implemented in java Includes many well known multiobjective algorithms Is able to run simulators and also well known test problems

Advanced Computer Architecture & Processing Systems Research Lab Existing tools Bounded to a certain simulator (Magellan) Lack portability - bounded to a certain operating system (M3Explorer, Magellan) Perform design space exploration of small parts of the system (only the cache - Archexplorer)

Advanced Computer Architecture & Processing Systems Research Lab FADSE – application architecture

Advanced Computer Architecture & Processing Systems Research Lab Features Parallel simulation (client server model) Ability to introduce constrains through XML interface Easily configurable through XML files:  change DSE algorithm,  specify input parameters and their possible values,  specify desired output metrics, etc.

Advanced Computer Architecture & Processing Systems Research Lab Our target Perform an evaluation of the existing algorithms on different simulators Find out which one performs best Improve the algorithms - map them on the specific problem of design space exploration

Advanced Computer Architecture & Processing Systems Research Lab Conclusions We have developed a framework which is able to perform automatic design space exploration Extensible, portable Many implemented multiobjective algorithms (through the use of jMetal) Reduces time through parallel &distributed execution of simulators

Advanced Computer Architecture & Processing Systems Research Lab Optimizing Application Mapping Algorithms for NoCs through a Unified Framework Ciprian RADU, PhD student Prof. Lucian VINTAN, PhD

Advanced Computer Architecture & Processing Systems Research Lab Outline Introduction  The application mapping problem for NoCs  The relation between application mapping and routing Evaluating application mapping algorithms for Networks- on-Chip  The framework design  The ns-3 NoC simulator Automatic Design Space Exploration for Networks-on- Chip  The framework

Advanced Computer Architecture & Processing Systems Research Lab The application mapping problem for NoCs

Advanced Computer Architecture & Processing Systems Research Lab Application mapping & routing

Advanced Computer Architecture & Processing Systems Research Lab Evaluating application mapping algorithms for Networks-on-Chip Existing application mapping algorithms are currently evaluated on specific NoCs  e.g.: NoCs with 2D mesh topology Existing comparisons between the algorithms are not made on the same NoC architecture We propose a unified framework for the evaluation and optimization of application mapping algorithms on different NoC designs

Advanced Computer Architecture & Processing Systems Research Lab The framework design 3 major components:  A module that contains the implementation of different application mapping algorithms;  A network traffic generator;  A Network-on-Chip simulator.

Advanced Computer Architecture & Processing Systems Research Lab The framework design flow

Advanced Computer Architecture & Processing Systems Research Lab The ns-3 NoC simulator Based on ns-3, an event driven simulator for Internet systemsns-3 Aims for a good accuracy – speed trade-off Flexible and scalable Current parameters:  Packet size, packet injection rate, packet injection probability;  Buffer size;  Network size;  Switching mechanism (SAF, VCT, Wormhole);  Routing protocol (XY, YX, SLB, SO);  Network topology (2D mesh, Irvine mesh);  Traffic patterns (bit-complement, bit-reverse, matrix transpose, uniform random).

Advanced Computer Architecture & Processing Systems Research Lab Automatic Design Space Exploration for Networks-on-Chip Motivation  There is no NoC suitable for all kinds of workload  There is an exponential number of possible NoC architectures Exhaustive DSE is no longer suitable Automatic DSE uses an heuristic driven exploration of the design space  Disadvantage: near-optimal solutions  Advantage: speed

Advanced Computer Architecture & Processing Systems Research Lab The framework Components:  DSE module  NoC simulator The DSE module determines the parameters of the NoC architecture  Uses algorithms from Artificial Intelligence The NoC simulator (ns-3 NoC) is automatically configured to simulate the network architecture determined by the DSE module The simulation results (network performance) help the DSE module at generating a better NoC architecture Design Space Exploration module Network-on-Chip simulator Configure the simulator Simulation results

Advanced Computer Architecture & Processing Systems Research Lab Optimal computer architecture for CFD calculation Senior Lecturer Ion Dan MIRONESCU, PhD Prof. Lucian VINTAN, PhD

Advanced Computer Architecture & Processing Systems Research Lab Practical aplication Modelling and simulation of multiscale, multicomponent, multiphase flow in complex geometry (ongoing projects) for :  optimisation of sugar crystalisation  prediction of the flow properties of polymer based dispers systems (starch and starch fractions, microbial polysacharides) HPC/CFD

Advanced Computer Architecture & Processing Systems Research Lab Goals Speed-up of this application on the given architecture Finding the optimal manycore architecture for CFD application (e.g. NoC)

Advanced Computer Architecture & Processing Systems Research Lab Method - Lattice Boltzmann (Chirila,2010)

Advanced Computer Architecture & Processing Systems Research Lab Method advantages easy discretization of complex geometry easy incorporation of “multi” models easy paralelisation easy cupling to other scale models (Molecular Dynamics)

Advanced Computer Architecture & Processing Systems Research Lab Computational model Local Values Ghost data COMPUTE EXCHANGE

Advanced Computer Architecture & Processing Systems Research Lab General-purpose manycore platform What can be used and what must be accounted for: ILP (super scalar, out of order, branch prediction) Task and Thread LP (multicore/multiprocessor) Mixed programming model (shared memory on blade, message passing between blades) Cache system

Advanced Computer Architecture & Processing Systems Research Lab Special purpose many core platform What can be used and what must be accounted for: SIMD Task and Thread LP (hardware multithreading, multicore/multiprocessor) Message passing Local store model –full user control

Advanced Computer Architecture & Processing Systems Research Lab Charm++ provides a high-level abstraction of a parallel program cooperating message-driven objects called chares support for load balancing, fault tolerance, automatic checkpointing support for all architectures trough a specific low level tier NAMD MD implementd in charm++

Advanced Computer Architecture & Processing Systems Research Lab Charm++ LB implementation

Advanced Computer Architecture & Processing Systems Research Lab Charm++ LB implementation

Advanced Computer Architecture & Processing Systems Research Lab DSE Search optimal values for sites/bloc blocs (chares)/core, /thread, /blade communication patterns

Advanced Computer Architecture & Processing Systems Research Lab Adaptive Meta-classifiers for Text Documents Prof. Lucian VINTAN, PhD Daniel MORARIU, PhD Radu CRETULESCU, PhD student

Advanced Computer Architecture & Processing Systems Research Lab Introduction We investigated a way to create a new adaptive meta-classifier for classifying text documents in order to increase the classification accuracy. During the first processing phase (pre- classification) the meta-classifier uses a non- adaptive selector. In the second phase (classification) we use a feed-forward neural network based on the back-propagation learning method.

Advanced Computer Architecture & Processing Systems Research Lab The architecture of the adaptive meta- classifier M-BP

Advanced Computer Architecture & Processing Systems Research Lab Classification accuracy

Advanced Computer Architecture & Processing Systems Research Lab Time necessary for reaching the given total error

Advanced Computer Architecture & Processing Systems Research Lab Conclusions This new adaptive meta-classifier uses 8 types of SVM classifiers and one Naïve Bayes type classifier to achieve the transposition of the input data from a large- scale space into a much smaller size space. The best results (99.74% in terms of classification accuracy) were obtained using a neural network with 192 neurons in the hidden layer. The meta-classifier managed to exceed the maximum "theoretical" limit of 98.63% which could be reached by an ideal non-adaptive meta-classifier that always chose the correct prediction if at least one classifier provide it. For Reuters2000 text documents we obtained classification accuracy up to 99.74%.

Advanced Computer Architecture & Processing Systems Research Lab Some Refererences – Computer Architectures L. VINTAN, A. GELLERT, A. FLOREA, M. OANCEA, C. EGAN – Understanding Prediction Limits through Unbiased Branches, Eleventh Asia-Pacific Computer Systems Architecture Conference, Shanghai 6-8th, September, A. GELLERT, A. FLOREA, M. VINTAN, C. EGAN, L. VINTAN - Unbiased Branches: An Open Problem, The Twelfth Asia-Pacific Computer Systems Architecture Conference (ACSAC 2007), Seoul, Korea, August 23-25th, VINTAN L. N., FLOREA A., GELLERT A. – Random Degrees of Unbiased Branches, Proceedings of The Romanian Academy, Series A: Mathematics, Physics, Technical Sciences, Information Science, Volume 9, Number 3, pp , Bucharest, A. GELLERT, A. FLOREA, L. VINTAN. - Exploiting Selective Instruction Reuse and Value Prediction in a Superscalar Architecture, Journal of Systems Architecture, vol. 55, issues 3, pp , ISSN , Elsevier, GELLERT A., PALERMO G., ZACCARIA V., FLOREA A., VINTAN L., SILVANO C. - Energy- Performance Design Space Exploration in SMT Architectures Exploiting Selective Load Value Predictions, Design, Automation & Test in Europe International Conference (DATE 2010), March 8-12, 2010, Dresden, Germany - CALBOREAN H., VINTAN L. - An Automatic Design Space Exploration Framework for Multicore Architecture Optimizations, Proceedings of The 9-th IEEE RoEduNet International Conference, ISBN, Sibiu, June 24-26, (indexata IEEE Xplore Digital Library) RADU C., VINTAN L. - Optimizing Application Mapping Algorithms for NoCs through a Unified Framework, Proceedings of The 9-th IEEE RoEduNet International Conference, ISBN, Sibiu, June 24-26, (indexata IEEE Xplore Digital Library) L. N. VINTAN - Direcţii de cercetare în domeniul sistemelor multicore / Main Challenges in Multicore Architecture Research, Revista Romana de Informatica si Automatica, ISSN: , ICI Bucuresti, vol. 19, nr. 3, 2009, v.

Advanced Computer Architecture & Processing Systems Research Lab References (1/2) - CFD Calculation 1. J. Hu and R. Marculescu, “Energy-aware mapping for tile-based NoC architectures under performance constraints,” in Proceedings of the 2003 Asia and South Pacific Design Automation Conference. Kitakyushu, Japan: ACM, 2003, pp. 233– R. Marculescu and J. Hu, “Energy- and performance-aware mapping for regular NoC architectures,” IEEE Transactions on ComputerAided Design of Integrated Circuits and Systems, vol. 24, no. 4, pp. 551–562, S. Murali and G. D. Micheli, “Bandwidth-Constrained mapping of cores onto NoC architectures,” in Proceedings of the conference on Design, Automation and Test in Europe - Volume 2. IEEE Computer Society, 2004, p K. Srinivasan and K. S. Chatha, “A technique for low energy mapping and routing in network-on-chip architectures,” in Proceedings of the 2005 international symposium on Low power electronics and design. San Diego, CA, USA: ACM, 2005, pp. 387– G. Ascia, V. Catania, and M. Palesi, “Multi-objective mapping for mesh-based NoC architectures,” in Proceedings of the 2nd IEEE/ACM/IFIP international conference on Hardware/software codesign and system synthesis. Stockholm, Sweden: ACM, 2004, pp. 182– J. P. Soininen and T. Salminen, “Evaluating application mapping using network simulation,” Proc of the Inter Symp on SystemonChip, vol. 1100, no. Kaitovyl 1, p. 2730, (2010) The SystemC website. [Online]. Available: 8. S. Murali and G. D. Micheli, “SUNMAP: a tool for automatic topology selection and generation for NoCs,” in Proceedings of the 41st annual Design Automation Conference. San Diego, CA, USA: ACM, 2004, pp. 914– C. Grecu, A. Ivanov, P. Pande, A. Jantsch, E. Salminen, U. Ogras, and R. Marculescu, “Towards open Network-on-Chip benchmarks,” in Proceedings of the First International Symposium on Networks-on-Chip.IEEE Computer Society, 2007, p. 205.

Advanced Computer Architecture & Processing Systems Research Lab References (2/2) - CFD Calculation 10. S. Mahadevan, F. Angiolini, M. Storgaard, R. G. Olsen, J. Sparso, and J. Madsen, “A network traffic generator model for fast Network-on-Chip simulation,” in Proceedings of the conference on Design, Automation and Test in Europe - Volume 2. IEEE Computer Society, 2005, pp. 780– R. P. Dick, D. L. Rhodes, and W. Wolf, “TGFF: task graphs for free,” in Proceedings of the 6th international workshop on Hardware/software codesign. Seattle, Washington, United States: IEEE Computer Society, 1998, pp. 97– (2010) The Embedded System Synthesis Benchmarks Suite (E3S) website. [Online]. Available: (2010) The Embedded Microprocessor Benchmark Consortium (EEMBC) website. [Online]. Available: (2010) The ns-3 network simulator website. [Online]. Available: H. vom Lehn, K. Wehrle, and E. Weing¨artner, “A performance comparison of recent network simulators,” 2009 IEEE International Conference on Communications, pp. 1–5, S. Schlingmann, “Selbstoptimierendes routing in einem network-on-a-chip,” Master’s thesis, University of Augsburg, J. Duato, S. Yalamanchili, and L. M. Ni, Interconnection Networks: An Engineering Approach, 1st ed. Institute of Electrical & Electronics Enginee, S. E. Lee and N. Bagherzadeh, “Increasing the throughput of an adaptive router in network-on- chip (NoC),” in Proceedings of the 4th international conference on Hardware/software codesign and system synthesis. Seoul, Korea: ACM, 2006, pp. 82– E. Salmien, A. Kulmala, and T. D. Hamalainen, “Survey of network-on-chip proposals,” White paper, © OCP-IP, Tampere University of Technology, March [On-line]. Available: IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdfhttp://ocpip.biz/uploads/documents/OCP- IP_Survey_of_NoC_Proposals_White_Paper_April_2008.pdf

Advanced Computer Architecture & Processing Systems Research Lab References - Meta-classifiers for Text Documents CRETULESCU R., MORARIU D., VINTAN L. – Eurovision-like weighted Non-Adaptive Meta-classifier for Text Documents, Proceedings of the 8th RoEduNet IEEE International Conference Networking in Education and Research, pp , ISBN , Galati, December 2009 (indexata ISI Web of Science - MORARIU D., CRETULESCU R., VINTAN L. – Improving a SVM Meta-classifier for Text Documents by using Naïve Bayes, International Journal of Computers, Communications & Control (IJCCC), Agora University Editing House - CCC Publications, ISSN 1841 – 9836, E-ISSN , Vol. V, No. 3, pp , 2010 CRETULESCU R., MORARIU D., VINTAN L., COMAN I. D. – An Adaptive Meta-classifier for Text Documents, The 16th International Conference on Information Systems Analysis and Synthesis: ISAS 2010, Orlando Florida, USA, April 6th – 9th 2010