Multiprocessor research at Åbo Akademi Ralph-Johan Back.

Slides:



Advertisements
Similar presentations
COMPUTER NETWORK TOPOLOGIES
Advertisements

879 CISC Parallel Computation High Performance Fortran (HPF) Ibrahim Halil Saruhan Although the [Fortran] group broke new ground …
Concurrency The need for speed. Why concurrency? Moore’s law: 1. The number of components on a chip doubles about every 18 months 2. The speed of computation.
Creating Computer Programs lesson 27. This lesson includes the following sections: What is a Computer Program? How Programs Solve Problems Two Approaches:
Computer Science 425 Distributed Systems CS 425 / ECE 428 Consensus
CR ST CREST Centre for Reliable Software Technology Ralph Back Director.
© Janice Regan, CMPT 102, Sept CMPT 102 Introduction to Scientific Computer Programming The software development method algorithms.
History of Distributed Systems Joseph Cordina
CISC October Goals for today: Foster’s parallel algorithm design –Partitioning –Task dependency graph Granularity Concurrency Collective communication.
Digital Systems Emphasis for Electrical Engineering Students Digital Systems skills are very valuable for electrical engineers Digital systems are the.
Reference: Message Passing Fundamentals.
Slide 1 Parallel Computation Models Lecture 3 Lecture 4.
1 Distributed Computing Algorithms CSCI Distributed Computing: everything not centralized many processors.
11/14/05ELEC Fall Multi-processor SoCs Yijing Chen.
1  1998 Morgan Kaufmann Publishers Chapter 9 Multiprocessors.
Multiprocessors Andreas Klappenecker CPSC321 Computer Architecture.
1 CSE SUNY New Paltz Chapter Nine Multiprocessors.
 Parallel Computer Architecture Taylor Hearn, Fabrice Bokanya, Beenish Zafar, Mathew Simon, Tong Chen.
Introduction to Programming Programming. COMP102 Prog. Fundamentals I: Introduction / Slide 2 Objectives l To learn fundamental problem solving techniques.
SM3121 Software Technology Mark Green School of Creative Media.
ENEE 644 Dr. Ankur Srivastava Office: 1349 A.V. Williams URL: Computer-Aided Design of.
Chapter 5 Array Processors. Introduction  Major characteristics of SIMD architectures –A single processor(CP) –Synchronous array processors(PEs) –Data-parallel.
1 The Turn Model for Adaptive Routing. 2 Summary Introduction to Direct Networks. Deadlocks in Wormhole Routing. System Model. Partially Adaptive Routing.
N Tropy: A Framework for Analyzing Massive Astrophysical Datasets Harnessing the Power of Parallel Grid Resources for Astrophysical Data Analysis Jeffrey.
STRATEGIES INVOLVED IN REMOTE COMPUTATION
CBP 2006MSc. Computing1 Modelling and Simulation.
LIGO-G Z 8 June 2001L.S.Finn/LDAS Camp1 How to think about parallel programming.
“The Architecture of Massively Parallel Processor CP-PACS” Taisuke Boku, Hiroshi Nakamura, et al. University of Tsukuba, Japan by Emre Tapcı.
Structure of Study Programmes
Multiple Processor Systems. Multiprocessor Systems Continuous need for faster and powerful computers –shared memory model ( access nsec) –message passing.
TRANSPUTER ARCHITECTURE. What is Transputer ? The first single chip computer designed for message-passing parallel systems, in 1980s, by the company INMOS.
Industrial Excellence Center (IXC) Embedded Applications Software Engineering (EASE) Prof. Per Runeson.
MIMD Distributed Memory Architectures message-passing multicomputers.
Course Wrap-Up Miodrag Bolic CEG4136. What was covered Interconnection network topologies and performance Shared-memory architectures Message passing.
After step 2, processors know who owns the data in their assumed partitions— now the assumed partition defines the rendezvous points Scalable Conceptual.
Loosely Coupled Parallelism: Clusters. Context We have studied older archictures for loosely coupled parallelism, such as mesh’s, hypercubes etc, which.
CS 345: Chapter 10 Parallelism, Concurrency, and Alternative Models Or, Getting Lots of Stuff Done at Once.
April 26, CSE8380 Parallel and Distributed Processing Presentation Hong Yue Department of Computer Science & Engineering Southern Methodist University.
Frontiers in Massive Data Analysis Chapter 3.  Difficult to include data from multiple sources  Each organization develops a unique way of representing.
Jump to first page One-gigabit Router Oskar E. Bruening and Cemal Akcaba Advisor: Prof. Agarwal.
Definitions Speed-up Efficiency Cost Diameter Dilation Deadlock Embedding Scalability Big Oh notation Latency Hiding Termination problem Bernstein’s conditions.
Mr C Johnston ICT Teacher BTEC IT Unit 05 - Lesson 03 Network Topologies.
Lecture 4: Sun: 23/4/1435 Distributed Operating Systems Lecturer/ Kawther Abas CS- 492 : Distributed system & Parallel Processing.
Research Topics CSC Parallel Computing & Compilers CSC 3990.
Major Disciplines in Computer Science Ken Nguyen Department of Information Technology Clayton State University.
Prepared By :. CONTENTS 1~ INTRODUCTION 2~ WHAT IS BLUE BRAIN 3~ WHAT IS VIRTUAL BRAIN 4~ FUNCTION OF NATURAL BRAIN 5~ BRAIN SIMULATION 6~ CURRENT RESEARCH.
Distributed DBMSs- Concept and Design Jing Luo CS 157B Dr. Lee Fall, 2003.
Non-Data-Communication Overheads in MPI: Analysis on Blue Gene/P P. Balaji, A. Chan, W. Gropp, R. Thakur, E. Lusk Argonne National Laboratory University.
Mr C Johnston ICT Teacher
Chapter2 Networking Fundamentals
McGraw-Hill Technology Education © 2006 by the McGraw-Hill Companies, Inc. All rights reserved. 11 CHAPTER INFORMATION TECHNOLOGY, THE INTERNET, AND YOU.
Axel Jantsch 1 Networks on Chip Axel Jantsch 1 Shashi Kumar 1, Juha-Pekka Soininen 2, Martti Forsell 2, Mikael Millberg 1, Johnny Öberg 1, Kari Tiensurjä.
Distributed Programming CA107 Topics in Computing Series Martin Crane Karl Podesta.
Master’s Degree in Computer Science. Why? Acquire Credentials Learn Skills –Existing software: Unix, languages,... –General software development techniques.
2016/1/6Part I1 A Taste of Parallel Algorithms. 2016/1/6Part I2 We examine five simple building-block parallel operations and look at the corresponding.
Computer Science 340 Software Design & Testing Software Architecture.
Interconnection network network interface and a case study.
Advanced Processor Group The School of Computer Science A Dynamic Link Allocation Router Wei Song, Doug Edwards Advanced Processor Group The University.
Chapter 1 Information Technology, the Internet, and You Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. 1-1.
Spring EE 437 Lillevik 437s06-l22 University of Portland School of Engineering Advanced Computer Architecture Lecture 22 Distributed computer Interconnection.
Distributed Algorithms Dr. Samir Tartir Extracted from Principles of Concurrent and Distributed Programming, Second Edition By M. Ben-Ari.
University of Texas at Arlington Scheduling and Load Balancing on the NASA Information Power Grid Sajal K. Das, Shailendra Kumar, Manish Arora Department.
December 13, G raphical A symmetric P rocessing Prototype Presentation December 13, 2004.
Agenda  Quick Review  Finish Introduction  Java Threads.
Intro to Distributed Systems Hank Levy. 23/20/2016 Distributed Systems Nearly all systems today are distributed in some way, e.g.: –they use –they.
The OSI Model. Understanding the OSI Model In early 1980s, manufacturers began to standardize networking so that networks from different manufacturers.
Computer Network Topology
Parallel Programming in C with MPI and OpenMP
Creating Computer Programs
Creating Computer Programs
Presentation transcript:

Multiprocessor research at Åbo Akademi Ralph-Johan Back

Early 80’s Work on distributed systems –joint work with Reino Kurki-Suonio –formalizing, constructing, analyzing, verifying distributed systems –point-to-point networks, wlans, etc –courses on distributed systems at ÅA –research on language CSP (C.A.R. Hoare) –very theoretical work, needed some practical case studies to look at

Mid 80’s Inmos starts designing the transputer –Special processor that has four links for communication with neighbours –Specially designed for building parallel processors T800

Hathi-project VTT Oulu contacted us, wanted to build a parallel computer –Tapani Äijänen, Kari Leppänen from VTT TEKES project (Hathi, ): –VTT built hardware –ÅA built software First version: Hathi-1, 16 processors –Mats Aspnäs project manager from ÅA Used for experimentation Lots of case studies in building parallel systems

Finsoft III program TEKES starts a new large research program (Finsoft, in three parts, ) –I was director of Finsoft III- parallel processing and neural networks –12 different research projects in Finsoft III, 7 in parallel processing and 5 in neural networks –ÅA had 2 large projects: Millipede and Centipede Millipede: Massively parallel processors Centipede: Construction and correctness of Parallel Systems –Quite a large number of people enganged in Millipede U. Solin, Hong Shen got Ph.D. from this many M.Sc. thesis.

Millipede Built Hathi-2 in Millipede –VTT/Oulu built hardware –ÅA built software Largest supercomputer in Finland at that time –Connected to the university network (internet) –A reasonable number of research groups were using Hathi-2, in different universities –Different kinds of applications, mainly CS and scientific computing Many Ph.D. and M.Sc thesis around Hathi-1 and Hahi-2.

Hathi-2 Hathi-2 construction –100 Transputer 800 floating point processors –Connected in a mesh structure –Dynamically reconfigurable connection structure –25 smaller Transputers connected in a ring, to monitor and control the floating point processors Physically –size of a larger refrigator, –with 25 boards, –each board containing 4 T800 and 1 smaller transputer

Hathi-2 Hathi-2 system software –Reconfiguring the conection structure –Mapping logical processor structure onto physical structure –Monitoring software for performance measurements Hathi-2 application software –Nuclear physics –Solving differential equations –CFD (Computational fluid dynamics) –Cosmology –Full text retrieval –....

Main challenges Writing software as collection of parallel processes was difficult –Occam language from Inmos based on CSP –orchestrating communication by means of message passing –problems with deadlocks and livelocks Lots of algorithmic problems –laying out a logical process net on a physical processor network required strong heuristics –Monitoring computation without interfering required careful embedded system design –Partitioning software to allow for parallel computation In the end, writing software for massively parallel processors was not that much more difficult that writing ordinary software

Early 90’ I went for a sabbatical to Caltech ( ) –Center for massively parallel computer research –Chuck Seiz built first cosmic cube –Chuck was working on really large processor array (16K processors) However –I worked mostly on formal methods in programming –And participated in Alain Martins group on the design of asynchronous VLSI circuits –Was not that intrested in multiprocessor hardware design

T9000 transputer We planned to build Hathi-3 –T800 processors had become too slow compared to the competitors We needed the new T9000 transputers –very efficient –had wormhole routing, so arbitrary dynamic routing at run time –competed with best Intel and Motorola processors at that time Inmos had difficulties in building T9000 in quantity –quality of produced processors was not good enough –Inmos was bought up by other companies –in the end, they stopped the development of a new transputer generation

End of Hathi Did not want to continue working with other processors – Intel or Motorola processors were much inferior to Inmos transputers for parallel systems –Could not build own transputer Hathi-3 was never built Hathi-2 was deplugged in