Download presentation
Presentation is loading. Please wait.
2
Heterogeneous and Grid Computing2 Course Outline u Outline of heterogeneous hardware –Heterogeneous clusters –Local networks of computers –Organizational networks of computers –Global general purpose networks of computers u Typical uses of heterogeneous networks –Traditional use –Parallel computing –Distributed computing
3
Heterogeneous and Grid Computing3 Course Outline (ctd) u Programming issues: results and open problems –Performance –Fault tolerance u Software –Parallel programming systems »MPI, HPF, HeteroMPI, mpC –Distributed programming systems »NetSolve/GridSolve
4
Heterogeneous and Grid Computing4 References u Reading materials –Alexey Lastovetsky. Parallel Computing on Heterogeneous Networks. John Wiley & Sons, 423 pp., June 2003. –Papers and Internet resources to be provided –Course webpage
6
Heterogeneous and Grid Computing6 Hardware platforms u Heterogeneous hardware for parallel and distributed computing –Multiple processors –Communication network interconnecting the processors –Can be heterogeneous in many ways »Only one way for a distributed memory multiprocessor system to be homogeneous
7
Heterogeneous and Grid Computing7 Homogeneous system u Homogeneous system –Identical processors –Homogeneous communication network »Links of the same latency and bandwidth between any pair of processors –Dedicated system »One application at a time –Designed for high performance parallel computing »Used to run a small number of similar applications
8
Heterogeneous and Grid Computing8 Homogeneity vs heterogeneity u Homogeneity –Easy to break »Multitasking makes the system heterogeneous –May be expansive to keep »Cluster of commodity processors u Cheap alternative to vendor systems u Should be a dedicated system to stay homogeneous u Cannot be upgraded in part u Heterogeneity is natural
9
Heterogeneous and Grid Computing9 Taxonomy of heterogeneous systems u Heterogeneous systems (in the increasing order of heterogeneity) –Vendor designed heterogeneous systems –Heterogeneous clusters –Local networks of computers –Organizational networks of computers –Global general purpose networks of computers
10
Heterogeneous and Grid Computing10 Vendor designed heterogeneous systems u New interesting trend –Motivated by applications »Impressive GPU’s performance u Cell architecture (IBM, Sony and Toshiba) –9 cores, 1 PowerPC and 8 special-purpose SPE processors »204 Gigaflops in 32 bit arithmitics per chip at 3.2 GHz u Kei-soku Petaflop system (Japan, by 2011) –Possibly, heterogeneous »Scalar and vector components, special purpose accelerators
11
Heterogeneous and Grid Computing11 Heterogeneous cluster u Heterogeneous cluster –Still a dedicated system designed for parallel computing, but »Processors may not be identical »Communication network may have a regular but heterogeneous structure u Faster segments interconnected by slower links »May be a multi-user system u Still dedicated to high performance computing u Dynamic performance characteristics of processors
12
Heterogeneous and Grid Computing12 Heterogeneous cluster (ctd) u Heterogeneity of processors –Different architectures –Different models of the same architecture –The same architecture and model but running different operating systems –The same architecture, model and OS but different configurations or basic software (compilers, run- time libraries, etc.)
13
Heterogeneous and Grid Computing13 Local network of computers (LNC)
14
Heterogeneous and Grid Computing14 LNC (ctd) u LNC is a multi-user systems by nature u Like highly heterogeneous clusters, LNC –Consists of processors of different architectures »Dynamically changing their performance characteristics –Interconnected via heterogeneous communication network u Unlike HCs, LNC –General-purpose computer system »Typically associated with an individual organization
15
Heterogeneous and Grid Computing15 LNC: communication network u Communication network of a typical LNC –Primary factors determining the structure »structure of the organisation, tasks that are solved, security requirements, construction restrictions, budget limitations, qualification of technical personnel, etc. –High performance computing »secondary factor if considered at all –Constantly developing »occasionally and incrementally »the communication network reflects the evolution of the organization rather than its current snapshot
16
Heterogeneous and Grid Computing16 LNC: communication network (ctd) u As a result, the communication network –May be extremely heterogeneous –May have links of low speed and/or narrow bandwidth
17
Heterogeneous and Grid Computing17 LNC: functional heterogeneity u Different computers may have different functions –Relatively isolated –Providing services to other computers in the LNC –Providing services to local and external computers u Different functions => different levels of integration into the network –The heavier the integration, the less predictable the performance characteristics
18
Heterogeneous and Grid Computing18 LNC: functional heterogeneity (ctd) u A heavy server may be configured differently –Often configured to avoid paging »Leads to abnormal termination of some applications u If they do not fit into the main memory »The loss of continuity of performance characteristics
19
Heterogeneous and Grid Computing19 LNC: decentralized system u Components of a LNC are not as strongly integrated and controlled as in HCs –Relatively autonomous computers »Used and administered independently –Configuration of LNC is dynamic »Computer can come and go u Users switch them on and off u Users may re-boot computers
20
Heterogeneous and Grid Computing20 Global network of computers u Global network of computers (GNC) –Includes geographically distributed computers u Three main types of GNCs –Dedicated system for HPC »Consists of several interconnected homogeneous systems and/or HCs u Similar to HCs –Organizational network –General-purpose global network
21
Heterogeneous and Grid Computing21 GNC (ctd) u Organizational GNC –Comprises computer resources of some individual organization –Can be seen as geographically extended LNC –Typically very well managed –Typically of higher level of centralization, integration and uniformity than LNCs u Organizational GNCs are similar to LNCs
22
Heterogeneous and Grid Computing22 GNC (ctd) u General-purpose GNC –Consists of individual computers interconnected via Internet »Each computer is managed independently »The most heterogeneous, irregular, loosely integrated and dynamic type of heterogeneous network u Some other heterogeneous sets of interconnected processing devices (not for scientific computing) –Mobile telecommunication system –Embedded control multiprocessor systems
23
Heterogeneous and Grid Computing23 Grid based systems u Grid based systems –A single login to a group of resources »No need to separately login at each session to each of the resources –Grid middleware »Keep a list of available resources discovered and added in the past »Upon user’s login to the Grid based system u Detects which of the resources are available now u Login to the available resources on behalf of the user –Services on top of this = Grid operating environment »Different models (Globus, Unicore)
25
Heterogeneous and Grid Computing25 Uses of heterogeneous networks u Heterogeneous networks are used –Traditionally –For parallel computing –For distributed computing
26
Heterogeneous and Grid Computing26 Uses of heterogeneous networks (ctd) u Traditional use –The network is used as an extension of the user’s computer »The computer may be serial or parallel »Applications are traditional u May be run on the user’s computer u Data and code are provided by the user »Applications may be run on any relevant computer of the network u Scheduled by an operating environment u Aimed at better utilization of the computing resources (rather that at higher performance)
27
Heterogeneous and Grid Computing27 Uses of heterogeneous networks (ctd) u Parallel computing –The network is used as a parallel computing system »The user provides a dedicated parallel application u The (source) code and data are provided by the user u The source code is sent to the computers where it is locally compiled u All computers are supposed to have all libraries necessary to produce local executables –High performance is the main goal of the use
28
Heterogeneous and Grid Computing28 Uses of heterogeneous networks (ctd) u Distributed computing –Some components of the application are not available on the user’s computer u Some code may be only available on remote computers –Lack of resources to execute the code –Too much efforts and/or resources needed compared to the frequency of its execution –Not available for installation –Meaningful only if executed remotely (ATM)
29
Heterogeneous and Grid Computing29 Uses of heterogeneous networks (ctd) u Some components of input data cannot be provided by the user and reside on remote storage devices –The size of data is too big for the disk storage of the user’s computer –The input data are provided by a third party »Remote scientific device, remote data base, remote application u Both some code and data are not available
30
Heterogeneous and Grid Computing30 Programming issues u Programming issues –Performance »Primary for parallel programming »High priority for distributed programming u Primary for high performance DP –Fault tolerance »Primary for distributed computing »Never used to be primary for parallel computing u Getting primary for large scale parallel computing and heterogeneous parallel computing
31
Heterogeneous and Grid Computing31 Programming issues (ctd) u Programming issues (ctd) –Arithmetic heterogeneity »Never been an issue for homogeneous parallel systems u Arithmetic data types are uniformly represented u Transfer between processors does not change value »Different representations in different processors »Arithmetic values may change during transfer in heterogeneous communication networks »Not a serious issue in distributed computing u If the method is not illconditioned
33
Heterogeneous and Grid Computing33 Processors Heterogeneity u Heterogeneity of processors –The processors run at different speeds »A good parallel program for MPPs evenly distributes workload »Ported to heterogeneous cluster, the program will align the performance with the slowest processor »A good parallel application for a NoC must distribute computations unevenly u taking into account the difference in processor speed u Ideally, the volume of computation performed by a processor should be proportional to its speed
34
Heterogeneous and Grid Computing34 Processors Heterogeneity (ctd) u Example. Multiplication of two dense n×n matrices A, B on a p-processor heterogeneous cluster. Matrices A, B, and C unevenly partitioned in one dimension The area of the slice mapped to each processor is proportional to its speed The slices mapped onto a single processor are shaded black During execution, this processor requires all of matrix A (shaded gray)
35
Heterogeneous and Grid Computing35 Processors Heterogeneity (ctd) u The efficiency of any application unevenly distributing computations over heterogeneous processors strongly depends on the accuracy of estimation of the relative speed of the processors –If this estimation is not accurate, the load of processors will be unbalanced u Traditional approach –Run some test code once –Use the estimation when distributing computations
36
Heterogeneous and Grid Computing36 Processors Heterogeneity (ctd) u Two processors of the same architecture only differing in the clock rate –No problem to accurately estimate their relative speed »The relative speed will be the same for any application u Processors of different architectures –Differ in everything »set of instructions, number of IEUs, number of registers, memory hierarchy, size of each memory level, and so on –May demonstrate different relative speeds for different applications
37
Heterogeneous and Grid Computing37 Processors Heterogeneity (ctd) u Example. Consider two implementations of a 500x500 matrix Cholesky factorisation: –The code for(k=0; k<500; k++) { for(i=k, lkk=sqrt(a[k][k]); i<500; i++) a[i][k] /= lkk; for(j=k+1; j<500; j++) for(i=j; i<500; i++) a[i][j] -= a[i][k]*a[j][k]; } estimated the relative speed of SPARCstation-5 and SPARCstation-20 as 10:9
38
Heterogeneous and Grid Computing38 Processors Heterogeneity (ctd) u Example. (ctd) –The code for(k=0; k<500; k++) { for(i=k, lkk=sqrt(a[k][k]); i<500; i++) a[i][k] /= lkk; for(i=k+1; i<500; i++) for(j=i; j<500; j++) a[i][j] -= a[k][j]*a[k][i]; } estimated their relative speed as 10:14 –Routine dptof2 from LAPACK, solving the same problem, estimated their relative speed as 10:10
39
Heterogeneous and Grid Computing39 Processors Heterogeneity (ctd) u Multi-tasking is another complication –The real performance can dynamically change depending on external computations and communications
40
Heterogeneous and Grid Computing40 Processors Heterogeneity (ctd) u A factor preventing from accurate estimation of the relative speed –Different levels of integration into the network »The heavier the integration, the less predictable the performance characteristics u Constant unpredictable fluctuations in the workload »A problem for general-purpose local and global networks u Dedicated clusters are much more predictable
41
Heterogeneous and Grid Computing41 Processors Heterogeneity (ctd) Memory heterogeneity –Computational tasks may not fit into the main memory –Different sizes of the main memory, different paging algorithms »The relative speed may not be constant independent on the problem size –Estimations obtained in the absence of paging may be inaccurate when paging occurs –Some computers configured to avoid paging
42
Heterogeneous and Grid Computing42 Communication Network u Communication network interconnecting heterogeneous processors –Has a significant impact on optimal distribution of computations over the processors –Can only be ignored if the contribution of communication operations into the total execution time << that of computations
43
Heterogeneous and Grid Computing43 Communication Network (ctd) Optimal distribution of computation taking account of communication links –Up to p 2 communication links in addition to p processors »Increases the complexity of the problem even if the optimal distribution involves all available processors –If some links are of low latency/bandwidth »May be profitable to involve not all available processors »The problem becomes extremely complex u All subsets of the set of processors should be tested
44
Heterogeneous and Grid Computing44 Communication Network (ctd) u Taking account of communication operations –The relative speed of processors makes no sense »Cannot help to estimate the cost of computations –The processor speed has to be absolute »Volume of computations per time unit u Given a volume of computations, the execution time of the computations can be calculated –Similarly, communication link have to be characterized by absolute units of performance
45
Heterogeneous and Grid Computing45 Summary of performance issues u Two basic related issues to be addressed –Performance model of HNOC »Quantifying the ability of the HNOC to perform computations and communications –Performance model of application »Quantifying computations and communications performed by the application –Given such models and parameters of the application and HNOC + a mapping of the application to the HNOC → prediction of the execution time »Having incorporated in applications or programming systems may lead to better distribution and, hence, higher performance
46
Heterogeneous and Grid Computing46 Summary of performance issues (ctd) u Performance model of HNOC –PM of a set of heterogeneous processors »To estimate the execution time of computations –PM of communication network »To predict the execution time of communication operations
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.