Presentation is loading. Please wait.

Presentation is loading. Please wait.

1 Computer Science, University of Warwick Metrics  FLOPS (FLoating point Operations Per Sec) - a measure of the numerical processing of a CPU which can.

Similar presentations


Presentation on theme: "1 Computer Science, University of Warwick Metrics  FLOPS (FLoating point Operations Per Sec) - a measure of the numerical processing of a CPU which can."— Presentation transcript:

1 1 Computer Science, University of Warwick Metrics  FLOPS (FLoating point Operations Per Sec) - a measure of the numerical processing of a CPU which can be an indicator of it’s scientific computing capability.  The floating-point format is a variation of scientific notation - the real number is represented using a mantissa, base, and exponent  Storing real number in computers:  use the fixed length of word as the storage space for a real number (e.g. 64bits)  Mantissa is normalised (1.61 is normalised, 16.1 is not)  The mantissa and exponents are converted to base-2  Some parts of the word are used to store the mantissa, 1bit to store sign, and the rest to store the exponent  Advantages and disadvantages  Using a fixed-length space to store a wide overall range of values  If 64 bits are used to store the real numbers, in which 11 bits are used to store exponent and 52 bits to mantissa (the remaining 1 bit used to store sign). We can derive the range of numbers this storage layout can represent  More bits are used to store mantissa, higher precision, but smaller range  More bits are used to store exponent, wider range, but lower precision  The difference between two successive numbers is not uniform  When the numbers cannot be perfected converted to base-2 numbers, they must be rounded to be stored in the format, leading to some problems where algebraic rules do not appear to apply  The LINPACK benchmark produces a FLOPS results. This solves a dense system of linear equations by Gaussian elimination.

2 2 Computer Science, University of Warwick Example of Floating Point Numbers 172.625 base 10 10101100.101 X 2^0 base 2 1.0101100101 X 2^7 base 2 normalised Using 32 bit (4 bytes) to store the number in computers, in which 1 bit for sign, 8 bits for exponent, and the rest for Mantissa 0 00000111 00000000000010101100101 S Exp Mantissa

3 3 Computer Science, University of Warwick Metrics  MIPS (Millions of Instructions Per Second) - a measure of the speed of a processor. Peak MIPS rates (usually vendor supplied) can be misrepresentative Meaningless Information on Performance for Salespeople People seldom refer to it

4 4 Computer Science, University of Warwick Metrics  SPECint - measures a processor’s integer processing capabilities. Latest version SPECint2006 Can test cpu, memory, compiler, but cannot test networking, I/O Consists of a series of benchmarks (12, including compression, compilation) each benchmark has a reference time Dividing the measured runtime of the benchmark by the reference time and multiplying by 100 provides a base ratio For example, if we run the benchmark 401.bzip2 to test the system, whose reference time is 1400. The actual runtime of the benchmark is 140 sec. then the base ratio is calculated as 1400/140*100=1000 These are averaged to produce a final performance figure for the processor.

5 5 Computer Science, University of Warwick SPECint2006 benchmark suite Benchmark LanguageCategory 400.perlbenchCProgramming Language 401.bzip2CCompression 403.gccCC Compiler 429.mcfCCombinatorial Optimization 445.gobmkCArtificial Intelligence 456.hmmerCSearch Gene Sequence 458.sjengCArtificial Intelligence 462.libquantumCPhysics / Quantum Computing 464.h264refCVideo Compression 471.omnetppC++Discrete Event Simulation 473.astarC++Path-finding Algorithms 483.xalancbmkC++XML Processing

6 6 Computer Science, University of Warwick Metrics Communication:  Bandwidth (bytes/sec) How much data can be sent per second over the network  Latency (seconds) The time between one processor sending a message and the other processor receiving the message  Interconnection type: On-board interconnection or over networks.  Topologies: bus, crossbar, hub, switch  Protocols: stacks  unicast, multicast, broadcast. Storage capabilities:  Storage facilities: register, cache, memory, hard disk  Bandwidth and Latency. Bandwidth: how much data can be accessed per second in a certain storage facility Latency: the time between sending a data accessing request and receiving the requested data  Memory hierarchies (cpu register-> cache -> main memory -> remote memory)  Local, remote file systems

7 7 Computer Science, University of Warwick Top500 Supercomputer list  Website: www.top500.org www.top500.org  Top500 project Started in 1993, updated twice a year  Aiming to track the trend in HPC  Using LINPACK to measure the performance (FLOPS)  Essentially, LINPACK is to solve the dense system of linear equations Ax=b (commonly encountered in engineering area)  Users are allowed to change the problem size to get the maximum performance, which is used to rank the supercomputers  Theoretical peak performance is also given for reference

8 8 Computer Science, University of Warwick Top500 Supercomputer list  Tends to represent parallel computers, so distributed systems such as SETI@Home are neglected.  Does not consider storage or I/O issues  Both custom designed machines and commodity machines win positions in the list  General trend towards commodity machines (COTS - Commodity Off-The-Shelf). BlueGene/L, however, is not a COTS machine  Connecting a large number of machines with relatively lower performance is more rewarding than connecting a small number of machines each with high performance  Read the paper: “A note on the Zipf distribution of Top500 supercomputers” (download from my homepage)  Performance doubles each year, better than Moore’s Law.  Moore’s Law : performance doubles approximately every 18 months  Dominated by the United States (location map of the Top100 machines: http://www.top500.org/lists/2006/11/top100map) http://www.top500.org/lists/2006/11/top100map  UK supercomputers in the list  Cambridge: No.20 (http://www.top500.org/system/8267 ),http://www.top500.org/system/8267  AWE: No. 15

9 9 Computer Science, University of Warwick Top Machine BlueGene/L  first supercomputer in the Blue Gene project  Specialised systems based on the Power architecture. Individual power 400 processors at 700Mhz Two processors reside in a single chip. Two chips reside on a “compute card” with 512MB memory. 16 of these compute cards are placed on a node board. 32 node boards fit into one cabinet, and there are 64 cabinets. 130,712 CPUs with theoretical peak of 183.5 TFLOPS/s Multiple network topologies available, which can be selected depending on the application.  High density of processors in a small area: Low power and (comparatively) slow processors - just lots of them! Fast interconnects and low-latency.


Download ppt "1 Computer Science, University of Warwick Metrics  FLOPS (FLoating point Operations Per Sec) - a measure of the numerical processing of a CPU which can."

Similar presentations


Ads by Google