Download presentation
Presentation is loading. Please wait.
Published byBertold Günther Modified over 6 years ago
1
Grid Means Business OGF-20, Manchester, May 2007 Scaling up to the Enterprise Level: Using 15,000 CPUs 66% of the Time Micron Technology, Inc. Brooklin J. Gore Senior Fellow, Advanced Computing 2004 Micron Technology, Inc. All rights reserved. Information is subject to change without notice. 1 1
2
Using 15,000 CPUs 66% of the Time Agenda
Micron Overview Micron Grid Overview Grid Application Overview System Management Best Practices Conclusions 11/8/2018
3
Using 15,000 CPUs 66% of the Time Micron Overview
…move… Capture… …Store 11/8/2018
4
Using 15,000 CPUs 66% of the Time Micron Overview
11/8/2018
5
Using 15,000 CPUs 66% of the Time Micron Grid Overview
Idaho-3 Idaho-2 Virginia Idaho-1 Italy Japan Utah Singapore-1 Singapore-2 11/8/2018
6
Using 15,000 CPUs 66% of the Time Micron Grid Overview
14077 Processors, TFlops, 63rd Top500 Rank 529 TeraBytes Disk 102 user accounts in all pools 1,281,636 job hours, 1780 Processor-months Primarily Windows, plus Linux, some Solaris Condor system managed in-house Centralized governance, distributed management 11/8/2018
7
Using 15,000 CPUs 66% of the Time Micron Grid Overview
11/8/2018
8
Using 15,000 CPUs 66% of the Time Grid-Enabled Application Processing Models
Push Data processors are pushed to dynamically allocated grid resources Pull Data is pulled to processors on pre-allocated grid resources Portable Processors may be dynamically deployed on resources far away from source data 11/8/2018
9
Using 15,000 CPUs 66% of the Time The Push Processing Model
Good for ‘long running’ jobs -- mitigate ~30s grid process activation delay One unit of work associated with one grid job Grid resources dynamically allocated as needed to run jobs Data (pointer) is pushed along with job to grid resource. A1 , A2 , … AL Work Queue P1 P2 Px P3 P4 Grid Resources A There is work to do (Schedule job to do work) Work In Out Allocate job to available resource Grid Scheduler 11/8/2018
10
Using 15,000 CPUs 66% of the Time The Pull Processing Model
Good for many very short jobs -- avoids ~30s grid process activation delay Work queue size not equal to grid job queue size Work processors are pre-provisioned onto grid resources Work processors pull work from queue is clasic example A1 , A2 , … AL Work Queue P1 P2 Px P3 P4 Grid Resources A High Water (Add Processor) Low (Remove Processor) Work In Out (Un-) Provision processor (from) on resource Grid Scheduler 11/8/2018
11
Using 15,000 CPUs 66% of the Time The Portable Processing Model
Portable Grid applications: Low data in/out Compute bound So, follow-the-moon: Direct jobs to sites Where the workers aren’t 11/8/2018
12
Using 15,000 CPUs 66% of the Time Grid Applications Overview
Don’t believe in one killer grid application… …but many general purpose grid applications: Manufacturing applications (widget processing) Engineering applications (repetitive tasks) Reporting applications (chart generation) Data mining (log file processing) Software development (build, test, package, deploy) Security (proactive port scanning) Grid-enabled script engines (MATLAB, JMP, R, etc.) 11/8/2018
13
Using 15,000 CPUs 66% of the Time Grid System Management
Software deployment and upgrades Unix systems use common system image on shared file system Windows systems use Altiris to deploy/install/upgrade sotware on local file system Grid host configuration Three-tier configuration files: global, pool-wide, host Unix systems utilize shared file system Windows ‘cron job’ checks for updates to central files and copies to local system. Today file-based, want web-based. Compute host and job management Global web interface 11/8/2018
14
Using 15,000 CPUs 66% of the Time Grid System Management
11/8/2018
15
Using 15,000 CPUs 66% of the Time Grid System Management
The tricky parts Configuring hosts so grid jobs don’t impact users Run jobs when machine user is idle Evict jobs when user comes back (Don’t worry about CPU -- let the OS do it) Optimizing the grid application data ‘chunk’ size Too big: hard to checkpoint, tough on network Too small: file overhead is high, use messages (Best to be configurable and dial-in with experience) 11/8/2018
16
Using 15,000 CPUs 66% of the Time Governance Best Practices
worker Infrastructure Fast, fat networks Centralized (fast) data stores Common system images Fast (fat) desktops worker worker worker Job dispatch result set “Edge push” Effect work units master (traditional client) database server 11/8/2018
17
Using 15,000 CPUs 66% of the Time Governance Best Practices
People Articulate the value proposition Focus on low-hanging fruit Integrate new grid processes with existing system management processes Educate, educate, educate Technology is Easy, People are Hard 11/8/2018
18
Using 15,000 CPUs 66% of the Time Governance Best Practices
Grid Management Centralize Grid Center of Excellence Global standards Grid tools development Distribute Grid resource management Application support Align Pools with Identity/Data Domains 11/8/2018
19
Using 15,000 CPUs 66% of the Time Conclusion
Large scale grid computing on shared desktop systems in the enterprise is doable today and is… Not that difficult (from a technology perspective) Not that expensive (from a people and money perspective) Practical (from a grid-applications perspective) 11/8/2018
20
Scaling up to the Enterprise Brooklin J. Gore
Open Grid Forum 20 Manchester UK, May, 2007 Grid Means Business Scaling up to the Enterprise Brooklin J. Gore Senior Fellow, Advanced Computing Micron and the Micron logo are trademarks and/or service marks of Micron Technology, Inc. All other trademarks are the property of their respective owners.
21
Using 15,000 CPUs 66% of the Time Abstract
This presentation outlines the goal of a global company to increase the utilization of its computing infrastructure to 66%. We show how a grid computing system is used to combine the computing resources of almost 15,000 cpus at 6 sites on three continents. The presentation discusses how the grid software is deployed to the machines, most of which are knowledge worker desktops running Microsoft Windows, and how each machine is configured and managed. The presentation also covers key best practices and touches on the breadth of applications that are running on the system. 11/8/2018
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.