Download presentation
Presentation is loading. Please wait.
1
Academic and Research Technology (A&RT)
QUEST Overview QUEST Support Team Academic and Research Technology (A&RT) NUIT David Chen, Ph.D.
2
Quest Workshop Schedule
Other topics in preparation: - Understanding parallel performance - QUEST1 and QUEST2 performance comparison - TotalView, Performance Counters, - Intel Hyperthreading Technology, etc 2
3
What is Quest? Hardware Layout Who is involved?
Agenda What is Quest? Hardware Layout Who is involved? What’s inside? Operating System Supported software Job scheduler How to Get Access Who gets access, and how? How do allocations work? Important Dates What are we doing here? What is Quest What Can it do for me? How do I get access? Where to go for more info/help? Quest website Quest announcement list MOTD – message of the day 3
4
What is Quest ? Quest is Northwestern University's large-scale, shared, high performance computing system providing University researchers with a clustered computing platform in support of research. The implementation of Quest reflects University's multi-year high performance computing initiative to maintain Northwestern's reputation as a leading research institution. How much do I need to pay to use Quest? Can I add my hardware to Quest? Can I align my grant or proposal with Quest? Access to Quest is allocation driven (more about this later!) 4 Have feedback?
5
How Does Quest Look Alike?
5
6
Quest Implementation Phase I Total: 7056 cores (756 nodes)
3008 Nehalem cores (376 nodes) – initial installation, July 2009 1024 Nehalem cores (128 nodes ) – upgrade, Feb. 2010 paid by specific user groups 8 cores per node, 48GB memory per node, no scratch disks Interconnect is DDR Infiniband. Phase II 3024 Westmere cores (252 nodes) – installed Dec. 2010 12 cores per CPU, 48GB memory per node I TB scratch disk in 126 nodes Interconnect is QDR Infiniband. Total: cores (756 nodes) 6 Have feedback?
7
Northwestern University
4 storage nodes 2 mgmt nodes 4 login nodes QUEST Storage Gigabit Ethernet InfiniBand QUEST-1 72 nodes 576 cores (576 cores) Northwestern University Intranet 36 (1008 QUEST-2 Users (864
8
Quest Compute Node Specifications
Partition Node Name Quantity IBM Product Name Processor Intel Xeon (2 Sockets/node) Memory (DDR3) Internal Disk quest1 qnode 504 Dx360 M2 2.26 GHz quad core Nehalem E5520 48 GB 1333 MHz (12 X 4GB) none quest2 qnode 256 Dx360 M3 2.66 GHz 6 core Westmere X5650 (6 X 8GB) (In 126 nodes) 1 TB 7200 RPM SATA Application; length, pages, content. how does it work? 8 Have feedback?
9
Quest OS Operating System: Stateless OS image Redhat 5.3
We boot from remote storage and the image lives in memory. Software stack is uniform, nodes boot quickly Hard disks are #1 source of failure (memory is #2, Power Supply #3) Less power, less heat, tastes great, less filling. Is there a downside? Application; length, pages, content. how does it work? 9 Have feedback?
10
Quest Software What's already installed?
Compilers/debuggers: gcc, intel, totalview (c, c++, fortran) MPI - OpenMPI (What? No MPICH or IntelMPI?) Commonly used applications - Matlab/Mathematica, etc A more complete list of installed software is online: What about my application software? We encourage users to install application software at own directory Why? A&RT will help 10 Have feedback?
11
Job Scheduler What is it and why should I care?
A popular restaurant has many of the same challenges and solutions Some people show up with lots of friends and stay for a long time Some people are just there for a quick meal Some people are part owners and have faster access to some tables Everyone wants to eat right now! 11 Operators are standing by:
12
Job Scheduler (a few details)
Moab is the Scheduler Quest uses queues (and partitions, advanced reservations, etc) to organize jobs according to number of cores a job uses amount of wall clock time the job is expected to take other considerations maybe needed The more accurate the request, the better the scheduler can do its job. Torque is the Resource manager Torque keeps track of all resources, job requests and running jobs. 12 Have feedback?
13
Batch Job Queues Assignment partition=quest1 qnode ID partition=quest2
Queues assigned Accessibility PIM 0001 – 0002 (2) 0505 (1) none PIM only hyperthreading 0003 – 0006 (4) ht All users art 0007 – 0012 (6) 0018 – 0020 (3) admin Admin only users (488) (~132 nodes owned by buy-in groups) (255) (~21 nodes owned by buy-in groups) short long wide normal debug 13 Have feedback?
14
So, who can use Quest? Any Northwestern University researcher or educator may request an allocation as a computational investigator (CI) for the purpose of research or education. CI must first complete an on-line account application which includes the following information: - a project proposal including description of intended work - amount of CPU time needed during the allocation term - amount of storage space to be shared by project team, and - some supplemental information to support your request Allocations will cover 1-year terms (Until April 2011) Allocations will cover 6-month terms (After June 2011) 14 Have feedback?
15
Available Allocations
15 Have feedback?
16
Allocation Review Cycle
For Research and Special Allocations Quarter Period of Submission Deadline Fall Oct 1st - December 31st September 1st Winter January 1st - March 31st December 1st Spring April 1st - June 30th March 1st Summer July 1st - Sept 30 June 1st For All Other Allocations Type Review Frequency Allocation Size Development Monthly 25,000 Hours Education Weekly 10,000 Hours Test Short Notice 2,500 Hours Workshop 1,000 Hours 16 Have feedback?
17
Where to Get Help? On line instructions
Still stuck? Quest support team (A&RT) can help Job submission problems System problems, software installations Need help fitting into the scheduler Performance issues with an application Access to licenses external to Quest 17 Have feedback?
18
Questions? 18
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.