Presentation is loading. Please wait.

Presentation is loading. Please wait.

Yeti Operations INTRODUCTION AND DAY 1 SETTINGS. Rob Lane HPC Support Research Computing Services CUIT

Similar presentations


Presentation on theme: "Yeti Operations INTRODUCTION AND DAY 1 SETTINGS. Rob Lane HPC Support Research Computing Services CUIT"— Presentation transcript:

1 Yeti Operations INTRODUCTION AND DAY 1 SETTINGS

2 Rob Lane HPC Support Research Computing Services CUIT hpc-support@columbia.edu

3 Topics 1.Yeti Operations Committee 2.Introduction to Yeti 3.Rules of Operation

4 1.Yeti Operations Committee Determines cluster policy In the process of being set up In the meantime we need a policy for day 1 of operations

5 2. Introduction to Yeti

6 Final Node Count Node TypeNumber of Nodes Standard (64 GB)38 Intermediate (128 GB)8 High Memory (256 GB)35 Infiniband16 GPU4 Total101

7

8 Meet Your New Neighbors Group afsisocp astropsych cclssscc eeengstats journxenon

9 Group Shares GroupShare %GroupShare % afsis2.12ocp10.60 astro6.36psych2.12 ccls19.43sscc19.08 eeeng2.12stats33.92 journ2.12xenon2.12

10 Other Groups Renters Free Tier CUIT

11 Rules of Operation 1.Job Priority 2.Job Characteristics 3.Queues 4.Guaranteed Access

12 Job Priority Every job waiting to run is assigned a priority by the scheduling software The priority determines the order of jobs waiting in the queue

13 Job Priority Components Group’s share vs. recent usage User’s recent usage Other factors

14 Recent Usage What does “recent” mean? It’s configurable Yeti’s setting: 7 Days

15 Job Characteristics Nodes and cores Time Memory

16 Job Queues (subject to change) QueueTime LimitMemory LimitMax. User Run Batch 112 hours4 GB512 Batch 212 hours16 GB128 Batch 35 days16 GB64 Batch 43 daysNone8 Interactive4 hoursNone4

17 Guaranteed Access New mechanism Subject to review by Yeti Operations Committee We’re going to try it out in the meantime

18 Guaranteed Access Groups have each been assigned systems Group jobs get priority access to their own systems “Guaranteed Access” means there will be a known maximum wait time before your job starts running

19 Guaranteed Access Example The group astro owns the node Brussels Only two types of jobs will be allowed on Brussels 1.Astro jobs 2.Short jobs

20 Job Queues (subject to change) QueueTime LimitMemory LimitMax. User Run Batch 112 hours4 GB512 Batch 212 hours16 GB128 Batch 35 days16 GB64 Batch 43 daysNone8 Interactive4 hoursNone4

21 Guaranteed Access Debate Good because researchers have guaranteed access rights to nodes Bad because long jobs lose access to many nodes

22 Thanks! Comments and Questions? hpc-support@columbia.edu


Download ppt "Yeti Operations INTRODUCTION AND DAY 1 SETTINGS. Rob Lane HPC Support Research Computing Services CUIT"

Similar presentations


Ads by Google