Extreme Scalability RAT Report title=RATs#Extreme_Scalability Sergiu Sanielevici,
Mission and Membership Mission: Recommend how TeraGrid should meet the challenges faced by our users and ourselves in the productive utilization of Track 2 Systems as integrated into the TeraGrid. RAT chartered 6/28 Members volunteered from all RPs: Alameda, Brown, Dennis, Gaither, Lathrop, Lynch, Majumdar, Milfeld, Nystrom, Sanielevici, Sheppard, Whitson. Deliverables: –Whitepaper describing challenges and proposed paths to solution –Draft charter of a new working group to deal with these challenges
Whitepaper Recommendations: Technical Challenge Areas Designing applications for scaling and robustness Coding for performance on multi-core systems Coding for performance on specific T2 architectures Tools for debugging applications at scale Tools for optimizing applications at scale Work and data flows for extracting knowledge from petascale simulations TG to deal with these Challenges by building the infrastructure proposed in the following slides:
Whitepaper Recommendations: Infrastructure (1) TG to charter a new Extreme Scalability Working Group (XSWG) XSWG to spearhead creation of an Extreme Scalability R&D Grid (XSG): –Suitable TG machines including Track-2 access –Consistent R&D environment, including XS Kit in CTSS –Meta-scheduling system and policies –Documentation
Whitepaper Recommendations: Infrastructure (2) XSWG to draft, then help to implement, a new Extreme Scalability Allocations policy for granting access to the XSG –SU amount at today’s MRAC level, but supporting application and tools R&D and training –Clear-cut eligibility criteria, e.g. NSF PetaApps awards and their equivalents sponsored by other programs or agencies; Relevant NSF SDCI awards and their equivalents sponsored by other programs or agencies; Applications, currently running on TG resources, that have demonstrated scalability to at least 4000 cores with at least 60% parallel efficiency; Academic scientists, commercial vendor personnel and TeraGrid staff who collaborate on R&D projects undertaken in support of the XSWG mission; Academic scientists, commercial vendor personnel and TeraGrid staff who collaborate on EOT projects undertaken in support of the XSWG mission.
Whitepaper Recommendations: Infrastructure (3) XSWG to foster collaborative R&D projects –Between RPs, computational and computer scientists, applied mathematicians, and commercial vendors –Methods: write joint proposals to funding agencies, licensing, access to XSG, etc. –Resulting software may become part of CTSS XS Kit XSWG to coordinate the creation and teaching of HPC University workshops and modules. XSWG to work with the TeraGrid User Facing and EOT teams on documentation and dissemination.
Whitepaper Recommendations: XSWG Charter Report via GIG User Support Coordination Area Membership encouraged from all RPs, required from Track-2 sites Ensure presence of skills required to execute XSWG tasks by appointing Core Members: –WG Leader(s); –Leaders for each Framework task and each Challenge area; –Alternates to ensure presence at all WG activities.