Download presentation
Presentation is loading. Please wait.
Published byMolly Alexander Modified over 8 years ago
1
National Energy Research Scientific Computing Center (NERSC) PDSF at NERSC Thomas M. Langley NERSC Center Division, LBNL November 19, 2003
2
PDSF: A Tool for Science What is PDSF? History Our clients and their science Configuration and Administration What is in the future for PDSF? Conclusion
3
PDSF: What is PDSF? PDSF is a large Linux cluster constructed in house from off the shelf components. Designed to support large numbers of applications large data capacity requirements. Tuned for serial processing. Cooperative effort between NERSC and the HEPNP communities. Provides a shared alternative to individual computing facilities for each project.
4
PDSF: History PDSF was initially developed 1991 at SSC National Laboratory. Composed of mid range workstations and shared disk. Acquired by NERSC in 1996 in collaboration with the HEPNP
5
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure
6
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure STAR is an experiment using the RHIC facility at Brookhaven National Laboratory. The primary goal of this field of research is to re-create in the laboratory a novel state of matter, the quark-gluon plasma (QGP), which is predicted by the standard model of particle physics (Quantum Chromodynamics) to have existed ten millionths of a second after the Big Bang (origin of the Universe) and may exist in the cores of very dense stars. STAR, the largest PDSF client, uses PDSF for detector simulations and software development. www.bnl.gov/star.htm
7
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure The Collider Detector at Fermilab (CDF) experiment is committed to studying high energy particle collisions at the world’s highest energy particle accelerator. The goal is to discover the identity and properties of the particles that make up the universe and to understand the forces and interactions between those particles. www-cdf.fnal.gov
8
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure The goal of the ATLAS Experiment for the Large Hadron Collider at the CERN Laboratory in Switzerland is to explore the fundamental nature of matter and the basic forces that shape our universe. The Atlas collaboration is using PDSF for detector simulation and software development. atlasexperiment.org
9
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure The HyperCP (E871) collaboration at Fermilab is searching for asymmetries between matter and antimatter, or CP violation, in Lambda and Xi hyperon decays, as well as in charged kaon decays. They also have an extensive program of hyperon and kaon physics outside of CP violation. ppd.fnal.gov/experiments/e871/
10
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure The ALICE (A Large Ion Collider Experiment) collaboration located at CERN is building a dedicated heavy-ion detector to exploit the unique physics potential of nucleus-nucleus interactions at LHC energies. The aim is to study the physics of strongly interacting matter at extreme energy densities, where the formation of a new phase of matter, the quark-gluon plasma. The Alice collaboration is using PDSF for detector simulations and software development. alice.web.cern.ch/Alice/AliceNew/
11
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure The Sudbury Neutrino Observatory (SNO) is taking data that has provided revolutionary insight into the properties of neutrinos and the core of the sun. The detector, shown in the artist's conception below, was built 6800 feet under ground, in INCO's Creighton mine near Sudbury, Ontario. SNO is a heavy-water Cherenkov detector that is designed to detect neutrinos produced by fusion reactions in the sun. The SNO collaboration uses PDSF for software development and data analysis production. neutrino.lbl.gov/index.html
12
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure KamLAND, the "Kamioka Liquid Scintillator Anti- Neutrino Detector” is the largest scintillation detector ever constructed. PDSF is used process massive amounts of data generated by this experiment. KamLAND uses PDSF for data analysis and production. www.awa.tohoku.ac.jp/html/KamLAND
13
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure The AMANDA telescope consists of neutrino detectors buried between 1 and 1.5 miles beneath the snow surface of the geographic south pole. The primary objective of AMANDA is to discover sources of very- high-energy neutrinos from galactic and extragalactic sources. www.amanda.uci.edu
14
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure The DeepSearch experiment is affiliated with the Supernova Cosmology Project at Berkeley Lab. It’s goal is to search for distant supernovae. PDSF is used for data reduction and analysis. panisse.lbl.gov
15
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure The Nearby Supernova Factory (SNfactory) is designed to address a wide range of supernova issues using detailed observations of low-redshift supernovae. Snfactory relies heavily on PDSF for computational support and data storage. snfactory.lbl.gov
16
PDSF: Our clients and their science PDSF has a varied client community. Collider Facilities –STAR/RHIC –CDF/FNAL –ATLAS/CERN –E871/FNAL –ALICE/CERN Neutrino Experiments –Sudbury Neutrino Observatory (SNO) –KamLAND –AMANDA Astrophysics –Deep Search –Super Nova Factory –Large Scale Structure Modelling the formation and evolution of structure in the universe. The LSS group is using PDSF for simulations.
17
PDSF: Configuration and Administration PDSF has grown significantly since first installed. 1997 Configuration 64 Nodes 224GB Disk 10Mb/sec network fabric. 100Mb/sec connection to ESNet.
18
PDSF: Configuration and Administration 1998 Configuration Added 20 Intel nodes for 84 total. Increased disk to 490GB 100Gb/sec network on new nodes.
19
PDSF: Configuration and Administration 1999 Configuration Removed HP and Sun nodes. Increased Intel based nodes to 48. Added 7 disk vaults with 64GB each. Added Sun E450 with 210GB Disk 100Gb/sec network throughout Cluster. Added FDDI connectivity to HPSS.
20
PDSF: Configuration and Administration 2000 Configuration Increased number of nodes to 152. Disk increased to 7.5TB Introduced GigE between high performance nodes, HPSS and Esnet.
21
PDSF: Configuration and Administration 2002 Configuration Increased number of nodes to 207. Introduced Athlon technology. Disk increased to 35TB including new hardware IDE raid. Added new Extreme 7i Switches
22
PDSF: Configuration and Administration 2003 Configuration Increased number of nodes to 414. Introduced Opteron technology. Total disk capacity now at 131.5TB. Installed first NAS on PDSF. Added lower cost Dell switches.
23
PDSF: Configuration and Administration Interactive and Compute Nodes Interactive nodes - Provide a point of entry into the cluster, user logins via SSH and job submission capability to LSF. Compute nodes - Are scheduled by LSF. Provide the bulk computing resource of the cluster. No interactive logins. Consist of regular nodes and high bandwidth nodes, the latter having faster processors and more local storage.
24
PDSF: Configuration and Administration Administrative Nodes Used to administer the cluster. Run the LSF manager, monitoring utilities, web server, etc. Logging Nodes Perform consolidated logging functions, email gateway, etc. Development nodes Small pool of processors made available for testing of new system functions such as file systems. Grid nodes Globus gateways to the DOE science grid. Console Servers Provide remote access to each cluster node by way of serial connections.
25
PDSF: Configuration and Administration Data Nodes Provide 90TB of shared storage to the cluster. Small disk vaults - older technology. Use software raid and integrated IDE controllers. Raidzones - first hardware raid devices in PDSF. Use IDE drives. 3ware - current technology. Use IDE raid with 80 to 300GB drives. NAS - 10TB configurable storage. In addition, an additional 60TB is provided in locally attached disk on the compute nodes.
26
PDSF: Configuration and Administration RedHat Linux 7.3 installed on all nodes. Preparing to upgrade to RedHat 8.0. LSF provides batch scheduling and queue management. Each client group is allocated services by way of a share value. Similar to a percentage of available resources. Each user’s runtime environment is customized with the modules package to reflect their development environment. Permits several versions of software to be installed that would otherwise conflict with each other. High capacity, high speed backup is provided to NERSC’s HPSS system. Open source and in house written monitoring software provide 24 x 7 monitoring of the environment with appropriate alerts and notifications. Internally developed hardware database is manually and automatically updated to reflect changing hardware conditions due to equipment additions, deletions and failures. Permits component locating and tracking.
27
PDSF in the spotlight PDSF was one of the first operational Linux clusters for public use in the world. PDSF has been in continuous operation longer than any other Linux cluster. Using the capabilities of PDSF the SNO collaborative was able to determine that neutrinos have mass. The KamLAND collaborative performed analysis of their data on PDSF that verified SNO’s findings. Utilizing the large data handling capabilities and flexible environment of PDSF, the Supernova Factory was able to discover 43 new supernovae in its first year, an astounding record. STAR has published more than 30 physical review letters made possible because of work performed on PDSF.
28
PDSF: The Future Remain poised to take advantage of the newest technologies as they are made available. Continue to develop new tools to more effectively manage the cluster and reduce system outages. Investigate ways to foster cooperation with installations operating Linux clusters in the global community. Continue to expand the relationship with the astrophysics community. Look for additional interest outside of HEPNP. Grid will play an ever increasing role.
29
PDSF: Conclusion PDSF allows the groups to fully leverage the computational resources PDSF has allowed NERSC to examine Linux and evaluate different models and approaches to providing computing PDSF continues to evolve while maintaining production quality service PDSF provides a unique managed system where projects with very differing environmental requirements may operate concurrently With its large storage capacity, high number of compute nodes, very low cost and extraordinary availability, PDSF provides capabilities otherwise unobtainable for our scientific users. Great science is being done on PDSF!
30
PDSF: Contacts Thank You! For more information visit our website at: http://pdsf.nersc.gov You may also contact us by email: PDSF Support Staff Tom Langley: tmlangley@lbl.gov Shane Cannon: scannon@lbl.gov Cary Whitney: clwhitney@lbl.gov PDSF User Support Iwona Sakrejda: isakrejda@lbl.gov
31
Earnest Orlando Lawrence Berkeley National Laboratory
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.