Download presentation
Presentation is loading. Please wait.
Published byBruno Scott Modified over 9 years ago
1
HPDC Report Domenico Vicinanza CERN IT-GD-OPS CERN, July 12 th weekly OPS section meeting
2
HPDC '07 in a nutshell Held in Monterey (CA-USA), July 25-29 Four parallel workshops: Grid Monitoring Workflows in Support of Large-Scale Science (WORKS07) Joint EGEE and OSG Workshop on Data Handling in Production Grids Challenges of Large Applications in Distributed Environment (CLADE 2007) Three days conference http://www.isi.edu/hpdc2007/
3
Grid Monitoring WS Monitoring in Grids –Fabric monitoring Publishing on the Service Availability Information to the local fabric monitoring Nagios (integration with SAM) –Monitoring from the VO/User perspect. INCA (San Diego Supercomputing Center) –http://inca.sdsc.edu/http://inca.sdsc.edu/ –Aiming to integrate (part of) their testing infrastructure within SAM framework
4
cont... Interoperability of the monitoring tools and OSG-LCG interoperability –Overview of the work done by the Grid Service Monitoring Working Group –Service Availability Monitor as one of the main components of the monitoring framework prototype for WLCG/EGEE infrastructure (SAM Team paper)
5
cont... Other monitoring tools: –RGMA (as general framework for information exchange on large scale distributed infrastructure) –GridICE –gLite LB –Centralized logging systems Syslog-NG (OSG) Splunk (Fermilab)
6
Syslog-NG New system logging utility used by OSG Can replace regular syslog daemon or can be used in parallel More powerful facilities for filtering, formatting, and redirecting log messages Open source license Administered by Php-MySQL tool
7
Syslog-NG facilities Can filter log messages based on log level, system host, facility, ip address or regular expressions Can reformat and modify messages using template facilities Inputs can be files or sockets Outputs can be other hosts, files, or sockets
8
Splunk Commercial software used to archive and query log messages Web interface allows log messages to be categorized and correlated Messages can be queried and sorted based on categorization and other parameters Used at Fermilab as well for internal logging collection
9
Other topics: Open Grids Open Grids –BOINC (Berkeley Open Infrastructure for Network Computing) improvements in load-balancing new check-pointing methods reliability issues RIDGE (kind of BOINC improvement) –observes the past behavior and estimates a reliability rating for worker nodes
10
Other topics: future improvements Provisioning models (modeling needs) –performance-cost optimization in grids –Genetic Algorithm formulation for provisioning resources for an application Condor extensions (Data-driven workflow planning Scalable I/O virtualization –dynamically manage virtualized components among multiple guest domains
11
Environment issues How HPDC is affecting the environment –warming –efforts to deliver energy –cooling system Role of the renewable sources of energies in the future of HPDC Solar energy to provide electric power to operate the computers and for cooling. Covering roofs with solar cells: –How much a house can compute?
12
Conclusions Well established SAM awareness Defining a common monitoring exchange format (Grid Monitoring WG) –started a growing network of monitoring tools integration/interaction –interest including/feeding SAM results from/to other tools (fabric mon) Importance of logs (and log analysis tool) Strong need for an improved modeling of –resources, needs, workflows
13
Bibliography SAM Team paper (PDF), minutes and slides: –http://indico.cern.ch/conferenceDisplay.py?c onfId=18405
Similar presentations
© 2024 SlidePlayer.com. Inc.
All rights reserved.