Download presentation
Presentation is loading. Please wait.
Published byLaney Jenner Modified over 10 years ago
1
IST-2006-026409 www.eu-eela.org E-infrastructure shared between Europe and Latin America Climate Application Final Report Jose M. Gutierrez Valvanuz Fernandez Antonio S. Cofiño Fernando García Jesús Fernandez Richard Miguel San Martín Mauricio Carrillo Gabriela Rosas Amelia Diaz Delia Acuña Rodrigo Abarca Claudio Baeza UC-SpainSENAMHI-PerúUDEC-Chile
2
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 2 Enabling grid computing for climate model simulation: Challenges Global circulation models provide a coarse description of the ocean and atmosphere (200km resolution) and have to be linked to regional models to obtain useful representations over areas of interest. CAM and WRF are open-source state of the art global and regional models. They need to be run in cascade: Sea surface temperature CAM WRF output converter NCAR Graphics library Regional models depend on many parameters related to sub-grid physical processes (multi-parametric jobs). CAM + WRF
3
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 3 Enabling data mining applications on simulations: The high-dimensional character of the data involved in climate simulations requires efficient data mining techniques to extract some useful knowledge. Unsupervised clustering allows partitioning the simulation databases, producing characteristic weather or climate types (or groups) governing the global dynamics. Self-Organizing Maps (SOM) is one of the most popular clustering algorithms, which is especially suitable for high dimensional data visualization and modeling. The weather types can be locally projected to obtain statistical regional forecasts of variables of interest. (Right) Precipitation at two different stations in Peru for a El Niño period. Challenges
4
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 4 Climate Cascade Demo Ensemble prediction systems comprise multiple runs of a weather model with slightly different initial conditions and/or model parameterizations. The resulting simulations contain valuable information about the sampled sources of uncertainty. Sea surface temperature CAM WRF (par 1) WRF (par 2) One El Niño year 365 simulations … WRF (par n) … SE SOM Compare the SOM distribution of each parameterization.
5
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 5 Application Achivements CAM and WRF are running in the Lost Island Grid CAM is a Data Producer and WRF a Data Consumer WRF is feed with CAM data We are using SYSTEM call from FORTRAN90 to upload information to LFC and AMGA. All this is done using shell scripts. Progress has been made for Task Management and Monitoring using AMGA The user after polling what data is available, decides which job wants to run –Start, Restart or Cancel a CAM experiment Jobs –Start or Cancel WRF Tasks WRFCAM
6
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 6 CAM Status LFC SE AMGA R-GMA Metadata Information CAM job Status CAM job Status & Checkpoint CAM job Status & Data Output CAM Simulation DATA To be implemented
7
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 7 CAM: Community Atmospheric Model The Community Atmosphere Model (CAM) is the latest in a series of global atmosphere models developed at NCAR for the weather and climate research communities. –grid size: 128 x 64 x 27 (XYZ) = 221184 gridpoints –6 output time steps = 197MB NetCDF -> 33MB/tstep –This includes ALL default variables (32x3D + 56x2D) –WRF only requires as input 5x3D and 9x2D (effective MB: 5/step = 620MB/month(6hly input). 720GB per 100 years –1 Year of simulation takes 48 CPU hours. A climate simulations of 100 Years takes 7 CPU months A case study simulating the climate of the past century It will require a CAM job running 7 months. Then Checkpoints is an important feature. For ENSEMBLES studies (multi-parametric) those figures increase.
8
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 8 Data and Metadata Datasets produced by WRF and CAM models are stored in the LFC catalog. The metadata from these datasets is extracted and uploaded to AMGA. CAM produce checkpoint dataset and It’s uploaded to LFC and notified to the user using AMGA.
9
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 9 Application Workflow The user queries CAM Jobs status. If jobs is not running query to AMGA about if was done. If not check was what the last checkpoint file and restart the CAM job. Meanwhile CAM job is running, the User queries AMGA about datasets produced by CAM then triggers the WRF jobs.
10
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 10 Application Workflow Schema UI supersubmiter.sh Generate CAM.jdl Submit CAM.jdl Insert entry in CAM collection AMGA Resource Broker WN change_status_CAM.sh Update status Insert runon CAM collection AMGA CAM upload_info_CAM.sh Insert entry in WRF collection AMGA 1 2 3 checkpoint_CAM.sh Insert entry in CHECKPOINT collection AMGA update_history_CAM.sh Insert entry in HISTORYCAM collection AMGA 1 2 3 4
11
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 1 2 4 5 1: Task is ready for scheduled 2: Task is submited to Grid by coordinator 3: Task is running on Grid 4: Task is done successfull 5: Task execution or submit fail 6: Task cancelled by user through coordinator 3 6 Coordinator: Task States
12
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 12 CAM Info & Data Flow LFC CAM SE AMGA R-GMA DATA Metadata Information WRF To be implemented
13
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 13 WRF Info & Data Flow LFC SE AMGA R-GMA DATA Metadata Information WRF To be implemented
14
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 14 LFC SE AMGA R-GMA Metadata Information coordinator WRF UI DATA WRF and User Interaction To be implemented
15
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 Coordinator CAM AMGA RB R-GMA Portal managementmonitoring WRF Application Workflow Improvement
16
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 16 Issues about application development A uniform framework for application development is missing. –gLite is a mixture of different initiatives LFC and SE operations are unreliable, some times is not possible to delete or recover data. –Is the application responsible for reliability? Or the GRID? An application workflow framework is required to have more monitoring and control over the applications. –Job submission is a quest, you never knows what is going to happen and little chance of post-mortem analysis. Metadata is an important issue in Data Management that hasn’t been well establish. –A metadata system is a really something useful for development. APIs are not well tested. –C and Python for sure they work, but PERL and JAVA release are not well tested. (Not enough user community?)
17
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 17 What was expected from EGRIS DAGs and Checkpointable job submission. –Restart of jobs with dependencies. Using metadata catalog from worker nodes: –Loading metadata with AMGA API from WN. –Integration of the metadata catalogs and datasets catalogue Data access protocol to datasets. –OpenDAP service in the Storage Element. Development of a portal for job submission and monitoring: –Authentication management from portal –Monitoring status of jobs. –Retrieval of information from metadata catalog
18
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 18 Thanks Thanks to EELA project for organizing EGRIS-1. Thanks to the local committee to setting up the Lost Island GRID Thanks to the tutors for their help And thanks to Valva, Claudio and Mauricio for their effort to migrate the Climate Application to the GRID
19
IST-2006-026409 E-infrastructure shared between Europe and Latin America www.eu-eela.org EGRIS-1, Itacuruçá (Brasil), 4.12.2006 19 Questions?
Similar presentations
© 2025 SlidePlayer.com. Inc.
All rights reserved.