Presentation is loading. Please wait.

Presentation is loading. Please wait.

Med-CORDEX database Med-CORDEX database = = netcdf files+ their info = File System + relational database = XFS+ mysql db = file server + LAMP server Linux,

Similar presentations


Presentation on theme: "Med-CORDEX database Med-CORDEX database = = netcdf files+ their info = File System + relational database = XFS+ mysql db = file server + LAMP server Linux,"— Presentation transcript:

1 Med-CORDEX database Med-CORDEX database = = netcdf files+ their info = File System + relational database = XFS+ mysql db = file server + LAMP server Linux, Apache, Mysql and PHP www.medcordex.eu1

2 file server www.medcordex.eu2 NETAPP FAS3240 HA Storage System  dual controller  RAID DP technology (two simultaneus disk failures allowed) environment:  dual power supply (one coming from UPS)  air-conditioned room

3 LAMP server HP DL575G7 Linux Server  SLES 11SP2 Operating System  no users: the machine is devoted to act as a webserver (not only for Med-CORDEX database) Apache 2.4.6 PHP 5.5.10 Tomcat 7.0.52 JVM 1.7.0_55mysql 5.0.96pure-ftpd 1.0.36 Environment:  dual power supply (one coming from UPS)  air-conditioned room www.medcordex.eu3

4 paths & filenames ATMOSPHERIC DATA PATH /MEDCORDEX/ / / / / / / / / Our PATH shortcut: /MEDCORDEX/ALL (files are not listable) FILENAME VariableName_Domain_GCMModelName_CMIP5ExperimentName _CMIP5EnsembleMember_RCMModelName_RCMVersionID_Frequ ency[_StartTime-EndTime].nc www.medcordex.eu4 According to “CORDEX Archive Design” O. B. Christensen, W.J Gutowski, G.Nikulin, and S. Legutke http ://cordex.dmi.dk

5 paths & filenames OCEAN DATA Not yet defined a standard (AFAIK) shall we use http://cmip-pcmdi.llnl.gov/cmip5/output_req.html#req_list ? www.medcordex.eu5

6 paths & filenames All tokens which form the PATH are derived from FILENAME but the Institution which is the name of the directory where files have been placed by each data providers e.g. /incoming_MEDCORDEX/ENEA  ENEA In the db we use all tokens and one more info: realm which is atmosphere or ocean. Realm is deduced from the VariableName THUS WE HAVE A CONSTRAINT ! variables must ALL be unique regardless to the realm they belong to! www.medcordex.eu6

7 uploading files Data providers having data to upload can use ANY ftp client to do: ftp ftp://user:passw@www.medcordex.eu cd /incoming_MEDCORDEX/$INST mput *.nc (all files into the same flat dir) put PLEASEGO.txt (any size, also empty) where $INST is the code of their institution (eg: ENEA) Then they wait for the automatic daily procedure to start (at 20:00) www.medcordex.eu7

8 ingesting files Every day at 20:00 is automatically run the “ingesting procedure”  For each dir /incoming_MEDCORDEX/$INST with PLEASEGO.txt:  for each other file in the dir, the procedure: 1.verifies it’s a netcdf file ncdump -h works properly 2.splits filenames in tokens and checks their compliance to CORDEX standard 3.checks validity of variable name it is already known 4.creates the right $PATH in /MEDCORDEX 5.moves the file into its $PATH 6.inserts/updates the file’s record in the db also ncdump –h  continue www.medcordex.eu8

9 ingesting files  When data provider’s files are all processed a mail is sent to him/her with the log of what happened ingesting his/her data After ingesting all files of all data providers, the procedure: 1.computes some statistics and publishes them on www.medcordex.eu/stats taking figures from db & ftp logs 2.makes all links in /MEDCORDEX/ALL 3.copies the whole /MEDCORDEX directory to another host www.medcordex.eu9

10 downloading files FTP Server (can be accessed by any ftp client) THREDDS Data Server (software by unidata.ucar.edu) www.medcordex.eu10

11 downloading data (using any FTP client) cmd line:  ftp $f/$p/ ; dir ; get filen.nc “dir” not in /ALL  ncftp –u $hymex www.medcordex.eu ; cd $p ; get filen.nc  wget $f/$p/file.nc  wget -r $f/$p recursive get, not in /ALL browser:  $f/$p  $f/$p/filen.nc where: $f = ftp://user:passw@www.medcordex.eu $p = MEDCORDEX/MED-xx/…/…/…. $p = MEDCORDEX/ALL www.medcordex.eu11

12 downloading data (using THREDDS) www.medcordex.eu12 services: (password required only to get netcdf files)  OpENDAP use files remotely, download them  HTTP serverdownload files  netcdf subsetselect & download sections of each file  WCS Web Coverage Service serves data to WCS clients  WMS Web Map Service serves data to WMS clients  NCMLNetCDF Markup Language to define a CDM ds  ISOdescription of the file in ISO 19115(-2) metadata.  UDDC Unidata Attribute Convention for Data Discovery provides recommendations for netCDF attributes that can be added to netCDF files

13 downloading data (using THREDDS) cmd line:  ncdump –h $t/dodsC/$p/file.nc  cdo showdate $t/dodsC/$p/file.nc  cdo copy $t/dodsC/$p/file.nc local.nc  ferret: use $t/dodsC/$p/file.nc tested with: netcdf 4.3.1.1, cdo 1.6.4rc6, ferret 6.9 browser: www.medcordex.eu/tds MEDCORDEX/ALL is invisible where:$p=MEDCORDEX/MED-xx/…/…/…. $p=MEDCORDEX/ALL $t=https://user:passw@www.medcordex.eu:8290/medcordex www.medcordex.eu 13

14 db fields for each ingested netcdf file are recorded: code path fname size ncdump realm www.medcordex.eu14 Institution VariableName Domain GCMModelName CMIP5ExperimentName CMIP5EnsembleMember RCMModelName RCMVersionID RCMmodel Frequency StartTime EndTime

15 statistics as of May 22, 2014 www.medcordex.eu15 netcdf filessize in GB CMCC589690.5 CNRM 3° 7803 1° 493.5 ENEA2° 1402397.7 GUF1° 627843° 303.6 ICPT5404101.1 INSTM1600.2 IPSL1606113,7 LMD7392° 429.0 UCL1012101.8 Total994271732.0


Download ppt "Med-CORDEX database Med-CORDEX database = = netcdf files+ their info = File System + relational database = XFS+ mysql db = file server + LAMP server Linux,"

Similar presentations


Ads by Google