Bringing processing Close to the data Richard MORENO CEOS - WGISS 15 march 2016
SUMMARY Bringing processing Close to the data Context Data downloading improvement Processing close to the data Big data and distributed architecture Is Cloud computing THE solution ? SUMMARY
Need to change the data usage Context Big data Big volume Big processing Need to change the data usage Limitations Cost of storage of several PB Bandwidth resources No more downloading the full archive or bulk extraction Bring the processing close to the data Copernicus / Datacube Integration of the Copernicus mirrors In Europe Worldwide ??? Need / possibility to federate datacube : cube of cubes
Data downloading improvement Tools / Standards / Interoperabilty – French Coperniccus CollGS Natural langage for searching data of interest Web services access - opensearch Bulk extraction Metalink Jdownloader Do not solve Bandwidth resource Duplication of storage
Processing close to the data Different types of processing Interactive processing via web services : WPS Interactive processing via MMI Google engine, GA Analytics Expression langage Notebook (eg. Jupyter) Mass processing on HPC / Cloud SandBox for algorithms / processing tuning
Big data and distributed architecture Is big data compatible with distributed architecture ? Examples OGC OWS-10 ESA and european agencies Federated pilot EUCLID project Can be generalized ? Is centralized platform / cloud the unique solution ?
Is Cloud computing THE solution ? Advantages and disadvantages of Cloud computing based archiecture ?