As Earth Observation (EO) continues to deliver data of ever-growing volume and complexity, it has become an integral contributor to ‘Big Data’. Big data however pose an increasing challenge for providers as well as users as moving the data between them incurs considerable costs and delays. Linking up the user’s analysis with the provider’s data holdings is a key promise of a ‘cloud’, a large IT infrastructure composed of high-throughput and high-performance computing nodes and massive storage facilities connected through ultrafast networks. Other advantages include potential synergies through shared usage as well as enormous flexibility and scalability in available resources. Cloud computing has completely transformed working with Big Data, opening exciting new possibilities for data distribution and analytics. Many EO data providers are considering how best to take advantage of cloud computing and are about, or have already started, to transfer crucial parts of their IT operations to cloud environments.
Combining and integrating processing chains, data archives and dissemination hubs under an overarching cloud architecture promises to lead not only to more harmonized, time and cost-efficient product generation and access, but also to an improved and integrated use of respective data and products. This had spurred the development of several cloud based EO platforms several of which are described by Gomes in a recent publication , among those the Google Earth Engine , the most prominent forerunner in this field, the Sentinel Hub, and the European Commission’s Joint Research Centre own Big Data Analytics Platform (BDAP, former JEODPP) .
This new way of working with geospatial data has inspired to think of large amounts of platform-hosted geospatial data as ‘Geospatial Data Cubes’ (GDC) characterised by specific technological aspects of working with multidimensional data . Data Cubes have been adopted by some national authorities as valuable tools for data access and utilisation as demonstrated e.g. by the Australian Geospatial Data Cube .
The Copernicus programme is addressing this challenge in various ways and at different levels but first and foremost through contributions to the funding of Data and Information Access Services (DIAS) which collect several of Europe’s leading cloud providers to foster and improve access to, exchange, and dissemination of Copernicus data and information. The multitude of solutions hence has provoked, particularly in Europe, a call for federation of platforms, for instance in the European Open Science Cloud, and harmonisation of their interfaces to reduce risks of lock-in with single vendors and foster interoperability between platforms and user domains.
The recently finalized Horizon 2020 project openEO for example has proposed a harmonized application programming interface (API) allowing access to a variety of participating cloud infrastructures using essentially the same source code . Currently a follow-on project funded by ESA targets at turning this API into the openEO platform as an entry point to a wider federation of cloud-based infrastructures whose complexity then effectively can be hidden from the user.
Cloud technology and KCEO
The KCEO considers cloud technologies as a key factor for achieving its goal to foster uptake of EO products and information by potential applications, and also to ensure that the Copernicus Programme itself can take advantage of these technologies and infrastructure in implementing it’s processing and Services. It will therefore closely monitor developments and translate them into concrete advice for its stakeholders. An example is a recently released study analysing the benefits, options and risks of transferring the Global Land Component of the Copernicus Land Service to a Cloud environment, which was undertaken by a group of leading international experts . This report inter alia points at a couple of primary benefits expected from transferring the production chain of the service to a cloud environment:
greater transparency and reproducibility of the data processing chains, enhancing trust;
better scalability to handle the increasing volume of data arriving from current and future satellites;
better integration with downstream processing chains and thus more value delivered to research and industry.
Share this page