Overheard at the 1st ‘ask me anything’ session

Overheard at the 1st ‘ask me anything’ session Sub-heading

Enhancing your data storage with the EOSC portal

If you are a researcher, you know that data is the essential element of your everyday work. The way you store your data plays a major role in how easily it can be accessed, used, reused and kept secure.

Which EOSC services can help you with all of your data storage needs? What is the concrete added value of these resources? These questions, and more, were examined during the first ‘ask me anything’ webinar on 1 February.

Data storage services available on EOSC

The EOSC portal, the online registry where researchers can discover, order and access EOSC services and resources employed by several EC-funded projects, offers a catalogue of data storage options to support your research. Through the EOSC marketplace, data storage services are organised according to different categories; for example: services to archive or back up data and services to replicate or synchronise data.

These services can serve different purposes in a research data workflow:

  • mid-term storage during a project (for active data) connected to computing facilities
  • mid-term storage resources after/between projects, accessible from computing facilities to store non-active research data
  • mid- to long-term storage that is accessible from computing facilities to store non-active stable research data, with optional value-added services (integrity checks, replications, ...)
  • long-term data preservation and sharing resources for non-active FAIR data.

Uses and benefits

Most of the services in this category of the EOSC marketplace are generic and, therefore, can be used in any scientific domain. Here are some examples:

  • B2SAFE, B2DROP, B2SHARE: 3 services provided by EUDAT CDI to virtualise large-scale data resources, store, synchronise and share data with peers, store, publish and share research data in a FAIR way.
  • CESNET,  CSCS, INFN services: These resources comprise storage facilities and object storage services.

It is worth noting that the resources mentioned above can be accessed, free of charge, until June 2023 via the DICE project.

EGI Online Storage and EGI Data Hub are 2 other services, which are available through EOSC and provided by the EGI Foundation. Both cover file and object storage and enable researchers to access scientific datasets in a scalable way.

Domain-specific services are also available. For instance, the Open Energy Platform is a modular framework for research data management in energy system analysis as well as a community database for energy data. In the near future, the portal will offer data storage services for the Earth Observation community; these new services will be provided by the C-SCALE project.

The above services can either be directly accessed (and/or ordered) via the portal, or a link on the portal will allow you to get in touch with the provider. Keep in mind, this depends on the policies of the provider or the type of usage that the user wants to make of such services.

2 data storage services in action

Concrete cases – from EU-funded projects using and contributing to EOSC – illustrate the added value of the platform’s available resources. With respect to data storage services, we can look at 2 examples from the RELIANCE project, Research Lifecycle Management for Earth Science communities and Copernicus users in EOSC.

The RELIANCE project is leveraging B2DROP services to support the Earth Science community. During the management of the research lifecycle, researchers produce multiple resources that, in many cases, are scattered and not easily shareable (e.g. locally stored) or too large to be kept locally. In response to these challenges, B2DROP will be used as a personal space to store these resources. For the RELIANCE project, it has been assessed as instrumental in keeping research data synchronised, up to date and shareable with other researchers.

Another example comes from the Geohazard community. Since the second half of 2020, Sentinel-1 data have been gradually moved from online rolling cache to long-term storage. This means: 1. a 2-step approach (first, order and wait and second, access the data); 2. No data retention policy, but stringent quotas are still applied; 3. Issues with the long-term series processing data pipeline. In light of this, the Geohazard community has been able to come together, through the RELIANCE and DICE projects, to provide 300-500 TB storage. This will provide access to long-time series of Sentinel-1 data (currently not available in Copernicus provider) and an environment for the researchers to access, analyse and reuse data as well as support the development of cloud-based services for open science.

Next steps

These are just a few examples of how the data storage resources offered by the EOSC marketplace can serve the needs of different research communities. For more information, watch the entire ask me anything webinar on data storage.

All of the data storage services mentioned above, and during the session, can be found on the EOSC marketplace. Furthermore, if you have a data storage service that could be interesting for other users, you can easily onboard it to the portal (check the EOSC providers' portal).

Last but not least, if you are interested in  software resources, do not miss the next ask me anything webinar organised by EOSC Future on 1 March at 14.00 CET. Registration is open: https://eoscfuture.eu/eventsfuture/ask-me-anything-session-2-software/

 

 

 

 

09 February 2022