Earth Data Hub documentation

Please also refer to the Earth Data Hub Getting Started page.

Earth Data Hub offers an innovative and super-efficient way to access data. Following what you need to know to start working.

Datasets are published as Zarr stores encoded in netCDF format and every user has a monthly quota of allowed requests, so downloads must be authenticated.

We will show how to obtain and use your authentication credentials later, let’s start by connecting to a test dataset that doesn’t need authentication.

Open your first dataset

The easiest way to get started with Earth Data Hub is by using Python and Xarray, on our public test dataset.

Be sure to have Python set up and install the basic tools running:

pip install xarray zarr dask aiohttp

now you are ready to open our public dataset for testing, start Python and run:

import xarray as xr

xr.open_dataset(
    "https://data.earthdatahub.destine.eu/public/test-dataset-v0.zarr",
    chunks={},
    engine="zarr",
)

this will display the data as an xarray.Dataset. If you use a different set of tools we suggest to get them to work with the public test dataset at "https://data.earthdatahub.destine.eu/public/test-dataset-v0.zarr" before attempting to set up the authentication.

Setup your credentials for authenticated data access

To access most datasets on Earth Data Hub you simply need to obtain an EDH personal access token and instruct your tools to use it when downloading the data. However, there is an expception. For datasets belonging to the Destination Earth Climate Adaptation digital twin you need to rely on the Data Cache Management Service to access the data. You can follow along one of ClimateDT tutorials to learn how to authenticate via the Data Cache Management Service.

How to obtain an Earth Data Hub personal access token

To obtain a personal access token you first need to register to the DestinE platform. Then you can go to Earth Data Hub account settings where you can find your default personal access tokens or create others.

Adding the token to the URL

In the following, we will access the same small test dataset as before, but this time from a URL that is authorisation protected: https://data.earthdatahub.destine.eu/private/test-dataset-v0.zarr.

The easiest way is to pass the personal access token as a password in the Zarr store URL, for example:

import xarray as xr

xr.open_dataset(
    "https://edh:<your personal access token>@data.earthdatahub.destine.eu/private/test-dataset-v0.zarr",
    chunks={},
    engine="zarr",
)

Configuring the token in the .netrc file

A more convenient way to set up the access token if you plan to use the system, is to configure the `.netrc file as follows:

machine data.earthdatahub.destine.eu
  password <your personal access token>

Once this is set up you can use the URL above directly and similarly, the other URL that you find in the catalogue:

import xarray as xr

xr.open_dataset(
    "https://data.earthdatahub.destine.eu/private/test-dataset-v0.zarr",
    storage_options={"client_kwargs":{"trust_env":True}},
    chunks={},
    engine="zarr",
)

Keep in mind that some tools do not use the .netrc file by default, but can be instructed to do so, for example, the storage_options={"client_kwargs":{"trust_env":True}} option is needed by Xarray / Zarr.

You will find the code snippet to access the data with Xarray on the dataset page and that will work out of the box after you set up the .netrc file as described above.

Quota limits

Request Limit: 500000 requests per user per month

Authentication: Required for all data access

Recommended Approach: we recommend carefully planning your data retrieval strategy, as downloading entire datasets can quickly consume your quota. Our tutorials provide guidance on how to efficiently access and work with large datasets while managing your request limits effectively.