Walk-through: Visualising Data¶
Description & purpose: This Notebook is designed to showcase the functionality of the Earth Observation Data Hub (EODH). It provides a snapshot of the Hub, the pyeodh API client and the various datasets as of February 2025. Much of what follows is still relevant, although some user interface components may have changes. Contact [EODH enquiries](mailto: enquiries@eodatahub.org.uk) if further help is required.
Set-up¶
The following cell only needs to be run on the EODH AppHub. If you have a local Python environment running, please install the required packages as you would normally. If running locally, it's easiest to install all dependencies using uv (https://docs.astral.sh/uv/getting-started/installation/) by running uv sync in the root directory of the repository.
# If needed you can install a package in the current AppHub Jupyter environment using pip
# For instance, we will need at least the following libraries
%pip install --upgrade pyeodh geopandas matplotlib numpy folium xarray geoviews hvplot holoviews datashader cartopy
Introduction¶
In this notebook we will explore different ways to visualise data held on the Hub. The Notebooks can be run on the JupyterHub instance on the Hub (linked to your user account) or locally with a suitably set up environment.
# Imports
import os
# Import the Python API Client
import pyeodh
import folium
import xarray as xr
import hvplot.xarray # Needed for xarray plotting with geoviews
import holoviews as hv
import datashader as ds
from holoviews import opts
from holoviews.operation.datashader import rasterize
import cartopy.crs as ccrs # For coordinate reference systems
import geoviews as gv
import requests
from IPython.display import Image, display
# Parameterise geoviews
gv.extension("bokeh", "matplotlib")
Accessing Different DataTypes¶
For this part of the workshop exercises we are going to look at how to visualise two different datasets. This should give you the foundation and understanding about how you could do this for other data held on the Hub and elsewhere. The first thing we need to do is connect to the EODH and get information about the collections we are interested in: cmip6 and the sentinel 2 ARD.
# Connect to the Hub
# Change base_url to point to the server that you want to connect to
client = pyeodh.Client(base_url="https://eodatahub.org.uk").get_catalog_service()
catalog = client.get_catalog("public/catalogs/ceda-stac-catalogue")
# Get each collection
cmip6 = catalog.get_collection("cmip6")
sentinel2_ard = catalog.get_collection("sentinel2_ard")
We can interrogate the temporal and spatial extents of the data holdings. Below, we have printed out one of each for each dataset that we have connected to. This can help us understand what date ranges or spatial extents we can use when visualising the data.
# Get the collection metadata
extent = ["spatial", "temporal"]
print("Dataset extent - CMIP6: ", cmip6.extent.to_dict()[extent[1]])
print("Dataset extent - Sentinel 2 ARD: ", sentinel2_ard.extent.to_dict()[extent[0]])
Dataset extent - CMIP6: {'interval': [['1850-01-01T00:00:00Z', '4114-12-16T12:00:00Z']]}
Dataset extent - Sentinel 2 ARD: {'bbox': [[-9.00034454651177, 49.48562028352171, 3.1494256015866995, 61.33444247301668]]}
As this is an exercise, we have already chosen some data to look at.
The CMIP6 item has been generated using the CIESM model, running the high-emission SSP5-8.5 scenario. It contains monthly-averaged upward shortwave radiation at the surface (rsus) on a regular grid.
The Sentinel 2 ARD image covers an area acros sthe south of England, including the Solent.
# Look for a specific item
cmip6_item = cmip6.get_item(
"CMIP6.ScenarioMIP.THU.CIESM.ssp585.r1i1p1f1.Amon.rsus.gr.v20200806"
)
sentinel_item = sentinel2_ard.get_item(
"neodc.sentinel_ard.data.sentinel_2.2023.11.17.S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb"
)
# Get information and links to the item assets held in STAC
print("---" * 20, "CMIP6", "---" * 20)
print(cmip6_item.assets)
print("")
print("---" * 20, "SENTINEL 2", "---" * 20)
print(sentinel_item.assets)
------------------------------------------------------------ CMIP6 ------------------------------------------------------------
{'reference_file': <Asset href=https://dap.ceda.ac.uk/badc/cmip6/metadata/kerchunk/pipeline1/ScenarioMIP/THU/CIESM/kr1.0/CMIP6_ScenarioMIP_THU_CIESM_ssp585_r1i1p1f1_Amon_rsus_gr_v20200806_kr1.0.json>, 'data0001': <Asset href=https://dap.ceda.ac.uk/badc/cmip6/data/CMIP6/ScenarioMIP/THU/CIESM/ssp585/r1i1p1f1/Amon/rsus/gr/v20200806/rsus_Amon_CIESM_ssp585_r1i1p1f1_gr_402901-411412.nc>}
------------------------------------------------------------ SENTINEL 2 ------------------------------------------------------------
{'cloud': <Asset href=https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/17/S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb_clouds.tif>, 'cloud_probability': <Asset href=https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/17/S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb_clouds_prob.tif>, 'cog': <Asset href=https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/17/S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif>, 'metadata': <Asset href=https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/17/S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref_meta.xml>, 'saturated_pixels': <Asset href=https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/17/S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb_sat.tif>, 'thumbnail': <Asset href=https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/17/S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref_thumbnail.jpg>, 'topographic_shadow': <Asset href=https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/17/S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb_toposhad.tif>, 'valid_pixels': <Asset href=https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/17/S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb_valid.tif>}
We can then use the information above to gather links to the asset data
cmip6_kerchunk_asset = cmip6_item.assets["reference_file"]
sentinel2_ard_cog_asset = sentinel_item.assets["cog"]
# print the href
print(cmip6_kerchunk_asset.href)
print(sentinel2_ard_cog_asset.href)
https://dap.ceda.ac.uk/badc/cmip6/metadata/kerchunk/pipeline1/ScenarioMIP/THU/CIESM/kr1.0/CMIP6_ScenarioMIP_THU_CIESM_ssp585_r1i1p1f1_Amon_rsus_gr_v20200806_kr1.0.json https://dap.ceda.ac.uk/neodc/sentinel_ard/data/sentinel_2/2023/11/17/S2A_20231117_latn509lonw0008_T30UXB_ORB137_20231117131218_utm30n_osgb_vmsk_sharp_rad_srefdem_stdsref.tif
# Get information on the cloud product
product = cmip6_item.get_cloud_products()
product
<DataPointCloudProduct: CMIP6.ScenarioMIP.THU.CIESM.ssp585.r1i1p1f1.Amon.rsus.gr.v20200806-reference_file (Format: kerchunk)> - bbox: [-180.0, -90.0, 178.75, 90.0] - asset_id: CMIP6.ScenarioMIP.THU.CIESM.ssp585.r1i1p1f1.Amon.rsus.gr.v20200806-reference_file - cloud_format: kerchunk Attributes: - title: CMIP6.ScenarioMIP.THU.CIESM.ssp585.r1i1p1f1.Amon.rsus.gr.v20200806 - datetime: 4072-01-01T00:00:00Z - created: 2025-01-24T14:29:23.741213Z - updated: 2025-08-29T03:10:04.291804Z - start_datetime: 4029-01-16T12:00:00Z - end_datetime: 4114-12-16T12:00:00Z - cmip6:further_info_url: https://furtherinfo.es-doc.org/CMIP6.THU.CIESM.ssp585.none.r1i1p1f1 - cmip6:source_type: AOGCM - cmip6:institution_id: THU - cmip6:variable_long_name: Surface Upwelling Shortwave Radiation - cmip6:mip_era: CMIP6 - project: CMIP6 - cmip6:access: ['HTTPServer'] - cmip6:nominal_resolution: 100 km - cmip6:sub_experiment_id: none - cmip6:frequency: mon - cmip6:experiment_id: ssp585 - cmip6:table_id: Amon - cmip6:activity_id: ScenarioMIP - product: model-output - model_cohort: Registered - cmip6:grid_label: gr - cmip6:cf_standard_name: surface_upwelling_shortwave_flux_in_air - cmip6:data_specs_version: 01.00.29 - cmip6:variable_id: rsus - cmip6:variable_units: W m-2 - cmip6:source_id: CIESM - cmip6:experiment_title: update of RCP8.5 based on SSP5 - cmip6:citation_url: http://cera-www.dkrz.de/WDCC/meta/CMIP6/CMIP6.ScenarioMIP.THU.CIESM.ssp585.r1i1p1f1.Amon.rsus.gr.v20200806.json - cmip6:variant_label: r1i1p1f1 - realm: ['atmos'] - cmip6:retracted: False - cmip6:grid: gs1x1
We can see that this item contains a single product, a kerchunk reference file which allows us to access the dataset as a single object. The format doesn't actually matter here, as the tools we are using here allows us to open any recognised format into xarray with minimal understanding of the data format on the user side.
The following code cell opens the file as an xarray dataset and presents the user with information about the array structure.
ds = product.open_dataset()
ds
<xarray.Dataset> Size: 228MB
Dimensions: (lat: 192, bnds: 2, lon: 288, time: 1032)
Coordinates:
* lat (lat) float64 2kB -90.0 -89.06 -88.12 -87.17 ... 88.12 89.06 90.0
* lon (lon) float64 2kB 0.0 1.25 2.5 3.75 ... 355.0 356.2 357.5 358.8
* time (time) object 8kB 4029-01-16 12:00:00 ... 4114-12-16 12:00:00
Dimensions without coordinates: bnds
Data variables:
lat_bnds (lat, bnds) float64 3kB dask.array<chunksize=(192, 2), meta=np.ndarray>
lon_bnds (lon, bnds) float64 5kB dask.array<chunksize=(288, 2), meta=np.ndarray>
rsus (time, lat, lon) float32 228MB dask.array<chunksize=(1, 192, 288), meta=np.ndarray>
time_bnds (time, bnds) object 17kB dask.array<chunksize=(1, 2), meta=np.ndarray>
Attributes: (12/46)
Conventions: CF-1.7 CMIP-6.2
activity_id: ScenarioMIP
branch_method: standard
branch_time_in_child: 735110.0
branch_time_in_parent: 735110.0
cmor_version: 3.6.0
... ...
table_id: Amon
table_info: Creation Date:(20 February 2019) MD5:510997cd0a2c...
title: CIESM output prepared for CMIP6
tracking_id: hdl:21.14100/26602daf-2379-491b-ad98-2ebb2f581db7
variable_id: rsus
variant_label: r1i1p1f1Looking at the output above we can see that the dataset has four Data variables. We can perform various selections on the data but specifically we need to choose the variable containng the data to plot, and then select the time step we are interested in. We will use the geoviews package to plot the resulting map.
data_var = list(ds.data_vars)[2] # Choose the data variable (modify if needed)
print("Variable: ", data_var)
timestep = 555 # choose any number within the range presented in the xarray oiutput i.e. up to 1032 in this case
Variable: rsus
selected_data = ds[data_var].isel(time=timestep) # Select the data for the time step
# Plot the data
plot = selected_data.hvplot.quadmesh(
x="lon",
y="lat",
cmap="plasma",
geo=True,
projection=ccrs.PlateCarree(),
coastline=True,
title=f"Time Step: {ds.time.values[timestep]}",
)
plot
/home/figi/software/work/eodh/eodh-training/.venv/lib/python3.10/site-packages/cartopy/io/__init__.py:242: DownloadWarning: Downloading: https://naturalearth.s3.amazonaws.com/110m_physical/ne_110m_coastline.zip
warnings.warn(f'Downloading: {url}', DownloadWarning)