Pangeo and openEO offer a training session that will guide you through the Pangeo (http://pangeo.io/) and openEO (https://openeo.org/) ecosystems for developing efficient Big Earth science data pipelines. Emphasis will be put on the complementarities of the two ecosystems with a goal to teach attendees how to fully exploit both frameworks to run complex data workflows. Attendees will learn about open, reproducible, and scalable Earth science. The workshop is designed to help anyone interested in starting their journey with Pangeo and OpenEO while avoiding common pitfalls.
All the Python packages used during this training are Open-source. Sample datasets used in the tutorial are EO datasets that are freely available to everyone and can also be used for real scientific analysis.
The Training material is open-source (CC-BY-4) too.
The workshop aims at empowering attendees to learn new skills and build confidence in using them in their work. The tutorial will have work along with hands-on exercises to check the understanding of attendees. Multiple opportunities to ask questions and discuss with the Pangeo and openEO communities will be offered.
This workshop will assume prior knowledge of the Python programming language and basics of Xarray. We recommend learners with no prior knowledge of Python or Xarray to get familiar with them, for instance using Software Carpentry training material (https://swcarpentry.github.io/python-novice-gapminder/), Project Pythia (https://foundations.projectpythia.org/core/xarray.html), the xarray tutorial (https://tutorial.xarray.dev), or Pangeo Galaxy Training material (https://training.galaxyproject.org/training-material/topics/climate/tutorials/pangeo-notebook/tutorial.html).
9:00 Welcome (5 minutes)
9:05 Introduction and Motivation (15 minutes)
9:20 Overview of the Pangeo ecosystem (20 minutes)
9:40 Understanding Xarray to avoid common pitfalls (30 minutes)
10:10 Interactive Visualization with Hvplot (20 minutes)
10:30 Break (30 minutes)
11:00 Getting started with OpenEO (15 minutes)
11:15 Accessing data with OpenEO (25 minutes)
11:40 Processing data with OpenEO (30 minutes)
12:10 Working with data cubes with OpenEO (20 minutes)
Part-3: Unlocking the Power of Space Data with Pangeo & OpenEO
14:00 Understanding what OpenEO does best and how to exploit it to easily streamline your data analysis (25 minutes)
14:25 Scaling with OpenEO (25 minutes)
14:50 Understanding when and how to exploit Pangeo to customise your algorithm and analyse multiple data sources (20 minutes)
15:10 Introduction to chunking (20 minutes)
16:00 Scaling with Dask (30 minutes)
16:30 Cloud-friendly access to archival data with kerchunk (25 minutes)
16:55 Create Analysis Ready Cloud Optimised (ARCO) data (25 minutes)
17:20 Common workflow that combines the best of the two “worlds” (30 minutes)
17:50 Wrap-up and feedback survey (10 minutes)