Table of Contents
Schedule ↵
WIS2 in a box training
WIS2 in a box (wis2box) is a Free and Open Source (FOSS) Reference Implementation of a WMO WIS2 node. The project provides a plug and play tool-set to ingest, process, and publish weather/climate/water data using standards-based approaches in alignment with the WIS2 principles. wis2box also provides access to all data in the WIS2 network. wis2box is designed to have a low barrier to entry for data providers, providing enabling infrastructure and services for data discovery, access, and visualization.
This training provides step-by-step explanations of various aspects of the wis2box project as well as a number of exercises to help you publish and download data from WIS2. The training is provided in the form of overview presentations as well as hands-on practical exercises.
Participants will be able to work with sample test data and metadata, as well as integrate their own data and metadata.
This training covers a wide range of topics (install/setup/configuration, publishing/downloading data, etc.).
Goals and learning outcomes
The goals of this training are to become familiar with the following:
- WIS2 architecture core concepts and components
- data and metadata formats used in WIS2 for discovery and access
- WIS2-in-a-box (wis2box) architecture and environment
- wis2box-core functions:
- metadata management
- data ingestion and transformation to BUFR-format
- MQTT-broker for WIS2-message publishing
- HTTP-endpoint for data download
- API-endpoint for programmatic access to data
Navigation
The left hand navigation provides a table of contents for the entire training.
The right hand navigation provides a table of concents for a specific page.
Prerequisites
Knowledge
- Basic Linux commands (see the cheatsheet)
- Basic knowledge of networking and Internet protocols
Software
This training requires the following tools:
- An instance running Ubuntu OS (provided by WMO trainers during local training sessions) see student-vm
- SSH client to access your instance
- MQTT Explorer on your local machine
- SCP and FTP client to copy files from your local machine
Conventions
Question
A section marked like this invites you to answer a question.
Also you will notice tips and notes sections within the text:
Tip
Tips share help on how to best achieve tasks.
Note
Notes provide additional information on the topic covered by the practical session, as well as how to best achieve tasks.
Examples are indicated as follows:
Configuration
1 2 3 4 |
|
Snippets which need to be typed in a on a terminal/console are indicated as:
echo 'Hello world'
Training location and materials
This training is always provided live at https://training.wis2box.wis.wmo.int.
The training contents, wiki and issue tracker are managed on GitHub at https://github.com/wmo-im/wis2box-training.
Printing the material
This training can be exported to PDF. To save or print this training material, go to the print page, and select File > Print > Save as PDF.
Exercise materials
Exercise materials can be downloaded from the exercise-materials.zip zipfile.
Support
For issues/bugs/suggestions or improvements/contributions to this training, please use the GitHub issue tracker.
All wis2box bugs, enhancements and issues can be reported on GitHub.
For additional support of questions, please contact wis@wmo.int.
As always, wis2box core documentation can always be found at https://docs.wis2box.wis.wmo.int.
Contributions are always encouraged and welcome!
20 - 24 March 2023: Windhoek, Namibia 🇳🇦
Venue
Hotel Safari, Managed By Accor
Corner Of Auas Aviation Road, 9000 Windhoek, Namibia
Learning Objectives
- to be able to explain WIS2 architecture
- to be able to install and configure the WIS2 in a box software
- to be able to operate a WIS2 Node
Timetable
Day 1
Time | Subject | Instructor(s) | Presentation(s) |
---|---|---|---|
8h30 - 9h | Registration | ||
09h - 11h | Opening photo and Round table | ||
11h - 11h30 | Break | ||
11h30 - 12h | WIS2 Implementation Plan | Hassan Haddouch | Presentation |
12h - 12h30 | WIS2 Architecture | Enrico Fucile | Presentation |
12h30 - 13h30 | Lunch break | ||
13h30 - 14h30 | Practical session: Connecting to WIS2 over MQTT | Maaike Limper/Timo Proescholdt | |
14h30 - 15h15 | Introducing WIS2 in a box | Tom Kralidis | Presentation |
15h15 - 15h45 | Break | ||
15h45 - 16h30 | WIS2 in a box case studies | Enrico Fucile | Presentation |
16h30 - 17h | Wrap-up / open issues | Timo Proescholdt |
Day 2
Time | Subject | Instructor(s) | Presentation(s) |
---|---|---|---|
9h - 9h30 | The training environment | Timo Proescholdt | Presentation |
9h30 - 11h | Practical session: Accessing your student VM | Maaike Limper/Timo Proescholdt | |
11h - 11h30 | Break | ||
11h30 - 12h30 | WIS2 in a box architecture, project and installation | Tom Kralidis | Architecture Project and installation |
12h30 - 13h30 | Lunch break | ||
13h30 - 15h | Practical session: Installing WIS2 box with test data | Tom Kralidis/Maaike Limper | |
15h - 15h30 | Break | ||
15h30 - 16h30 | Practical session: Installing WIS2 box with test data | Tom Kralidis/Maaike Limper | |
16h30 - 17h | Group discussion: insights and challenges installing wis2box | Timo Proescholdt |
Day 3
Time | Subject | Instructor(s) | Presentation(s) |
---|---|---|---|
9h - 10h | BUFR and ecCodes | Enrico Fucile | BUFR in a nutshell BUFR with ecCodes |
10h - 11h | Conversion to BUFR: synop2bufr | Enrico Fucile/Maaike Limper | SYNOP to BUFR |
11h - 11h30 | Break | ||
11h30 - 12h30 | Conversion to BUFR: csv2bufr | Enrico Fucile/Maaike Limper | CSV to BUFR |
12h30 - 13h30 | Lunch break | ||
13h30 - 14h | DAYCLI | Enrico Fucile | Daily Climate data (DAYCLI) |
14h - 14h30 | Practical session: Converting DAYCLI data to BUFR | Enrico Fucile | |
14h30 - 15h | Break | ||
15h - 16h30 | Convert your data | Enrico Fucile/Maaike Limper | |
16h30 - 17h | Group discussion: insights and challenges converting data to BUFR | Timo Proescholdt |
Day 4
Time | Subject | Instructor(s) | Presentation(s) |
---|---|---|---|
9h - 11h | Installing wis2box with test data Configuring data mappings Configuring WIS2 discovery metadata Configuring station metadata |
Tom Kralidis/Maaike Limper | |
11h - 11h30 | Break | ||
11h30 - 12h30 | wis2box practical session: runtime configuration setup (continued) | Tom Kralidis/Maaike Limper | |
12h30 - 13h30 | Lunch break | ||
13h30 - 15h30 | Practical session: Data ingest and monitoring | Tom Kralidis/Maaike Limper | |
15h30 - 17h | Practical session: Downloading and finding data from WIS2 | Tom Kralidis/Maaike Limper | WIS2 data access |
Day 5
Time | Subject | Instructor(s) | Presentation(s) |
---|---|---|---|
9h - 10h30 | Joining WIS2 as a WIS2 Node | Timo Proescholdt | Participating in the pilot phase as a WIS2 node |
10h30 - 11h | Next steps with WIS2 | Hassan Haddouch/Enrico Fucile | Next steps with WIS2 |
11h - 11h30 | Round table: participants' plans for WIS2 | all | WIS2 National Plan |
11h30 - 12h | Closing ceremony | all |
Participants
Country | Name | |
---|---|---|
1 | Algeria 🇩🇿 | Abderraouf KADEM |
2 | Congo 🇨🇬 | Francilly Lardin SAMBA |
3 | Eswatini 🇸🇿 | Mbongeni Ian GAMEDZE |
4 | Kenya 🇰🇪 | Peter Kipkorir MUTAI |
5 | Malawi 🇲🇼 | Patrick Edward MTINGWI |
6 | Morocco 🇲🇦 | Chems Eddine EL GARRAI |
7 | Namibia 🇳🇦 | Odillo KGOBETSI |
8 | Namibia 🇳🇦 | Laina AMUNJELA (Ms.) |
9 | Namibia 🇳🇦 | Seblonica IMALWA (Ms.) |
10 | Namibia 🇳🇦 | Prince Julian HAOSEB |
11 | Namibia 🇳🇦 | Uvatera TJIMUNE |
12 | Namibia 🇳🇦 | Vilho NDEUNYEMA |
13 | South Africa 🇿🇦 | Tshepo Hope TAWANE |
14 | South Africa 🇿🇦 | Teko Mohono |
15 | South Africa 🇿🇦 | Thula Ngidi |
16 | South Africa 🇿🇦 | Christa Ferreira (Ms.) |
17 | Tanzania 🇹🇿 | Rose Dudley SENYAGWA (Ms.) |
18 | Zambia 🇿🇲 | Naomi MUCHIPU (Ms.) |
19 | Zimbabwe 🇿🇼 | Joyce BANDA (Ms.) |
20 | Zimbabwe 🇿🇼 | Wilfred Janda CHAWAGUTA |
21 | Zimbabwe 🇿🇼 | Zvidzai Malvin KANENGONI |
Instructors
Name | Organization |
---|---|
Enrico Fucile | WMO Secretariat 🇺🇳 |
Hassan Haddouch | WMO Secretariat 🇺🇳 |
Tom Kralidis | Meteorological Service of Canada 🇨🇦 |
Maaike Limper | WMO Secretariat 🇺🇳 |
Timo Proescholdt | WMO Secretariat 🇺🇳 |
12 - 16 June 2023: Port of Spain, Trinidad and Tobago 🇹🇹
IN PREPARATION
9 - 13 October 2023: Jakarta, Indonesia
IN PREPARATION
Ended: Schedule
Practical sessions ↵
Connecting to WIS2 over MQTT
Introduction
WIS2 uses the MQTT protocol to advertise the availability of weather/climate/water data. The WIS2 Global Broker subscribes to all WIS2 Nodes in the network and republishes the messages it receives. The Global Cache subscribes to the Global Broker, downloads the data in the message and then republishes the message on the cache
topic with a new URL. The Global Discovery Catalogue publishes discovery metadata from the Broker and provides a search API.
As part of the WIS2 Pilot Phase in 2023, Météo-France is running a Global Broker. In this practical session you will learn how to use the MQTT Explorer tool to review the topics available on this Global Broker and be able to display WIS2 notification messages.
Using MQTT Explorer to connect to the Global Broker
One way to view messages published by this Global Broker is using the MQTT Explorer which can be downloaded from the MQTT Explorer website.
Open MQTT Explorer and add a new connection as follows:
Click on the 'ADVANCED' button and add the following topics to subscribe to:
Note
When setting up MQTT subscriptions you can use the following wildcards:
- Single-level (+): a single-level wildcard replaces one topic level
- Multi-level (#): a multi-level wildcard replaces multiple topic levels
Click 'BACK', then 'SAVE' to save your connection and subscription details. Then click 'CONNECT':
At this point, the following should appear in the MQTT Explorer session:
You are now ready to start exploring the WIS2 topics and message structure and answer the following questions:
Question
How can we distinguish the originating country providing the data? How many countries are sharing data?
Question
Find the latest message received in the following topic:
origin/a/wis2/mwi/malawi_wmo_demo/data/core/weather/surface-based-observations/synop
What is the URL from which we can download the data in BUFR format?
Question
Find the latest message received on the following topic:
cache/a/wis2/mwi/malawi_wmo_demo/data/core/weather/surface-based-observations/synop
What is the URL we can use to download the data? What is the difference between this URL and the URL in the previous question?
Question
How does the WIS2 message define the location where the data was observed? And how you can determine the time that the data was published?
Note
MQTT Explorer is a helpful tool to review the topic structure for a given MQTT broker and visually work with the MQTT protocol. There exist many MQTT client and server software. To work with MQTT programmatically (for example, in Python), paho-mqtt can be used to connect to an MQTT broker and process incoming messages.
Conclusion
Congratulations!
Congratulations! In this practical session, you learned:
- how to subscribe to the WIS2 Global Broker from Météo-France using MQTT Explorer
- the WIS2 topic structure
- the WIS2 notification message structure
- the difference between Global Broker messages published on the
origin
andcache
topics
Your student VM
Introduction
As part of locally run wis2box training sessions, you can access your personal student VM on the local training network named "WIS2-training".
Note
If you want to run this training outside of a local training session, you can provide your own instance using any Cloud Provider:
- GCP (Google Cloud Platform) VM instance
e2-medium
- AWS (Amazon Web Services)Â ec2-instance
t3a.medium
 - Azure (Microsoft) Azure Virtual Machine
standard_b2s
Select Ubuntu Server 20.0.4 LTS as OS and run the setup script available in student-vm-setup.zip on your instance to ensure you have all required software.
If you are using the student VM provided during local WIS2 training sessions, the required software will already be installed.
Note
The student-VMs provided during WIS2 local training sessions have the following command-line editors pre-installed:
- vi
- vim
- nano
- emacs
Connect to your student VM on the local training network
Use the following configuration to connect your PC on the local WiFi broadcasted in the room during WIS2 training:
- SSID: WIS2-training
- password: dataismagic!
Use an SSH client to connect to your student VM using the following:
- Host: (provided during in-person training)
- Port: 22
- Username: (provided during in-person training)
- Password: wis2training (default password to be changed after logging in)
Tip
Contact a trainer if you are unsure about the hostname/username or have issues connecting.
Once connected, please change your password to ensure others cannot access your VM:
limper@student-vm:~$ passwd
Changing password for testuser.
Current password:
New password:
Retype new password:
passwd: password updated successfully
Verify software versions
To be able to practice conversion to BUFR, the student VM comes with ecCodes, synop2bufr and csv2bufr pre-installed:
Check the ecCodes version via the bufr_dump
command:
bufr_dump -V
ecCodes Version 2.28.0
Check synop2bufr version:
synop2bufr --version
synop2bufr, version 0.4.0
Check csv2bufr version:
csv2bufr --version
csv2bufr, version 0.5.1
To be able to run wis2box, the student VM also comes with Python Docker and Docker Compose pre-installed.
Check docker version:
docker --version
Docker version 20.10.17, build 100c701
Check Docker Compose version:
docker-compose --version
docker-compose version 1.29.0, build unknown
Check Python version:
python3 --version
Python 3.8.10
Inspect the exercise materials
Inspect the contents of your home directory; these are the materials used as part of the training and practical sessions.
ls ~/
exercise-materials wis2box-1.0b3
To access the material on your local machine rather than from the command line, you can use SCP. Using WinSCP, you can create a new SCP connection to your VM as follows:
And you should be able to access the exercise-materials:
Conclusion
Congratulations!
In this practical session, you learned how to:
- access your student VM over SSH and WinSCP
- verify the required software for the practical exercises is installed
- verify you have access to exercise materials for this training on your local student VM
Converting SYNOP data to BUFR
Introduction
Surface synoptic observations (SYNOP) data are used to report weather observations from surface stations (manned or automated). synop2bufr is a tool to help convert SYNOP to BUFR data. ecCodes is a package to reading and writing GRIB and BUFR formats.
In this session you will learn about converting a SYNOP report into the WMO BUFR format using the above mentioned tools, as well as the relationship between SYNOP reports and BUFR messages.
Preparation
Warning
Ensure that you are logged into your student VM.
Navigate to the exercise-materials/synop2bufr-exercises
directory and make sure that the exercises directories are there.
cd ~/exercise-materials/synop2bufr-exercises
ls
Tip
You should be able to see the following directories answers ex_1 ex_2 ex_3 ex_4 ex_5
synop2bufr primer
Below are essential synop2bufr
commands and configurations:
transform
The transform
function converts a SYNOP message to BUFR:
synop2bufr transform --metadata my_file.csv --output-dir ./my_directory --year message_year --month message_month my_SYNOP.txt
Note that if the metadata, output directory, year and month options are not specified, they will assume their default values:
Option | Default |
---|---|
--metadata | station_list.csv |
--output-dir | The current working directory. |
--year | The current year. |
--month | The current month. |
In the examples, the year and month are not given, so feel free to specify a date yourself or use the default values.
ecCodes primer
ecCodes provides both command line tools and can be embedded in your own applications. Below are some useful command line utilities to work with BUFR data.
bufr_dump
The bufr_dump
command is a generic BUFR information tool. It has many options, but the following will be the most applicable to the exercises:
bufr_dump -p my_bufr.bufr4
This will display BUFR content to your screen. If you are interested in the values taken by a variable in particular, use the egrep
command:
bufr_dump -p my_bufr.bufr4 | egrep -i temperature
This will display variables related to temperature in your BUFR data. If you want to do this for multiple types of variables, filter the output using a pipe (|
):
bufr_dump -p my_bufr.bufr4 | egrep -i 'temperature|wind'
Inspecting SYNOP data and BUFR conversion
Exercise 1
Navigate to the exercise-materials/synop2bufr-exercises/ex_1
directory and inspect the SYNOP message file message.txt:
cd ~/exercise-materials/synop2bufr-exercises/ex_1
more message.txt
Question
How many SYNOP reports are in this file?
Inspect the station list:
more station_list.csv
Question
How many stations are listed in the station list?
Question
Convert message.txt
to BUFR format.
Tip
See the synop2bufr primer section.
Note
BUFR files have no set file extension, however it is recommended to use .bufr4
.
Inspect the resulting BUFR data using bufr_dump
.
Question
Compare the latitude and longitude values to those in the station list.
Tip
See the ecCodes primer section.
Exercise 2
Navigate to the exercise-materials/synop2bufr-exercises/ex_2
directory and inspect the SYNOP message file message.txt:
cd ~/exercise-materials/synop2bufr-exercises/ex_2
more message.txt
Question
How many SYNOP reports are in this file?
Inspect the station list:
more station_list.csv
Question
How many stations are listed in the station list?
Question
Convert message.txt
to BUFR format.
Question
Based on the results of the exercises in this and the previous exercise, how would you predict the number of resulting BUFR files based upon the number of SYNOP reports and stations listed in the station metadata file?
Inspect the resulting BUFR data using bufr_dump
.
Question
Check each of the output BUFR files contain different WIGOS Station Identifiers (WSI).
Exercise 3
Navigate to the exercise-materials/synop2bufr-exercises/ex_3
directory and inspect the SYNOP message file message.txt:
cd ~/exercise-materials/synop2bufr-exercises/ex_3
more message.txt
This SYNOP message only contains one longer report with more sections.
Inspect the station list:
more station_list.csv
Question
Is it problematic that this file contains more stations than there are reports in the SYNOP message?
Note
The station list file is a source of metadata for synop2bufr
to provide the information missing in the alphanumeric SYNOP report and required in the BUFR SYNOP.
Question
Convert message.txt
to BUFR format.
Inspect the resulting BUFR data using bufr_dump
.
Question
Find the following variables:
- Air temperature (K) of the report
- Total cloud cover (%) of the report
- Total period of sunshine (mins) of the report
- Wind speed (m/s) of the report
Tip
You may find the last command of the ecCodes primer section useful.
Exercise 4
Navigate to the exercise-materials/synop2bufr-exercises/ex_4
directory and inspect the SYNOP message file message.txt:
cd ~/exercise-materials/synop2bufr-exercises/ex_4
more message_incorrect.txt
Question
What is incorrect about this SYNOP file?
Attempt to convert message_incorrect.txt
using station_list.csv
Question
What problem(s) did you encounter with this conversion?
Exercise 5
Navigate to the exercise-materials/synop2bufr-exercises/ex_5
directory and inspect the SYNOP message file message.txt:
cd ~/exercise-materials/synop2bufr-exercises/ex_5
more message.txt
Attempt to convert message.txt
to BUFR format using station_list_incorrect.csv
Question
What problem(s) did you encounter with this conversion?
Considering the error presented, justify the number of BUFR files produced.
Conclusion
Congratulations!
In this practical session, you learned:
- the principles of SYNOP data reporting
- how to use
synop2bufr
to convert SYNOP data to BUFR format - how to use
bufr_dump
to inspect the content of BUFR data
Converting CSV data to BUFR
Introduction
CSV data is a commonly used format for recording tabular data. csv2bufr
is a tool to help
convert CSV to BUFR data.
In this session you will learn to create BUFR data from CSV, using custom and flexible configuration (mappings) in support of meeting WMO GBON requirements.
Preparation
Warning
Ensure that you are logged into your student VM.
Navigate to the exercise-materials/csv2bufr-exercises
directory and make sure that the exercises directories are there.
cd ~/exercise-materials/csv2bufr-exercises
ls
Tip
You should be able to see the following directories BUFR_tables answers ex_1 ex_2 ex_3 ex_4 ex_5
csv2bufr primer
Below are essential csv2bufr
commands and configurations:
mappings Create
The mappings create
command creates an empty BUFR mapping template JSON file, which maps CSV column headers to their corresponding ecCodes element:
csv2bufr mappings create <BUFR descriptors> --output <my_template.json>
For more information, see the following example.
data transform
The data transform
command converts a CSV file to BUFR format:
csv2bufr data transform --bufr-template <my_template.json> --output-dir <./my_directory> <my_data.csv>
Note
The output directory is not required, and by default is the current working directory.
ecCodes BUFR refresher
bufr_dump
The bufr_dump
function will allow you to inspect the BUFR files created from the conversion. It has numerous options;, the following will be most applicable to the exercises:
bufr_dump -p <my_bufr.bufr4>
This will display the content of your BUFR on screen. If you are interested in the values taken by a variable in particular, use the egrep
command:
bufr_dump -p <my_bufr.bufr4> | egrep -i temperature
This will display the variables related to temperature in your BUFR data. If you want to do this for multiple types of variables, filter the output using a pipe (|
):
bufr_dump -p <my_bufr.bufr4> | egrep -i 'temperature|wind'
Inspecting CSV data and BUFR conversion
Exercise 1: Converting a CSV file to BUFR
In this exercise we will look at a pre-configured mapping file for the CSV data, and will use this to convert the data to BUFR.
Navigate to the ex_1
directory:
cd ~/exercise-materials/csv2bufr-exercises/ex_1
and open the CSV data ex_1.csv
.
- How many header rows are there in this data?
- Which row contains the column names?
Now open the mappings file mappings_1.json
.
Note
csv2bufr mappings files have no set file extension, however it recommended to use .json
.
-
Verify that
"number_header_rows"
and"column_names_row"
are the same as your answers above. -
Locate each of the CSV column names in this mappings file.
-
By the
data transform
command, use the mappings file to convert this CSV data to BUFR. -
Use bufr_dump to find the latitude and longitude value stored in the output BUFR file. Verify these values using the CSV file.
Exercise 2: Correcting the datetime format
In this exercise we will investigate the correct format to present the datetime of an observation in the CSV file.
Navigate to the ex_2
directory:
cd ~/exercise-materials/csv2bufr-exercises/ex_2
and open the CSV data ex_2.csv
.
- What are the differences in the way that the datetime is represented in this CSV file compared to the previous one?
Now open the mappings file mappings_2.json
. By looking at the eccodes keys related to dates and times, it should seem clear that it is not possible to map the datetime with the CSV in its current state.
-
Create new columns in the CSV file for each component of the datetime, with appropriate column names to match those of the mapping file.
-
By the
data transform
command, use the mappings file to convert this CSV data to BUFR.
Exercise 3: Handling changes to the CSV data
In this exercise we consider the following scenario: given the same CSV data but with different column names, how can we adjust the mappings file to convert this data to BUFR? For simplicity, we will only look at one column name change.
Navigate to the ex_3
directory
cd ~/exercise-materials/csv2bufr-exercises/ex_3
- By the
data transform
command, attempt to convert the CSV data to BUFR. What error appears?
Open the CSV data ex_3.csv
.
- What column name has been changed?
Open the mappings file mappings_3.json
.
- Find the original column name in this mapping file, and change it to the new name.
- By the
data transform
command, use the mappings file to convert this CSV data to BUFR. - Use
bufr_dump
to verify thatrelativeHumidity
has the same value as the CSV data.
Exercise 4: Unit conversion
In this exercise, we expand on the work above by not only handling changes to column names, but also the units of the data. We achieve this by using offset
and scale
in the mappings file.
Navigate to the ex_4
directory:
cd ~/exercise-materials/csv2bufr-exercises/ex_4
and open the CSV data ex_4.csv
.
- Which row are the units of the variables written?
You should notice that BP
now has units hPa instead of Pa. Moreover, the air temperature and dewpoint temperature now have column names AirTempC
and DewPointTempC
, with units C instead of K.
- What power of 10 is needed to convert hPa to Pa?
- What constant must be added to convert degrees C to K?
Open the mappings file mappings_3.json
. Find the lines corresponding to the variables above.
-
Convert
BP
to Pa by adding the following line to the right of"data:BP"
:"offset": "const:0", "scale": "const:x"
where
x
is your answer in part 3. -
Change the column names of air temperature and dewpoint temperature in the mappings file to match that of the CSV file, as you did in the previous exercise.
-
Convert
AirTempC
to K by adding the following line to the right of"data:AirTempC"
:"offset": "const:y", "scale": "const:0"
where
y
is your answer in part 4. -
Convert
DewPointTempC
to K by adding the following line to the right of"data:DewPointTempC"
:"offset": "const:y", "scale": "const:0"
where
y
is your answer in part 4. -
By the
data transform
command, use the mappings file to convert this CSV data to BUFR. -
Use the
bufr_dump
command to verify thatnonCoordinatePressure
,airTemperature
anddewpointTemperature
have the values you would expect after conversion.
Exercise 5: Implementing quality control
In this exercise, we will implement some minimum and maximum tolerable values to prevent clearly incorrect data from being converted to BUFR. To do this, we will use valid_min
and valid_max
in the mappings file.
Navigate to the ex_5
directory:
cd ~/exercise-materials/csv2bufr-exercises/ex_5
and open the CSV data ex_5.csv
.
- Which two variables have values that are clearly incorrect?
- For each of these variables, decide on some sensible minimum and maximum tolerable values.
Open the mappings file mappings_4.json
. Find the lines corresponding to the variables above.
-
Implement these minimum and maximum values by adding the following line to the right of the
data:
code:"valid_min": "const:a", "valid_max": "const:b"
where a and b are values you chose in part 2.
-
By the
data transform
command, use this mappings file to convert this CSV data to BUFR. What happens? Is a BUFR file written? Justify why.
Conclusion
Congratulations!
In this practical session, you learned:
- The basic usage of
csv2bufr
- How to update a simple csv2bufr mapping file for a variety of scenarios, including for GBON requirements, unit conversion, and quality control/range checking
- How to
csv2bufr
on a test data file and convert to BUFR format
Converting DAYCLI data to BUFR
Introduction
Daily climate reports (DAYCLI) are a requirement for daily climate observations to monitor extremes, such as daily maximum and minimum temperature, daily total precipitation, and more.
This session will focus on understanding the structure of a typical DAYCLI CSV file, converting it to BUFR format, and inspecting the resulting contents.
If this data is recorded in CSV format, we can use csv2bufr to convert this data to BUFR. Moreover, if the structure of the CSV file is correct, then one does not need to configure a mappings file for the conversion to BUFR, as a DAYCLI mapping template comes included with csv2bufr to manage such data.
Preparation
Warning
Ensure that you are logged into your student VM.
Navigate to the exercise-materials/daycli-exercises
directory.
cd ~/exercise-materials/daycli-exercises
Exercise
Edit the file daycli.csv
. Compare the column structure to that of the final slides of the presentation we just viewed.
Convert this CSV file to BUFR using the built-in daycli mapping file:
csv2bufr data transform --bufr-template daycli --output-dir ./ daycli.csv
Inspect the output BUFR files using bufr_dump
and verify the data is the same as the original DAYCLI CSV file.
Conclusion
Congratulations!
In this practical session, you learned how to:
- verify the structure of a typical DAYCLI CSV file
- convert this DAYCLI CSV file into BUFR using csv2bufr and the built-in daycli mapping template
- inspect the contents of the DAYCLI BUFR files created
wis2box services introduction
Introduction
In this session you will run the wis2box software that was pre-installed on your student VM using the test data configuration.
You will review and access the services provided by your wis2box: the MQTT broker and HTTP accessible services and view how the services work when manually ingesting some test-data.
Preparation
Login to your designated VM with your username and password.
Note
The latest wis2box-release 'wis2box-1.0b3' was pre-installed on your student VM using the release archive available on GitHub.
You can always find the latest 'wis2box-setup' archive at https://github.com/wmo-im/wis2box/releases
wis2box start and status
Navigate to the directory containing the wis2box software stack:
cd ~/wis2box-1.0b3
Start wis2box with the following command:
python3 wis2box-ctl.py start
Inspect the status with the following command:
python3 wis2box-ctl.py status
Repeat this command until all services are up and running.
Question
What services are running? Which ports are used for each service?
Note
wis2box runs as a set of Docker containers managed by docker-compose.
The services are defined in the various docker-compose*.yml
which can be found in the ~/wis2box-1.0b3/
directory.
The script 'wis2box-ctl.py' is used to run the underlying Docker Compose commands that control the wis2box services.
wis2box UI
Open a web browser and visit the page http://<your-host>
.
This is the default wis2box web application (running via the wis2box-ui container).
Click the "EXPLORE" option on http://<your-host>
.
View latest data per station on the wis2box UI
Click on on a station in the station-list or hover your mouse over a station in the map to see the latest data for that station.
View data profile over time per measured variable
After selecting a station in the map, click on "data" and select a variable to see a graph of the measured variable over time.
Question
What is last timestamp in UTC for which the Malawi station "Bilira" received data?
wis2box API
Open a new tab and navigate to the page http://<your-host>/oapi
.
This is the wis2box API (running via the wis2box-api container).
To view collections currently published to the API, click View the collections in this service
.
Question
What collections are currently available?
Question
How many data notifications have been published?
Question
How many stations are configured?
wis2box-broker
Open the MQTT Explorer on your computer and prepare a new connection to connect to your broker (running via the wis2box-broker container).
Use the following connection details:
- Protocol: mqtt://
- Host:
<your-host>
- Port: 1883
- Username: wis2box
- Password: wis2box
- under 'ADVANCED', subscribe to the topics
$SYS/#
andorigin/#
Make sure to click "SAVE" to store your connection details.
Once you are connected, you should see statistics being published by your broker on the $SYS
topic.
Make sure MQTT Explorer is connected to your broker before proceeding to the next exercise:
publishing WIS2 data
To demonstrate how the wis2box can publish WIS2 data we will manually ingest some data from the command line:
In your SSH client window, ensure you are in the ~/wis2box-1.0b3
directory and login to the wis2box-management container as follows:
cd ~/wis2box-1.0b3/
python3 wis2box-ctl.py login
Note
This command is equivalent to docker exec -it wis2box-management /bin/bash
, meaning that you have entered an interactive shell inside the wis2box-management container.
Run the following command to ingest some additional data for Malawi:
wis2box data ingest -th mwi.mwi_met_centre.data.core.weather.surface-based-observations.synop -p /data/wis2box/observations/malawi-new-data/
After the data ingestion ran successfully, you should be able to view new messages that have been published on your wis2box broker in MQTT Explorer.
Question
What is the topic used to publish notifications for new data? How many WIS2 data notifications have been published?
download data
What is the URL that allows you to download the published data in BUFR-format?
Copy and paste the URL in your browser to verify you can download the corresponding .bufr4
file.
Go back to your browser and visit the wis2box UI.
review new data
Find the stations for which you ingested new data and verify new data is available.
Conclusion
Congratulations!
In this practical session, you learned how to:
- start wis2box and check the status of its components
- ingest some data test observations
- access the wis2box UI, API, MinIO UI and Grafana dashboard
Data ingest and monitoring
Introduction
In this session you will learn various ways to ingest data into your wis2box and learn how you can monitor if your data is being ingested without errors.
preparation
Download test-data
Download the following file onto your local computer:
WIGOS_0-454-2-AWSBILIRA_2023-06-01T1055.csv
Verify data mappings
Ensure you are logged into the wis2box-management container on your student VM:
python3 wis2box-ctl.py login
Open your data mappings file:
vi $WIS2BOX_DATA_MAPPINGS
Question
Which plugins are configured for your dataset ?
Open the Grafana dashboard
Open the Grafana dashboard home-page at http://<your-host>:3000
Question
Are there any errors reported so far?
Have there been any WIS2 notifications published in the last 24 hours?
Keep a web browser tab open with the Grafana dashboard during the next few exercises to monitor the status of your data publishing.
Ingesting your data into wis2box
You can use multiple methods to ingest data into wis2box and start publishing notifications to WIS2. Previously you used the wis2box data ingest
command from within the wis2box-management container, which requires the data to be available on the wis2box-instance.
Another method for manually ingesting data is to use the MinIO
admin interface to upload a file into the wis2box-incoming
bucket.
If your data-collection software supports sending data to an FTP-endpoint you could use the optional wis2box-ftp container setup.
You can also automate data ingest using a script to copy data into the wis2box-incoming
bucket at regular intervals, for example using Python and the MinIO-client.
MinIO admin interface
Open a new tab in your web browser and visit the page `http://
You should be see the buckets 'wis2box-archive', 'wis2box-incoming', 'wis2box-public'.
You can click 'browse' to view the contents of the buckets.
Navigate to the wis2box-incoming bucket:
Click the Create new path button:
Create the following path '/test/data/':
And then upload the file 'WIGOS_0-454-2-AWSBILIRA_2023-06-01T1055.csv' into the folder 'wis2box-incoming/test/data'
View the Grafana dashboard
Check the Grafana dashboard and find the error reported after uploading the file.
Navigate the directory structure until you are in the folder 'wis2box-incoming/mwi/mwi_wmo_demo/data/core/weather/surface-based-observations/synop':
Upload the file 'WIGOS_0-454-2-AWSBILIRA_2023-06-01T1055.csv' it to 'wis2box-incoming/mwi/mwi_wmo_demo/data/core/weather/surface-based-observations/synop'
View the Grafana dashboard
Check the Grafana dashboard; can you confirm the wis2box workflow was initiated after you uploaded your data? In case you see any errors, try to use the information provided in the dashboard to resolve the errors.
View new messages on your wis2box-broker
Check MQTT-explorer, can you confirm that new messages were successfully published on your wis2box-broker?
Note
The wis2box interprets the folder-structure in the wis2box-incoming as the corresponding topic-hierarchy for the file.
mwi.mwi_wmo_demo.data.core.weather.surface-based-observations.synop
corresponds to the path:
mwi/mwi_wmo_demo/data/core/weather/surface-based-observations/synop
.
If there are no data-mappings defined for the topic-hiearachy corresponding to the directory that received data, wis2box will not initiate the workflow.
wis2box FTP
To allow your data to be accessible over FTP you can use the 'wis2box-ftp' image. This container provides a service that forwards data received over FTP to MinIO.
docker-compose -f docker-compose.wis2box-ftp.yml --env-file ftp.env up -d
To test the FTP service, you can use WinSCP on your local laptop and prepare the connection to the wis2box-ftp container as follows:
Replace "Host name" with that of your student VM and use the username and password for the FTP as specified by FTP_USER
and FTP_PASSWORD
in your ftp.env
file.
Once you have established the connection you will land in an empty directory.
Select the option to create a 'new directory':
Enter the topic hierarchy value of your dataset as a directory:
Now enter the directory you created and you can copy your data sample from your host machine to trigger the wis2box data ingest.
Check your Grafana dashboard.
Question
Did you manage to successfully publish WIS2 notifications for your data?
If not, review the errors reported and try to determine what went wrong.
Note
You can run docker logs wis2box-ftp
to check if the FTP service is running correctly.
Note
To change the username/password for the wis2box-ftp edit the file ftp.env
and set your own values for FTP_USER and FTP_PASS:
FTP_USER=wis2box
FTP_PASS=wis2box123
docker-compose -f docker-compose.wis2box-ftp.yml down
docker-compose -f docker-compose.wis2box-ftp.yml --env-file ftp.env up -d
Then start the wis2box-ftp
service with the following command:
MinIO Python client (optional exercise)
You may want to automate data ingestion from your system into wis2box using Python-scripts
MinIO provides a Python client which can be installed as follows:
pip3 install minio
On your student machine the 'minio'-module for python will already be installed.
Go to the directory exercise-materials/wis2box-setup
and run the example script using the following command:
cd ~/exercise-materials/wis2box-setup
python3 examples/scripts/copy_to_incoming.py
Note
The sample script needs to be modified before it can be used.
The script needs to know the correct endpoint for accessing MinIO on your wis2box. If wis2box is running on your host, the MinIO endpoint is available at http://<your-host>:9000
.
The sample script provides the basic structure for copying a file into MinIO. Try to ingest a data sample of your choosing using this script.
ingest data using Python
Use the Python example provided to create your own Python script to ingest data into your wis2box.
Ensure that you:
- define the correct MinIO endpoint for your host
- define the correct path in MinIO for the topics defined in your data-mappings.yml
- determine the correct local path where the script can access the data to ingest
Ensure that the script runs correctly and new data notifications are published on your wis2box broker. Review and correct any errors reported on the Grafana dashboard:
Conclusion
Congratulations!
In this practical session, you learned how to:
- trigger wis2box workflow using different data ingest methods
- monitor the status of your data ingest and publishing
Using the API to query data
Introduction
In this practical session you will learn how to use the API service to query data ingested in wis2box.
Conclusion
Congratulations!
In this practical session, you learned how to:
- use the wis2box API to run queries on your data
wis2box environment variables
Introduction
In this session you will customize your wis2box environment variables and restart your wis2box.
Preparation
- Login to your VM
- Stop wis2box, and use Docker commands to remove volumes
dev.env
The wis2box setup reads environment variables from dev.env
. A basic example is provided by test.env
in your current directory.
Review your new setup
Login to the wis2box-management container using the following command:
python3 wis2box-ctl.py login
Run the following command to view the environment variables used by your wis2box:
wis2box environment show
Note the variables you have set for WIS2BOX_HOST_DATADIR
, WIS2BOX_URL
and WIS2BOX_API_URL
etc.
Review that your wis2box-broker and MinIO storage passwords have been updated.
Run the following command to view the environment variable WIS2BOX_HOST_DATADIR
:
echo $WIS2BOX_HOST_DATADIR
returns:
/home/<your-username>/wis2box-data/
And check the content of /data/wis2box
inside the wis2box-management container:
ls /data/wis2box/
returns:
data-mappings.yml metadata
Note
The directory defined by $WIS2BOX_HOST_DATADIR
gets mounted as /data/wis2box
inside the wis2box-management container.
Conclusion
Congratulations!
In this practical session, you learned how to:
- reinitialize wis2box-services
- set the wis2box-data-directory
- set the logging level for the wis2box-services
- set custom passwords for your broker and storage
Configuring data mappings
Introduction
wis2box uses a number of configuration files to allow for a simple setup of the system. At the heart of wis2box is data ingest and publishing, which are driven by wis2box data mappings. The basic concept of data mappings is configuring a WIS2 topic to a defined ingest and publish workflow and files/templates. In this session, you will work on adding to the data mappings in support of publishing your data via wis2box.
Preparation
Note
Ensure you are logged into the wis2box-management container on your student VM:
cd ~/exercise-materials/wis2box-setup
python3 wis2box-ctl.py login
Configure a data mapping
Note
Ensure you are logged into the wis2box-management container before continuing.
Inspect the wis2box environment to locate the data mappings in use by the system, as defined by the WIS2BOX_DATA_MAPPINGS
environment variable:
wis2box environment show | grep WIS2BOX_DATA_MAPPINGS
Question
Where are the live data mappings located?
Question
How can using the WIS2BOX_DATA_MAPPINGS
environment variable be valuable, as compared to /data/wis2box/data-mappings.yml
?
Add CSV data
Let's add a data mapping for wis2box to process CSV data. Inspect the contents of the sample SYNOP CSV data mapping:
cat ~/exercise-materials/wis2box-setup/synop-csv-mappings.yml
Question
What topic is defined in this mapping? What values of the topic are placeholders to be updated later in this session?
Copy and paste the above file contents into the $WIS2BOX_DATA_MAPPINGS
file (either manually or via the command below)::
tail -n +2 exercise-materials/wis2box-setup/test-data/data-mappings.yml >> $WIS2BOX_DATA_MAPPINGS
Tip
Be sure that the first data:
line from the above file is omitted when copying/pasting into the $WIS2BOX_DATA_MAPPINGS
file.
Open the data mappings file:
vi $WIS2BOX_DATA_MAPPINGS
Verify that the file you copied from ~/exercise-materials/wis2box-setup/synop-csv-mappings.yml
is now part of the live data mappings file.
Update the [country]
and [centre_id]
values in your new/added data mapping. Use your username as the centre_id
topic.
Tip
The country
value should match one of the countries in the country list of the WIS2 Topic Hierarchy.
Note
Centre ids will be officially managed and introduced as part of the WIS2 Topic Hierarchy throughout the WIS2 Pilot Phase, at which point each centre's id will be in the centre_id list of the WIS2 Topic Hierarchy. centre_id
values should be lower case and contain no accents or special characters. Dashes should be used instead of underscores.
Note
The file-pattern
values throughout the data mapping provide a regular expression to be able to match filenames. Ensure your filenames are formatted as per the regular expression in the new data mapping, to include WIGOS_
as a fixed value, followed by the WIGOS Station Identifier (WSI), followed by an underscore (_
), as well as any other information (i.e. datestamp). Ensure the file extension is .csv
. An real world example would be WIGOS_0-454-2-AWSBALAKA_2021-11-18T0955.csv
.
Tip
Remember your dataset topic for the WIS2 discovery metadata exercise.
Restart wis2box
In order for data mappings to take effect, restart wis2box as follows:
python3 wis2box-ctl.py restart
Conclusion
Congratulations!
In this practical session, you learned how to:
- inspect the live wis2box data mappings
- add a new data mapping
- update the
country
andcentre_id
values add a new data mapping - update the
file-pattern
value to match your data filename convention
Configuring WIS2 discovery metadata
Introduction
As described in the overviews, WIS2 requires discovery metadata to be provided describing your data to be shared to WIS2 Global Services. This session will walk you through creating and publishing discovery metadata from wis2box from a configuration file.
Preparation
Note
Ensure you are running MQTT Explorer and you are connected to the broker on your student VM before continuing.
Note
Ensure you are logged into the wis2box-management container on your student VM:
cd ~/exercise-materials/wis2box-setup
python3 wis2box-ctl.py login
Creating discovery metadata
Copy the test discovery metadata into your own file (you may name the file whatever you wish):
cp ~/exercise-materials/wis2box-setup/test-data/mwi-surface-weather-observations.yml ~/my-discovery-metadata.yml
Inspect the sample discovery metadata:
more ~/my-discovery-metadata.yml
Note
All values in the discovery metadata configuration are required and should be included.
Question
How does line 3 of your discovery metadata file relate to the new data mapping in the previous session?
Update the following values in the discovery metadata configuration:
wis2box.topic_hierarchy
: the topic hierarchy that categorizes the data (this value should be the same as the definition in your newly created data mapping).wis2box.country
: 3-letter country code in lower casewis2box.centre_id
: your centre id as defined in the previous exercisemetadata.identifier
: a unique identifier consisting ofurn:x-wmo:md:[country]:[centre_id]:[dataset_name]
, where[dataset-name]
can be any name of your choosing. Remember this value for API validation later on in this exerciseidentification.title
: a human readable title describing your dataidentification.abstract
: a human readable description describing your dataidentification.dates.creation
: when the discovery metadata was created (today's date)identification.extents.spatial (bbox)
: the bounding box coordinates of your data (minimum longitude, minimum latitude, maximum longitude, maximum latitude), in decimal degreesidentification.extents.temporal (begin)
: the begin and end time of your data (keeping the end time tonull
is suitable to ongoing observations)contact.pointOfContact
: your organization's point of contact information
Tip
The configuration is based on the YAML format. Consult the YAML cheatsheet for more information.
Tip
Ensure that bbox values are correctly signed (for example, use the minus sign [-
] for southern or western hemispheres.
Tip
The following tools can be valuable for deriving your identification.extents.spatial.bbox
:
Publishing discovery metadata
Run the following command to publish your discovery metadata:
wis2box metadata discovery publish ~/my-discovery-metadata.yml
Ensure that your discovery metadata was published to the API, by navigating to http://<your-host>/oapi/collections/discovery-metadata
.
Ensure that your discovery metadata was also published to the broker, by looking for a new metadata message in MQTT Explorer.
Question
Do you see your new discovery metadata in the API?
Click on your discovery metadata record and inspect the content, noting how it relates to the discovery metadata configuration created earlier in this session.
Update the title of your discovery metadata, and re-publish:
vi ~/my-discovery-metadata.yml
wis2box metadata discovery publish ~/my-discovery-metadata.yml
Ensure that your discovery metadata updates were published to the API, by refreshing the page to your discovery metadata.
Question
Are you able to see the updates you made in the configuration?
Feel free to update additional values and re-publishing your discovery metadata to get a better idea of how and where discovery metadata content is updated.
Publishing your dataset to the API
Run the below command to add the data to the API:
wis2box data add-collection ~/my-discovery-metadata.yml
Ensure that your dataset was published to the API, by navigating to http://<your-host>/oapi/collections/<metadata.identifier>
.
Question
Do you see your new dataset in the API?
Question
Do you see any data coming from your new dataset in the API? If not, why not?
Conclusion
Congratulations!
In this practical session, you learned how to:
- create discovery metadata
- publish discovery metadata
- update and re-publish discovery metadata
Configuring station metadata
Introduction
wis2box requires station metadata to be able to process and publish observations to WIS2. This session will walk you through creating and publishing station metadata from wis2box from a configuration file. wis2box uses a fixed station metadata list that is used as part of its runtime operation.
Preparation
Note
Ensure you are logged into the wis2box-management container on your student VM:
cd ~/exercise-materials/wis2box-setup
python3 wis2box-ctl.py login
Creating station metadata
Update the file ~/exercises-metadata/test-data/metadata/station/station_list.csv
, adding your new station:
vi ~/exercises-metadata/test-data/metadata/station/station_list.csv
For each new station, add a row to the end of the file with the following values:
station_name
: the human readable name of the stationwigos_station_identifier
: the WSI issued for the stationtraditional_station_identifier
: the traditional station identifier if a WSI does not existfacility_type
: the station/platform type (use Land (fixed) for land stations)latitude
: the latitude, in decimal degreeslongitude
: the longitude, in decimal degreeselevation
: station elevation, in metres above sea levelterritory_name
: the human readable country namewmo_region
: the Roman numeral of your country based on WMO Regional Associations
Tip
You can also derive your station information from the WMO OSCAR/Surface system.
Tip
Ensure that latitude and longitude values are correctly signed (for example, use the minus sign [-
] for southern or western hemispheres.
Publishing station metadata
Run the following command to publish your station metadata:
wis2box metadata station discovery publish-collection
Ensure that your new station metadata was published to the API, by navigating to http://<your-host>/oapi/collections/stations
.
Question
Do you see your new station metadata?
Click on your station metadata record and inspect the content, noting how it relates to the station list configuration updated earlier in this session.
Conclusion
Congratulations!
In this practical session, you learned how to:
- update station metadata
- publish station metadata
Downloading and finding data from WIS2
Introduction
In this session you will learn how to discover data from the WIS2 Global Discovery Catalogue (GDC) and download data from a WIS2 Global Broker (GB).
Preparation
Note
Before starting please login to your student VM.
Downloading data with pywis-pubsub
The first practical session used MQTT Explorer to connect to the Météo-France Global Broker.
Let's use the pywis-pubsub to subscribe using a command line tool.
pywis-pubsub subscribe --help
Update the sample configuration (see the sections marked TBD) to connect to the Météo-France Global Broker:
vi ~/exercise-materials/pywis-pubsub-exercises/config.yml
Open MQTT Explorer and connect to the Météo-France Global Broker.
Update the following values in the configuration:
- broker
- username: everyone
- password: everyone
- host: globalbroker.meteo.fr
- port: 8883
- subscribe_topics: fill in one to many topics
origin/#
andcache/#
(on separate lines) - storage.option.path: add a directory to where data should be downloaded
Run the pywis-pubsub
command:
pywis-pubsub subscribe -c ~/exercise-materials/pywis-pubsub-exercises/config.yml
Question
What is the format of the data notifications that are displayed on the screen?
Question
Is there data being downloaded? How can we run the pywis-pubsub
command to be able to download the data (hint: review the options when running the pywis-pubsub subscribe --help
command)?
Stop the pywis-pubsub
command (CTRL-C) and update the configuration to be able to download the data
to /tmp/wis2-data
.
Finding data with pywiscat
Let's use pywiscat to query the GDC
pywiscat wis2 search
Question
How many records are returned from the search?
Let's try querying the GDC with a keyword:
pywiscat wis2 search -q radar
Question
What is the data policy of the results?
Try additional queries with -q
Tip
The -q
flag allows for the following syntax:
-q sea
: find all records with the word "sea"-q "NOT sea"
: find all records that do not contain the word "sea"-q "sea AND ice"
: find all records with both "sea" and "ice"-q "sea OR ice"
: find all records with both "sea" and "ice"-q "sea NOT ice"
: find all records with "sea" but not "ice"-q "sea~"
: fuzzy search
If searching for terms with spaces, enclose in double quotes.
Let's get more details on a specific search result that we are interested in:
pywiscat wis2 get <id>
Tip
Use the id
value from the previous search.
Conclusion
Congratulations!
In this practical session, you learned how to:
- use pywis-pubsub to subscribe to a Global Broker and download data to your local system
- use pywiscat to discover datasets from the Global Discovery Catalogue
Answers ↵
Converting CSV data to BUFR answers
Exercise 1
- There are 4 header rows
- Row 2 contains the column names
- Found by checking the lines 6 and 7 in the mapping file
- These are found in the
data:
lines of the mapping file - The CSV data is converted to BUFR using the following command:
csv2bufr data transform ex_1.csv --bufr-template mappings_1.json
- The latitude and longitude can be verified using the following command:
bufr_dump -p WIGOS_0-454-2-AWSMULANJE_20230526T005500.bufr4 | egrep -i 'latitude|longitude'
Exercise 2
-
In the previous CSV file, the datetime was split into six columns representing the year, month, day, hour, minute, and second of the observation. In this CSV file, the datetime is stored in a single column.
-
This is done by recreating the six columns of the previous CSV file, and deleting the new datetime column.
-
The CSV data is written to BUFR using:
csv2bufr data transform ex_2.csv --bufr-template mappings_2.json
Exercise 3
-
When running:
csv2bufr data transform ex_3.csv --bufr-template mappings_3.json
You should see the following error:
Column RH not found in input data
followed by the input data dictionary.
-
The relative humidity column now has name
RelativeHumidity
instead ofRH
. -
We do this by changing the line:
{"eccodes_key": "#1#relativeHumidity", "value":"data:RH"}
to:
{"eccodes_key": "#1#relativeHumidity", "value":"data:RelativeHumidity"}
-
The CSV data is written to BUFR by re-using:
csv2bufr data transform ex_3.csv --bufr-template mappings_3.json
-
The relative humidity can be verified using:
bufr_dump -p WIGOS_0-454-2-AWSMULANJE_20230526T005500.bufr4 | egrep -i relativeHumidity
which should give a value of
54
.Note
The value returned is not the same exact value as what is present in the CSV file: 54.09. This is because this particular ecCode is always an integer, so is rounded.
Exercise 4
- The units are found in row 3.
- 2, as 1hPa = 100Pa.
- 273.15, as 0 degrees C = 273.15 K
-
You should have the following line in your mapppings file:
{"eccodes_key": "#1#nonCoordinatePressure", "value":"data:BP", "offset": "const:0", "scale": "const:2"}
-
You should have the following line in your mappings file:
{"eccodes_key": "#1#airTemperature", "value":"data:AirTempC"} {"eccodes_key": "#1#dewpointTemperature", "value":"data:DewPointTempC"}
-
bash {"eccodes_key": "#1#airTemperature", "value":"data:AirTempC", "offset": "const:273.15", "scale": "const:0"}
-
bash {"eccodes_key": "#1#dewpointTemperature", "value":"data:DewPointTempC", "offset": "const:273.15", "scale": "const:0"}
-
The CSV data is written to BUFR by re-using:
csv2bufr data transform ex_4.csv --bufr-template mappings_4.json
-
The converted variables can be verified using:
bufr_dump -p WIGOS_0-454-2-AWSMULANJE_20230526T005500.bufr4 | egrep -i 'nonCoordinatePressure|airTemperature|dewpointTemperature'
Exercise 5
-
Non coordinate pressure (QNH) takes a negative value and the air temperature is too high.
-
This will vary from person to person, but here is an example:
Variable Minimum Maximum Pressure (Pa) 0 150000 Temperature (C) -20 50 -
You should have the following line in your mappings file:
{"eccodes_key": "#1#pressureReducedToMeanSeaLevel", "value":"data:QNH", "valid_min": "const:0", "valid_max": "const:150000"}
{"eccodes_key": "#1#airTemperature", "value":"data:AirTempC", "valid_min": "const:-20", "valid_max": "const:50", "offset": "const:273.15", "scale": "const:0"}, {"eccodes_key": "#1#dewpointTemperature", "value":"data:DewPointTempC", "valid_min": "const:-20", "valid_max": "const:50", "offset": "const:273.15", "scale": "const:0"}
Note
The valid minimum and maximum values should take the same units as the CSV data.
-
You will see the following error:
ECCODES ERROR : encode_double_value: pressureReducedToMeanSeaLevel (010051). Value (-102043) out of range (minAllowed=0, maxAllowed=163830). ECCODES ERROR : Cannot encode pressureReducedToMeanSeaLevel=-102043 (subset=1) error calling codes_set(49887440, 'pack', True): Value out of coding range
and no BUFR file is written. This is because the pressure reduced to sea level variable is out of the range of minimum and maximum valid values.
Note
There will be no error for the air temperature being out of range. This is because once one value out of range error occurs, the csv2bufr process is halted.
Converting SYNOP data to BUFR answers
Exercise 1
- There is 1 SYNOP report, as there is only 1 delimiter (=)
- There is 1 station
-
This is done using the
transform
command, for example:synop2bufr transform --metadata station_list.csv --output-dir . message.txt
-
This can be done using the following command:
bufr_dump -p WIGOS_0-20000-0-15015_20230221T120000.bufr4 | egrep -i 'latitude|longitude'
Exercise 2
- There are 3 SYNOP reports, as there are 3 delimiters (=)
- There are 3 stations
-
This is done using the
transform
command, for example:synop2bufr transform --metadata station_list.csv --output-dir . message.txt
-
The number of BUFR files output is determined by the number of valid SYNOP reports in the text file, provided the station TSI of each report can be found in the station list file with a corresponding WSI
-
This can be done using the following commands:
bufr_dump -p WIGOS_0-20000-0-15015_20230221T120000.bufr4 | egrep -i 'wigos'
bufr_dump -p WIGOS_0-20000-0-15020_20230221T120000.bufr4 | egrep -i 'wigos'
bufr_dump -p WIGOS_0-20000-0-15090_20230221T120000.bufr4 | egrep -i 'wigos'
Note that if you have a directory with just these 3 BUFR files, you can use Linux wildcards as follows:
bufr_dump -p *.bufr4 | egrep -i 'wigos'
Exercise 3
- No, this is not a problem provided that there exists a row in the station list file with a station TSI matching that of the SYNOP report we are trying to convert
-
This is done using the
transform
command, for example:synop2bufr transform --metadata station_list.csv --output-dir . message.txt
-
This can be done in one command:
bufr_dump -p WIGOS_0-20000-0-15260_20230221T115500.bufr4 | egrep -i 'temperature|cover|sunshine|wind'
Of course a command for each variable can also be used.
Exercise 4
- The SYNOP reports are missing the delimiter (
=
) that allowssynop2bufr
to distinguish one report from another - Attempting to convert should raise the following error:
ERROR:synop2bufr:Delimiters (=) are not present in the string, thus unable to identify separate SYNOP reports.
Exercise 5
- One of the station TSIs (
15015
) has no corresponding metadata in the file, which will prohibit synop2bufr from accessing additional necessary metadata to convert the first SYNOP report to BUFR - The error is:
ERROR:synop2bufr:Missing WSI for station 15015
- There are 3 SYNOP reports but only 2 BUFR files have been produced. This is because one of the SYNOP reports lacked the necessary metadata as mentioned above
Ended: Answers
Ended: Practical sessions
Cheatsheets ↵
Linux cheatsheet
Overview
The basic concepts of working in a Linux operating system are files and directories (folders) organized in a tree structure within an environment.
Once you login to a Linux system, you are working in a shell in which you can work on files and directories, by executing commands which are installed on the system. The Bash shell is a common and popular shell which is typically found on Linux systems.
Bash
Directory Navigation
- Entering an absolute directory:
cd /dir1/dir2
- Entering a relative directory:
cd ./somedir
- Move one directory up:
cd ..
- Move two directories up:
cd ../..
- Move to your "home" directory:
cd -
File Management
- Listing files in the current directory:
ls
- Listing files in the current directory with more detail:
ls -l
- List the root of the filessystem:
ls -l /
- Create an empty file:
touch foo.txt
- Create a file from an
echo
command:
echo "hi there" > test-file.txt
- Copy a file:
cp file1 file2
- Wildcards: operate on file patterns:
ls -l fil* # matches file1 and file2
- Concatenate two files into a new file called
newfile
:
cat file1 file2 > newfile
- Append another file into
newfile
cat file3 >> newfile
- Delete a file:
rm newfile
- Delete all files with the same file extension:
rm *.dat
- Create a directory
mkdir dir1
Chaining commands together with pipes
Pipes allow a user to send the output of one command to another using the pipe |
symbol:
echo "hi" | sed 's/hi/bye/'
- Filtering command outputs using grep:
echo "id,title" > test-file.txt
echo "1,birds" >> test-file.txt
echo "2,fish" >> test-file.txt
echo "3,cats" >> test-file.txt
cat test-file.txt | grep fish
- Ignoring case:
grep -i FISH test-file.txt
- Count matching lines:
grep -c fish test-file.txt
- Return outputs not containing keyword:
grep -v birds test-file.txt
- Count the number of lines in
test-file.txt
:
wc -l test-file.txt
- Display output one screen at a time:
more test-file.txt
...with controls:
- Scroll down line by line: enter
- Go to next page: space bar
-
Go back one page: b
-
Display the first 3 lines of the file:
head -3 test-file.txt
- Display the last 2 lines of the file:
tail -2 test-file.txt
Docker cheatsheet
Overview
Docker allows for creating virtual envronments in an isolated manner in support of virtualization of computing resources. The basic concept behind Docker is containerization, where software can run as services, interacting with other software containers, for example.
The typical Docker workflow involves creating and building images, which are then run as live containers.
Image management
- List available images
docker images
- Build an image from a Dockerfile:
cat << EOF > Dockerfile
FROM ubuntu:latest
RUN apt-get update
RUN apt-get install –y nginx
CMD ["echo", "Hello from my first Docker setup!"]
EOF
- Building the image:
docker build -t my-image:local .
- Removing an image:
docker rmi my-image:local
Volume Management
- List all created volumes:
docker volume ls
- Create a volume:
docker volume create my-volume
- Display detailed information on a volume:
docker volume inspect my-volume
- Remove a volume:
docker volume rm my-volume
- Remove all unused volumes:
docker volume prune
Container Management
- Create a container from an image, with an interactive terminal (
-it
) and a mounted volume (v
):
docker run -it -v ${pwd}:/app my-image:local
- Display a list of currently running containers:
docker ps
- List of all containers:
docker ps -a
- Start a container:
docker start my-image:local # starts a new container
- Enter the interactive terminal of a running container:
Tip
use docker ps
to use the container id in the command below
docker exec -it my-container /bin/bash
- Remove a container
docker rm my-container
- Remove a running container:
docker rm -f my-container
YAML cheatsheet
Overview
Many wis2box configuration files are managed in YAML (YAML Ain't Markup Language) format. YAML allows for flexible configurations using indentation to do grouping and nesting.
Indentation
YAML is characterized by indentatation. It is important to ensure consistet indentatation in a YAML fileiont.
It is recommended to use 4 spaces for a default indentation level.
It is also strongly recommended to NOT use tabs for indentation.
A simple YAML file:
my-configuration: # <- this is a section
# this is a comment
# you can add comments in this manner as you wish
section: # <- this is nested section
subsection: # <- this is another nested section
value1: 123 # an integer
value2: 1.23 # a float
value3: '123' # a number forced into a string
value4: my value # a string (does not need quoting)
value4: [1, 2, 3, 4] # an array/list
# lists can contain any data type, including more lists or sections
my-list: # another list, same result as value4 above
- item 1
- item 2
- item 3
WIS2 in a box cheatsheet
Overview
wis2box runs as a suite of Docker Compose commands. The wis2box-ctl.py
command is a utility
(written in Python) to run Docker Compose commands easily.
wis2box command essentials
Building
- Build all of wis2box:
python3 wis2box-ctl.py build
- Build a specific wis2box Docker image:
python3 wis2box-ctl.py build wis2box-management
- Update wis2box:
python3 wis2box-ctl.py update
Starting and stopping
- Start wis2box:
python3 wis2box-ctl.py start
- Stop wis2box:
python3 wis2box-ctl.py stop
- Verify all wis2box containers are running:
python3 wis2box-ctl.py status
- Login to a wis2box container (wis2box-management by default):
python3 wis2box-ctl.py login
- Login to a specific wis2box container:
python3 wis2box-ctl.py login wis2box-api
Design time commands (metadata management and publishing)
Note
You must be logged into the wis2box-management container to run the below commands
- Publish discovery metadata:
wis2box metadata discovery publish /path/to/discovery-metadata-file.yml
- Publish station metadata:
wis2box metadata station publish-collection
- Add a dataset of observation collections from discovery metadata:
wis2box data add-collection /path/to/discovery-metadata-file.yml
- Ingest data into the wis2box-incoming bucket to trigger processing and publishing:
wis2box data ingest --topic-hierarchy topic.hierarchy.path --path /path/to/directory/of/data/files