Skip to content

Configuring data mappings

Introduction

wis2box uses a number of configuration files to allow for a simple setup of the system. At the heart of wis2box is data ingest and publishing, which are driven by wis2box data mappings. The basic concept of data mappings is configuring a WIS2 topic to a defined ingest and publish workflow and files/templates. In this session, you will work on adding to the data mappings in support of publishing your data via wis2box.

Preparation

Note

Ensure you are logged into the wis2box-management container on your student VM:

cd ~/exercise-materials/wis2box-setup
python3 wis2box-ctl.py login

Configure a data mapping

Note

Ensure you are logged into the wis2box-management container before continuing.

Inspect the wis2box environment to locate the data mappings in use by the system, as defined by the WIS2BOX_DATA_MAPPINGS environment variable:

wis2box environment show | grep WIS2BOX_DATA_MAPPINGS

Question

Where are the live data mappings located?

Question

How can using the WIS2BOX_DATA_MAPPINGS environment variable be valuable, as compared to /data/wis2box/data-mappings.yml?

Add CSV data

Let's add a data mapping for wis2box to process CSV data. Inspect the contents of the sample SYNOP CSV data mapping:

cat ~/exercise-materials/wis2box-setup/synop-csv-mappings.yml

Question

What topic is defined in this mapping? What values of the topic are placeholders to be updated later in this session?

Copy and paste the above file contents into the $WIS2BOX_DATA_MAPPINGS file (either manually or via the command below)::

tail -n +2 exercise-materials/wis2box-setup/test-data/data-mappings.yml >> $WIS2BOX_DATA_MAPPINGS

Tip

Be sure that the first data: line from the above file is omitted when copying/pasting into the $WIS2BOX_DATA_MAPPINGS file.

Open the data mappings file:

vi $WIS2BOX_DATA_MAPPINGS

Verify that the file you copied from ~/exercise-materials/wis2box-setup/synop-csv-mappings.yml is now part of the live data mappings file.

Update the [country] and [centre_id] values in your new/added data mapping. Use your username as the centre_id topic.

Tip

The country value should match one of the countries in the country list of the WIS2 Topic Hierarchy.

Note

Centre ids will be officially managed and introduced as part of the WIS2 Topic Hierarchy throughout the WIS2 Pilot Phase, at which point each centre's id will be in the centre_id list of the WIS2 Topic Hierarchy. centre_id values should be lower case and contain no accents or special characters. Dashes should be used instead of underscores.

Note

The file-pattern values throughout the data mapping provide a regular expression to be able to match filenames. Ensure your filenames are formatted as per the regular expression in the new data mapping, to include WIGOS_ as a fixed value, followed by the WIGOS Station Identifier (WSI), followed by an underscore (_), as well as any other information (i.e. datestamp). Ensure the file extension is .csv. An real world example would be WIGOS_0-454-2-AWSBALAKA_2021-11-18T0955.csv.

Tip

Remember your dataset topic for the WIS2 discovery metadata exercise.

Restart wis2box

In order for data mappings to take effect, restart wis2box as follows:

python3 wis2box-ctl.py restart

Conclusion

Congratulations!

In this practical session, you learned how to:

  • inspect the live wis2box data mappings
  • add a new data mapping
  • update the country and centre_id values add a new data mapping
  • update the file-pattern value to match your data filename convention