Configuring data mappings
Introduction
wis2box uses a number of configuration files to allow for a simple setup of the system. At the heart of wis2box is data ingest and publishing, which are driven by wis2box data mappings. The basic concept of data mappings is configuring a WIS2 topic to a defined ingest and publish workflow and files/templates. In this session, you will work on adding to the data mappings in support of publishing your data via wis2box.
Preparation
Note
Ensure you are logged into the wis2box-management container on your student VM:
cd ~/exercise-materials/wis2box-setup
python3 wis2box-ctl.py login
Configure a data mapping
Note
Ensure you are logged into the wis2box-management container before continuing.
Inspect the wis2box environment to locate the data mappings in use by the system, as defined by the WIS2BOX_DATA_MAPPINGS
environment variable:
wis2box environment show | grep WIS2BOX_DATA_MAPPINGS
Question
Where are the live data mappings located?
Question
How can using the WIS2BOX_DATA_MAPPINGS
environment variable be valuable, as compared to /data/wis2box/data-mappings.yml
?
Add CSV data
Let's add a data mapping for wis2box to process CSV data. Inspect the contents of the sample SYNOP CSV data mapping:
cat ~/exercise-materials/wis2box-setup/synop-csv-mappings.yml
Question
What topic is defined in this mapping? What values of the topic are placeholders to be updated later in this session?
Copy and paste the above file contents into the $WIS2BOX_DATA_MAPPINGS
file (either manually or via the command below)::
tail -n +2 exercise-materials/wis2box-setup/test-data/data-mappings.yml >> $WIS2BOX_DATA_MAPPINGS
Tip
Be sure that the first data:
line from the above file is omitted when copying/pasting into the $WIS2BOX_DATA_MAPPINGS
file.
Open the data mappings file:
vi $WIS2BOX_DATA_MAPPINGS
Verify that the file you copied from ~/exercise-materials/wis2box-setup/synop-csv-mappings.yml
is now part of the live data mappings file.
Update the [country]
and [centre_id]
values in your new/added data mapping. Use your username as the centre_id
topic.
Tip
The country
value should match one of the countries in the country list of the WIS2 Topic Hierarchy.
Note
Centre ids will be officially managed and introduced as part of the WIS2 Topic Hierarchy throughout the WIS2 Pilot Phase, at which point each centre's id will be in the centre_id list of the WIS2 Topic Hierarchy. centre_id
values should be lower case and contain no accents or special characters. Dashes should be used instead of underscores.
Note
The file-pattern
values throughout the data mapping provide a regular expression to be able to match filenames. Ensure your filenames are formatted as per the regular expression in the new data mapping, to include WIGOS_
as a fixed value, followed by the WIGOS Station Identifier (WSI), followed by an underscore (_
), as well as any other information (i.e. datestamp). Ensure the file extension is .csv
. An real world example would be WIGOS_0-454-2-AWSBALAKA_2021-11-18T0955.csv
.
Tip
Remember your dataset topic for the WIS2 discovery metadata exercise.
Restart wis2box
In order for data mappings to take effect, restart wis2box as follows:
python3 wis2box-ctl.py restart
Conclusion
Congratulations!
In this practical session, you learned how to:
- inspect the live wis2box data mappings
- add a new data mapping
- update the
country
andcentre_id
values add a new data mapping - update the
file-pattern
value to match your data filename convention