Converting CSV data to BUFR
Introduction
CSV data is a commonly used format for recording tabular data. csv2bufr
is a tool to help
convert CSV to BUFR data.
In this session you will learn to create BUFR data from CSV, using custom and flexible configuration (mappings) in support of meeting WMO GBON requirements.
Preparation
Warning
Ensure that you are logged into your student VM.
Navigate to the exercise-materials/csv2bufr-exercises
directory and make sure that the exercises directories are there.
cd ~/exercise-materials/csv2bufr-exercises
ls
Tip
You should be able to see the following directories BUFR_tables answers ex_1 ex_2 ex_3 ex_4 ex_5
csv2bufr primer
Below are essential csv2bufr
commands and configurations:
mappings Create
The mappings create
command creates an empty BUFR mapping template JSON file, which maps CSV column headers to their corresponding ecCodes element:
csv2bufr mappings create <BUFR descriptors> --output <my_template.json>
For more information, see the following example.
data transform
The data transform
command converts a CSV file to BUFR format:
csv2bufr data transform --bufr-template <my_template.json> --output-dir <./my_directory> <my_data.csv>
Note
The output directory is not required, and by default is the current working directory.
ecCodes BUFR refresher
bufr_dump
The bufr_dump
function will allow you to inspect the BUFR files created from the conversion. It has numerous options;, the following will be most applicable to the exercises:
bufr_dump -p <my_bufr.bufr4>
This will display the content of your BUFR on screen. If you are interested in the values taken by a variable in particular, use the egrep
command:
bufr_dump -p <my_bufr.bufr4> | egrep -i temperature
This will display the variables related to temperature in your BUFR data. If you want to do this for multiple types of variables, filter the output using a pipe (|
):
bufr_dump -p <my_bufr.bufr4> | egrep -i 'temperature|wind'
Inspecting CSV data and BUFR conversion
Exercise 1: Converting a CSV file to BUFR
In this exercise we will look at a pre-configured mapping file for the CSV data, and will use this to convert the data to BUFR.
Navigate to the ex_1
directory:
cd ~/exercise-materials/csv2bufr-exercises/ex_1
and open the CSV data ex_1.csv
.
- How many header rows are there in this data?
- Which row contains the column names?
Now open the mappings file mappings_1.json
.
Note
csv2bufr mappings files have no set file extension, however it recommended to use .json
.
-
Verify that
"number_header_rows"
and"column_names_row"
are the same as your answers above. -
Locate each of the CSV column names in this mappings file.
-
By the
data transform
command, use the mappings file to convert this CSV data to BUFR. -
Use bufr_dump to find the latitude and longitude value stored in the output BUFR file. Verify these values using the CSV file.
Exercise 2: Correcting the datetime format
In this exercise we will investigate the correct format to present the datetime of an observation in the CSV file.
Navigate to the ex_2
directory:
cd ~/exercise-materials/csv2bufr-exercises/ex_2
and open the CSV data ex_2.csv
.
- What are the differences in the way that the datetime is represented in this CSV file compared to the previous one?
Now open the mappings file mappings_2.json
. By looking at the eccodes keys related to dates and times, it should seem clear that it is not possible to map the datetime with the CSV in its current state.
-
Create new columns in the CSV file for each component of the datetime, with appropriate column names to match those of the mapping file.
-
By the
data transform
command, use the mappings file to convert this CSV data to BUFR.
Exercise 3: Handling changes to the CSV data
In this exercise we consider the following scenario: given the same CSV data but with different column names, how can we adjust the mappings file to convert this data to BUFR? For simplicity, we will only look at one column name change.
Navigate to the ex_3
directory
cd ~/exercise-materials/csv2bufr-exercises/ex_3
- By the
data transform
command, attempt to convert the CSV data to BUFR. What error appears?
Open the CSV data ex_3.csv
.
- What column name has been changed?
Open the mappings file mappings_3.json
.
- Find the original column name in this mapping file, and change it to the new name.
- By the
data transform
command, use the mappings file to convert this CSV data to BUFR. - Use
bufr_dump
to verify thatrelativeHumidity
has the same value as the CSV data.
Exercise 4: Unit conversion
In this exercise, we expand on the work above by not only handling changes to column names, but also the units of the data. We achieve this by using offset
and scale
in the mappings file.
Navigate to the ex_4
directory:
cd ~/exercise-materials/csv2bufr-exercises/ex_4
and open the CSV data ex_4.csv
.
- Which row are the units of the variables written?
You should notice that BP
now has units hPa instead of Pa. Moreover, the air temperature and dewpoint temperature now have column names AirTempC
and DewPointTempC
, with units C instead of K.
- What power of 10 is needed to convert hPa to Pa?
- What constant must be added to convert degrees C to K?
Open the mappings file mappings_3.json
. Find the lines corresponding to the variables above.
-
Convert
BP
to Pa by adding the following line to the right of"data:BP"
:"offset": "const:0", "scale": "const:x"
where
x
is your answer in part 3. -
Change the column names of air temperature and dewpoint temperature in the mappings file to match that of the CSV file, as you did in the previous exercise.
-
Convert
AirTempC
to K by adding the following line to the right of"data:AirTempC"
:"offset": "const:y", "scale": "const:0"
where
y
is your answer in part 4. -
Convert
DewPointTempC
to K by adding the following line to the right of"data:DewPointTempC"
:"offset": "const:y", "scale": "const:0"
where
y
is your answer in part 4. -
By the
data transform
command, use the mappings file to convert this CSV data to BUFR. -
Use the
bufr_dump
command to verify thatnonCoordinatePressure
,airTemperature
anddewpointTemperature
have the values you would expect after conversion.
Exercise 5: Implementing quality control
In this exercise, we will implement some minimum and maximum tolerable values to prevent clearly incorrect data from being converted to BUFR. To do this, we will use valid_min
and valid_max
in the mappings file.
Navigate to the ex_5
directory:
cd ~/exercise-materials/csv2bufr-exercises/ex_5
and open the CSV data ex_5.csv
.
- Which two variables have values that are clearly incorrect?
- For each of these variables, decide on some sensible minimum and maximum tolerable values.
Open the mappings file mappings_4.json
. Find the lines corresponding to the variables above.
-
Implement these minimum and maximum values by adding the following line to the right of the
data:
code:"valid_min": "const:a", "valid_max": "const:b"
where a and b are values you chose in part 2.
-
By the
data transform
command, use this mappings file to convert this CSV data to BUFR. What happens? Is a BUFR file written? Justify why.
Conclusion
Congratulations!
In this practical session, you learned:
- The basic usage of
csv2bufr
- How to update a simple csv2bufr mapping file for a variety of scenarios, including for GBON requirements, unit conversion, and quality control/range checking
- How to
csv2bufr
on a test data file and convert to BUFR format