Skip to content

Converting CSV data to BUFR

Introduction

CSV data is a commonly used format for recording tabular data. csv2bufr is a tool to help convert CSV to BUFR data.

In this session you will learn to create BUFR data from CSV, using custom and flexible configuration (mappings) in support of meeting WMO GBON requirements.

Preparation

Warning

Ensure that you are logged into your student VM.

Navigate to the exercise-materials/csv2bufr-exercises directory and make sure that the exercises directories are there.

cd ~/exercise-materials/csv2bufr-exercises
ls

Tip

You should be able to see the following directories BUFR_tables answers ex_1 ex_2 ex_3 ex_4 ex_5

csv2bufr primer

Below are essential csv2bufr commands and configurations:

mappings Create

The mappings create command creates an empty BUFR mapping template JSON file, which maps CSV column headers to their corresponding ecCodes element:

csv2bufr mappings create <BUFR descriptors> --output <my_template.json>

For more information, see the following example.

data transform

The data transform command converts a CSV file to BUFR format:

csv2bufr data transform --bufr-template <my_template.json> --output-dir <./my_directory> <my_data.csv>

Note

The output directory is not required, and by default is the current working directory.

ecCodes BUFR refresher

bufr_dump

The bufr_dump function will allow you to inspect the BUFR files created from the conversion. It has numerous options;, the following will be most applicable to the exercises:

bufr_dump -p <my_bufr.bufr4>

This will display the content of your BUFR on screen. If you are interested in the values taken by a variable in particular, use the egrep command:

bufr_dump -p <my_bufr.bufr4> | egrep -i temperature

This will display the variables related to temperature in your BUFR data. If you want to do this for multiple types of variables, filter the output using a pipe (|):

bufr_dump -p <my_bufr.bufr4> | egrep -i 'temperature|wind'

Inspecting CSV data and BUFR conversion

Exercise 1: Converting a CSV file to BUFR

In this exercise we will look at a pre-configured mapping file for the CSV data, and will use this to convert the data to BUFR.

Navigate to the ex_1 directory:

cd ~/exercise-materials/csv2bufr-exercises/ex_1

and open the CSV data ex_1.csv.

  1. How many header rows are there in this data?
  2. Which row contains the column names?

Now open the mappings file mappings_1.json.

Note

csv2bufr mappings files have no set file extension, however it recommended to use .json.

  1. Verify that "number_header_rows" and "column_names_row" are the same as your answers above.

  2. Locate each of the CSV column names in this mappings file.

  3. By the data transform command, use the mappings file to convert this CSV data to BUFR.

  4. Use bufr_dump to find the latitude and longitude value stored in the output BUFR file. Verify these values using the CSV file.

Exercise 2: Correcting the datetime format

In this exercise we will investigate the correct format to present the datetime of an observation in the CSV file.

Navigate to the ex_2 directory:

cd ~/exercise-materials/csv2bufr-exercises/ex_2

and open the CSV data ex_2.csv.

  1. What are the differences in the way that the datetime is represented in this CSV file compared to the previous one?

Now open the mappings file mappings_2.json. By looking at the eccodes keys related to dates and times, it should seem clear that it is not possible to map the datetime with the CSV in its current state.

  1. Create new columns in the CSV file for each component of the datetime, with appropriate column names to match those of the mapping file.

  2. By the data transform command, use the mappings file to convert this CSV data to BUFR.

Exercise 3: Handling changes to the CSV data

In this exercise we consider the following scenario: given the same CSV data but with different column names, how can we adjust the mappings file to convert this data to BUFR? For simplicity, we will only look at one column name change.

Navigate to the ex_3 directory

cd ~/exercise-materials/csv2bufr-exercises/ex_3
  1. By the data transform command, attempt to convert the CSV data to BUFR. What error appears?

Open the CSV data ex_3.csv.

  1. What column name has been changed?

Open the mappings file mappings_3.json.

  1. Find the original column name in this mapping file, and change it to the new name.
  2. By the data transform command, use the mappings file to convert this CSV data to BUFR.
  3. Use bufr_dump to verify that relativeHumidity has the same value as the CSV data.

Exercise 4: Unit conversion

In this exercise, we expand on the work above by not only handling changes to column names, but also the units of the data. We achieve this by using offset and scale in the mappings file.

Navigate to the ex_4 directory:

cd ~/exercise-materials/csv2bufr-exercises/ex_4

and open the CSV data ex_4.csv.

  1. Which row are the units of the variables written?

You should notice that BP now has units hPa instead of Pa. Moreover, the air temperature and dewpoint temperature now have column names AirTempC and DewPointTempC, with units C instead of K.

  1. What power of 10 is needed to convert hPa to Pa?
  2. What constant must be added to convert degrees C to K?

Open the mappings file mappings_3.json. Find the lines corresponding to the variables above.

  1. Convert BP to Pa by adding the following line to the right of "data:BP":

    "offset": "const:0", "scale": "const:x"
    

    where x is your answer in part 3.

  2. Change the column names of air temperature and dewpoint temperature in the mappings file to match that of the CSV file, as you did in the previous exercise.

  3. Convert AirTempC to K by adding the following line to the right of "data:AirTempC":

    "offset": "const:y", "scale": "const:0"
    

    where y is your answer in part 4.

  4. Convert DewPointTempC to K by adding the following line to the right of "data:DewPointTempC":

    "offset": "const:y", "scale": "const:0"
    

    where y is your answer in part 4.

  5. By the data transform command, use the mappings file to convert this CSV data to BUFR.

  6. Use the bufr_dump command to verify that nonCoordinatePressure, airTemperature and dewpointTemperature have the values you would expect after conversion.

Exercise 5: Implementing quality control

In this exercise, we will implement some minimum and maximum tolerable values to prevent clearly incorrect data from being converted to BUFR. To do this, we will use valid_min and valid_max in the mappings file.

Navigate to the ex_5 directory:

cd ~/exercise-materials/csv2bufr-exercises/ex_5

and open the CSV data ex_5.csv.

  1. Which two variables have values that are clearly incorrect?
  2. For each of these variables, decide on some sensible minimum and maximum tolerable values.

Open the mappings file mappings_4.json. Find the lines corresponding to the variables above.

  1. Implement these minimum and maximum values by adding the following line to the right of the data: code:

    "valid_min": "const:a", "valid_max": "const:b"
    

    where a and b are values you chose in part 2.

  2. By the data transform command, use this mappings file to convert this CSV data to BUFR. What happens? Is a BUFR file written? Justify why.

Conclusion

Congratulations!

In this practical session, you learned:

  • The basic usage of csv2bufr
  • How to update a simple csv2bufr mapping file for a variety of scenarios, including for GBON requirements, unit conversion, and quality control/range checking
  • How to csv2bufr on a test data file and convert to BUFR format