Keep validity variable in HARP exported product

Hi,

I would like to keep the “SO2_column_number_density_validity” variable within the HARP exported product. With the following code within the “operations” in the harp.import_product:
[…]
‘keep(SO2_column_number_density_validity);’

I receive the following error message:

harp._harppy.CLibraryError: cannot keep non-existent variable SO2_column_number_density_validity

I don’t understand this error since this variable should exist, as reported at this link http://stcorp.github.io/harp/doc/html/ingestions/S5P_L2_SO2.html in the “Mapping description”.

Many thanks in advance.

Best regards,
Davide

Can you provide more details on how you perform the import? It is very likely that the operations that you perform before the keep somehow result in the variable being removed.

Thank you for the fast answer.
Here is the code:

    product_preprocessed = harp.import_product(f'{product_for_preprocessing}',
                                                        operations='latitude >= 'f'{lat_min}'' [degree_north]; latitude <= 'f'{lat_max}'' [degree_north];'
                                                                   'longitude >= 'f'{lon_min}'' [degree_east]; longitude < 'f'{lon_max}'' [degree_east];'
                                                                   'bin_spatial('f'{step_lat}'', 'f'{lat_min}'', 0.063, 'f'{step_lon}'', 'f'{lon_min}'', 0.031);'
                                                                   'derive(latitude {latitude});'
                                                                   'derive(longitude {longitude});'
                                                                   'keep(SO2_column_number_density_validity)')

I tried also:
product_preprocessed = harp.import_product(f’{product_for_preprocessing}’,
operations=‘latitude >= ‘f’{lat_min}’’ [degree_north]; latitude <= ‘f’{lat_max}’’ [degree_north];’
‘longitude >= ‘f’{lon_min}’’ [degree_east]; longitude < ‘f’{lon_max}’’ [degree_east];’
‘SO2_column_number_density_validity > 50;’
‘bin_spatial(‘f’{step_lat}’’, ‘f’{lat_min}’’, 0.063, ‘f’{step_lon}’’, ‘f’{lon_min}’’, 0.031);’
‘derive(latitude {latitude});’
‘derive(longitude {longitude});’
‘keep(SO2_column_number_density_validity)’)

I get the same error message.

If you perform a spatial binning, then the validity information will be removed. This is because you can’t perform an area weighted average of quality flags.

Thank you for your answer.

Ok, I undestand. So, how could I obtain (if it is possible, maybe with another operation) an exported HARP product with a regular grid (latitude, longitude) replicable with different L1B and L2 products with the “column number density validity” variable for each pixel in the resulting L2 exported product?
I would like to analyse, for the same latitude and longitude point, the radiance values of a L1B product at certain wavelenghts and the corresponding SO2 column number density with the relative validity.

Thanks for you help.

You can’t do this on a regular lat/lon grid. If you want to compare L1b with L2, you should do this using the original satellite viewing geometry. You can align the L1b and L2 pixels by matching the datetime and scan_subindex variable values.

Thanks for your suggestions.
I tried to investigate better the structure of three files where I only applied a crop over an area of interest wiht HARP. The files are the L1B B2, B3 and L2 SO2_OFFL products for the same date. With the following code:

print(b2, b3, so2)

for i in range(0, 1):
    print(np.array(b2['datetime'][i]), np.array(b3['datetime'][i]), np.array(so2['datetime_start'][i]))
    print(np.array(b2['scan_subindex'][i]), np.array(b3['scan_subindex'][i]), np.array(so2['scan_subindex'][i]))
    print(np.array(b2['latitude'][i]), np.array(b3['latitude'][i]), np.array(so2['latitude'][i]))
    print(np.array(b2['longitude'][i]), np.array(b3['longitude'][i]), np.array(so2['longitude'][i]))

I obtain the following results:

Dimensions:               (independent_4: 4, spectral: 497, time: 19035)
Dimensions without coordinates: independent_4, spectral, time
Data variables:
    scan_subindex         (time) int16 ...
    datetime              (time) datetime64[ns] ...
    orbit_index           int32 ...
    latitude              (time) float32 ...
    longitude             (time) float32 ...
    latitude_bounds       (time, independent_4) float32 ...
    longitude_bounds      (time, independent_4) float32 ...
    sensor_latitude       (time) float32 ...
    sensor_longitude      (time) float32 ...
    sensor_altitude       (time) float32 ...
    solar_zenith_angle    (time) float32 ...
    solar_azimuth_angle   (time) float32 ...
    sensor_zenith_angle   (time) float32 ...
    sensor_azimuth_angle  (time) float32 ...
    wavelength            (time, spectral) float32 ...
    photon_radiance       (time, spectral) float32 ...
    index                 (time) int32 ...
Attributes:
    Conventions:     HARP-1.0
    datetime_start:  7110.123880706018
    datetime_stop:   7110.125530659721
    source_product:  S5P_OFFL_L1B_RA_BD2_20190620T020303_20190620T034433_0872...
    history:         2021-02-08T22:16:48Z [harp-1.11] harp.import_product('S5... <xarray.Dataset>
Dimensions:               (independent_4: 4, spectral: 497, time: 19138)
Dimensions without coordinates: independent_4, spectral, time
Data variables:
    scan_subindex         (time) int16 ...
    datetime              (time) datetime64[ns] ...
    orbit_index           int32 ...
    latitude              (time) float32 ...
    longitude             (time) float32 ...
    latitude_bounds       (time, independent_4) float32 ...
    longitude_bounds      (time, independent_4) float32 ...
    sensor_latitude       (time) float32 ...
    sensor_longitude      (time) float32 ...
    sensor_altitude       (time) float32 ...
    solar_zenith_angle    (time) float32 ...
    solar_azimuth_angle   (time) float32 ...
    sensor_zenith_angle   (time) float32 ...
    sensor_azimuth_angle  (time) float32 ...
    wavelength            (time, spectral) float32 ...
    photon_radiance       (time, spectral) float32 ...
    index                 (time) int32 ...
Attributes:
    Conventions:     HARP-1.0
    datetime_start:  7110.123830706018
    datetime_stop:   7110.125493159722
    source_product:  S5P_OFFL_L1B_RA_BD3_20190620T020303_20190620T034433_0872...
    history:         2021-02-08T22:13:00Z [harp-1.11] harp.import_product('S5... <xarray.Dataset>
Dimensions:                                               (independent_4: 4, time: 19138, vertical: 34)
Dimensions without coordinates: independent_4, time, vertical
Data variables:
    scan_subindex                                         (time) int16 ...
    datetime_start                                        (time) datetime64[ns] ...
    datetime_length                                       float32 ...
    orbit_index                                           int32 ...
    validity                                              (time) int32 ...
    latitude                                              (time) float32 ...
    longitude                                             (time) float32 ...
    latitude_bounds                                       (time, independent_4) float32 ...
    longitude_bounds                                      (time, independent_4) float32 ...
    sensor_latitude                                       (time) float32 ...
    sensor_longitude                                      (time) float32 ...
    sensor_altitude                                       (time) float32 ...
    solar_zenith_angle                                    (time) float32 ...
    solar_azimuth_angle                                   (time) float32 ...
    sensor_zenith_angle                                   (time) float32 ...
    sensor_azimuth_angle                                  (time) float32 ...
    pressure                                              (time, vertical) float64 ...
    SO2_column_number_density                             (time) float32 ...
    SO2_column_number_density_uncertainty_random          (time) float32 ...
    SO2_column_number_density_uncertainty_systematic      (time) float32 ...
    SO2_column_number_density_validity                    (time) int8 ...
    SO2_column_number_density_amf                         (time) float32 ...
    SO2_column_number_density_amf_uncertainty_random      (time) float32 ...
    SO2_column_number_density_amf_uncertainty_systematic  (time) float32 ...
    SO2_column_number_density_avk                         (time, vertical) float32 ...
    SO2_volume_mixing_ratio_dry_air_apriori               (time, vertical) float32 ...
    SO2_slant_column_number_density                       (time) float32 ...
    SO2_type                                              (time) int8 ...
    O3_column_number_density                              (time) float32 ...
    O3_column_number_density_uncertainty                  (time) float32 ...
    absorbing_aerosol_index                               (time) float32 ...
    cloud_albedo                                          (time) float32 ...
    cloud_albedo_uncertainty                              (time) float32 ...
    cloud_fraction                                        (time) float32 ...
    cloud_fraction_uncertainty                            (time) float32 ...
    cloud_altitude                                        (time) float32 ...
    cloud_altitude_uncertainty                            (time) float32 ...
    cloud_pressure                                        (time) float32 ...
    cloud_pressure_uncertainty                            (time) float32 ...
    surface_albedo                                        (time) float32 ...
    surface_altitude                                      (time) float32 ...
    surface_altitude_uncertainty                          (time) float32 ...
    surface_pressure                                      (time) float32 ...
    index                                                 (time) int32 ...
Attributes:
    Conventions:     HARP-1.0
    datetime_start:  7110.123830706018
    datetime_stop:   7110.125505659722
    source_product:  S5P_OFFL_L2__SO2____20190620T020303_20190620T034433_0872...
    history:         2021-02-08T22:14:46Z [harp-1.11] harp.import_product('S5...
------------
2019-06-20T02:58:23.293000000 2019-06-20T02:58:18.973000000 2019-06-20T02:58:18.973000000
364 385 385
44.000114 44.000736 44.000736
157.1179 158.71748 158.71748

Seems that L1B_B3 and SO2 are aligned but this don’t happen with L1B_B2 which has also a slighty different “time” dimension (19035 for B2, 19138 for B3 and SO2).
Is what I obtained right? If yes, could you suggest one way to compare information for the same lat/lon points for B2, B3 and SO2? Because in this case seems that I can’t match the same datetime and scan_subindex for B2, B3 and SO2.

Many thanks in advance.

It seems that band 2 and 3 have a different viewing geometry, so I am not sure that you can easily align them.

I also don’t really understand what you want to do with the validity after you would have aligned the pixels. You generally only us this flag to filter on (as you already do with SO2_column_number_density_validity > 50).

Thank you for your answer.

Do you know if the fact that band 2 and 3 have different viewing geometry is a constant for each date or if it depends time by time? I could also try to analyse B2 and B3 alignement for other dates.

I would like to compare the SO2_column_number_density obtained with a different method with the value reported in the L2 TROPOMI product. If I use the filter SO2_column_number_density_validity > 50, then I don’t know when a pixel has validity 60 or 75 or 100 (examples). I suppose that the quality of the data is different with each of the three values, although in general it is only suggested to use data with SO2_column_number_density_validity > 50.
Therefore my question is: it doesn’t matter if a pixel has qa value about 60 or 75 or 100 and it is just enough that the qa value is above 50 to consider the data “fully reliable” or it is important (and if yes, how much “weights” on data reliability) also each specific qa value above qa>50?

Thanks for your help.

If you want to understand the quality differences between the different qa_value values, you should probably have a look at the PUM and ATBD as is also mentioned in Section 3.1 of the Product Readme File (PRF). The qa_value is generally just a combination from other quality measures that are also found elsewhere in the product.

If you only want to use the quality flag to filter out poor data, without really knowing which conditions might impact quality, then just go with the recommendations from the PRF.

If you are actively investigating yourself which conditions impact quality (e.g. cloud fraction, amf, solar angle, etc.) then you probably don’t want to look at the quality flag itself anyway, but at the impacting quality influence variables directly.

Many thanks for your answer.

I only have one more question at the moment: using the bin spatial operation with harp, in your opinion, what would be the best choice of latitude and longitude grid to exploit as max as possible TROPOMI spatial resolution and to “homogenize” data, working at the same time togheter with both SO2 L2 TROPOMI products acquired before and after August 2019?
At first I was thinking to set 0.06 lat x 0.03 lon, but now I’m not sure that would be the best choice and if could be better also a square grid (e.g. 0.03 * 0.03). I suppose that it depends also if I’m working with data acquired at high or middle latitudes for example and maybe I should consider different bin spatial settings depending on which part of the world I consider the TROPOMI products. Any suggestion you may have in this regard would be really appreciated.

Thank you!

See also this response. SO2 uses ‘native resolution’ similar to NO2, so I would recommend something around 0.02 x 0.02.