L2__CO____ 3-days average

Hello,

I was trying to reproduce a L2__CO map like the one at the website (https://maps.s5p-pal.com/co/).
I first tried averaging data that I downloaded by applying:

basic_operations = ";".join([
    "CO_column_number_density_validity>50",
    "keep(latitude_bounds,longitude_bounds,CO_column_number_density )",
    "bin_spatial(801,-40,0.1,1600,-100,0.1)",
    "derive(latitude {latitude})",
    "derive(longitude {longitude})",
])
reduce_operations=";".join([
    "squash(time, (latitude, longitude, latitude_bounds, longitude_bounds))",
    "bin()"
])
s5p_co_merged = harp.import_product(files, basic_operations, reduce_operations=reduce_operations)

but I obtained quite a coarse image. So I have plotted the data provided at the website for those specific days but the image looked still quite different than the website:

I was thinking if the difference could be the use of a 3-days moving average, while I am using the usual time average implemented by Harp. But even in this case, I don’t understand why the data provided by the website look still different when I plot them unless the moving average is applied afterwards.
Also, how is the 3-days moving average applied? Is it considering one day before and one ahead the selected day? And can HARP also perform moving average?

Thank you in advance for the help!

Best,
Serena

Be aware that for the Mapping Portal the CO is shown as volume mixing ratios and not as number densities.
The HARP operations that are performed at the start are:

CO_column_number_density_validity>50
keep(CO_column_number_density, H2O_column_number_density, altitude, surface_pressure, pressure_bounds, scan_subindex, latitude, latitude_bounds, longitude, longitude_bounds, datetime_start, datetime_length)
derive(CO_column_volume_mixing_ratio {time} [ppbv])

The difference may thus be in how you derive the VMR based on how HARP does this? (your harp operations do not show this conversion)

In addition, the Mapping Portal performs a custom median-based destriping algorithm to reduce some of the destriping effects. This was introduced before a L2 processor that included the destriping was released. At some point in the near future the maps will be regenerated based on the reprocessed collection 3 data and the maps will then use the destriped CO data from the product itself (i.e. harp ingestion option co=corrected).

I don’t understand what you mean with ‘the usual time average implemented by Harp’. The maps on the Mapping Portal actually use the harp time averaging approach. This is done by first creating daily grids and then merging them per 3 days (every day, so a moving average) using the bin() operation.

Note that you can see the full operation set in the history global attribute from the gridded netcdf files that you can download from the Mapping Portal (via the download icon on the bottom right of the plots).

1 Like

Oh. And you are using a different color bar. Check the color at e.g. 100 ppbv

Hi Sander,
thank you for your quick answer.
Yes, I just realized that the map used in the Mapping Portal seems to use a logarithmic color bar, so this is probably the issue when I try to plot the data downloaded directly from there.

I have tried to reproduce the same map using collection 3 data (available in the ATM-MPC workspace) to make a comparison for the same days. I assume that there I don’t need to use the option corrected anymore.
I have tried to ingest in this way (I made a spatial selection otherwise the kernel would crash):

basic_operations = ";".join([
    "CO_column_number_density_validity>50",
    "keep(CO_column_number_density, H2O_column_number_density, altitude, surface_pressure, pressure_bounds, scan_subindex, latitude_bounds, longitude_bounds, datetime_start, datetime_length)",
    "bin_spatial(801,-40,0.1,1600,-100,0.1)",
    "derive(latitude {latitude})",
    "derive(longitude {longitude})",
    "derive(CO_column_volume_mixing_ratio {time} [ppbv])",
])

s5p_co_merged = harp.import_product(files, basic_operations,)

But I get this error:

---------------------------------------------------------------------------
CLibraryError                             Traceback (most recent call last)
File <timed exec>:1

File /opt/jupyterlab/lib/python3.10/site-packages/harp/_harppy.py:1242, in import_product(filename, operations, options, reduce_operations, post_operations)
   1239 # Import the product as a C product.
   1240 if _lib.harp_import(_encode_path(file), _encode_string(operations), _encode_string(options),
   1241                     c_product_ptr) != 0:
-> 1242     raise CLibraryError()
   1243 if _lib.harp_product_is_empty(c_product_ptr[0]) == 1:
   1244     _lib.harp_product_delete(c_product_ptr[0])

CLibraryError: could not derive variable 'CO_column_volume_mixing_ratio {time}'

I read from the documentation that the variable and dimension does not have to exist yet so I am not sure what I am doing wrong here. I have also tried to use the same name CO_column_number_density, but it gave me the same error.

Thanks again!

Best,
Serena

It is important to be aware that HARP performs all operations in sequence in the order you provide it. The way you formulated it now is that the bin_spatial is performed before your derive of CO_column_volume_mixing_ratio. After a bin_spatial your data would have gained a latitude and longitude dimension, so you would then have to use derive(CO_column_volume_mixing_ratio {time,latitude,longitude} [ppbv]).

However, this is not very efficient. It is better to derive the vmr before the spatial binning and then throw everything away that is no longer needed. e.g.:

basic_operations = ";".join([
    "CO_column_number_density_validity>50",
    "keep(CO_column_number_density, H2O_column_number_density, altitude, surface_pressure, pressure_bounds, latitude_bounds, longitude_bounds, datetime_start, datetime_length)",
    "derive(CO_column_volume_mixing_ratio {time} [ppbv])",
    "derive(datetime_stop {time})",
    "keep(CO_column_volume_mixing_ratio, latitude_bounds, longitude_bounds, datetime_start, datetime_stop)",
    "bin_spatial(801,-40,0.1,1600,-100,0.1)",
    "derive(latitude {latitude})",
    "derive(longitude {longitude})",
])
1 Like

Thanks Sander, indeed with the correct order now it worked!
Best,
Serena

Hello Sander
I am working with RPRO CO datasets.
Do I need to perform

reduce_operations = “squash(time, (latitude, longitude, latitude_bounds, longitude_bounds));bin()”
or not after the basic operation? Followed by

mean_CO=harp.import_product(files_in, operations, reduce_operations=reduce_operations)

tropomi_co=harp.import_product(filename_tropomi, operations, options=“co=corrected”)
I took this from Usercase_6_CO.

My question is if, I am using “reduce_operation” . Is it okay to use option=“co=corrected” ?

mean_CO=harp.import_product(files_in, operations, reduce_operations=reduce_operations, options=“co=corrected”)

Yes. You can combine these arguments in the harp import command.

you mean by not removing the “reduce_operations”?

Indeed. The last command you provided, which combines all the arguments, should work.