How to obtain the vertical column from amf_clear in a qa4ecv file

If I want to grid the vertical column density (VCD) from a qa4ecv file with GOME, I do the following to obtain the L3 grid:

harpmerge -f hdf5 -a “keep(latitude, longitude,tropospheric_HCHO_column_number_density ,latitude_bounds,longitude_bounds) ; bin_spatial(3600,-90,0.05,7200,-180,0.05) ; derive(latitude {latitude});derive(longitude {longitude})” -ar “bin();squash(time, (latitude,longitude));exclude(latitude_bounds_weight,longitude_bounds_weight,latitude_weight,longitude_weight)” ./QA4ECV/WinB/2003/02/10/* test_hcho_gome_200302.nc

with /QA4ECV/WinB/2003/02/10/ contains qa4ecv files recognized as a Harp product.

this works fine, but what if I want to grid instead of tropospheric_HCHO_column_number_density the VCD_clear which is obtained by:

VCD_clear=\frac{tropospheric_HCHO_column_number_density_amf}{</PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/amf_clear>} x
tropospheric_HCHO_column_number_density

</PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/amf_clear> is not recognized by Harp, and there is no harp operation (I think) to calculate VCD_clear

A solution would be to create an new harp product that includes as one of the variables VCD_clear, but I wonder if this is possible to do with the harp C-interface, called from the command line?
thanks

1 Like

There are two different ways to do this.

The first is that we modify HARP to allow reading of the clear sky VCD+AMF (I assume that ‘clear’ is about ‘clear sky’ (i.e. no clouds)). This would then be a HARP ingestion option (e.g. ‘amf=clear_sky’) and when that option is enabled it would return the clear sky amf for tropospheric_HCHO_column_number_density_amf and the AMF scaled version of the VCD for tropospheric_HCHO_column_number_density (instead of the regular vcd+amf). I think this might be a case where we could make such a change in HARP. Can you maybe describe what this scaled VCD would represent? Is this simply ‘the tropospheric column assuming clear sky conditions’ (or is there more to it)?

The other approach is that you perform this calculation yourself. You can probably best do this via Python. You would read the original product using the HARP Python interface and any additional data using e.g. CODA (see e.g. SCIAMACHY L2 to L3 with CODA? - #2 by svniemeijer). You can then do this scaling of the VCD and replacing of the AMF values yourself in Python (the HARP variable data are just numpy arrays). Finally, you could export the modified harp.Product using your gridding steps as operations with harp.export_product().

Thanks a lot for the answer. I tried what you suggested in the second point with python:

#import the product:
imp=harp.import_product(files, operations=“keep(tropospheric_HCHO_column_number_density,tropospheric_HCHO_column_number_density_amf
,validity,latitude_bounds,longitude_bounds)”)


#calculating VCD_clear=VCD * AMF / AMF_clear:
imp.tropospheric_HCHO_column_number_density.data=imp.tropospheric_HCHO_column_number_density_amf.data/amfclear*imp.tropospheric_HCHO_column_number_density.data # rescaling the vcd to vcd_clear

where amfclear is AMF_clear: which is not accessible by the imported harp product.

it would be nice already, if there would be a harp variable tropospheric_HCHO_column_number_density_amf_clear, for example, then the scaling can be done more easily. Then we can also have access to VCD and VCD_clear at the same time.

VCD_clear=VCD*AMF/AMF_clear is not applicable to all molecules. It is only useful in certain conditions. vcd_clear represents indeed ‘the tropospheric column assuming clear sky conditions’.

Having custom post-fixes for variables is exactly what HARP is trying to prevent. HARP does not allow multiple variants of the same quantity to exist in the same product. This is to remove any confusion on what variables belong together on to have a fixed naming convention that allows for derivations (see the full list of algorithms).

If you have a tropospheric HCHO column number density, then with HARP you should expect that variable to always be available as tropospheric_HCHO_column_number_density. And it’s AMF will always be called tropospheric_HCHO_column_number_density_amf.

If you want the VCD calculated from a different AMF, then you are actually using a different algorithm. And ‘algorithm’ or ‘sensor’ specific aspects is exactly what HARP is trying to harmonise away from. So you would have to store that in a separate HARP product (or file) where tropospheric_HCHO_column_number_density_amf is your clear sky AMF and tropospheric_HCHO_column_number_density is the VCD calculated using that clear sky.
You would use a different filename (or different variable name for your harp.Product() in Python) to keep track of which is which.

From the HARP perspective there is nothing different from comparing two slightly different retrieval algorithms for data from the same sensor (i.e. regular AMF vs. clear sky AMF) or comparing two totally different instruments (e.g. S5P and OMI). In both cases you should store each different variant in its own file. This is the only way to harmonise the structure and naming convention between both variants.

We will have a go at looking at implementing the ingestion option for the QA4ECV product to have a ‘clear sky amf’ option.