Speed things up with operations?

mariliza · November 23, 2022, 7:34pm

try:
# import the entire product in lat/lon but only keep the variables we want
product = harp.import_product(infile,operations = ‘keep(latitude, longitude,
solar_zenith_angle,viewing_zenith_angle, scan_subindex,
datetime,
O3_column_number_density,O3_column_number_density_uncertainty,
cloud_top_albedo, cloud_top_pressure, cloud_fraction,
cloud_optical_depth)’)

    # three variables are not in the ingestion definition, so we need to ingest them by hand
    pf = coda.open(infile)      
    product.O3_temp = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/O3Temperature"), ["time"])
    product.EW_CorrF_O3 = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/EastWestPostCorrectionFactorO3"), ["time"])
    product.Effective_SCD_O3 = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/ESCRingCorrected"), ["time"])
    coda.close(pf)
    
    #apply the geospatial filter
    try:
        filtered_product = harp.execute_operations(product, "point_distance(40.6335,22.9563,150[km])") 
    except harp.Error:
        pass
    productlist.append(filtered_product)
    
except harp.Error:
    pass

#Export the product
try:
average = harp.execute_operations(productlist)
harp.export_product(average, outpath+“test_ozone.nc”)
except harp.Error:
pass

Hi Sander,

I am wondering if there is a faster way to perform the operations above. I am ingesting a GOME2 L2 orbital file, adding the variables not in the ingestion definition, applying a spatial filtering and then outputting per day whatever it found. Do you see any way this process might become faster? for e.g. I noticed that in the productlist I get 13 empty harp products and one that actually has data [over SKG for that matter]. Hence it is a list of 14 harp products 13 of which are empty and not needed. [14 orbits per day, only one over SKG]. Per day this is not an issue of course, but what if I run a month? a year? multiple years?! Horror.

Is there a way to “coda.fetch” from within harp.import_product? so that all this can happen in one go?

All the above in python with Harp 1.16

Many thanks as ever,
MariLiza

sander.niemeijer · November 24, 2022, 5:51pm

What you can do is already perform the point_distance filter as part of the initial ingestion.
If you then also keep() the index variable, this variable will give you the set of indices that you can use as index filter into the numpy arrays that get returned from coda.fetch.

By the way, are you sure that average = harp.execute_operations(productlist) is correct? The way this works is that it will just concatenate the results. To average it, you would have to use a bin() operation as parameter.
You might rather want to use harp.concatenate() (which might be faster), and if you then still want to perform a bin() operation, you could then do this as part of the harp.export_product().

mariliza · November 24, 2022, 6:39pm

Thanks for the swift reply! No binning this time, I’m just creating an “overpass” type file.

The index tip is great, would you be so kind as to help me with the syntax of the coda.fetch command in this case? pretty please?

I’ll use harp.concatenate to “merge” all the data it found, sweet.

sander.niemeijer · November 25, 2022, 1:45pm

It is not a modification of the coda.fetch command. You just need to index the result.

See below for a working example:

import coda
import harp

infile = "GOME_O3-NO2-NO2Tropo-BrO-SO2-H2O-HCHO-OClO_L2_20120101012547_055_METOPA_26984_DLR_04.HDF5"

product = harp.import_product(infile,operations = "keep(latitude, longitude, solar_zenith_angle,viewing_zenith_angle, "
	"scan_subindex, datetime, O3_column_number_density,O3_column_number_density_uncertainty, cloud_top_albedo, "
	"cloud_top_pressure, cloud_fraction, cloud_optical_depth, index);point_distance(-50,110,150[km])")

with coda.open(infile) as pf:
    # three variables are not in the ingestion definition, so we need to ingest them by hand
    product.O3_temp = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/O3Temperature")[product.index.data], ["time"])
    product.EW_CorrF_O3 = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/EastWestPostCorrectionFactorO3")[product.index.data], ["time"])
    product.Effective_SCD_O3 = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/ESCRingCorrected")[product.index.data], ["time"])
    
harp.export_product(product, "./test_ozone.nc")

mariliza · November 25, 2022, 7:22pm

Well, for me using the [product.index.data] is something new! Wish I knew it long ago.
This way has sped up the loop, and also using concatenate at the end [for multiple files].
Many thanks,
MariLiza

mariliza · May 15, 2023, 1:06pm

Good afternoon @sander.niemeijer,

I am wondering if IDL can also support such operations as
product.Effective_SCD_O3 = harp.Variable(coda.fetch(pf, “/DETAILED_RESULTS/O3/ESCRingCorrected”)[product.index.data], [“time”])

I have IDL codes for CO and do not wish to change them to python, if possible.

While this one command:
dummy = reform(coda_fetch(pf, “PRODUCT/SUPPORT_DATA/DETAILED_RESULTS/surface_albedo_2325”)) of course works I would like not only to include this into the harp product already imported but also to apply said harp product product.index to this extra parameter. Like you showed me for python.

I get various % Object reference type required in this context: HARP. type errors with most things I try.

Many thanks,
MariLiza

PS. I get the feeling I am the only IDL-user in this forum… … …

sander.niemeijer · May 15, 2023, 1:39pm

My IDL skills are rather rusty, so I am not sure how to exactly do this in IDL.
But the representation of a HARP product and its variables is just a hierarchy of anonymous structs in IDL. You should not need any HARP routines to modify or adapt an imported HARP product.

mariliza · May 15, 2023, 2:58pm

Yes, I see what you mean. I should use simple IDL structure commands to add to the harp product structure.

My concern is that I do not think I am picking the correct array elements from the harp ingested product when applying the product.index.data to the coda imported array. I was hoping the doing this via the harp.Variable(coda_fetch, …) command, all in one go, would ensure that I am doing this right.

Thanks anyway!