Speed things up with operations?

try:
# import the entire product in lat/lon but only keep the variables we want
product = harp.import_product(infile,operations = ‘keep(latitude, longitude,
solar_zenith_angle,viewing_zenith_angle, scan_subindex,
datetime,
O3_column_number_density,O3_column_number_density_uncertainty,
cloud_top_albedo, cloud_top_pressure, cloud_fraction,
cloud_optical_depth)’)

    # three variables are not in the ingestion definition, so we need to ingest them by hand
    pf = coda.open(infile)      
    product.O3_temp = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/O3Temperature"), ["time"])
    product.EW_CorrF_O3 = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/EastWestPostCorrectionFactorO3"), ["time"])
    product.Effective_SCD_O3 = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/ESCRingCorrected"), ["time"])
    coda.close(pf)
    
    #apply the geospatial filter
    try:
        filtered_product = harp.execute_operations(product, "point_distance(40.6335,22.9563,150[km])") 
    except harp.Error:
        pass
    productlist.append(filtered_product)
    
except harp.Error:
    pass

#Export the product
try:
average = harp.execute_operations(productlist)
harp.export_product(average, outpath+“test_ozone.nc”)
except harp.Error:
pass


Hi Sander,

I am wondering if there is a faster way to perform the operations above. I am ingesting a GOME2 L2 orbital file, adding the variables not in the ingestion definition, applying a spatial filtering and then outputting per day whatever it found. Do you see any way this process might become faster? for e.g. I noticed that in the productlist I get 13 empty harp products and one that actually has data [over SKG for that matter]. Hence it is a list of 14 harp products 13 of which are empty and not needed. [14 orbits per day, only one over SKG]. Per day this is not an issue of course, but what if I run a month? a year? multiple years?! Horror.

Is there a way to “coda.fetch” from within harp.import_product? so that all this can happen in one go?

All the above in python with Harp 1.16

Many thanks as ever,
MariLiza

What you can do is already perform the point_distance filter as part of the initial ingestion.
If you then also keep() the index variable, this variable will give you the set of indices that you can use as index filter into the numpy arrays that get returned from coda.fetch.

By the way, are you sure that average = harp.execute_operations(productlist) is correct? The way this works is that it will just concatenate the results. To average it, you would have to use a bin() operation as parameter.
You might rather want to use harp.concatenate() (which might be faster), and if you then still want to perform a bin() operation, you could then do this as part of the harp.export_product().

Thanks for the swift reply! No binning this time, I’m just creating an “overpass” type file.

The index tip is great, would you be so kind as to help me with the syntax of the coda.fetch command in this case? pretty please?

I’ll use harp.concatenate to “merge” all the data it found, sweet.

It is not a modification of the coda.fetch command. You just need to index the result.

See below for a working example:

import coda
import harp

infile = "GOME_O3-NO2-NO2Tropo-BrO-SO2-H2O-HCHO-OClO_L2_20120101012547_055_METOPA_26984_DLR_04.HDF5"

product = harp.import_product(infile,operations = "keep(latitude, longitude, solar_zenith_angle,viewing_zenith_angle, "
	"scan_subindex, datetime, O3_column_number_density,O3_column_number_density_uncertainty, cloud_top_albedo, "
	"cloud_top_pressure, cloud_fraction, cloud_optical_depth, index);point_distance(-50,110,150[km])")

with coda.open(infile) as pf:
    # three variables are not in the ingestion definition, so we need to ingest them by hand
    product.O3_temp = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/O3Temperature")[product.index.data], ["time"])
    product.EW_CorrF_O3 = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/EastWestPostCorrectionFactorO3")[product.index.data], ["time"])
    product.Effective_SCD_O3 = harp.Variable(coda.fetch(pf, "/DETAILED_RESULTS/O3/ESCRingCorrected")[product.index.data], ["time"])
    
harp.export_product(product, "./test_ozone.nc")
1 Like

Well, for me using the [product.index.data] is something new! Wish I knew it long ago.
This way has sped up the loop, and also using concatenate at the end [for multiple files].
Many thanks,
MariLiza