Subsetting Data using lot/lan values

ryan · February 12, 2020, 2:26pm

Hello,

I have been using the bellow code to analyse sentinel-5p NO2 data:

product = harp.import_product(“path\month\*.nc”,
operations=“tropospheric_NO2_column_number_density_validity>75;keep(latitude_bounds,longitude_bounds,tropospheric_NO2_column_number_density,surface_zonal_wind_velocity,surface_meridional_wind_velocity);bin_spatial(1801,-90,0.1,3601,-180,0.1);derive(tropospheric_NO2_column_number_density [Pmolec/cm2])”,
post_operations= “bin();squash(time, (latitude_bounds,longitude_bounds));derive(latitude {latitude});derive(longitude {longitude});exclude(latitude_bounds,longitude_bounds,latitude_bounds_weight,longitude_bounds_weight,count,weight)”)

I am only interested in the Mediterranean Sea region. Even though the files in my directory are only relevant for the area of interest I am still processing a lot of unnecessary data. This results into problems when trying to compute monthly plots with around 90 netcdf files. At the moment I can run around 30 files at a time otherwise my Python crash. It is a computational limitation I have to live with.

Is there a way to subset the data with HARP using specific lot/lan values so that I can process just the data relevant for me?

Thanks in advance for your help.

sander.niemeijer · February 13, 2020, 5:10pm

Hi Ryan,

There are several things you can do.

If you are only interested in the Mediterranean area, then use a spatial binning that matches that area. For instance bin_spatial(251,25,0.1,551,-10,0.1). This will already reduce the size a bit.

What I would further recommend is to only ingest daily grids as a start:

grid1 = harp.import_product(".../day1/*.nc", operations="...", post_operations="...")
grid2 = harp.import_product(".../day2/*.nc", operations="...", post_operations="...")
...

daily_grids = [grid1, grid2, ...]

And then merge these daily grids into a single monthly grid using:

monthly_grid = harp.execute_operations(daily_grids, "", "bin()")

This is both fast and memory efficient.