Issues with harpmerge command line tool and ingesting and reading the product with VISAN

Dear all,

I am experiencing some trouble with the HARP command line tools (v1.9.2) and VISAN (v4.0).

I want to create an animated plot of the UVAI (S5P_OFFL_L2__AER_AI) generated by the Australian fires between 29/12/2019 and 08/01/2020 (following this ESA animation on Australian fire).

I am hence downloading all products located between Eastern Australia and Western South America for dates between 29/12/2019 and 08/01/2020. What I do first is the following:

harpmerge -ap ‘bin(); squash(time, (latitude,longitude))’ -a ‘latitude > -42 [degree_north]; latitude < -7 [degree_north]; absorbing_aerosol_index_validity > 80; bin_spatial(176,-42,0.2,1801,-180,0.2); derive(longitude {longitude}); derive(latitude {latitude})’ S5P_OFFL_L2__AER_AI_20191229*.nc …/Processing_daily_map/Australia_fire_20191229.nc

I do this for each day between 29/12/2019 and 08/01/2020 so that I have a regular map between Eastern Australia and South Western America for each of the 11 days (with the same Lat/lon grid). The generation of the daily products over the South Pacific works well. Then I simply merge the 11 resulting maps into a single file (to create an animated plot within VISAN) with:

harpmerge Australia_fire.nc …/Australia_20191229_20200108/Australia_fire_all.nc*

But when trying to use the wplot command on the Australia_fire_all.nc product in VISAN, it gives me the following error:

raceback (most recent call last):
File “”, line 1, in
File “/home/rus/anaconda3/lib/python3.7/site-packages/visan/commands.py”, line 642, in wplot
dataSetId = plot.AddGridData(latitude, longitude, data)
File “/home/rus/anaconda3/lib/python3.7/site-packages/visan/plot/worldplotframe.py”, line 485, in AddGridData
raise ValueError(“plot latitudeGrid should be the same for all grids”)
ValueError: plot latitudeGrid should be the same for all grids

Which I do not understand because I browsed the Australia_fire_all.nc product with VISAN and every variable of interest (latitude, longitude and absorbing aerosol index) has the dimension 11 x 175 x 1800 and hence has the same latitude grid.
Sometimes there are differences of 10^-14 between some latitude coordinates (same kind of differences for longitude coordinates) from one day to an other. But I do not know whether my issue is due to these (tiny compared to the grid resolution) differences and how to go around them because I applied the exact same command line to generate each daily products.

Also, I tried to use the longitude_range operation to handle the international lines but I did not manage to make it work correctly:

Longitude_range(142 [degree_east], -82 [degree_east]).

Because it seems to exclude the measures between [142,180] and between [-180,-82]. I tried many other possibilities but it never gave me the expected results.

For your information, I had done the same process to study UVAI generated by a fire over California and the process was working very well. Except that I did not need to merge the products day by day ahead because the area was covered by a single product per day. Here, as the area is way larger, I would need to create merged product day by day and then merge them all which seems to work but then VISAN doesn’t seem to be able to handle it.

I am probably doing something wrong but I cannot figure out what.

Any help would be appreciated!

Thank you in advance

Simon

Hi Simon,

Regarding the grid inconsistency, VISAN will indeed complain if your time series of lat/lon grids are not exactly the same. It performs a numpy.all(product.longitude.data[:] == product.longitude.data[0]) as check and if that fails then you will get the error.
I also don’t understand how you can get those slight differences. Are you perhaps creating the daily grid files on different machines (using different CPUs)? Otherwise I can’t explain this behaviour.
On a single machine, repeating these HARP operations should always result in exactly the same answer.
Note that the squash(time, (latitude,longitude)) in creating the daily files also performs a similar equality check of the lat/lon grid of the 14 orbit-based grids within a day. And that one doesn’t seem to be failing.
In any case, you can work around this for the plot, by just providing the lat/lon grids explicitly: wplot(product.latitude.data[0], product.longitude.data[0], product.absorbing_aerosol_index_validity)

For the longitude_range operation, I will check what is going wrong. What should already work is to make sure that the range is provided as min/max values (but potentially exceeding -180 or 180), so in your case use e.g. longitude_range(142 [degree_east], 278 [degree_east])

The documentation for longitude_range was indeed wrong.
We updated it to be in line with the actual behaviour.

Hi Sander,

Thank you very much for your prompt reply!
I agree with you, those slight differences should not happen when the computations are carried out on the same machines. And I am performing the harpmerge computation on the same virtual machines which has 8CPUs. I do not know whether the same CPU is used for the harpmerge command line for each day. Is there a way for me to find this information?

Also I was thinking, perhaps there is a way to reduce the precision of the computed lat/lon corrdinates to 4 significant digits with the HARP operations? Having acoordinate of 12.99999999999999 instead of 12.1 for some lat/lon grid products doesn’t not make much sense anyway and ends up in my issue …

It does not surprise me that the squash(time, (latitude,longitude)) operation works for the creation of the daily products as the creation of the daily products is carried out through a single command line per day. But then the resulting lat/lon grid of the daily files are not all exactly the same. Those daily files were created separately from successive calls to the same HARP command line on the same machine.

I tried your workaround yesterday and it seems to work fine, thank you for the tip :).

Ok I will try the longitude_range operation accordingly the. Thank you so much!

Simon

Also I was thinking, perhaps there is a way to reduce the precision of the computed lat/lon corrdinates to 4 significant digits with the HARP operations?

HARP doesn’t have a variant that performs nice rounding, but you can cast the latitude to an integer using proper unit scaling:

harpdump -d -a 'keep(latitude);derive(latitude int16 [1e-2degree_north])' yourfile.nc

Hello!
I am a new user and mainly new to process earth observation S5p data.I have already downloaded the files about an area and I want to find & estimate the PM 2.5 aerosol in specific coordinates of this area. Have you any idea what is the process via VISAN & HARP?
The RUS copernicus video in Youtube shows the way about Air quality monitoring with Sentinel-5p.What more I have to do,in your opinion? Is there any option(info)?

Dear Margarita,

S5P does not measure PM2.5.

As far as I know there is no satellite that can directly measure PM2.5. There are some ways to derive it from satellite AOD data as is described here, but S5P also does not provide AOD.

1 Like

Hi Sander,

I feel like VISAN does not handle the S5P Aerosol index products nominally.
Now when I perform a HARP import operation within VISAN on such a product and then when I simply write wplot(product) it plots the surface wind velocity instead of the aerosol index. So I went around this by fully writing wplot(product.latitude.data, product.longitude.data, product.absorbing_aerosol_index.data) but I wanted to check with you whether I was doing something wrong?

Best,

Simon

Hi Simon,

VISAN includes some generic routines to try to figure out the main variable in a harp product. Since AAI is often an ancillary parameter for other products, it gets deprioritized.

We will check whether can tune the variable selection so it selects AAI correctly for this product.

In any case, an easier way to specify which variable to use for plotting is to use the value parameter:

wplot(product, value="absorbing_aerosol_index")
1 Like

Hi everybody,

When going through the RUS Training ATMO01 Air Quality Monitoring (https://rus-copernicus.eu/portal/wp-content/uploads/library/education/training/ATMO01_AirQuality_Monitoring.pdf), I’ve run into two problems with harpmerge (Anaconda PowerShell, Windows 10):

  • When using * in the filename to combine more input files, I only get the ‘could not find’ error message, exactly as described here. For now, I have solved it by specifying a directory (so that all files in that directory will be included; which means having to copy the respective files for every daily averaged product separately) but I’d like to know why the syntax with * is not working…

  • After generating the daily averaged files, I wanted to combine them to create a monthly averaged file (following the description in the RUS Training, except of specifying a directory instead of using *), using the following:
    harpmerge -ap ‘bin(); squash(time, (latitude,longitude))’ ./ …/…/Processing/NO2/Global/l3_NO2_201901_quality_75.nc
    but I only get
    ERROR: invalid variable ‘latitude_bounds_weight’ (should be have same initial dimensions as ‘weight’)

Unfortunately, I cannot find any reference for such an error. Could you please help me and let me know what should I change to be able to generate monthly average?
Thank you very much in advance!
Lenka

1 Like

Hi Lenka,

The * doesn’t work on Windows since on Linux and macOS it is actually replaced by the shell into a list of filenames before the harp command is executed. On Windows the * gets passed directly to the harp tool and harp then indeed complains that it can’t find a file that is called ‘*’.

Regarding the weight error, please make sure you generate all files with the latest version of HARP (v1.9.2). There are some important fixes in HARP especially for this case.

Hi Sander,

Thank you very much.
As for *, I thought it could be something like that. Luckily, there are ways to deal with it.

As for HARP version, I do have 1.9.2 and get the weight error anyway.
Any ideas how can I solve this?


I’ve opened the daily averaged files in QGIS and checked the Layer Properties; for latitude_bounds_weight, it shows Dimensions X: 2 Y: 230 Bands: 1; while for weight (as well as *tropospheric_NO2_column_number_density etc), it’s X: 720 Y: 230 Bands: 1

The daily averages were generated according to the RUS Training using, e.g.,
harpmerge -ap 'bin(); squash(time, (latitude,longitude))' -a 'latitude > -55 [degree_north]; latitude < 60 [degree_north]; tropospheric_NO2_column_number_density_validity > 75; bin_spatial(231,-55,0.5,721,-180,0.5); derive(longitude {longitude});derive(latitude {latitude})' ./ ../../Processing/NO2/l3_NO2_20190124_quality_75.nc

It seems like you stumbled upon a different bug in HARP. We will try to fix this soon.

In the mean time you can work around this by excluding the lat/lon weight variables after you performed a bin() operation.
So, wherever you now use:

-ap 'bin(); squash(time, (latitude,longitude))'

you should use:

-ap 'bin(); squash(time, (latitude,longitude)); exclude(latitude_bounds_weight,longitude_bounds_weight,latitude_weight,longitude_weight)'

Hi Sander,

Thanks again! Using your code, I was able to come further and combine several daily averaged files (tried with 5 daily files in the beginning, everything seemed to be going well), so that’s great!

However, there’s another issue obviously: after generating all 31 daily files for January 2020, I have found I’m not able to combine them, because no output file is generated (and no error message shown). There were several GB free RAM left when harpmerge was running so that should not be an issue. But what then?

I’ve experimented a bit and found that combining my daily files for Jan 1-Jan 24 worked well, but when adding Jan 25, no file was generated anymore. Adding Jan 26 or Jan 27 up Jan1-Jan 24 worked, but adding another file (Jan 28 or Jan 29 or Jan 30 or Jan 31) => again, no output. Then, I removed the added files and went back to Jan 1-Jan 24 and to my huge surprise, haven’t got any output while it worked before. I don’t understand it.

So, what I’ve done for now: I’ve combined the daily files for Jan1-Jan16, and, separately, Jan17-Jan31, and finally, combined these two half-monthly files. Hence, the resulting file will not be exactly the average of all the daily files but the difference should be small enough. But I think this is not how harpmerge is supposed to behave, or at least I haven’t seen any remark on limited number of input files.

Hi everybody,

When I try to run RUS Training ATMO01 Air Quality Monitoring using Windows 10 I have similar problem “The system cannot find the file specified.”
harpmerge -a ‘latitude > 32 [degree_north]; latitude < 42 [degree_north]; longitude > -126 [degree_east]; longitude < -116 [degree_east]; absorbing_aerosol_index_validity>80; bin_spatial(101,32,0.1,101,-126,0.1); derive(longitude {longitude});derive(latitude {latitude})’ e:\ResultsS5P\ e:\ResultsS5P\result.nc

I would like to ask you what is the difference between the results obtain by runing the tutorial and the data provide by Earth Engine Data Catalog? https://developers.google.com/earth-engine/datasets/catalog/COPERNICUS_S5P_NRTI_L3_NO2

Also, I would like to ask you if it can be used the results regarding air pollutants NO2 detected by S-5P at local level (cities level) or the resolution is too low?

Thank you in advance!
Kinga

Hi Lenka,

How are you providing the files as input to the harpmerge command?
Are you providing them one-by-one as arguments? Or do you put them into a subdirectory and provide the path to that subdirectory as input?

On Windows there is a limit to the command line length, but I am not sure whether that would be the cause of your issue.
In any case, please try the approach of putting all daily files into a subdirectory and have harpmerge run on that subdirectory (as you did with creating the daily gridded file from the orbit files).

Otherwise I have no idea what might be going wrong for you. The problem doesn’t sound familiar.

Hi Kinga,

First, I am not familiar with the RUS Training. So unless the creators of this training are members of this forum and respond, you might want to contact them directly.

The “The system cannot find the file specified.” error is probably because you haven’t installed HARP yet (or not correctly).

Regarding the NO2 at local level, the resolution of S5P is about 3.5km by 5.6km.

Hi Sander,
Indeed, what I did was putting all the files into a subdirectory and using the code

harpmerge -ap ‘bin(); squash(time, (latitude,longitude))’ ./ …/result_202001_quality_75.nc

Up to some number of files in that subdirectory, it did work, but after adding more files, it didn’t anymore. Very strange…

Regarding the invalid variable latitude_bounds_weight error, a better solution to this problem is to actually squash the time dimensions of the lat/lon variables before performing the bin() operation:

-ap 'squash(time, (latitude,longitude,latitude_bounds,longitude_bounds));bin()'

This approach is the preferred approach since the squash operation will raise errors if you try to merge grids with different lat/lon axis (whereas performing a bin() first would just average the axis values.

1 Like