# How is the average of a HARP "bin" calculated? Role of weight and count variables?

I understood that since HARP 1.9 or so the HARP bin operation is no longer an arithmetic average but a weighted average.
Suppose for example a “point_distance(lat,lon,50 [km]);bin();” operations on satellite orbit files, where the satellite pixel dimensions are present (i.e., there is latitude_bounds and longitude_bounds)

The result is a weighted average. But how is it exactly calculated? By weighting according to the pixel area?

In the resulting file there are the new “weight” and “count” variables.

“weight” will be used in subsequent bin operations, correct?
Does “count” play a role in subsequent bin operations? Perhaps when the weight variable is excluded before?

For a limited number of variables, there is now a corresponding “_weight” variable:
latitude_weight, longitude_weight, solar_zenith_angle_weight, solar_azimuth_angle_weight, sensor_zenith_angle_weight, sensor_azimuth_angle_weight

What is the role of these new _weight variables? Why only for these variables?

The best documentation on this is, is in the documentation for the C library functions for harp_product_bin_spatial and harp_product_bin.

We know that this is something we need to document better.

Thanks. As I use the “bin” operation, I guess “harp_product_bin” (and not harp_product_bin_spatial) is the relevant C function.
The explanation for harp_product_bin_spatial seems a bit more complete than for harp_product_bin.

harp_product_bin_spatial:
“In addition, a ‘weight’ variable will be added that will contain the sum of weights for the contribution to each cell. If a variable contained NaN values then a variable specific weight variable will be created with only the sum of weights for the non-NaN entries.”

-> I guess the same holds for harp_product_bin, but replacing “to each cell” by “to each bin” ??

In harp_product_bin_spatial, area binning means that "each sample will be allocated to each lat/lon grid cell based on the amount of overlap. "

-> For harp_product_bin, there are no grid cells and so no overlap can be calculated. So I guess weighting uses surface area, not amount of overlap?

harp_product_bin_spatial actually does both temporal binning and spatial binning (although in practice you would mostly use it with a single temporal target bin).
harp_product_bin only does the temporal binning.

The documentation is pretty accurate. There is no ‘spatial overlap’ in the temporal binning case (harp_product_bin). Weight variables are only created for angular variables (to store the length of the combined unit vectors).

Ok.
Does this mean that in the above example (bin() per orbit file), an arithmetic average of the individual pixels will be taken?
And would, in a subsequent bin() operation (combining several “binned-per-overpass” instances), the average be weighted using the “count” variable?

That is indeed how it will happen.