DataDescription
Satellites covering the whole globe provide different variables. An overview of these variables is provided, including their description and the spatial distribution.
00VariablesDefinition
ERSBackscatter
The European Remote Sensing Satellite (ERS) scatterometer that sends out a radiation pulse and measures the amount of radiation that is reflected or scattered back in the direction of the instrument. This fraction of radiation that is received at the instrument is called the backscatter and is specified in units of decibel [dB]. The observations of active microwave instruments are very sensitive to soil moisture. This is because the presence of water in the soil strongly changes the soil dielectric properties and hence the scattering properties.
SSM/I SurfaceEmissivities
The Special Sensor Microwave/Imager (SSM/I) instrument measures the brightness temperature, which is defined as the temperature a perfect black body would have to have in order to emit the same amount of radiation. The SSM/I instrument measures both at vertical and horizontal polarization. For a horizontal polarization the emissivities are much more sensitive to the soil moisture than for a vertical polarization. However, in general both passive microwave observations have a less strong relation with the soil moisture than the active microwave observations.
Surface TemperatureDiurnal Amplitude
The amplitude of the daily temperature is based on the infrared surface temperature measurements made available by the ISCCP. It can be used as an indicator of the soil moisture, because the thermal inertia of the soil depends on its moisture content. For a wet soil, the thermal inertia is increased, that means the surface temperature changes less for a certain incident solar flux. In a dry soil on the other hand the same incident solar flux would result in a stronger temperature change and hence a larger diurnal cycle amplitude.
NDVIDataproduct
The normalized difference vegetation index (NDVI) is an index that quantifies the amount of vegetation present on the part of the surface that is observed. The NDVI is computed from the reflectance measured in the visible and near‐infrared part of the spectrum. The NDVI and the soil moisture are strongly related, since more plants will grow on a moist soil than on a dry soil. However, the exact relationship between these two parameters depends on the observation region considered.
Soil MoistureLMD Dataset
In order to train the neural network there is the need to provide a global target dataset of soil moistures. In this analysis the data chosen for this role is the soil moisture product of the land surface model developed by the Laboratoire de la Météorologie Dynamique (LMD) in Paris. The model's soil moisture product is provided in the form of an index ranging between 0 and 1, where 0 corresponds to the local wilting point and 1 corresponds to the local saturation level.
The dataset with all the satellite observations is provided in csv format, with more than 600.000 datapoints. For each datapoint the latitude and longitude are detailed. The distribution of the datapoints does not correspond to any standard grid and the original distribution is respected, without any regredding or interpolation. Values of backscatter, emissitvities, temperature amplitude, NDVI and the LMD soil moisture dataproduct are given for each datapoint too. Aiming at performing statistical analysis with those significant values, the Not-a-Number (NaN) entries in the dataset are removed, leaving a subset of around 45.000 valid datapoints.
01Remove theNot-a-number
02SpatialDistribution
The spatial distribution of the different variables is shown here. Overall, available data is located in lower latitudes except from some patches in North and South America and some parts of Africa. The targeted soil moisture dataproduct shows high values along the equator and also in some specific regions in northern Europe and North America.
03BasicStatistics
With the use of the Pandas package in Python, we can read and perform basic statistics calculation, focusing on the mean, maximum and minimum values for each variable.
PandasDataframe
# Read the input file data = pd.read_csv(./TP_3487347, sep='\t', header=None) # Assign column names data.columns = ["cellNr", "latitude", "longitude", "backscatter","emissivity_v","emissivity_h","ts_amplitude","ndvi","lmd_soilWetness"] # Replace and remove NaN data = data.replace(' NaN',pd.NA) data = data.dropna() # Deliver basic statistics data.describe()