Normality test
normality_test_array(data, bands=None, nodata_value=None)
Compute Shapiro-Wilk test for normality on the input Numpy array.
It is assumed that 3D input array represents multiband raster and the first dimension is the number of bands (same shape as Rasterio reads a raster into an array). Normality is calculated for each band separately. NaN values and optionally a specified nodata value are masked out before calculations.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
ndarray
|
Numpy array containing the input data. Array should either be 1D, 2D or 3D. |
required |
bands |
Optional[Sequence[int]]
|
Band selection. Applies only if input array is 3D. If None, normality is tested for each band. |
None
|
nodata_value |
Optional[Number]
|
Nodata value to be masked out. Optional parameter. |
None
|
Returns:
Type | Description |
---|---|
Dict[str, Dict[str, float]]
|
Test statistic and p_value for each selected band in a dictionary. |
Raises:
Type | Description |
---|---|
EmptyDataException
|
The input data is empty. |
InvalidRasterBandException
|
All selected bands were not found in the input data. |
InvalidDataShapeException
|
Input data has incorrect number of dimensions (> 3). |
SampleSizeExceededException
|
Input data exceeds the maximum of 5000 samples. |
Source code in eis_toolkit/exploratory_analyses/normality_test.py
68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 |
|
normality_test_dataframe(data, columns=None)
Compute Shapiro-Wilk test for normality on the input DataFrame.
Nodata values are dropped automatically.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
DataFrame
|
Dataframe containing the input data. |
required |
columns |
Optional[Sequence[str]]
|
Column selection. If none, normality is tested for all columns. |
None
|
Returns:
Type | Description |
---|---|
Dict[str, Dict[str, float]]
|
Test statistic and p_value for each selected column in a dictionary. |
Raises:
Type | Description |
---|---|
EmptyDataException
|
The input data is empty. |
InvalidColumnException
|
All selected columns were not found in the input data. |
NonNumericDataException
|
Selected columns contain non-numeric data or no numeric columns were found. |
SampleSizeExceededException
|
Input data exceeds the maximum of 5000 samples. |
Source code in eis_toolkit/exploratory_analyses/normality_test.py
20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 |
|