Skip to content

Descriptive statistics

descriptive_statistics_dataframe(input_data, column)

Generate descriptive statistics from vector data.

Generates min, max, mean, quantiles(25%, 50% and 75%), standard deviation, relative standard deviation and skewness.

Parameters:

Name Type Description Default
input_data Union[DataFrame, GeoDataFrame]

Data to generate descriptive statistics from.

required
column str

Specify the column to generate descriptive statistics from.

required

Returns:

Type Description
dict

The descriptive statistics in previously described order.

Source code in eis_toolkit/exploratory_analyses/descriptive_statistics.py
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
@beartype
def descriptive_statistics_dataframe(input_data: Union[pd.DataFrame, gpd.GeoDataFrame], column: str) -> dict:
    """Generate descriptive statistics from vector data.

    Generates min, max, mean, quantiles(25%, 50% and 75%), standard deviation, relative standard deviation and skewness.

    Args:
        input_data: Data to generate descriptive statistics from.
        column: Specify the column to generate descriptive statistics from.

    Returns:
        The descriptive statistics in previously described order.
    """
    if column not in input_data.columns:
        raise InvalidColumnException
    data = input_data[column]
    statistics = _descriptive_statistics(data)
    return statistics

descriptive_statistics_raster(input_data)

Generate descriptive statistics from raster data.

Generates min, max, mean, quantiles(25%, 50% and 75%), standard deviation, relative standard deviation and skewness. Nodata values are removed from the data before the statistics are computed.

Parameters:

Name Type Description Default
input_data DatasetReader

Data to generate descriptive statistics from.

required

Returns:

Type Description
dict

The descriptive statistics in previously described order.

Source code in eis_toolkit/exploratory_analyses/descriptive_statistics.py
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
@beartype
def descriptive_statistics_raster(input_data: rasterio.io.DatasetReader) -> dict:
    """Generate descriptive statistics from raster data.

    Generates min, max, mean, quantiles(25%, 50% and 75%), standard deviation, relative standard deviation and skewness.
    Nodata values are removed from the data before the statistics are computed.

    Args:
        input_data: Data to generate descriptive statistics from.

    Returns:
        The descriptive statistics in previously described order.
    """
    data = input_data.read().flatten()
    nodata_value = input_data.nodata
    data = data[data != nodata_value]
    statistics = _descriptive_statistics(data)
    return statistics