Weights of evidence

`weights_of_evidence_calculate_responses(output_arrays, nr_of_deposits, nr_of_pixels)`

Calculate the posterior probabilities for the given generalized weight arrays.

Parameters:

Name	Type	Description	Default
`output_arrays`	`Sequence[Dict[str, ndarray]]`	List of output array dictionaries returned by weights of evidence calculations. For each dictionary, generalized weight and generalized standard deviation arrays are used and summed together pixel-wise to calculate the posterior probabilities. If generalized arrays are not found, the W+ and S_W+ arrays are used (so if outputs from unique weight calculations are used for this function).	required
`nr_of_deposits`	`int`	Number of deposit pixels in the input data for weights of evidence calculations.	required
`nr_of_pixels`	`int`	Number of evidence pixels in the input data for weights of evidence calculations.	required

Returns:

Type	Description
`ndarray`	Array of posterior probabilites.
`ndarray`	Array of standard deviations in the posterior probability calculations.
`ndarray`	Array of confidence of the prospectivity values obtained in the posterior probability array.

Source code in eis_toolkit/prediction/weights_of_evidence.py

@beartype
def weights_of_evidence_calculate_responses(
    output_arrays: Sequence[Dict[str, np.ndarray]], nr_of_deposits: int, nr_of_pixels: int
) -> Tuple[np.ndarray, np.ndarray, np.ndarray]:
    """Calculate the posterior probabilities for the given generalized weight arrays.

    Args:
        output_arrays: List of output array dictionaries returned by weights of evidence calculations.
            For each dictionary, generalized weight and generalized standard deviation arrays are used and summed
            together pixel-wise to calculate the posterior probabilities. If generalized arrays are not found,
            the W+ and S_W+ arrays are used (so if outputs from unique weight calculations are used for this function).
        nr_of_deposits: Number of deposit pixels in the input data for weights of evidence calculations.
        nr_of_pixels: Number of evidence pixels in the input data for weights of evidence calculations.

    Returns:
        Array of posterior probabilites.
        Array of standard deviations in the posterior probability calculations.
        Array of confidence of the prospectivity values obtained in the posterior probability array.
    """
    gen_weights_sum = sum(
        [
            item[GENERALIZED_WEIGHT_PLUS_COLUMN]
            if GENERALIZED_WEIGHT_PLUS_COLUMN in item.keys()
            else item[WEIGHT_PLUS_COLUMN]
            for item in output_arrays
        ]
    )
    gen_weights_variance_sum = sum(
        [
            np.square(item[GENERALIZED_S_WEIGHT_PLUS_COLUMN])
            if GENERALIZED_S_WEIGHT_PLUS_COLUMN in item.keys()
            else np.square(item[WEIGHT_S_PLUS_COLUMN])
            for item in output_arrays
        ]
    )

    prior_probabilities = nr_of_deposits / nr_of_pixels
    prior_odds = np.log(prior_probabilities / (1 - prior_probabilities))
    posterior_probabilities = np.exp(gen_weights_sum + prior_odds) / (1 + np.exp(gen_weights_sum + prior_odds))

    posterior_probabilities_squared = np.square(posterior_probabilities)
    posterior_probabilities_std = np.sqrt(
        (1 / nr_of_deposits + gen_weights_variance_sum) * posterior_probabilities_squared
    )

    confidence_array = posterior_probabilities / posterior_probabilities_std
    return posterior_probabilities, posterior_probabilities_std, confidence_array

`weights_of_evidence_calculate_weights(evidential_raster, deposits, raster_nodata=None, weights_type='unique', studentized_contrast_threshold=1, arrays_to_generate=None)`

Calculate weights of spatial associations.

Parameters:

Name	Type	Description	Default
`evidential_raster`	`DatasetReader`	The evidential raster.	required
`deposits`	`GeoDataFrame`	Vector data representing the mineral deposits or occurences point data.	required
`raster_nodata`	`Optional[Number]`	If nodata value of raster is wanted to specify manually. Optional parameter, defaults to None (nodata from raster metadata is used).	`None`
`weights_type`	`Literal[unique, categorical, ascending, descending]`	Accepted values are 'unique', 'categorical', 'ascending' and 'descending'. Unique weights does not create generalized classes and does not use a studentized contrast threshold value while categorical, cumulative ascending and cumulative descending do. Categorical weights are calculated so that all classes with studentized contrast below the defined threshold are grouped into one generalized class. Cumulative ascending and descending weights find the class with max contrast and group classes above/below into generalized classes. Generalized weights are also calculated for generalized classes.	`'unique'`
`studentized_contrast_threshold`	`Number`	Studentized contrast threshold value used with 'categorical', 'ascending' and 'descending' weight types. Used either as reclassification threshold directly (categorical) or to check that class with max contrast has studentized contrast value at least the defined value (cumulative). Defaults to 1.	`1`
`arrays_to_generate`	`Optional[Sequence[str]]`	Arrays to generate from the computed weight metrics. All column names in the produced weights_df are valid choices. Defaults to ["Class", "W+", "S_W+] for "unique" weights_type and ["Class", "W+", "S_W+", "Generalized W+", "Generalized S_W+"] for the cumulative weight types.	`None`

Returns:

Type	Description
`DataFrame`	Dataframe with weights of spatial association between the input data.
`dict`	Dictionary of arrays for specified metrics.
`dict`	Raster metadata.
`int`	Number of deposit pixels.
`int`	Number of all evidence pixels.

Raises:

Type	Description
`ClassificationFailedException`	Unable to create generalized classes with the given studentized_contrast_threshold.
`InvalidColumnException`	Arrays to generate contains invalid column name(s).
`InvalidParameterValueException`	Input weights_type is not one of the accepted values.

Source code in eis_toolkit/prediction/weights_of_evidence.py

@beartype
def weights_of_evidence_calculate_weights(
    evidential_raster: rasterio.io.DatasetReader,
    deposits: gpd.GeoDataFrame,
    raster_nodata: Optional[Number] = None,
    weights_type: Literal["unique", "categorical", "ascending", "descending"] = "unique",
    studentized_contrast_threshold: Number = 1,
    arrays_to_generate: Optional[Sequence[str]] = None,
) -> Tuple[pd.DataFrame, dict, dict, int, int]:
    """
    Calculate weights of spatial associations.

    Args:
        evidential_raster: The evidential raster.
        deposits: Vector data representing the mineral deposits or occurences point data.
        raster_nodata: If nodata value of raster is wanted to specify manually. Optional parameter, defaults to None
            (nodata from raster metadata is used).
        weights_type: Accepted values are 'unique', 'categorical', 'ascending' and 'descending'.
            Unique weights does not create generalized classes and does not use a studentized contrast threshold value
            while categorical, cumulative ascending and cumulative descending do. Categorical weights are calculated so
            that all classes with studentized contrast below the defined threshold are grouped into one generalized
            class. Cumulative ascending and descending weights find the class with max contrast and group classes
            above/below into generalized classes. Generalized weights are also calculated for generalized classes.
        studentized_contrast_threshold: Studentized contrast threshold value used with 'categorical', 'ascending' and
            'descending' weight types. Used either as reclassification threshold directly (categorical) or to check
            that class with max contrast has studentized contrast value at least the defined value (cumulative).
            Defaults to 1.
        arrays_to_generate: Arrays to generate from the computed weight metrics. All column names
            in the produced weights_df are valid choices. Defaults to ["Class", "W+", "S_W+]
            for "unique" weights_type and ["Class", "W+", "S_W+", "Generalized W+", "Generalized S_W+"]
            for the cumulative weight types.

    Returns:
        Dataframe with weights of spatial association between the input data.
        Dictionary of arrays for specified metrics.
        Raster metadata.
        Number of deposit pixels.
        Number of all evidence pixels.

    Raises:
        ClassificationFailedException: Unable to create generalized classes with the given
            studentized_contrast_threshold.
        InvalidColumnException: Arrays to generate contains invalid column name(s).
        InvalidParameterValueException: Input weights_type is not one of the accepted values.
    """

    if arrays_to_generate is None:
        if weights_type == "unique":
            metrics_to_arrays = DEFAULT_METRICS_UNIQUE
        else:
            metrics_to_arrays = DEFAULT_METRICS_CUMULATIVE
    else:
        for col_name in arrays_to_generate:
            if col_name not in VALID_DF_COLUMNS:
                raise InvalidColumnException(f"Arrays to generate contains invalid metric / column name: {col_name}.")
        metrics_to_arrays = arrays_to_generate.copy()

    # 1. Preprocess data
    evidence_array = _read_and_preprocess_evidence(evidential_raster, raster_nodata)
    raster_meta = evidential_raster.meta

    # Rasterize deposits
    deposit_array, _ = rasterize_vector(
        geodataframe=deposits, default_value=1.0, base_raster_profile=raster_meta, fill_value=0.0
    )

    # Mask NaN out of the array
    nodata_mask = np.isnan(evidence_array)
    masked_evidence_array = evidence_array[~nodata_mask]
    masked_deposit_array = deposit_array[~nodata_mask]

    # 2. WofE calculations
    if weights_type == "unique" or weights_type == "categorical":
        wofe_weights = _unique_weights(masked_deposit_array, masked_evidence_array)
    elif weights_type == "ascending":
        wofe_weights = _cumulative_weights(masked_deposit_array, masked_evidence_array, ascending=True)
    elif weights_type == "descending":
        wofe_weights = _cumulative_weights(masked_deposit_array, masked_evidence_array, ascending=False)
    else:
        raise InvalidParameterValueException(
            "Expected weights_type to be one of unique, categorical, ascending or descending."
        )

    # 3. Create DataFrame based on calculated metrics
    df_entries = []
    for cls, metrics in wofe_weights.items():
        metrics = [round(metric, 4) if isinstance(metric, np.floating) else metric for metric in metrics]
        A, _, C, _, w_plus, s_w_plus, w_minus, s_w_minus, contrast, s_contrast, studentized_contrast = metrics
        df_entries.append(
            {
                CLASS_COLUMN: cls,
                PIXEL_COUNT_COLUMN: A + C,
                DEPOSIT_COUNT_COLUMN: A,
                WEIGHT_PLUS_COLUMN: w_plus,
                WEIGHT_S_PLUS_COLUMN: s_w_plus,
                WEIGHT_MINUS_COLUMN: w_minus,
                WEIGHT_S_MINUS_COLUMN: s_w_minus,
                CONTRAST_COLUMN: contrast,
                S_CONTRAST_COLUMN: s_contrast,
                STUDENTIZED_CONTRAST_COLUMN: studentized_contrast,
            }
        )
    weights_df = pd.DataFrame(df_entries)

    # 4. If we use cumulative weights type, calculate generalized classes and weights
    if weights_type == "categorical":
        weights_df = _generalized_classes_categorical(weights_df, studentized_contrast_threshold)
        weights_df = _generalized_weights_categorical(weights_df, masked_deposit_array)
    elif weights_type == "ascending" or weights_type == "descending":
        weights_df = _generalized_classes_cumulative(weights_df, studentized_contrast_threshold)
        weights_df = _generalized_weights_cumulative(weights_df, masked_deposit_array)

    # 5. Generate arrays for desired metrics
    arrays_dict = _generate_arrays_from_metrics(evidence_array, weights_df, metrics_to_arrays)

    # Return nr. of deposit pixels  and nr. of all evidence pixels for to be used in calculate responses
    nr_of_deposits = int(np.sum(masked_deposit_array == 1))
    nr_of_pixels = int(np.size(masked_evidence_array))

    return weights_df, arrays_dict, raster_meta, nr_of_deposits, nr_of_pixels