Skip to content

Cell-Based Association

cell_based_association(cell_size, geodata, output_path, column=None, subset_target_attribute_values=None, add_name=None, add_buffer=None)

Creation of CBA matrix.

Initializes a CBA matrix from a vector file. The mesh is calculated according to the geometries contained in this file and the size of cells. Allows to add multiple vector data to the matrix, based on targeted shapes and/or attributes.

Parameters:

Name Type Description Default
cell_size int

Size of the cells.

required
geodata List[GeoDataFrame]

GeoDataFrame to create the CBA matrix. Additional GeoDataFrame(s) can be imputed to add to the CBA matrix.

required
output_path str

Name of the saved .tif file.

required
column Optional[List[str]]

Name of the column of interest. If no attribute is specified, then an artificial attribute is created representing the presence or absence of the geometries of this file for each cell of the CBA grid. A categorical attribute will generate as many columns (binary) in the CBA matrix than values considered of interest (dummification). See parameter . Additional column(s) can be imputed for each added GeoDataFrame(s).

None
subset_target_attribute_values Optional[List[Union[None, list, str]]]

List of values of interest of the target attribute, in case a categorical target attribute has been specified. Allows to filter a subset of relevant values. Additional values can be imputed for each added GeoDataFrame(s).

None
add_name Optional[List[Union[str, None]]]

Name of the column(s) to add to the matrix.

None
add_buffer Optional[List[Union[Number, bool]]]

Allow the use of a buffer around shapes before the intersection with CBA cells for the added GeoDataFrame(s). Minimize border effects or allow increasing positive samples (i.e. cells with mineralization). The size of the buffer is computed using the CRS (if projected CRS in meters: value in meters).

None

Returns:

Type Description
GeoDataFrame

CBA matrix is created.

Source code in eis_toolkit/vector_processing/cell_based_association.py
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
@beartype
def cell_based_association(
    cell_size: int,
    geodata: List[gpd.GeoDataFrame],
    output_path: str,
    column: Optional[List[str]] = None,
    subset_target_attribute_values: Optional[List[Union[None, list, str]]] = None,
    add_name: Optional[List[Union[str, None]]] = None,
    add_buffer: Optional[List[Union[Number, bool]]] = None,
) -> gpd.GeoDataFrame:
    """Creation of CBA matrix.

    Initializes a CBA matrix from a vector file. The mesh is calculated
    according to the geometries contained in this file and the size of cells.
    Allows to add multiple vector data to the matrix, based on targeted shapes
    and/or attributes.

    Args:
        cell_size: Size of the cells.
        geodata: GeoDataFrame to create the CBA matrix. Additional
            GeoDataFrame(s) can be imputed to add to the CBA matrix.
        output_path: Name of the saved .tif file.
        column: Name of the column of interest. If no attribute is specified,
            then an artificial attribute is created representing the presence
            or absence of the geometries of this file for each cell of the CBA
            grid. A categorical attribute will generate as many columns (binary)
            in the CBA matrix than values considered of interest (dummification).
            See parameter <subset_target_attribute_values>. Additional
            column(s) can be imputed for each added GeoDataFrame(s).
        subset_target_attribute_values: List of values of interest of the
            target attribute, in case a categorical target attribute has been
            specified. Allows to filter a subset of relevant values. Additional
            values can be imputed for each added GeoDataFrame(s).
        add_name: Name of the column(s) to add to the matrix.
        add_buffer: Allow the use of a buffer around shapes before the
            intersection with CBA cells for the added GeoDataFrame(s). Minimize
            border effects or allow increasing positive samples (i.e. cells
            with mineralization). The size of the buffer is computed using the
            CRS (if projected CRS in meters: value in meters).

    Returns:
        CBA matrix is created.
    """

    # Swapping None to list values
    if column is None:
        column = [""]
    if add_buffer is None:
        add_buffer = [False]

    # Consistency checks on input data
    for frame in geodata:
        if frame.empty:
            raise EmptyDataFrameException("The input GeoDataFrame is empty.")

    if cell_size <= 0:
        raise InvalidParameterValueException("Expected cell size to be positive and non-zero.")

    add_buffer = [False if x == 0 else x for x in add_buffer]
    if any(num < 0 for num in add_buffer):
        raise InvalidParameterValueException("Expected buffer value to be positive, null or False.")

    for i, name in enumerate(column):
        if column[i] == "":
            if subset_target_attribute_values[i] is not None:
                raise InvalidParameterValueException("Can't use subset of values if no column is targeted.")
        elif column[i] not in geodata[i]:
            raise InvalidColumnException("Targeted column not found in the GeoDataFrame.")

    for i, subset in enumerate(subset_target_attribute_values):
        if subset is not None:
            for value in subset:
                if value not in geodata[i][column[i]].unique():
                    raise InvalidParameterValueException("Subset of value(s) not found in the targeted column.")

    # Computation
    for i, data in enumerate(geodata):
        if i == 0:
            # Initialization of the CBA matrix
            grid, cba = _init_from_vector_data(cell_size, geodata[0], column[0], subset_target_attribute_values[0])
        else:
            # If necessary, adding data to matrix
            cba = _add_layer(
                cba,
                grid,
                geodata[i],
                column[i],
                subset_target_attribute_values[i],
                add_name[i - 1],
                add_buffer[i - 1],
            )

    # Export
    _to_raster(cba, output_path)

    return cba