Skip to content

Classification probability evaluation

plot_calibration_curve(y_true, y_prob, n_bins=5, plot_title='Calibration curve', ax=None, **kwargs)

Plot calibration curve (aka realibity diagram).

Calibration curve has the frequency of the positive labels on the y-axis and the predicted probability on the x-axis. Generally, the close the calibration curve is to line x=y, the better the model is calibrated.

Parameters:

Name Type Description Default
y_true ndarray

True labels.

required
y_prob ndarray

Predicted probabilities for the positive class. The array should come from a binary classifier.

required
plot_title Optional[str]

Title for the plot. Defaults to "Precision-Recall curve".

'Calibration curve'
ax Optional[Axes]

An existing Axes in which to draw the plot. Defaults to None.

None
**kwargs

Additional keyword arguments passed to matplotlib.pyplot.plot.

{}

Returns:

Type Description
Axes

Matplotlib axes containing the plot.

Source code in eis_toolkit/evaluation/classification_probability_evaluation.py
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
def plot_calibration_curve(
    y_true: np.ndarray,
    y_prob: np.ndarray,
    n_bins: int = 5,
    plot_title: Optional[str] = "Calibration curve",
    ax: Optional[plt.Axes] = None,
    **kwargs
) -> plt.Axes:
    """
    Plot calibration curve (aka realibity diagram).

    Calibration curve has the frequency of the positive labels on the y-axis and the predicted probability on
    the x-axis. Generally, the close the calibration curve is to line x=y, the better the model is calibrated.

    Args:
        y_true: True labels.
        y_prob: Predicted probabilities for the positive class. The array should come from
            a binary classifier.
        plot_title: Title for the plot. Defaults to "Precision-Recall curve".
        ax: An existing Axes in which to draw the plot. Defaults to None.
        **kwargs: Additional keyword arguments passed to matplotlib.pyplot.plot.

    Returns:
        Matplotlib axes containing the plot.
    """
    display = CalibrationDisplay.from_predictions(y_true, y_prob, n_bins=n_bins, ax=ax, **kwargs)
    out_ax = display.ax_
    out_ax.set(xlabel="Mean predicted probability", ylabel="Fraction of positives", title=plot_title)
    return out_ax

plot_det_curve(y_true, y_prob, plot_title='DET curve', ax=None, **kwargs)

Plot DET (detection error tradeoff) curve.

DET curve is a binary classification multi-threshold metric. DET curves are a variation of ROC curves where False Negative Rate is plotted on the y-axis instead of True Positive Rate. The ideal performance corner of the plot is bottom-left. When comparing the performance of different models, DET curves can be slightly easier to assess visually than ROC curves.

Parameters:

Name Type Description Default
y_true ndarray

True labels.

required
y_prob ndarray

Predicted probabilities for the positive class. The array should come from a binary classifier.

required
plot_title Optional[str]

Title for the plot. Defaults to "DET curve".

'DET curve'
ax Optional[Axes]

An existing Axes in which to draw the plot. Defaults to None.

None
**kwargs

Additional keyword arguments passed to matplotlib.pyplot.plot.

{}

Returns:

Type Description
Axes

Matplotlib axes containing the plot.

Source code in eis_toolkit/evaluation/classification_probability_evaluation.py
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
def plot_det_curve(
    y_true: np.ndarray,
    y_prob: np.ndarray,
    plot_title: Optional[str] = "DET curve",
    ax: Optional[plt.Axes] = None,
    **kwargs
) -> plt.Axes:
    """
    Plot DET (detection error tradeoff) curve.

    DET curve is a binary classification multi-threshold metric. DET curves are a variation of ROC curves where
    False Negative Rate is plotted on the y-axis instead of True Positive Rate. The ideal performance corner of
    the plot is bottom-left. When comparing the performance of different models, DET curves can be
    slightly easier to assess visually than ROC curves.

    Args:
        y_true: True labels.
        y_prob: Predicted probabilities for the positive class. The array should come from
            a binary classifier.
        plot_title: Title for the plot. Defaults to "DET curve".
        ax: An existing Axes in which to draw the plot. Defaults to None.
        **kwargs: Additional keyword arguments passed to matplotlib.pyplot.plot.

    Returns:
        Matplotlib axes containing the plot.
    """
    display = DetCurveDisplay.from_predictions(y_true, y_prob, ax=ax, **kwargs)
    out_ax = display.ax_
    out_ax.set(xlabel="False positive rate", ylabel="False negative rate", title=plot_title)
    return out_ax

plot_precision_recall_curve(y_true, y_prob, plot_title='Precision-Recall curve', ax=None, **kwargs)

Plot precision-recall curve.

Precision-recall curve is a binary classification multi-threshold metric. Precision-recall curve shows the tradeoff between precision and recall for different classification thresholds. It can be a useful measure of success when classes are imbalanced.

Parameters:

Name Type Description Default
y_true ndarray

True labels.

required
y_prob ndarray

Predicted probabilities for the positive class. The array should come from a binary classifier.

required
plot_title Optional[str]

Title for the plot. Defaults to "Precision-Recall curve".

'Precision-Recall curve'
ax Optional[Axes]

An existing Axes in which to draw the plot. Defaults to None.

None
**kwargs

Additional keyword arguments passed to matplotlib.pyplot.plot.

{}

Returns:

Type Description
Axes

Matplotlib axes containing the plot.

Source code in eis_toolkit/evaluation/classification_probability_evaluation.py
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
def plot_precision_recall_curve(
    y_true: np.ndarray,
    y_prob: np.ndarray,
    plot_title: Optional[str] = "Precision-Recall curve",
    ax: Optional[plt.Axes] = None,
    **kwargs
) -> plt.Axes:
    """
    Plot precision-recall curve.

    Precision-recall curve is a binary classification multi-threshold metric. Precision-recall curve shows
    the tradeoff between precision and recall for different classification thresholds.
    It can be a useful measure of success when classes are imbalanced.

    Args:
        y_true: True labels.
        y_prob: Predicted probabilities for the positive class. The array should come from
            a binary classifier.
        plot_title: Title for the plot. Defaults to "Precision-Recall curve".
        ax: An existing Axes in which to draw the plot. Defaults to None.
        **kwargs: Additional keyword arguments passed to matplotlib.pyplot.plot.

    Returns:
        Matplotlib axes containing the plot.
    """
    display = PrecisionRecallDisplay.from_predictions(y_true, y_prob, plot_chance_level=True, ax=ax, **kwargs)
    out_ax = display.ax_
    out_ax.set(xlabel="Recall", ylabel="Precision", title=plot_title)
    return out_ax

plot_predicted_probability_distribution(y_prob, n_bins=5, plot_title='Distribution of predicted probabilities', ax=None, **kwargs)

Plot a histogram of the predicted probabilities.

Parameters:

Name Type Description Default
y_prob ndarray

Predicted probabilities for the positive class. The array should come from a binary classifier.

required
n_bins int

Number of bins used for the histogram. Defaults to 5.

5
plot_title Optional[str]

Title for the plot. Defaults to "Distribution of predicted probabilities".

'Distribution of predicted probabilities'
ax Optional[Axes]

An existing Axes in which to draw the plot. Defaults to None.

None
**kwargs

Additional keyword arguments passed to sns.histplot and matplotlib.

{}

Returns:

Type Description
Axes

Matplolib axes containing the plot.

Source code in eis_toolkit/evaluation/classification_probability_evaluation.py
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
def plot_predicted_probability_distribution(
    y_prob: np.ndarray,
    n_bins: int = 5,
    plot_title: Optional[str] = "Distribution of predicted probabilities",
    ax: Optional[plt.Axes] = None,
    **kwargs
) -> plt.Axes:
    """
    Plot a histogram of the predicted probabilities.

    Args:
        y_prob: Predicted probabilities for the positive class. The array should come from
            a binary classifier.
        n_bins: Number of bins used for the histogram. Defaults to 5.
        plot_title: Title for the plot. Defaults to "Distribution of predicted probabilities".
        ax: An existing Axes in which to draw the plot. Defaults to None.
        **kwargs: Additional keyword arguments passed to sns.histplot and matplotlib.

    Returns:
        Matplolib axes containing the plot.
    """
    sns.set_theme(style="white")
    plt.figure()
    out_ax = sns.histplot(y_prob, bins=n_bins, ax=ax, **kwargs)
    out_ax.set(xlabel="Predicted probability", ylabel="Count", title=plot_title)
    return out_ax

plot_roc_curve(y_true, y_prob, plot_title='ROC curve', ax=None, **kwargs)

Plot ROC (receiver operating characteristic) curve.

ROC curve is a binary classification multi-threshold metric. The ideal performance corner of the plot is top-left. AUC of the ROC curve summarizes model performance across different classification thresholds.

Parameters:

Name Type Description Default
y_true ndarray

True labels.

required
y_prob ndarray

Predicted probabilities for the positive class. The array should come from a binary classifier.

required
plot_title Optional[str]

Title for the plot. Defaults to "ROC curve".

'ROC curve'
ax Optional[Axes]

An existing Axes in which to draw the plot. Defaults to None.

None
**kwargs

Additional keyword arguments passed to matplotlib.pyplot.plot.

{}

Returns:

Type Description
Axes

Matplotlib axes containing the plot.

Source code in eis_toolkit/evaluation/classification_probability_evaluation.py
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
def plot_roc_curve(
    y_true: np.ndarray,
    y_prob: np.ndarray,
    plot_title: Optional[str] = "ROC curve",
    ax: Optional[plt.Axes] = None,
    **kwargs
) -> plt.Axes:
    """
    Plot ROC (receiver operating characteristic) curve.

    ROC curve is a binary classification multi-threshold metric. The ideal performance corner of the plot
    is top-left. AUC of the ROC curve summarizes model performance across different classification thresholds.

    Args:
        y_true: True labels.
        y_prob: Predicted probabilities for the positive class. The array should come from
            a binary classifier.
        plot_title: Title for the plot. Defaults to "ROC curve".
        ax: An existing Axes in which to draw the plot. Defaults to None.
        **kwargs: Additional keyword arguments passed to matplotlib.pyplot.plot.

    Returns:
        Matplotlib axes containing the plot.
    """
    display = RocCurveDisplay.from_predictions(y_true, y_prob, plot_chance_level=True, ax=ax, **kwargs)
    out_ax = display.ax_
    out_ax.set(xlabel="False positive rate", ylabel="True positive rate", title=plot_title)
    return out_ax

summarize_probability_metrics(y_true, y_prob)

Generate a comprehensive report of various evaluation metrics for classification probabilities.

The output includes ROC AUC, log loss, average precision and Brier score loss.

Parameters:

Name Type Description Default
y_true ndarray

True labels.

required
y_prob ndarray

Predicted probabilities for the positive class. The array should come from a binary classifier.

required

Returns:

Type Description
Dict[str, float]

A dictionary containing the evaluated metrics.

Source code in eis_toolkit/evaluation/classification_probability_evaluation.py
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
def summarize_probability_metrics(y_true: np.ndarray, y_prob: np.ndarray) -> Dict[str, float]:
    """
    Generate a comprehensive report of various evaluation metrics for classification probabilities.

    The output includes ROC AUC, log loss, average precision and Brier score loss.

    Args:
        y_true: True labels.
        y_prob: Predicted probabilities for the positive class. The array should come from
            a binary classifier.

    Returns:
        A dictionary containing the evaluated metrics.
    """
    metrics = {}

    metrics["roc_auc"] = roc_auc_score(y_true, y_prob)
    metrics["log_loss"] = log_loss(y_true, y_prob)
    metrics["average_precision"] = average_precision_score(y_true, y_prob)
    metrics["brier_score_loss"] = brier_score_loss(y_true, y_prob)

    return metrics