brier_score_loss#

sklearn.metrics.brier_score_loss(y_true, y_proba, *, sample_weight=None, pos_label=None, labels=None, scale_by_half='auto')[source]#

Compute the Brier score loss.

The smaller the Brier score loss, the better, hence the naming with β€œloss”. The Brier score measures the mean squared difference between the predicted probability and the actual outcome. The Brier score is a strictly proper scoring rule.

Read more in the User Guide.

Parameters:
y_truearray-like of shape (n_samples,)

True targets.

y_probaarray-like of shape (n_samples,) or (n_samples, n_classes)

Predicted probabilities. If y_proba.shape = (n_samples,) the probabilities provided are assumed to be that of the positive class. If y_proba.shape = (n_samples, n_classes) the columns in y_proba are assumed to correspond to the labels in alphabetical order, as done by LabelBinarizer.

sample_weightarray-like of shape (n_samples,), default=None

Sample weights.

pos_labelint, float, bool or str, default=None

Label of the positive class when y_proba.shape = (n_samples,). If not provided, pos_label will be inferred in the following manner:

  • if y_true in {-1, 1} or {0, 1}, pos_label defaults to 1;

  • else if y_true contains string, an error will be raised and pos_label should be explicitly specified;

  • otherwise, pos_label defaults to the greater label, i.e. np.unique(y_true)[-1].

labelsarray-like of shape (n_classes,), default=None

Class labels when y_proba.shape = (n_samples, n_classes). If not provided, labels will be inferred from y_true.

Added in version 1.7.

scale_by_halfbool or β€œauto”, default=”auto”

When True, scale the Brier score by 1/2 to lie in the [0, 1] range instead of the [0, 2] range. The default β€œauto” option implements the rescaling to [0, 1] only for binary classification (as customary) but keeps the original [0, 2] range for multiclass classification.

Added in version 1.7.

Returns:
scorefloat

Brier score loss.

Notes

For \(N\) observations labeled from \(C\) possible classes, the Brier score is defined as:

\[\frac{1}{N}\sum_{i=1}^{N}\sum_{c=1}^{C}(y_{ic} - \hat{p}_{ic})^{2}\]

where \(y_{ic}\) is 1 if observation i belongs to class c, otherwise 0 and \(\hat{p}_{ic}\) is the predicted probability for observation i to belong to class c. The Brier score then ranges between \([0, 2]\).

In binary classification tasks the Brier score is usually divided by two and then ranges between \([0, 1]\). It can be alternatively written as:

\[\frac{1}{N}\sum_{i=1}^{N}(y_{i} - \hat{p}_{i})^{2}\]

where \(y_{i}\) is the binary target and \(\hat{p}_{i}\) is the predicted probability of the positive class.

References

Examples

>>> import numpy as np
>>> from sklearn.metrics import brier_score_loss
>>> y_true = np.array([0, 1, 1, 0])
>>> y_true_categorical = np.array(["spam", "ham", "ham", "spam"])
>>> y_prob = np.array([0.1, 0.9, 0.8, 0.3])
>>> brier_score_loss(y_true, y_prob)
0.0375
>>> brier_score_loss(y_true, 1-y_prob, pos_label=0)
0.0375
>>> brier_score_loss(y_true_categorical, y_prob, pos_label="ham")
0.0375
>>> brier_score_loss(y_true, np.array(y_prob) > 0.5)
0.0
>>> brier_score_loss(y_true, y_prob, scale_by_half=False)
0.075
>>> brier_score_loss(
...    ["eggs", "ham", "spam"],
...    [[0.8, 0.1, 0.1], [0.2, 0.7, 0.1], [0.2, 0.2, 0.6]],
...    labels=["eggs", "ham", "spam"]
... )
0.146