mlnext.score.pr_curve#

mlnext.score.pr_curve(y: ndarray, y_score: ndarray, *, y_true: ndarray | None = None, pos_label: str | int | None = None, sample_weight: List | ndarray | None = None) PRCurve[source]#

Computes precision-recall pairs for different probability thresholds for binary classification tasks.

Adapted from https://scikit-learn.org/stable/modules/generated/sklearn.metrics.precision_recall_curve.html. Changed the return value to PRCurve which encapsulates not only the recall, precision and thresholds but also the tps, fps, tns and fns. Thus, we can obtain all necessary parameters that are required for the logging of a pr-curve in tensorboard (https://github.com/tensorflow/tensorboard/blob/master/tensorboard/plugins/pr_curve/README.md). Furthermore, we can you use results for further processing.

Parameters:
  • y (np.ndarray) – Positive lables either {-1, 1} or {0, 1}. Otherwise, pos_label needs to be given.

  • y_score (np.ndarray) – Target scores in range [0, 1].

  • pos_label (int, optional) – The label of the positive class. When pos_label=None, if y is in {-1, 1} or {0, 1}, pos_label is set to 1, otherwise an error will be raised. Defaults to None.

  • sample_weight (T.Union[T.List, np.ndarray], optional) – Sample weights. Defaults to None.

Returns:

Returns a PRCurve container for the results.

Return type:

PRCurve

Example

>>> import numpy as np
>>> from mlnext.score import pr_curve
>>> y = np.array([0, 0, 1, 1])
>>> y_scores = np.array([0.1, 0.4, 0.35, 0.8])
>>> curve = pr_curve(y, y_scores)
>>> print(curve)
  TH     ACC      F1     PRC     RCL     ANO      TP      FN      TN      FP      DA      TA
0.3500  0.7500  0.8000  0.6667  1.0000  1.0000    2       0       1       1       1       1
0.4000  0.5000  0.5000  0.5000  0.5000  1.0000    1       1       1       1       1       1
0.8000  0.7500  0.6667  1.0000  0.5000  1.0000    1       1       2       0       1       1
AUC: 0.7917
>>> # access fields
>>> print(curve.f1, curve.thresholds)
[0.8        0.5        0.66666667] [0.35 0.4  0.8 ]
>>> # confusion matrix for a specific threshold
>>> print(curve[np.argmax(f1)])
P\A   1      0
1     2      1
0     0      1
accuracy: 0.7500
f1: 0.8000
recall: 1.0000
precision: 0.6667
>>> # convert to format for the tensorboard writer
>>> import tensorflow as tf
>>> import tensorboard.summary.v1 as tb_summary
>>> pr_curve_summary = tb_summary.pr_curve_raw_data_op(
...    "pr", **curve.to_tensorboard())
>>> writer = tf.summary.create_file_writer("./tmp/pr_curves")
>>> with writer.as_default():
>>>    tf.summary.experimental.write_raw_pb(pr_curve_summary, step=1)
Result:
../_images/pr_curve.png