Mathieu Blondel

2010-10-12 04:54:51 UTC

Hello,

In the metrics module, the function precision_recall(y, probas_)

currently outputs the precision/recall pairs for different probability

thresholds. This is only one possible criterion for plotting the

precision/recall curve so I suggest renaming it to

precision_recall_curve(y, probas_) and add the function

precision_recall(y_true, y_pred). This will be a useful utility for

plotting the precision/recall curve against other criteria (e.g., the

value of a hyperparameter). See below for the patch of what I'm

proposing. In addition, the code works in the multi-label setting too.

See below for a test that shows what I mean.

A similar change can be done for the ROC curve function.

Also, I noticed an inconsistency in the code: some metrics use

function(y_true, y_pred) while others use function(y_pred, y_true).

Some metrics aren't symmetric so it it would be nice to make this

consistent.

The changes I propose will break existing programs but I think it's

better to do it sooner than later. If that's OK, I will make the

necessary modifications and commit.

Mathieu

PATCH:

diff --git a/scikits/learn/metrics.py b/scikits/learn/metrics.py

index 393b411..f1c389a 100644

--- a/scikits/learn/metrics.py

+++ b/scikits/learn/metrics.py

@@ -115,7 +115,15 @@ def auc(x, y):

return area

-def precision_recall(y, probas_):

+def precision_recall(y_true, y_pred):

+ true_pos = np.sum(y_true[y_pred == 1]==1)

+ false_pos = np.sum(y_true[y_pred == 1]==0)

+ false_neg = np.sum(y_true[y_pred == 0]==1)

+ precision = true_pos / float(true_pos + false_pos)

+ recall = true_pos / float(true_pos + false_neg)

+ return precision, recall

+

+def precision_recall_curve(y, probas_):

"""compute Precision-Recall

Parameters

@@ -149,11 +157,9 @@ def precision_recall(y, probas_):

precision = np.empty(n_thresholds)

recall = np.empty(n_thresholds)

for i, t in enumerate(thresholds):

- true_pos = np.sum(y[probas_>=t]==1)

- false_pos = np.sum(y[probas_>=t]==0)

- false_neg = np.sum(y[probas_<t]==1)

- precision[i] = true_pos / float(true_pos + false_pos)

- recall[i] = true_pos / float(true_pos + false_neg)

+ y_pred = np.ones(len(y))

+ y_pred[probas_ < t] = 0

+ precision[i], recall[i] = precision_recall(y, y_pred)

TEST:

def test_precision_recall_multilabel():

Y_true = np.array([[1, 0, 1, 0],

[1, 0, 0, 0],

[0, 0, 0, 0],

[0, 1, 0, 0],

[0, 1, 1, 1]])

Y_pred = np.array([[1, 1, 1, 0],

[1, 0, 0, 0],

[0, 1, 0, 0],

[0, 1, 0, 0],

[0, 0, 1, 1]])

n_pred = 8.0

n_corr_pred = 6.0

n_labeled = 7.0

precision = n_corr_pred / n_pred

recall = n_corr_pred / n_labeled

assert_equal((precision, recall),

precision_recall(Y_true, Y_pred))

In the metrics module, the function precision_recall(y, probas_)

currently outputs the precision/recall pairs for different probability

thresholds. This is only one possible criterion for plotting the

precision/recall curve so I suggest renaming it to

precision_recall_curve(y, probas_) and add the function

precision_recall(y_true, y_pred). This will be a useful utility for

plotting the precision/recall curve against other criteria (e.g., the

value of a hyperparameter). See below for the patch of what I'm

proposing. In addition, the code works in the multi-label setting too.

See below for a test that shows what I mean.

A similar change can be done for the ROC curve function.

Also, I noticed an inconsistency in the code: some metrics use

function(y_true, y_pred) while others use function(y_pred, y_true).

Some metrics aren't symmetric so it it would be nice to make this

consistent.

The changes I propose will break existing programs but I think it's

better to do it sooner than later. If that's OK, I will make the

necessary modifications and commit.

Mathieu

PATCH:

diff --git a/scikits/learn/metrics.py b/scikits/learn/metrics.py

index 393b411..f1c389a 100644

--- a/scikits/learn/metrics.py

+++ b/scikits/learn/metrics.py

@@ -115,7 +115,15 @@ def auc(x, y):

return area

-def precision_recall(y, probas_):

+def precision_recall(y_true, y_pred):

+ true_pos = np.sum(y_true[y_pred == 1]==1)

+ false_pos = np.sum(y_true[y_pred == 1]==0)

+ false_neg = np.sum(y_true[y_pred == 0]==1)

+ precision = true_pos / float(true_pos + false_pos)

+ recall = true_pos / float(true_pos + false_neg)

+ return precision, recall

+

+def precision_recall_curve(y, probas_):

"""compute Precision-Recall

Parameters

@@ -149,11 +157,9 @@ def precision_recall(y, probas_):

precision = np.empty(n_thresholds)

recall = np.empty(n_thresholds)

for i, t in enumerate(thresholds):

- true_pos = np.sum(y[probas_>=t]==1)

- false_pos = np.sum(y[probas_>=t]==0)

- false_neg = np.sum(y[probas_<t]==1)

- precision[i] = true_pos / float(true_pos + false_pos)

- recall[i] = true_pos / float(true_pos + false_neg)

+ y_pred = np.ones(len(y))

+ y_pred[probas_ < t] = 0

+ precision[i], recall[i] = precision_recall(y, y_pred)

TEST:

def test_precision_recall_multilabel():

Y_true = np.array([[1, 0, 1, 0],

[1, 0, 0, 0],

[0, 0, 0, 0],

[0, 1, 0, 0],

[0, 1, 1, 1]])

Y_pred = np.array([[1, 1, 1, 0],

[1, 0, 0, 0],

[0, 1, 0, 0],

[0, 1, 0, 0],

[0, 0, 1, 1]])

n_pred = 8.0

n_corr_pred = 6.0

n_labeled = 7.0

precision = n_corr_pred / n_pred

recall = n_corr_pred / n_labeled

assert_equal((precision, recall),

precision_recall(Y_true, Y_pred))