如何在 Tensorflow 中计算 Spearman 相关性



我需要计算 Pearson 和 Spearman 相关性,并将其用作张量流中的指标。


tf.contrib.metrics.streaming_pearson_correlation(y_pred, y_true)



From 这个答案 https://stackoverflow.com/a/42730743/9494790 :

    samples = 1
    predictions_rank = tf.nn.top_k(y_pred, k=samples, sorted=True, name='prediction_rank').indices
    real_rank = tf.nn.top_k(y_true, k=samples, sorted=True, name='real_rank').indices
    rank_diffs = predictions_rank - real_rank
    rank_diffs_squared_sum = tf.reduce_sum(rank_diffs * rank_diffs)
    six = tf.constant(6)
    one = tf.constant(1.0)
    numerator = tf.cast(six * rank_diffs_squared_sum, dtype=tf.float32)
    divider = tf.cast(samples * samples * samples - samples, dtype=tf.float32)
    spearman_batch = one - numerator / divider


Following the definition of Wikipedia https://en.wikipedia.org/wiki/Spearman%27s_rank_correlation_coefficient#Definition_and_calculation : enter image description here

我试过 :

size = tf.size(y_pred)
indice_of_ranks_pred = tf.nn.top_k(y_pred, k=size)[1]
indice_of_ranks_label = tf.nn.top_k(y_true, k=size)[1]
rank_pred = tf.nn.top_k(-indice_of_ranks_pred, k=size)[1]
rank_label = tf.nn.top_k(-indice_of_ranks_label, k=size)[1]
rank_pred = tf.to_float(rank_pred)
rank_label = tf.to_float(rank_label)
spearman = tf.contrib.metrics.streaming_pearson_correlation(rank_pred, rank_label)


tensorflow.python.framework.errors_impl.InvalidArgumentError:输入 必须至少有 k 列。已有 1 个,需要 32 个

[[{{节点指标/spearman/TopKV2}} = TopKV2[T=DT_FLOAT, 排序=true, _device="/job:localhost/replica:0/task:0/device:CPU:0"](lambda_1/add, 指标/pearson/pearson_r/variance_predictions/Size)]]

你可以做的一件事是使用 Tensorflow 的函数tf.py_function与使用scipy.stats.spearmanr并像这样定义输入和输出:

from scipy.stats import spearmanr
def get_spearman_rankcor(y_true, y_pred):
     return ( tf.py_function(spearmanr, [tf.cast(y_pred, tf.float32), 
                       tf.cast(y_true, tf.float32)], Tout = tf.float32) )

