Is there anywhere a detailed description about the evaluation stats of “RegressionEvaluation”. Got the following output and I am confused about the RSE and PC values:
Column MSE MAE RMSE RSE PC R^2
col_0 1.03810e-02 1.01887e-01 1.01887e-01 Infinity -2.57662e+00 -9.70559e+06
col_1 2.37513e-02 1.54115e-01 1.54115e-01 Infinity NaN -2.22060e+07
col_2 1.78609e-02 1.33645e-01 1.33645e-01 Infinity NaN -1.66988e+07
col_3 5.25208e-02 2.29174e-01 2.29174e-01 Infinity 9.62097e-01 -4.91037e+07
col_4 5.27693e-02 2.29716e-01 2.29716e-01 Infinity 1.77998e+00 -4.93360e+07
col_5 2.19545e-02 1.48170e-01 1.48170e-01 Infinity NaN -2.05260e+07
col_6 3.32210e-02 1.82266e-01 1.82266e-01 Infinity NaN -3.10596e+07
col_7 3.73992e-02 1.93389e-01 1.93389e-01 Infinity NaN -3.49659e+07
col_8 2.33758e-02 1.52891e-01 1.52891e-01 Infinity NaN -2.18549e+07
col_9 2.57782e-02 1.60556e-01 1.60556e-01 Infinity NaN -2.41010e+07
......
@clasch-student - I would recommend normalizing your labels if you aren’t already. This is a configuration on the normalization. You can se that on any of the normalizers:
@clasch-student that can generally be a sign of overfitting…still it shouldn’t output infinity. Do you mind sending me something to reproduce this to see if there’s any edge cases to work around in our error calculations?
The pearson coefficient, I guess, may turn out to be NaN when you get negative numbers in the sqrt part of the calculation:
As it looks like @agibsonccc is going to be busy for some more time, I suggest you put a break point on those calculations and try to figure out what exactly is causing those, and maybe we can help you how to get around that.