Meaning of infinite RSE and NaN PC in Regression Evaluation Stats

Hi all

Is there anywhere a detailed description about the evaluation stats of “RegressionEvaluation”. Got the following output and I am confused about the RSE and PC values:

Column       MSE              MAE                RMSE       RSE         PC               R^2            
col_0        1.03810e-02    1.01887e-01    1.01887e-01    Infinity       -2.57662e+00   -9.70559e+06   
col_1        2.37513e-02    1.54115e-01    1.54115e-01    Infinity       NaN                   -2.22060e+07   
col_2        1.78609e-02    1.33645e-01    1.33645e-01    Infinity       NaN                  -1.66988e+07   
col_3        5.25208e-02    2.29174e-01    2.29174e-01    Infinity       9.62097e-01    -4.91037e+07   
col_4        5.27693e-02    2.29716e-01    2.29716e-01    Infinity       1.77998e+00    -4.93360e+07   
col_5        2.19545e-02    1.48170e-01    1.48170e-01    Infinity       NaN                  -2.05260e+07   
col_6        3.32210e-02    1.82266e-01    1.82266e-01    Infinity       NaN                  -3.10596e+07   
col_7        3.73992e-02    1.93389e-01    1.93389e-01    Infinity       NaN                 -3.49659e+07   
col_8        2.33758e-02    1.52891e-01    1.52891e-01    Infinity       NaN                 -2.18549e+07   
col_9        2.57782e-02    1.60556e-01    1.60556e-01    Infinity       NaN                 -2.41010e+07   
......

@agibsonccc, @AlexBlack: Maybe you guys can help?

@clasch-student - I would recommend normalizing your labels if you aren’t already. This is a configuration on the normalization. You can se that on any of the normalizers:

Hi @agibsonccc

I already do normalization on the labels …

Can you advise more about the outputs regarding RSE and PC?

@clasch-student that can generally be a sign of overfitting…still it shouldn’t output infinity. Do you mind sending me something to reproduce this to see if there’s any edge cases to work around in our error calculations?

@agibsonccc hmm maybe yes…

Please find here an example: https://drive.google.com/file/d/13uZq921hNXAOp60QkQ3ZCl1lHJXLKCNb/view?usp=sharing

Just download, unzip and import it into an IDE (exported it from IntelliJ).

Run the file LSTMSequencePredictionExample.java and check the output in the console.

Let me know the results of your analysis and thanks in advance :slight_smile:

@agibsonccc Have you already had time?

@agibsonccc Did it work? Sorry for stressing

RSE is sometimes explicitly set to infinity:

The pearson coefficient, I guess, may turn out to be NaN when you get negative numbers in the sqrt part of the calculation:

As it looks like @agibsonccc is going to be busy for some more time, I suggest you put a break point on those calculations and try to figure out what exactly is causing those, and maybe we can help you how to get around that.

Echoing @treo (thanks for being patient!) if you can meet us in the middle with the breakpoint that would be super helpful.