My biggest trouble all along has been not in understanding how matrix/algebra operations work but understanding the nomenclature/terminology used in dl4j/nd4j and how algebra/matrix operations are called in dl4j/nd4j. Add to that that, understandably given the massive volume of information, some of the documentation contains errors.
I make all of this worse by not taking the time to really understand how you have implemented matrix/algebra operations. I am trying to slow down.
I have corrected the embarrassing error I made in my first attempt to use permute. Now, the permute/reshape pairing works.
Now, when running the code there are two problems:
-
After about half a dozen datasets, all values turn to nan
-
When processing the 17th dataset, for some unknown (to me) reason, the value of dim0 (miniBatchSize) changes from 32 to 22 and execution fails.
Below, I posted the latest update of the code, and the last portion of the log.
Please let me know what you think is going on and what I should be doing next.
Thanks.
Printing sd information
--- Summary ---
Variables: 17 (13 with arrays)
Functions: 8
SameDiff Function Defs: 0
Loss function variables: [loss]
--- Variables ---
- Name - - Array Shape - - Variable Type - - Data Type- - Output Of Function - - Inputs To Functions -
b1 [2] VARIABLE FLOAT <none>
bias [8] VARIABLE FLOAT <none> [lstmLayer]
input [32, 6, 33] PLACEHOLDER FLOAT <none> [lstmLayer]
label [32, 2, 33] PLACEHOLDER FLOAT <none> [log_loss]
loss - ARRAY FLOAT log_loss(log_loss)
lstmLayer [32, 2, 33] ARRAY FLOAT lstmLayer(lstmLayer) [permute]
lstmLayer:1 [32, 2] ARRAY FLOAT lstmLayer(lstmLayer)
matmul [1056, 2] ARRAY FLOAT matmul(matmul) [reshape_1]
out - ARRAY FLOAT softmax(softmax) [log_loss]
permute [32, 33, 2] ARRAY FLOAT permute(permute) [reshape]
permute_1 [32, 2, 33] ARRAY FLOAT permute_1(permute) [softmax]
rWeights [2, 8] VARIABLE FLOAT <none> [lstmLayer]
reshape [1056, 2] ARRAY FLOAT reshape(reshape) [matmul]
reshape_1 [32, 33, 2] ARRAY FLOAT reshape_1(reshape) [permute_1]
sd_var [] CONSTANT FLOAT <none> [log_loss]
w1 [2, 2] VARIABLE FLOAT <none> [matmul]
weights [6, 8] VARIABLE FLOAT <none> [lstmLayer]
--- Functions ---
- Function Name - - Op - - Inputs - - Outputs -
0 lstmLayer LSTMLayer [input, weights, rWeights, bias] [lstmLayer, lstmLayer:1]
1 permute Permute [lstmLayer] [permute]
2 reshape Reshape [permute] [reshape]
3 matmul Mmul [reshape, w1] [matmul]
4 reshape_1 Reshape [matmul] [reshape_1]
5 permute_1 Permute [reshape_1] [permute_1]
6 softmax SoftMax [permute_1] [out]
7 log_loss LogLoss [out, sd_var, label] [loss]
Added differentiated op log_loss
Added differentiated op softmax
Added differentiated op permute_1
Added differentiated op reshape_1
Added differentiated op matmul
Added differentiated op reshape
Added differentiated op permute
Added differentiated op lstmLayer
.
.
.
Executing op: [lstmLayer]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:1 result shape: [32, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Debug info for node_2 input[0]; shape: [32, 2, 43]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [32, 43, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [zeros_as]
About to get variable in execute output
node_1:0 result shape: [32, 2]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Debug info for node_2 input[0]; shape: [32, 43, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Reshape: Optional reshape arg was -99
Removing variable <1:0>
Executing op: [reshape]
Reshape: Optional reshape arg was -99
Reshape: new shape: {1376, 2}
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [shape_of]
About to get variable in execute output
node_1:0 result shape: [3]; dtype: INT64; first values [32, 43, 2]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [1376, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Reshape: Optional reshape arg was -99
Removing variable <1:0>
Executing op: [reshape]
Reshape: Optional reshape arg was -99
Reshape: new shape: {32, 43, 2}
About to get variable in execute output
node_1:0 result shape: [32, 43, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [shape_of]
About to get variable in execute output
node_1:0 result shape: [2]; dtype: INT64; first values [1376, 2]
Debug info for node_2 input[0]; shape: [32, 43, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [softmax]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Executing op: [log_loss_grad]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: []; dtype: FLOAT; first values [-nan]
About to get variable in execute output
node_1:2 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [softmax_bp]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [32, 2, 43]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [32, 43, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [32, 43, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[1]; shape: [2]; ews: [1]; order: [c]; dtype: [INT64]; first values: [1376, 2]
Reshape: Optional reshape arg was 1376
Removing variable <1:0>
Removing variable <1:1>
Executing op: [reshape]
Reshape: Optional reshape arg was 1376
Reshape: new shape: {1376, 2}
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Executing op: [matmul_bp]
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [2, 2]; dtype: FLOAT; first values [nan, nan, nan, nan]
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [2, 2]; dtype: FLOAT; first values [nan, nan, nan, nan]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [2, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [2, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:2 result shape: [2, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [2, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [1376, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[1]; shape: [3]; ews: [1]; order: [c]; dtype: [INT64]; first values: [32, 43, 2]
Reshape: Optional reshape arg was 32
Removing variable <1:0>
Removing variable <1:1>
Executing op: [reshape]
Reshape: Optional reshape arg was 32
Reshape: new shape: {32, 43, 2}
About to get variable in execute output
node_1:0 result shape: [32, 43, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [32, 43, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [32, 6, 43]; ews: [1]; order: [f]; dtype: [FLOAT]; first values: [-0.226728, -0.226728, -0.129559, -0.097169, -0.161948, -0.161948, -0.161948, -0.161948, -0.161948, -0.161948, -0.097169, -0, -0, -0, -0, -0]
Debug info for node_2 input[1]; shape: [6, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[2]; shape: [2, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[3]; shape: [8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Removing variable <1:3>
Removing variable <1:4>
Removing variable <1:5>
Executing op: [lstmLayer_bp]
About to get variable in execute output
node_1:0 result shape: [32, 6, 43]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:1 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, nan, nan, nan, nan, nan, nan, -nan, -nan, nan, nan, nan, nan, nan, nan, -nan, -nan, nan, nan, nan, nan, nan, nan, -nan, -nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:2 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, nan, nan, nan, nan, nan, nan, -nan, -nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:3 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, nan, nan, nan, nan, nan, nan]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:2 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:2 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:2 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [22, 6, 19]; ews: [1]; order: [f]; dtype: [FLOAT]; first values: [-0.0647794, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0]
Debug info for node_2 input[1]; shape: [6, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[2]; shape: [2, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Removing variable <1:3>
Executing op: [lstmLayer]
About to get variable in execute output
node_1:0 result shape: [22, 2, 19]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:1 result shape: [22, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Debug info for node_2 input[0]; shape: [22, 2, 19]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [22, 19, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [zeros_as]
About to get variable in execute output
node_1:0 result shape: [22, 2]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Debug info for node_2 input[0]; shape: [22, 19, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Reshape: Optional reshape arg was -99
Removing variable <1:0>
Executing op: [reshape]
Reshape: Optional reshape arg was -99
Reshape: new shape: {418, 2}
About to get variable in execute output
node_1:0 result shape: [418, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [shape_of]
About to get variable in execute output
node_1:0 result shape: [3]; dtype: INT64; first values [22, 19, 2]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [418, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [418, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Reshape: Optional reshape arg was -99
Error at [/home/runner/work/deeplearning4j/deeplearning4j/libnd4j/include/ops/declarable/generic/shape/reshape.cpp:163:0]:
Reshape: lengths before and after reshape should match, but got 836 vs 832
Removing variable <1:0>
Exception in thread "main" java.lang.RuntimeException: Op reshape with name reshape_1 failed to execute. Here is the error from c++: Op validation failed
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.calculateOutputShape(NativeOpExecutioner.java:1672)
at org.nd4j.linalg.api.ops.DynamicCustomOp.calculateOutputShape(DynamicCustomOp.java:696)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getAndParameterizeOp(InferenceSession.java:1363)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getAndParameterizeOp(InferenceSession.java:68)
at org.nd4j.autodiff.samediff.internal.AbstractSession.output(AbstractSession.java:531)
at org.nd4j.autodiff.samediff.internal.AbstractSession.output(AbstractSession.java:154)
at org.nd4j.autodiff.samediff.internal.TrainingSession.trainingIteration(TrainingSession.java:129)
at org.nd4j.autodiff.samediff.SameDiff.fitHelper(SameDiff.java:1936)
at org.nd4j.autodiff.samediff.SameDiff.fit(SameDiff.java:1792)
at org.nd4j.autodiff.samediff.SameDiff.fit(SameDiff.java:1732)
at org.nd4j.autodiff.samediff.config.FitConfig.exec(FitConfig.java:172)
at org.nd4j.autodiff.samediff.SameDiff.fit(SameDiff.java:1712)
at org.deeplearning4j.examples.quickstart.modeling.recurrent.LocationNextNeuralNetworkV6_03.sameDiff3(LocationNextNeuralNetworkV6_03.java:240)
at org.deeplearning4j.examples.quickstart.modeling.recurrent.LocationNextNeuralNetworkV6_03.main(LocationNextNeuralNetworkV6_03.java:141)
private static int nIn = 6;
private static int nOut = 2;
private static int labelCount = 2;
private static int miniBatchSize = 32;
private static int numLabelClasses = -1;
private static SameDiff sd = SameDiff.create();
private static long dim0 = 0L;
private static long dim1 = 0L;
private static long dim2 = 0L;
private static Map<String,INDArray> placeholderData = new HashMap<>();
private static DataSet t;
public static void sameDiff3() throws IOException, InterruptedException
{
// ----- Load the training data -----
trainFeatures = new CSVSequenceRecordReader();
trainFeatures.initialize(new NumberedFileInputSplit(featuresDirTrain.getAbsolutePath() + "/%d.csv", 0, lastTrainCount));
trainLabels = new CSVSequenceRecordReader();
trainLabels.initialize(new NumberedFileInputSplit(labelsDirTrain.getAbsolutePath() + "/%d.csv", 0, lastTrainCount));
trainData = new SequenceRecordReaderDataSetIterator(trainFeatures, trainLabels, miniBatchSize, numLabelClasses,
true, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);
// ----- Load the test data -----
//Same process as for the training data.
testFeatures = new CSVSequenceRecordReader();
testFeatures.initialize(new NumberedFileInputSplit(featuresDirTest.getAbsolutePath() + "/%d.csv", 0, lastTestCount));
testLabels = new CSVSequenceRecordReader();
testLabels.initialize(new NumberedFileInputSplit(labelsDirTest.getAbsolutePath() + "/%d.csv", 0, lastTestCount));
testData = new SequenceRecordReaderDataSetIterator(testFeatures, testLabels, miniBatchSize, numLabelClasses,
true, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);
normalizer = new NormalizerStandardize();
normalizer.fitLabel(true);
normalizer.fit(trainData); //Collect the statistics (mean/stdev) from the training data. This does not modify the input data
trainData.reset();
while(trainData.hasNext()) {
normalizer.transform(trainData.next()); //Apply normalization to the training data
}
while(testData.hasNext()) {
normalizer.transform(testData.next()); //Apply normalization to the test data. This is using statistics calculated from the *training* set
}
trainData.reset();
testData.reset();
trainData.setPreProcessor(normalizer);
testData.setPreProcessor(normalizer);
System.out.println(" Printing traindata dataset shape - 1");
DataSet data = trainData.next();
System.out.println(Arrays.toString(data.getFeatures().shape()));
System.out.println(" Printing testdata dataset shape - 1");
DataSet data2 = testData.next();
System.out.println(Arrays.toString(data2.getFeatures().shape()));
trainData.reset();
testData.reset();
UIServer uiServer = UIServer.getInstance();
StatsStorage statsStorage = new InMemoryStatsStorage();
uiServer.attach(statsStorage);
int listenerFrequency = 1;
sd.setListeners(new ScoreListener());
t = trainData.next();
dim0 = t.getFeatures().size(0);
dim1 = t.getFeatures().size(1);
dim2 = t.getFeatures().size(2);
System.out.println(" features - dim0 - 0 - "+dim0);
System.out.println(" features - dim1 - 0 - "+dim1);
System.out.println(" features - dim2 - 0 - "+dim2);
trainData.reset();
getConfiguration();
int nEpochs = 4;
for (int i = 0; i < nEpochs; i++) {
Log.info("Epoch " + i + " starting. ");
History history = sd.fit(trainData, 1);
trainData.reset();
Log.info("Epoch " + i + " completed. ");
}
System.out.println(" Starting test data evaluation --- ");
//Evaluate on test set:
String outputVariable = "out";
Evaluation evaluation = new Evaluation();
sd.evaluate(testData, outputVariable, evaluation);
//Print evaluation statistics:
System.out.println(" evaluation.stats() - "+evaluation.stats());
String pathToSavedNetwork = "src/main/assets/location_next_neural_network_v6_03.zip";
File savedNetwork = new File(pathToSavedNetwork);
sd.save(savedNetwork, true);
// ModelSerializer.addNormalizerToModel(savedNetwork, normalizer);
System.out.println("----- Example Complete -----");
//Save the trained network for inference - FlatBuffers format
File saveFileForInference = new File("src/main/assets/sameDiffExampleInference.fb");
try {
sd.asFlatFile(saveFileForInference);
} catch (IOException e) {
throw new RuntimeException(e);
}
}
private static void getConfiguration()
{
placeholderData = placeholderData = new HashMap<>();
//Create input and label variables
SDVariable input = sd.placeHolder("input", DataType.FLOAT, miniBatchSize, nIn, -1);
SDVariable label = sd.placeHolder("label", DataType.FLOAT, miniBatchSize, nOut, -1);
placeholderData.put("input", t.getFeatures());
placeholderData.put("label", t.getLabels());
LSTMLayerConfig mLSTMConfiguration = LSTMLayerConfig.builder()
.lstmdataformat(LSTMDataFormat.NST)
.directionMode(LSTMDirectionMode.FWD)
.gateAct(LSTMActivations.SIGMOID)
.cellAct(LSTMActivations.SOFTPLUS)
.outAct(LSTMActivations.SOFTPLUS)
.retFullSequence(true)
.retLastC(false)
.retLastH(true)
.build();
LSTMLayerOutputs outputs = new LSTMLayerOutputs(sd.rnn.lstmLayer(
input,
LSTMLayerWeights.builder()
.weights(sd.var("weights", Nd4j.rand(DataType.FLOAT, nIn, 4 * nOut)))
.rWeights(sd.var("rWeights", Nd4j.rand(DataType.FLOAT, nOut, 4 * nOut)))
.bias(sd.var("bias", Nd4j.rand(DataType.FLOAT, 4 * nOut)))
.build(),
mLSTMConfiguration), mLSTMConfiguration);
// t.getFeatures().size(0) == input.getShape()[0] == miniBatchSize
// t.getFeatures().size(1) == input.getShape()[1] == nIn
// t.getFeatures().size(2) == input.getShape()[2] == TimeSteps
SDVariable layer0 = outputs.getOutput();
SDVariable layer0Permuted = sd.permute(layer0, 0, 2, 1);
SDVariable layer0PermutedReshaped = sd.reshape(layer0Permuted, -1, nOut);
SDVariable w1 = sd.var("w1", new XavierInitScheme('c', nIn, nOut), DataType.FLOAT, nOut, labelCount);
SDVariable b1 = sd.var("b1", Nd4j.rand(DataType.FLOAT, labelCount));
SDVariable mmulOutput = layer0PermutedReshaped.mmul(w1);
SDVariable mmulOutputUnreshaped = sd.reshape(mmulOutput, miniBatchSize, -1, labelCount);
SDVariable mmulOutputUnreshapedUnPermuted = sd.permute(mmulOutputUnreshaped, 0, 2, 1);
SDVariable out = sd.nn.softmax("out", mmulOutputUnreshapedUnPermuted);
SDVariable loss = sd.loss.logLoss("loss", label, out);
sd.setLossVariables("loss");
double learningRate = 1e-3;
TrainingConfig config = new TrainingConfig.Builder()
.l2(1e-4) //L2 regularization
.updater(new Adam(learningRate)) //Adam optimizer with specified learning rate
.dataSetFeatureMapping("input") //DataSet features array should be associated with variable "input"
.dataSetLabelMapping("label") //DataSet label array should be associated with variable "label"
.build();
sd.setTrainingConfig(config);
System.out.println(" Printing sd information");
System.out.println(sd.summary());
}