My biggest trouble all along has been not in understanding how matrix/algebra operations work but understanding the nomenclature/terminology used in dl4j/nd4j and how algebra/matrix operations are called in dl4j/nd4j. Add to that that, understandably given the massive volume of information, some of the documentation contains errors.
I make all of this worse by not taking the time to really understand how you have implemented matrix/algebra operations. I am trying to slow down.
I have corrected the embarrassing error I made in my first attempt to use permute. Now, the permute/reshape pairing works.
Now, when running the code there are two problems:
After about half a dozen datasets, all values turn to nan
When processing the 17th dataset, for some unknown (to me) reason, the value of dim0 (miniBatchSize) changes from 32 to 22 and execution fails.
Below, I posted the latest update of the code, and the last portion of the log.
Please let me know what you think is going on and what I should be doing next.
Printing sd information
--- Summary ---
Variables: 17 (13 with arrays)
Functions: 8
SameDiff Function Defs: 0
Loss function variables: [loss]
--- Variables ---
- Name - - Array Shape - - Variable Type - - Data Type- - Output Of Function - - Inputs To Functions -
b1 [2] VARIABLE FLOAT <none>
bias [8] VARIABLE FLOAT <none> [lstmLayer]
input [32, 6, 33] PLACEHOLDER FLOAT <none> [lstmLayer]
label [32, 2, 33] PLACEHOLDER FLOAT <none> [log_loss]
loss - ARRAY FLOAT log_loss(log_loss)
lstmLayer [32, 2, 33] ARRAY FLOAT lstmLayer(lstmLayer) [permute]
lstmLayer:1 [32, 2] ARRAY FLOAT lstmLayer(lstmLayer)
matmul [1056, 2] ARRAY FLOAT matmul(matmul) [reshape_1]
out - ARRAY FLOAT softmax(softmax) [log_loss]
permute [32, 33, 2] ARRAY FLOAT permute(permute) [reshape]
permute_1 [32, 2, 33] ARRAY FLOAT permute_1(permute) [softmax]
rWeights [2, 8] VARIABLE FLOAT <none> [lstmLayer]
reshape [1056, 2] ARRAY FLOAT reshape(reshape) [matmul]
reshape_1 [32, 33, 2] ARRAY FLOAT reshape_1(reshape) [permute_1]
sd_var [] CONSTANT FLOAT <none> [log_loss]
w1 [2, 2] VARIABLE FLOAT <none> [matmul]
weights [6, 8] VARIABLE FLOAT <none> [lstmLayer]
--- Functions ---
- Function Name - - Op - - Inputs - - Outputs -
0 lstmLayer LSTMLayer [input, weights, rWeights, bias] [lstmLayer, lstmLayer:1]
1 permute Permute [lstmLayer] [permute]
2 reshape Reshape [permute] [reshape]
3 matmul Mmul [reshape, w1] [matmul]
4 reshape_1 Reshape [matmul] [reshape_1]
5 permute_1 Permute [reshape_1] [permute_1]
6 softmax SoftMax [permute_1] [out]
7 log_loss LogLoss [out, sd_var, label] [loss]
Added differentiated op log_loss
Added differentiated op softmax
Added differentiated op permute_1
Added differentiated op reshape_1
Added differentiated op matmul
Added differentiated op reshape
Added differentiated op permute
Added differentiated op lstmLayer
Executing op: [lstmLayer]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:1 result shape: [32, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Debug info for node_2 input[0]; shape: [32, 2, 43]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [32, 43, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [zeros_as]
About to get variable in execute output
node_1:0 result shape: [32, 2]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Debug info for node_2 input[0]; shape: [32, 43, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Reshape: Optional reshape arg was -99
Removing variable <1:0>
Executing op: [reshape]
Reshape: Optional reshape arg was -99
Reshape: new shape: {1376, 2}
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [shape_of]
About to get variable in execute output
node_1:0 result shape: [3]; dtype: INT64; first values [32, 43, 2]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [1376, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Reshape: Optional reshape arg was -99
Removing variable <1:0>
Executing op: [reshape]
Reshape: Optional reshape arg was -99
Reshape: new shape: {32, 43, 2}
About to get variable in execute output
node_1:0 result shape: [32, 43, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [shape_of]
About to get variable in execute output
node_1:0 result shape: [2]; dtype: INT64; first values [1376, 2]
Debug info for node_2 input[0]; shape: [32, 43, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [softmax]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Executing op: [log_loss_grad]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: []; dtype: FLOAT; first values [-nan]
About to get variable in execute output
node_1:2 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [softmax_bp]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [32, 2, 43]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [32, 43, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [32, 43, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[1]; shape: [2]; ews: [1]; order: [c]; dtype: [INT64]; first values: [1376, 2]
Reshape: Optional reshape arg was 1376
Removing variable <1:0>
Removing variable <1:1>
Executing op: [reshape]
Reshape: Optional reshape arg was 1376
Reshape: new shape: {1376, 2}
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Executing op: [matmul_bp]
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [2, 2]; dtype: FLOAT; first values [nan, nan, nan, nan]
About to get variable in execute output
node_1:0 result shape: [1376, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [2, 2]; dtype: FLOAT; first values [nan, nan, nan, nan]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [2, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [2, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:2 result shape: [2, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [2, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [1376, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[1]; shape: [3]; ews: [1]; order: [c]; dtype: [INT64]; first values: [32, 43, 2]
Reshape: Optional reshape arg was 32
Removing variable <1:0>
Removing variable <1:1>
Executing op: [reshape]
Reshape: Optional reshape arg was 32
Reshape: new shape: {32, 43, 2}
About to get variable in execute output
node_1:0 result shape: [32, 43, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [32, 43, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [32, 2, 43]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [32, 6, 43]; ews: [1]; order: [f]; dtype: [FLOAT]; first values: [-0.226728, -0.226728, -0.129559, -0.097169, -0.161948, -0.161948, -0.161948, -0.161948, -0.161948, -0.161948, -0.097169, -0, -0, -0, -0, -0]
Debug info for node_2 input[1]; shape: [6, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[2]; shape: [2, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[3]; shape: [8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Removing variable <1:3>
Removing variable <1:4>
Removing variable <1:5>
Executing op: [lstmLayer_bp]
About to get variable in execute output
node_1:0 result shape: [32, 6, 43]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:1 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, nan, nan, nan, nan, nan, nan, -nan, -nan, nan, nan, nan, nan, nan, nan, -nan, -nan, nan, nan, nan, nan, nan, nan, -nan, -nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:2 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, nan, nan, nan, nan, nan, nan, -nan, -nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:3 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, nan, nan, nan, nan, nan, nan]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:2 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [6, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:2 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [2, 8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:1 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
About to get variable in execute output
node_1:2 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [8]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [22, 6, 19]; ews: [1]; order: [f]; dtype: [FLOAT]; first values: [-0.0647794, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0]
Debug info for node_2 input[1]; shape: [6, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[2]; shape: [2, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Removing variable <1:3>
Executing op: [lstmLayer]
About to get variable in execute output
node_1:0 result shape: [22, 2, 19]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
About to get variable in execute output
node_1:1 result shape: [22, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Debug info for node_2 input[0]; shape: [22, 2, 19]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [permute]
About to get variable in execute output
node_1:0 result shape: [22, 19, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [zeros_as]
About to get variable in execute output
node_1:0 result shape: [22, 2]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Debug info for node_2 input[0]; shape: [22, 19, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Reshape: Optional reshape arg was -99
Removing variable <1:0>
Executing op: [reshape]
Reshape: Optional reshape arg was -99
Reshape: new shape: {418, 2}
About to get variable in execute output
node_1:0 result shape: [418, 2]; dtype: FLOAT; first values [nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan, nan]
Removing variable <1:0>
Executing op: [shape_of]
About to get variable in execute output
node_1:0 result shape: [3]; dtype: INT64; first values [22, 19, 2]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [418, 2]; dtype: FLOAT; first values [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Debug info for node_2 input[0]; shape: [418, 2]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [-nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan, -nan]
Reshape: Optional reshape arg was -99
Error at [/home/runner/work/deeplearning4j/deeplearning4j/libnd4j/include/ops/declarable/generic/shape/reshape.cpp:163:0]:
Reshape: lengths before and after reshape should match, but got 836 vs 832
Removing variable <1:0>
Exception in thread "main" java.lang.RuntimeException: Op reshape with name reshape_1 failed to execute. Here is the error from c++: Op validation failed
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.calculateOutputShape(
at org.nd4j.linalg.api.ops.DynamicCustomOp.calculateOutputShape(
at org.nd4j.autodiff.samediff.internal.InferenceSession.getAndParameterizeOp(
at org.nd4j.autodiff.samediff.internal.InferenceSession.getAndParameterizeOp(
at org.nd4j.autodiff.samediff.internal.AbstractSession.output(
at org.nd4j.autodiff.samediff.internal.AbstractSession.output(
at org.nd4j.autodiff.samediff.internal.TrainingSession.trainingIteration(
at org.nd4j.autodiff.samediff.SameDiff.fitHelper(
at org.nd4j.autodiff.samediff.config.FitConfig.exec(
at org.deeplearning4j.examples.quickstart.modeling.recurrent.LocationNextNeuralNetworkV6_03.sameDiff3(
at org.deeplearning4j.examples.quickstart.modeling.recurrent.LocationNextNeuralNetworkV6_03.main(
private static int nIn = 6;
private static int nOut = 2;
private static int labelCount = 2;
private static int miniBatchSize = 32;
private static int numLabelClasses = -1;
private static SameDiff sd = SameDiff.create();
private static long dim0 = 0L;
private static long dim1 = 0L;
private static long dim2 = 0L;
private static Map<String,INDArray> placeholderData = new HashMap<>();
private static DataSet t;
public static void sameDiff3() throws IOException, InterruptedException
// ----- Load the training data -----
trainFeatures = new CSVSequenceRecordReader();
trainFeatures.initialize(new NumberedFileInputSplit(featuresDirTrain.getAbsolutePath() + "/%d.csv", 0, lastTrainCount));
trainLabels = new CSVSequenceRecordReader();
trainLabels.initialize(new NumberedFileInputSplit(labelsDirTrain.getAbsolutePath() + "/%d.csv", 0, lastTrainCount));
trainData = new SequenceRecordReaderDataSetIterator(trainFeatures, trainLabels, miniBatchSize, numLabelClasses,
true, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);
// ----- Load the test data -----
//Same process as for the training data.
testFeatures = new CSVSequenceRecordReader();
testFeatures.initialize(new NumberedFileInputSplit(featuresDirTest.getAbsolutePath() + "/%d.csv", 0, lastTestCount));
testLabels = new CSVSequenceRecordReader();
testLabels.initialize(new NumberedFileInputSplit(labelsDirTest.getAbsolutePath() + "/%d.csv", 0, lastTestCount));
testData = new SequenceRecordReaderDataSetIterator(testFeatures, testLabels, miniBatchSize, numLabelClasses,
true, SequenceRecordReaderDataSetIterator.AlignmentMode.ALIGN_END);
normalizer = new NormalizerStandardize();
normalizer.fitLabel(true);; //Collect the statistics (mean/stdev) from the training data. This does not modify the input data
while(trainData.hasNext()) {
normalizer.transform(; //Apply normalization to the training data
while(testData.hasNext()) {
normalizer.transform(; //Apply normalization to the test data. This is using statistics calculated from the *training* set
System.out.println(" Printing traindata dataset shape - 1");
DataSet data =;
System.out.println(" Printing testdata dataset shape - 1");
DataSet data2 =;
UIServer uiServer = UIServer.getInstance();
StatsStorage statsStorage = new InMemoryStatsStorage();
int listenerFrequency = 1;
sd.setListeners(new ScoreListener());
t =;
dim0 = t.getFeatures().size(0);
dim1 = t.getFeatures().size(1);
dim2 = t.getFeatures().size(2);
System.out.println(" features - dim0 - 0 - "+dim0);
System.out.println(" features - dim1 - 0 - "+dim1);
System.out.println(" features - dim2 - 0 - "+dim2);
int nEpochs = 4;
for (int i = 0; i < nEpochs; i++) {"Epoch " + i + " starting. ");
History history =, 1);
trainData.reset();"Epoch " + i + " completed. ");
System.out.println(" Starting test data evaluation --- ");
//Evaluate on test set:
String outputVariable = "out";
Evaluation evaluation = new Evaluation();
sd.evaluate(testData, outputVariable, evaluation);
//Print evaluation statistics:
System.out.println(" evaluation.stats() - "+evaluation.stats());
String pathToSavedNetwork = "src/main/assets/";
File savedNetwork = new File(pathToSavedNetwork);, true);
// ModelSerializer.addNormalizerToModel(savedNetwork, normalizer);
System.out.println("----- Example Complete -----");
//Save the trained network for inference - FlatBuffers format
File saveFileForInference = new File("src/main/assets/sameDiffExampleInference.fb");
try {
} catch (IOException e) {
throw new RuntimeException(e);
private static void getConfiguration()
placeholderData = placeholderData = new HashMap<>();
//Create input and label variables
SDVariable input = sd.placeHolder("input", DataType.FLOAT, miniBatchSize, nIn, -1);
SDVariable label = sd.placeHolder("label", DataType.FLOAT, miniBatchSize, nOut, -1);
placeholderData.put("input", t.getFeatures());
placeholderData.put("label", t.getLabels());
LSTMLayerConfig mLSTMConfiguration = LSTMLayerConfig.builder()
LSTMLayerOutputs outputs = new LSTMLayerOutputs(sd.rnn.lstmLayer(
.weights(sd.var("weights", Nd4j.rand(DataType.FLOAT, nIn, 4 * nOut)))
.rWeights(sd.var("rWeights", Nd4j.rand(DataType.FLOAT, nOut, 4 * nOut)))
.bias(sd.var("bias", Nd4j.rand(DataType.FLOAT, 4 * nOut)))
mLSTMConfiguration), mLSTMConfiguration);
// t.getFeatures().size(0) == input.getShape()[0] == miniBatchSize
// t.getFeatures().size(1) == input.getShape()[1] == nIn
// t.getFeatures().size(2) == input.getShape()[2] == TimeSteps
SDVariable layer0 = outputs.getOutput();
SDVariable layer0Permuted = sd.permute(layer0, 0, 2, 1);
SDVariable layer0PermutedReshaped = sd.reshape(layer0Permuted, -1, nOut);
SDVariable w1 = sd.var("w1", new XavierInitScheme('c', nIn, nOut), DataType.FLOAT, nOut, labelCount);
SDVariable b1 = sd.var("b1", Nd4j.rand(DataType.FLOAT, labelCount));
SDVariable mmulOutput = layer0PermutedReshaped.mmul(w1);
SDVariable mmulOutputUnreshaped = sd.reshape(mmulOutput, miniBatchSize, -1, labelCount);
SDVariable mmulOutputUnreshapedUnPermuted = sd.permute(mmulOutputUnreshaped, 0, 2, 1);
SDVariable out = sd.nn.softmax("out", mmulOutputUnreshapedUnPermuted);
SDVariable loss = sd.loss.logLoss("loss", label, out);
double learningRate = 1e-3;
TrainingConfig config = new TrainingConfig.Builder()
.l2(1e-4) //L2 regularization
.updater(new Adam(learningRate)) //Adam optimizer with specified learning rate
.dataSetFeatureMapping("input") //DataSet features array should be associated with variable "input"
.dataSetLabelMapping("label") //DataSet label array should be associated with variable "label"
System.out.println(" Printing sd information");