I followed your suggestion to replicate the original one-layer LSTM network I implemented using dl4j.
This
was the starting point for my code.
It seems to work up to a point. I have no idea as to how meaningful the results are as I have been unable to add a loss function dues to a shape mismatch.
After running for a number of cycles (2? I am not sure how to read the log file), it fails with a shape mismatch error.
Below you will find the error log and the latest version of the code (the error log if far longer).
I got to this point by trial and error without really understanding why the code works to the point that it does.
Please let me know what you think is going on. Thanks.
ERROR
Printing traindata dataset shape - 1
[32, 6, 57]
Printing testdata dataset shape - 1
[32, 6, 13]
Printing traindata feature and label dataset shape
[32, 6, 28]
[32, 2, 28]
features - dim0 - 32
features - dim1 - 6
features - dim2 - 28
Printing sd information
SameDiff(nVars=12,nOps=5)
— Summary —
Variables: 12 (5 with arrays)
Functions: 5
SameDiff Function Defs: 0
Loss function variables:
— Variables —
- Name - - Array Shape - - Variable Type - - Data Type- - Output Of Function - - Inputs To Functions -
add - ARRAY FLOAT add(add) [softmax]
b1 [1] VARIABLE FLOAT [add]
bias [2, 8] VARIABLE FLOAT [lstmLayer]
input [32, 6, 28] PLACEHOLDER FLOAT [lstmLayer]
label [32, 4] PLACEHOLDER FLOAT
lstmLayer - ARRAY FLOAT lstmLayer(lstmLayer) [reduce_mean]
matmul - ARRAY FLOAT matmul(matmul) [add]
out - ARRAY FLOAT softmax(softmax)
rWeights [2, 2, 8] VARIABLE FLOAT [lstmLayer]
reduce_mean - ARRAY FLOAT reduce_mean(reduce_mean) [matmul]
w1 [4, 4] VARIABLE FLOAT [matmul]
weights [2, 28, 8] VARIABLE FLOAT [lstmLayer]
— Functions —
- Function Name - - Op - - Inputs - - Outputs -
0 lstmLayer LSTMLayer [input, weights, rWeights, bias] [lstmLayer]
1 reduce_mean Mean [lstmLayer] [reduce_mean]
2 matmul Mmul [reduce_mean, w1] [matmul]
3 add AddOp [matmul, b1] [add]
4 softmax SoftMax [add] [out]
Added differentiated op softmax
Added differentiated op add
Added differentiated op matmul
Added differentiated op reduce_mean
Added differentiated op lstmLayer
Debug info for node_2 input[0]; shape: [32, 6, 28]; ews: [1]; order: [f]; dtype: [FLOAT]; first values: [0.480073, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0]
Debug info for node_2 input[1]; shape: [2, 28, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [0.708899, 0.482904, 0.531003, 0.976715, 0.442821, 0.552599, 0.749225, 0.492696, 0.407787, 0.117383, 0.426929, 0.652086, 0.71485, 0.606519, 0.371322, 0.694302]
Debug info for node_2 input[2]; shape: [2, 2, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [0.478817, 0.713919, 0.795261, 0.31311, 0.652732, 0.998342, 0.847164, 0.0842575, 0.200632, 0.335908, 0.0698259, 0.963265, 0.213671, 0.445501, 0.940424, 0.396508]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Removing variable <1:3>
Executing op: [lstmLayer]
About to get variable in execute output
node_1:0 result shape: [32, 6, 4]; dtype: FLOAT; first values [0.849772, 0.73437, 5.83051, 8.03961, 1.46902, 1.28924, 2.67941, 3.98201, 1.91017, 1.73674, 1.42729, 2.21462, 3.25434, 3.84437, 1.34903, 1.72554, 5.95258, 8.76993, 0.933508, 1.02732, 11.9936, 19.9579, 0.680528, 0.64996, 0.868451, 0.76536, 1.02353, 1.11707, 0.993514, 0.691358, 0.39941, 0.334278]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [32, 4]; dtype: FLOAT; first values [0.794465, 0.127477, -1.60831, -3.91174, 0.299192, 0.00638225, -0.0186475, -0.42616, 0.794887, 0.127653, -1.60946, -3.91384, 0.58846, 0.00785959, -0.438789, -1.62087, 1.11276, -0.0541707, -1.53608, -4.06199, 1.65794, -0.557415, -0.827336, -3.39457, 1.11354, -0.0576317, -1.52672, -4.04792, 1.55479, -0.499344, -0.85779, -3.3646]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [add]
About to get variable in execute output
node_1:0 result shape: [32, 4]; dtype: FLOAT; first values [1.21663, 0.549647, -1.18614, -3.48957, 0.721362, 0.428552, 0.403522, -0.00399056, 1.21706, 0.549823, -1.18729, -3.49167, 1.01063, 0.43003, -0.0166191, -1.1987, 1.53493, 0.367999, -1.11391, -3.63982, 2.08011, -0.135245, -0.405167, -2.9724, 1.53571, 0.364538, -1.10456, -3.62575, 1.97696, -0.0771744, -0.43562, -2.94243]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [softmax_bp]
About to get variable in execute output
node_1:0 result shape: [32, 4]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, 3.69649e-08, 1.89676e-08, 3.33882e-09, 3.33284e-10, 0, 0, 0, 0, 4.29518e-08, 1.33717e-08, 3.03812e-09, 2.43005e-10, 4.97198e-08, 5.42515e-09, 4.14177e-09, 3.17872e-10, 0, 0, 0, 0, 0, 0, 0, 0]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Executing op: [add_bp]
About to get variable in execute output
node_1:0 result shape: [32, 4]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, 3.69649e-08, 1.89676e-08, 3.33882e-09, 3.33284e-10, 0, 0, 0, 0, 4.29518e-08, 1.33717e-08, 3.03812e-09, 2.43005e-10, 4.97198e-08, 5.42515e-09, 4.14177e-09, 3.17872e-10, 0, 0, 0, 0, 0, 0, 0, 0]
About to get variable in execute output
node_1:1 result shape: [1]; dtype: FLOAT; first values [1.78814e-07]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [1]; dtype: FLOAT; first values [0.000992596]
About to get variable in execute output
node_1:1 result shape: [1]; dtype: FLOAT; first values [1.79738e-12]
About to get variable in execute output
node_1:2 result shape: [1]; dtype: FLOAT; first values [4.23958e-06]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [1]; dtype: FLOAT; first values [0.421177]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Executing op: [matmul_bp]
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [32, 4]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, -7.22628e-09, 1.56904e-09, 3.0658e-08, -6.69388e-09, 0, 0, 0, 0, -1.12187e-08, 2.0442e-09, 3.38978e-08, -2.62647e-09, -1.62658e-08, 1.75764e-09, 3.76462e-08, 3.27964e-09, 0, 0, 0, 0, 0, 0, 0, 0]
Executing op: [matmul]
About to get variable in execute output
node_1:0 result shape: [4, 4]; dtype: FLOAT; first values [6.91882e-07, -4.85486e-07, 3.21119e-08, -2.81657e-09, 1.20087e-06, -2.27313e-07, 8.13507e-08, 3.53942e-09, 4.76888e-07, -1.98581e-07, 7.33345e-09, -1.04926e-08, 7.29635e-07, -1.09877e-07, 3.11691e-08, -7.05006e-09]
About to get variable in execute output
node_1:0 result shape: [32, 4]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, -7.22628e-09, 1.56904e-09, 3.0658e-08, -6.69388e-09, 0, 0, 0, 0, -1.12187e-08, 2.0442e-09, 3.38978e-08, -2.62647e-09, -1.62658e-08, 1.75764e-09, 3.76462e-08, 3.27964e-09, 0, 0, 0, 0, 0, 0, 0, 0]
About to get variable in execute output
node_1:1 result shape: [4, 4]; dtype: FLOAT; first values [6.91882e-07, -4.85486e-07, 3.21119e-08, -2.81657e-09, 1.20087e-06, -2.27313e-07, 8.13507e-08, 3.53942e-09, 4.76888e-07, -1.98581e-07, 7.33345e-09, -1.04926e-08, 7.29635e-07, -1.09877e-07, 3.11691e-08, -7.05006e-09]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [4, 4]; dtype: FLOAT; first values [-0.00099122, 0.000990289, 0.000923081, -0.000990391, 0.00096508, 0.000893534, -0.000993512, -0.000992771, 0.000995655, 0.000981697, 0.000982201, 0.000995039, 0.000974414, -0.00099488, 0.000988444, -0.000992701]
About to get variable in execute output
node_1:1 result shape: [4, 4]; dtype: FLOAT; first values [1.27447e-12, 1.03996e-12, 1.44018e-14, 1.06229e-12, 7.63809e-14, 7.04372e-15, 2.34463e-12, 1.88619e-12, 5.25077e-12, 2.87671e-13, 3.04502e-13, 4.02234e-12, 1.45039e-13, 3.77599e-12, 7.31585e-13, 1.84979e-12]
About to get variable in execute output
node_1:2 result shape: [4, 4]; dtype: FLOAT; first values [-3.56999e-06, 3.22486e-06, 3.795e-07, -3.2593e-06, 8.73968e-07, 2.65402e-07, -4.84218e-06, -4.34305e-06, 7.24627e-06, 1.6961e-06, 1.74501e-06, 6.34224e-06, 1.20433e-06, -6.14495e-06, 2.7048e-06, -4.30095e-06]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [4, 4]; dtype: FLOAT; first values [-0.362927, 0.326351, 0.0367058, -0.324911, 0.0744229, 0.0279198, -0.484037, -0.433348, 0.718862, 0.170614, 0.173445, 0.633333, 0.112162, -0.612401, 0.26918, -0.429031]
Removing variable <1:0>
Removing variable <1:1>
Executing op: [reduce_mean_bp]
About to get variable in execute output
node_1:0 result shape: [32, 6, 4]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
Debug info for node_2 input[0]; shape: [32, 6, 28]; ews: [1]; order: [f]; dtype: [FLOAT]; first values: [0.480073, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0]
Debug info for node_2 input[1]; shape: [2, 28, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [0.708899, 0.482904, 0.531003, 0.976715, 0.442821, 0.552599, 0.749225, 0.492696, 0.407787, 0.117383, 0.426929, 0.652086, 0.71485, 0.606519, 0.371322, 0.694302]
Debug info for node_2 input[2]; shape: [2, 2, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [0.478817, 0.713919, 0.795261, 0.31311, 0.652732, 0.998342, 0.847164, 0.0842575, 0.200632, 0.335908, 0.0698259, 0.963265, 0.213671, 0.445501, 0.940424, 0.396508]
Debug info for node_2 input[3]; shape: [2, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [0.634018, 0.182539, 0.817293, 0.734501, 0.221919, 0.359397, 0.909973, 0.57734, 0.274089, 0.148167, 0.000462413, 0.952388, 0.793436, 0.771765, 0.292129, 0.309894]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Removing variable <1:3>
Removing variable <1:4>
Executing op: [lstmLayer_bp]
About to get variable in execute output
node_1:0 result shape: [32, 6, 28]; dtype: FLOAT; first values [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
About to get variable in execute output
node_1:1 result shape: [2, 28, 8]; dtype: FLOAT; first values [4.13802e-08, 4.89966e-09, 3.10218e-08, 4.43322e-09, 7.48696e-08, 1.10441e-08, 1.41268e-08, 1.25986e-08, 3.90659e-09, -2.75095e-10, 5.84314e-10, 2.44075e-11, -9.23789e-08, -2.17932e-08, 9.25006e-11, 3.13561e-10, 4.06404e-09, -2.42605e-10, 5.73112e-10, 2.1081e-11, -9.24447e-08, -2.11804e-08, 4.27572e-12, 3.33022e-10, 4.03604e-09, -2.6178e-10, 5.71726e-10, 2.10758e-11, 6.95787e-09, 1.86189e-09, -4.75046e-11, 3.26383e-10]
About to get variable in execute output
node_1:2 result shape: [2, 2, 8]; dtype: FLOAT; first values [-1.93821e-07, -1.25492e-08, -7.40319e-08, -6.34455e-09, -4.13854e-07, -2.18169e-08, -2.77321e-08, -3.26106e-08, -1.52975e-07, -1.05278e-08, -6.53342e-08, -5.4067e-09, -3.84343e-07, -3.68898e-09, -2.43036e-08, -2.48422e-08, 2.93766e-07, 7.21267e-08, 3.25517e-08, 7.91499e-08, 4.2865e-07, 2.01337e-07, 1.5855e-07, 3.08569e-07, 3.59455e-07, 7.63541e-08, 6.85051e-08, 9.02766e-08, 5.35456e-07, 2.2367e-07, 1.87411e-07, 3.63651e-07]
About to get variable in execute output
node_1:3 result shape: [2, 8]; dtype: FLOAT; first values [-1.12297e-07, -2.21489e-08, -4.55664e-08, -5.49458e-09, -2.71389e-07, -3.62453e-08, -4.93015e-08, -3.61429e-08, 2.75381e-07, 1.40554e-07, 1.31891e-07, 6.49031e-08, 3.66783e-07, 2.81401e-07, 2.12401e-07, 2.94411e-07]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [2, 28, 8]; dtype: FLOAT; first values [0.000995561, 0.000993495, 0.000994083, 0.000996773, 0.000992921, 0.000994311, 0.000995798, 0.000993624, 0.000992306, 0.000973766, 0.000992647, 0.000995174, 0.00099559, 0.000994811, 0.000991556, 0.000995466, 0.000996443, 0.000996296, 0.000983858, 0.000941161, 0.00098094, 0.000986143, 0.000996359, 0.000994685, 0.000995813, 0.000994135, 0.000979586, 0.000934845, 0.000994655, 0.000996165, 0.000996138, 0.000994928]
About to get variable in execute output
node_1:1 result shape: [2, 28, 8]; dtype: FLOAT; first values [5.03119e-12, 2.3324e-12, 2.8229e-12, 9.54046e-12, 1.96752e-12, 3.05483e-12, 5.61542e-12, 2.4287e-12, 1.6632e-12, 1.37779e-13, 1.82271e-12, 4.2521e-12, 5.09685e-12, 3.67596e-12, 1.37879e-12, 4.82054e-12, 7.84838e-12, 7.23368e-12, 3.71478e-13, 2.55853e-14, 2.64868e-13, 5.06446e-13, 7.48859e-12, 3.50235e-12, 5.65498e-12, 2.87279e-12, 2.30261e-13, 2.05869e-14, 3.46353e-12, 6.74598e-12, 6.65147e-12, 3.8485e-12]
About to get variable in execute output
node_1:2 result shape: [2, 28, 8]; dtype: FLOAT; first values [7.09313e-06, 4.82953e-06, 5.31314e-06, 9.76759e-06, 4.4357e-06, 5.52709e-06, 7.49366e-06, 4.92822e-06, 4.07826e-06, 1.1738e-06, 4.26935e-06, 6.52086e-06, 7.13927e-06, 6.06301e-06, 3.71323e-06, 6.94306e-06, 8.85917e-06, 8.50516e-06, 1.92739e-06, 5.05823e-07, 1.62749e-06, 2.25045e-06, 8.65372e-06, 5.9181e-06, 7.52001e-06, 5.35988e-06, 1.51745e-06, 4.53731e-07, 5.88521e-06, 8.21345e-06, 8.15571e-06, 6.20367e-06]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [2, 28, 8]; dtype: FLOAT; first values [0.707904, 0.48191, 0.530009, 0.975718, 0.441828, 0.551604, 0.748229, 0.491702, 0.406794, 0.116409, 0.425937, 0.65109, 0.713855, 0.605524, 0.37033, 0.693307, 0.884879, 0.849522, 0.191749, 0.0496409, 0.162692, 0.224271, 0.864375, 0.590812, 0.750964, 0.534996, 0.150759, 0.044438, 0.587457, 0.82033, 0.814575, 0.619368]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [2, 2, 8]; dtype: FLOAT; first values [0.000993412, 0.000995589, 0.000996036, 0.000989999, 0.000995148, 0.000996842, 0.00099628, 0.000963691, 0.000984366, 0.000990671, 0.000956283, 0.000996728, 0.000985153, 0.000992951, 0.000996648, 0.000992083, 0.000983321, 0.000996377, 0.000990209, 0.000994209, 0.000955753, 0.000995983, 0.000996727, 0.000909622, 0.000986283, 0.000995567, 0.000994379, 0.000947502, 0.000994527, 0.000993998, 0.000974479, 0.00097095]
About to get variable in execute output
node_1:1 result shape: [2, 2, 8]; dtype: FLOAT; first values [2.2741e-12, 5.09495e-12, 6.31254e-12, 9.79972e-13, 4.20668e-12, 9.96238e-12, 7.17207e-12, 7.04438e-14, 3.96412e-13, 1.12762e-12, 4.78478e-14, 9.27763e-12, 4.40271e-13, 1.98436e-12, 8.83929e-12, 1.5702e-12, 3.4757e-13, 7.56255e-12, 1.02276e-12, 2.9478e-12, 4.66584e-14, 6.14613e-12, 9.27295e-12, 1.01296e-14, 5.1699e-13, 5.04343e-12, 3.12952e-12, 3.25743e-14, 3.30135e-12, 2.74225e-12, 1.458e-13, 1.11713e-13]
About to get variable in execute output
node_1:2 result shape: [2, 2, 8]; dtype: FLOAT; first values [4.76878e-06, 7.13794e-06, 7.9452e-06, 3.13047e-06, 6.48594e-06, 9.98124e-06, 8.46887e-06, 8.39314e-07, 1.99102e-06, 3.35803e-06, 6.91726e-07, 9.63211e-06, 2.09828e-06, 4.45464e-06, 9.40181e-06, 3.9626e-06, 1.86434e-06, 8.69635e-06, 3.19808e-06, 5.4294e-06, 6.83074e-07, 7.83978e-06, 9.62968e-06, 3.18273e-07, 2.27376e-06, 7.10176e-06, 5.59425e-06, 5.70743e-07, 5.74577e-06, 5.23669e-06, 1.20749e-06, 1.05695e-06]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [2, 2, 8]; dtype: FLOAT; first values [0.477823, 0.712924, 0.794264, 0.31212, 0.651737, 0.997345, 0.846167, 0.0832938, 0.199648, 0.334917, 0.0688696, 0.962268, 0.212686, 0.444508, 0.939427, 0.395516, 0.182513, 0.867917, 0.318493, 0.541154, 0.0630651, 0.780968, 0.960386, 0.027832, 0.222795, 0.708416, 0.557745, 0.055224, 0.568228, 0.520438, 0.1179, 0.101088]
Executing op: [adam_updater]
About to get variable in execute output
node_1:0 result shape: [2, 8]; dtype: FLOAT; first values [0.000995028, 0.000982951, 0.000996144, 0.000995713, 0.000985779, 0.000991269, 0.000996535, 0.000994549, 0.000988706, 0.000979296, 0.000360328, 0.000996693, 0.000996049, 0.000995934, 0.000989367, 0.000989993]
About to get variable in execute output
node_1:1 result shape: [2, 8]; dtype: FLOAT; first values [4.00551e-12, 3.32392e-13, 6.67214e-12, 5.39405e-12, 4.80502e-13, 1.28904e-12, 8.27144e-12, 3.329e-12, 7.66411e-13, 2.23717e-13, 3.17307e-17, 9.08269e-12, 6.35366e-12, 5.99964e-12, 8.6584e-13, 9.78665e-13]
About to get variable in execute output
node_1:2 result shape: [2, 8]; dtype: FLOAT; first values [6.32895e-06, 1.82317e-06, 8.16837e-06, 7.34447e-06, 2.19205e-06, 3.59034e-06, 9.09481e-06, 5.76979e-06, 2.76843e-06, 1.49573e-06, 1.78132e-08, 9.53038e-06, 7.97104e-06, 7.74579e-06, 2.94253e-06, 3.12838e-06]
Executing op: [subtract]
About to get variable in execute output
node_1:0 result shape: [2, 8]; dtype: FLOAT; first values [0.633023, 0.181556, 0.816297, 0.733506, 0.220933, 0.358406, 0.908977, 0.576345, 0.273101, 0.147188, 0.000102085, 0.951392, 0.79244, 0.770769, 0.29114, 0.308904]
Debug info for node_2 input[0]; shape: [32, 6, 37]; ews: [1]; order: [f]; dtype: [FLOAT]; first values: [0.240036, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0, -0]
Debug info for node_2 input[1]; shape: [2, 28, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [0.707904, 0.48191, 0.530009, 0.975718, 0.441828, 0.551604, 0.748229, 0.491702, 0.406794, 0.116409, 0.425937, 0.65109, 0.713855, 0.605524, 0.37033, 0.693307]
Debug info for node_2 input[2]; shape: [2, 2, 8]; ews: [1]; order: [c]; dtype: [FLOAT]; first values: [0.477823, 0.712924, 0.794264, 0.31212, 0.651737, 0.997345, 0.846167, 0.0832938, 0.199648, 0.334917, 0.0688696, 0.962268, 0.212686, 0.444508, 0.939427, 0.395516]
Removing variable <1:0>
Removing variable <1:1>
Removing variable <1:2>
Removing variable <1:3>
Executing op: [lstmLayer]
Error at [/home/runner/work/deeplearning4j/deeplearning4j/libnd4j/include/ops/declarable/generic/nn/recurrent/lstmLayer.cpp:226:0]:
LSTM_LAYER operation: wrong shape of input weights, expected is [2, 37, 8], but got [2, 28, 8] instead !
Exception in thread “main” java.lang.RuntimeException: Op with name lstmLayer and op type [lstmLayer] execution failed with message Op validation failed
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1905)
at org.nd4j.linalg.factory.Nd4j.exec(Nd4j.java:6554)
at org.nd4j.autodiff.samediff.internal.InferenceSession.doExec(InferenceSession.java:801)
at org.nd4j.autodiff.samediff.internal.InferenceSession.getOutputs(InferenceSession.java:255)
at org.nd4j.autodiff.samediff.internal.TrainingSession.getOutputs(TrainingSession.java:163)
at org.nd4j.autodiff.samediff.internal.TrainingSession.getOutputs(TrainingSession.java:45)
at org.nd4j.autodiff.samediff.internal.AbstractSession.output(AbstractSession.java:533)
at org.nd4j.autodiff.samediff.internal.AbstractSession.output(AbstractSession.java:154)
at org.nd4j.autodiff.samediff.internal.TrainingSession.trainingIteration(TrainingSession.java:129)
at org.nd4j.autodiff.samediff.SameDiff.fitHelper(SameDiff.java:1936)
at org.nd4j.autodiff.samediff.SameDiff.fit(SameDiff.java:1792)
at org.nd4j.autodiff.samediff.SameDiff.fit(SameDiff.java:1732)
at org.nd4j.autodiff.samediff.config.FitConfig.exec(FitConfig.java:172)
at org.nd4j.autodiff.samediff.SameDiff.fit(SameDiff.java:1712)
at org.deeplearning4j.examples.quickstart.modeling.recurrent.LocationNextNeuralNetworkV6.sameDiff(LocationNextNeuralNetworkV6.java:861)
at org.deeplearning4j.examples.quickstart.modeling.recurrent.LocationNextNeuralNetworkV6.main(LocationNextNeuralNetworkV6.java:199)
CODE
SameDiff sd = SameDiff.create();
Map<String,INDArray> placeholderData = new HashMap<>();
//Properties for dataset:
int nIn = 6;
int nOut = 2;
int miniBatchSize = 32;
while(trainData.hasNext()) {
placeholderData = new HashMap<>();
DataSet t = trainData.next();
System.out.println(" Printing traindata feature and label dataset shape");
System.out.println(Arrays.toString(t.getFeatures().shape()));
System.out.println(Arrays.toString(t.getLabels().shape()));
INDArray features = t.getFeatures();
INDArray labels = t.getLabels();
placeholderData.put("input", features);
placeholderData.put("label", labels);
long dim0 = t.getFeatures().size(0);
long dim1 = t.getFeatures().size(1);
long dim2 = t.getFeatures().size(2);
System.out.println(" features - dim0 - "+dim0);
System.out.println(" features - dim1 - "+dim1);
System.out.println(" features - dim2 - "+dim2);
//Create input and label variables
SDVariable input = sd.placeHolder("input", DataType.FLOAT, dim0, dim1, dim2);
SDVariable label = sd.placeHolder("label", DataType.FLOAT, miniBatchSize, 4);
LSTMLayerConfig mLSTMConfiguration = LSTMLayerConfig.builder()
.lstmdataformat(LSTMDataFormat.NTS)
.directionMode(LSTMDirectionMode.BIDIR_CONCAT)
.gateAct(LSTMActivations.SIGMOID)
.cellAct(LSTMActivations.SOFTPLUS)
.outAct(LSTMActivations.SOFTPLUS)
.retFullSequence(true)
.retLastC(false)
.retLastH(false)
.build();
LSTMLayerOutputs outputs = new LSTMLayerOutputs(sd.rnn.lstmLayer(
input,
LSTMLayerWeights.builder()
.weights(sd.var("weights", Nd4j.rand(DataType.FLOAT, 2, dim2, 4 * nOut)))
.rWeights(sd.var("rWeights", Nd4j.rand(DataType.FLOAT, 2, nOut, 4 * nOut)))
.bias(sd.var("bias", Nd4j.rand(DataType.FLOAT, 2, 4 * nOut)))
.build(),
mLSTMConfiguration), mLSTMConfiguration);
// Behaviour with default settings: 3d (time series) input with shape
// [miniBatchSize, vectorSize, timeSeriesLength] → 2d output [miniBatchSize, vectorSize]
SDVariable layer0 = outputs.getOutput();
SDVariable layer1 = layer0.mean(1);
SDVariable w1 = sd.var("w1", new XavierInitScheme('c', nIn, nOut), DataType.FLOAT, 4, 4);
SDVariable b1 = sd.var("b1", Nd4j.rand(DataType.FLOAT, 1));
SDVariable out = sd.nn.softmax("out", layer1.mmul(w1).add(b1));
//Create and set the training configuration
double learningRate = 1e-3;
TrainingConfig config = new TrainingConfig.Builder()
.l2(1e-4) //L2 regularization
.updater(new Adam(learningRate)) //Adam optimizer with specified learning rate
.dataSetFeatureMapping("input") //DataSet features array should be associated with variable "input"
.dataSetLabelMapping("label") //DataSet label array should be associated with variable "label"
.build();
sd.setTrainingConfig(config);
System.out.println(" Printing sd information");
System.out.println(sd.toString());
System.out.println(sd.summary());
//Perform training for 2 epochs
int numEpochs = 2;
sd.fit(trainData, numEpochs);
//Evaluate on test set:
String outputVariable = "softmax";
Evaluation evaluation = new Evaluation();
sd.evaluate(testData, outputVariable, evaluation);
//Print evaluation statistics:
System.out.println(evaluation.stats());
}