Is there an example of wiring up a network similar to GitHub - philipperemy/keras-tcn: Keras Temporal Convolutional Network. in DL4J. I am trying to use this style of network on a sequence of word GloVe embeddings using a setup like below:
ComputationGraphConfiguration conf = new NeuralNetConfiguration.Builder()
.seed(12345)
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
.updater(new Nesterovs()).weightInit(XAVIER).activation(Activation.TANH)
.graphBuilder()
.addInputs("in")
.addLayer("c0", new Convolution1D.Builder(3, 1).convolutionMode(ConvolutionMode.Causal).dilation(1).nIn(300).nOut(256).build(), "in")
.addLayer("c1", new Convolution1D.Builder(3, 1).convolutionMode(ConvolutionMode.Causal).dilation(2).nIn(256).nOut(256).build(), "c0")
.addLayer("c2", new Convolution1D.Builder(3, 1).convolutionMode(ConvolutionMode.Causal).dilation(4).nIn(256).nOut(256).build(), "c1")
.addLayer("c3", new Convolution1D.Builder(3, 1).convolutionMode(ConvolutionMode.Causal).dilation(8).nIn(256).nOut(256).build(), "c2")
.addLayer("p2", new GlobalPoolingLayer.Builder(PoolingType.MAX).build(), "c3")
.addLayer("batchNorm", new BatchNormalization.Builder().nIn(256).nOut(256).build(), "p2")
.addLayer("l0", new DenseLayer.Builder().nIn(256).nOut(256).dropOut(0.5).build(), "batchNorm")
.addLayer("l1", new DenseLayer.Builder().nIn(256).nOut(128).build(), "l0")
.layer("out", new OutputLayer.Builder().nIn(DENSE_L1).nOut(1).lossFunction(LossFunctions.LossFunction.XENT).activation(Activation.SIGMOID).build(),"l1")
.setOutputs("out")
.build();
But I keep getting errors like the following:
Failed to execute op conv1d. Attempted to execute with 3 inputs, 1 outputs, 0 targs,0 bargs and 6 iargs. Inputs: [(FLOAT,[200,256,300],c), (FLOAT,[3,256,256],f), (FLOAT,[256],c)]. Outputs: [(FLOAT,[200,256,300],f)]. tArgs: -. iArgs: [3, 1, 0, 4, 2, 0]. bArgs: -. Op own name: "f428232c-01be-4f85-a02a-91a72a8adb6f" - Please see above message (printed out from c++) for a possible cause of error.
java.lang.RuntimeException: Op [conv1d] execution failed
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1723) ~[nd4j-native-1.0.0-beta6.jar:?]
at org.nd4j.linalg.factory.Nd4j.exec(Nd4j.java:6599) ~[nd4j-api-1.0.0-beta6.jar:1.0.0-beta6]
at org.deeplearning4j.nn.layers.convolution.Convolution1DLayer.causalConv1dForward(Convolution1DLayer.java:206) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.layers.convolution.Convolution1DLayer.preOutput(Convolution1DLayer.java:161) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.layers.convolution.ConvolutionLayer.activate(ConvolutionLayer.java:446) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.layers.convolution.Convolution1DLayer.activate(Convolution1DLayer.java:212) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.graph.vertex.impl.LayerVertex.doForward(LayerVertex.java:111) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.graph.ComputationGraph.ffToLayerActivationsInWS(ComputationGraph.java:2136) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1373) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.graph.ComputationGraph.computeGradientAndScore(ComputationGraph.java:1342) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.optimize.solvers.BaseOptimizer.gradientAndScore(BaseOptimizer.java:170) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.optimize.solvers.StochasticGradientDescent.optimize(StochasticGradientDescent.java:63) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.optimize.Solver.optimize(Solver.java:52) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.graph.ComputationGraph.fitHelper(ComputationGraph.java:1166) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1116) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1083) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at org.deeplearning4j.nn.graph.ComputationGraph.fit(ComputationGraph.java:1019) ~[deeplearning4j-nn-1.0.0-beta6.jar:?]
at com.masked.aiproject.GloveDilatedCNN.trainEpoch(GloveDilatedCNN.java:264) ~[main/:?]
at com.masked.aiproject.GloveDilatedCNN.handleFile(GloveDilatedCNN.java:205) ~[main/:?]
at com.masked.aiproject.GloveDilatedCNN.lambda$main$0(GloveDilatedCNN.java:117) [main/:?]
at com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly(TrustedListenableFutureTask.java:125) [guava-26.0-jre.jar:?]
at com.google.common.util.concurrent.InterruptibleTask.run(InterruptibleTask.java:57) [guava-26.0-jre.jar:?]
at com.google.common.util.concurrent.TrustedListenableFutureTask.run(TrustedListenableFutureTask.java:78) [guava-26.0-jre.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:834) [?:?]
Caused by: java.lang.RuntimeException: could not create a dilated convolution forward descriptor
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:2019) ~[nd4j-native-1.0.0-beta6.jar:?]
at org.nd4j.linalg.cpu.nativecpu.ops.NativeOpExecutioner.exec(NativeOpExecutioner.java:1713) ~[nd4j-native-1.0.0-beta6.jar:?]
... 25 more
Does anyone have an example or any tips on how to debug this?
Edit: Added formatting so it is easier to read the post.