I am creating a network where the cnn-1d layer is at the input, the network performs several iterations of training and then the process shuts down with an error “Process finished with exit code -1073740791 (0xC0000409)”. The error is only when using CUDA, when using the CPU everything is fine. Also, everything works when this layer is not the first, but, for example, the second or third. I tried to select hyperparameters, activation functions, etc., but nothing helps. When the batch size is smaller, the network can perform a little more iterations.
You can repeat this :
public static void main(String[] args) {
CudaEnvironment.getInstance().getConfiguration().setMaximumDeviceCacheableLength(1024 * 1024 * 2048L).setMaximumDeviceCache((long) (0.5 * 6096 * 1024 * 1024 * 2048L)).setMaximumHostCacheableLength(1024 * 1024 * 2048L).setMaximumHostCache((long) (0.5 * 6096 * 1024 * 1024 * 2048L));
Nd4j.getMemoryManager().setAutoGcWindow(100000);
Nd4j.getMemoryManager().togglePeriodicGc(false);
Nd4j.getEnvironment().allowHelpers(false);
int batchSize = 64;
int nEpochs = 10;
int numSamples = 100;
int inputSize = 10;
int sequenceLength = 60;
int outputSize = 3;
// Generate random data
Random rng = new Random();
ArrayList<DataSet> sets = new ArrayList<>();
for (int i = 0; i < numSamples; i++) {
double[][][] input = new double[1][inputSize][sequenceLength];
for (int j = 0; j < inputSize; j++) {
for (int k = 0; k < 10; k++) {
input[0][j][k] = rng.nextDouble();
}
}
double[][][] output = new double[1][outputSize][sequenceLength];
for (int j = 0; j < outputSize; j++) {
for (int k = 0; k < 10; k++) {
output[0][j][k] = rng.nextDouble();
}
}
INDArray in = Nd4j.createFromArray(input);
INDArray out = Nd4j.createFromArray(output);
sets.add(new DataSet(in, out));
}
DataSetIterator iterator = new ListDataSetIterator<>(sets, batchSize);
MultiLayerConfiguration config = new NeuralNetConfiguration.Builder()
.seed(123)
.trainingWorkspaceMode(WorkspaceMode.ENABLED)
.inferenceWorkspaceMode(WorkspaceMode.ENABLED)
.optimizationAlgo(OptimizationAlgorithm.STOCHASTIC_GRADIENT_DESCENT)
.gradientNormalization(GradientNormalization.ClipElementWiseAbsoluteValue)
.weightInit(WeightInit.LECUN_NORMAL)
.activation(Activation.TANH)
.convolutionMode(ConvolutionMode.Causal)
.cudnnAlgoMode(ConvolutionLayer.AlgoMode.NO_WORKSPACE)
.updater(new Adam())
.list()
.setInputType(InputType.recurrent(inputSize, sequenceLength, RNNFormat.NCW))
.layer(new Convolution1DLayer.Builder(3,1)
.nOut(100)
.activation(Activation.TANH)
.build())
.layer(new RnnOutputLayer.Builder(LossFunctions.LossFunction.MSE)
.activation(Activation.IDENTITY)
.nOut(outputSize)
.build())
.backpropType(BackpropType.Standard)
.build();
MultiLayerNetwork model = new MultiLayerNetwork(config);
System.out.println(model.summary());
model.setListeners(new ScoreIterationListener(1));
for (int i = 0; i < nEpochs; i++) {
model.fit(iterator);
}
}
Here is pom.xml:
<dependencies>
<!-- DL4j and Nd4j dependencies start -->
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-native</artifactId>
<version>${dl4j-master.version}</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-native-platform</artifactId>
<version>${dl4j-master.version}</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-native-api</artifactId>
<version>${dl4j-master.version}</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-backend-impls</artifactId>
<version>${dl4j-master.version}</version>
<type>pom</type>
</dependency>
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-core</artifactId>
<version>${dl4j-master.version}</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-api</artifactId>
<version>${dl4j-master.version}</version>
</dependency>
<dependency>
<groupId>org.deeplearning4j</groupId>
<artifactId>deeplearning4j-ui</artifactId>
<version>${dl4j-master.version}</version>
</dependency>
<dependency>
<groupId>org.webjars</groupId>
<artifactId>jquery-ui</artifactId>
<version>1.13.2</version>
</dependency>
<dependency>
<groupId>org.webjars</groupId>
<artifactId>jquery</artifactId>
<version>3.7.1</version>
</dependency>
<dependency>
<groupId>org.webjars</groupId>
<artifactId>bootstrap</artifactId>
<version>5.3.3</version>
</dependency>
<dependency>
<groupId>org.webjars.bower</groupId>
<artifactId>lodash</artifactId>
<version>4.17.21</version>
</dependency>
<dependency>
<groupId>com.beust</groupId>
<artifactId>jcommander</artifactId>
<version>1.82</version>
</dependency>
<dependency>
<groupId>io.vertx</groupId>
<artifactId>vertx-core</artifactId>
<version>3.9.13</version>
</dependency>
<dependency>
<groupId>io.vertx</groupId>
<artifactId>vertx-web</artifactId>
<version>3.9.13</version>
</dependency>
<!-- DL4j and Nd4j dependencies end -->
<!-- CUDA dependencies start -->
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-cuda-11.6</artifactId>
<version>${dl4j-master.version}</version>
</dependency>
<dependency>
<groupId>org.nd4j</groupId>
<artifactId>nd4j-cuda-11.6</artifactId>
<version>${dl4j-master.version}</version>
<classifier>windows-x86_64-cudnn</classifier>
</dependency>
<!-- CUDA dependencies end -->
And here is console output:
2024-05-17 12:18:41.504[1715937521504] | INFO | main | org.nd4j.linalg.factory.Nd4jBackend - Loaded [JCublasBackend] backend
2024-05-17 12:18:44.766[1715937524766] | INFO | main | org.nd4j.nativeblas.NativeOpsHolder - Number of threads used for linear algebra: 32
2024-05-17 12:18:44.821[1715937524821] | INFO | main | o.n.l.a.o.e.DefaultOpExecutioner - Backend used: [CUDA]; OS: [Windows 11]
2024-05-17 12:18:44.821[1715937524821] | INFO | main | o.n.l.a.o.e.DefaultOpExecutioner - Cores: [6]; Memory: [4,0GB];
2024-05-17 12:18:44.821[1715937524821] | INFO | main | o.n.l.a.o.e.DefaultOpExecutioner - Blas vendor: [CUBLAS]
2024-05-17 12:18:44.831[1715937524831] | INFO | main | o.nd4j.linalg.jcublas.JCublasBackend - ND4J CUDA build version: 11.6.55
2024-05-17 12:18:44.833[1715937524833] | INFO | main | o.nd4j.linalg.jcublas.JCublasBackend - CUDA device 0: [NVIDIA GeForce GTX 1060 6GB]; cc: [6.1]; Total memory: [6442319872]
2024-05-17 12:18:44.833[1715937524833] | INFO | main | o.nd4j.linalg.jcublas.JCublasBackend - Backend build information:
MSVC: 192930146
STD version: 201402L
DEFAULT_ENGINE: samediff::ENGINE_CUDA
HAVE_FLATBUFFERS
HAVE_CUDNN
2024-05-17 12:18:46.527[1715937526527] | INFO | main | o.d.nn.multilayer.MultiLayerNetwork - Starting MultiLayerNetwork with WorkspaceModes set to [training: ENABLED; inference: ENABLED], cacheMode set to [NONE]
==============================================================================
LayerName (LayerType) nIn,nOut TotalParams ParamsShape
==============================================================================
layer0 (Convolution1DLayer) 10,100 3 100 b:{100}, W:{100,10,3,1}
layer1 (RnnOutputLayer) 100,3 303 W:{100,3}, b:{3}
------------------------------------------------------------------------------
Total Parameters: 3 403
Trainable Parameters: 3 403
Frozen Parameters: 0
==============================================================================
2024-05-17 12:18:46.840[1715937526840] | INFO | main | o.d.o.l.ScoreIterationListener - Score at iteration 0 is 6.761971791585286
2024-05-17 12:18:46.884[1715937526884] | INFO | main | o.d.o.l.ScoreIterationListener - Score at iteration 1 is 5.826276425962095
2024-05-17 12:18:46.927[1715937526927] | INFO | main | o.d.o.l.ScoreIterationListener - Score at iteration 2 is 4.543416659037272
2024-05-17 12:18:46.957[1715937526957] | INFO | main | o.d.o.l.ScoreIterationListener - Score at iteration 3 is 3.873487967031973
Process finished with exit code -1073740791 (0xC0000409)