Error on saving large SameDiff model with FlatBuffers

Hi

I have a BERT model in SameDiff, that I would like to save using the normal “save” method provided by SameDiff. However, I get the following error as a result of the FlatBuffers maximum size:

Caused by: java.lang.AssertionError: FlatBuffers: cannot grow buffer beyond 2 gigabytes.
	at com.google.flatbuffers.FlatBufferBuilder.growByteBuffer(FlatBufferBuilder.java:210)
	at com.google.flatbuffers.FlatBufferBuilder.prep(FlatBufferBuilder.java:257)
	at com.google.flatbuffers.FlatBufferBuilder.startVector(FlatBufferBuilder.java:430)
	at org.nd4j.graph.FlatArray.createBufferVector(FlatArray.java:47)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.toFlatArray(BaseNDArray.java:5449)
	at org.nd4j.linalg.api.ndarray.BaseNDArray.toFlatArray(BaseNDArray.java:5446)
	at org.nd4j.autodiff.samediff.SameDiff.asFlatBuffers(SameDiff.java:4893)
	at org.nd4j.autodiff.samediff.SameDiff.asFlatBuffers(SameDiff.java:4737)
	at org.nd4j.autodiff.samediff.SameDiff.asFlatBuffers(SameDiff.java:4959)
	at org.nd4j.autodiff.samediff.SameDiff.asFlatFile(SameDiff.java:5073)
	at org.nd4j.autodiff.samediff.SameDiff.save(SameDiff.java:4973)
	at org.nd4j.autodiff.samediff.SameDiff.save(SameDiff.java:4993)

Is there an alternative way to store the model? Based on the functionality in SameDiff all what I saw finally ends using the FlatBuffers.

I also save the update state, what makes the model larger. However, I guess it should not be the problem to serialize a large model with the update state?

I’m currently using beta7.

Thank you :slight_smile:

Unfortunately that currently is a limitation of flatbuffers and it is one of the reasons that we’d like to move away from using it eventually, as is documented here: SameDiff: Implement new graph format (FlatBuffers storage problems to be resolved) · Issue #8312 · eclipse/deeplearning4j · GitHub

1 Like

Without saving the updater state it works. I can live with that for the moment, since I will not continue training of a trained model at the moment. For the future however, this might become a requirement.

Hi,

I run into this problem even with updater set to false.

Any suggestions as to how I can resolve/work around this issue?

Thanks

@adonnini what version are you using? Could you post more context?

@agibsonccc I am using m2.

The failure occurs when I try to execute this code:

        try {
            sd.asFlatFile(saveFileForInference, false);
        } catch (IOException e) {
            throw new RuntimeException(e);
        }

after I run sd.fit

Do you need any additional information?

Thanks