Differences in Shakespeare generator example

pwilkin · October 20, 2021, 3:57pm

Hello, I’m a new visitor to the deep learning community, trying to find my way around after reading the O’Reilly book about DeepLearning4J. I tried to adapt the example from the book (GravesLSTMCharModellingExample), only to be pointed by Adam Gibson on Gitter that the example has already been ported here ( deeplearning4j-examples/GenerateTxtModel.java at master · eclipse/deeplearning4j-examples (github.com). My attempt at the adaptation is here:

Graves LSTM example adapted for DL4J 1.0 (github.com)

Now, the one thing I noticed is that while my adapted example works very badly (the network hardly converges), the official one works very well (even the first set of samples is already pretty good). I’m wondering - what would be the major differences here? Is it the choice of the updater? Was the regularization rate too high in the original example (0.01 vs 0.001)? Or did I make a configuration error somewhere that I didn’t notice?

And BTW, the ScoreIterationListener doesn’t seem to print out anything to the console in this case - is this expected behavior?

Topic		Replies	Views
LSTM Regression Example DL4J	11	1160	January 13, 2022
Welcome to the DL4J Community! DL4J	9	2252	November 23, 2024
Transfer Learning with LSTM DL4J	12	567	April 10, 2020
Basic deeplearning4j classification example DL4J	4	1002	February 3, 2020
Embedding sequence layer with LSTM DL4J	1	320	May 27, 2021

Differences in Shakespeare generator example

Related topics