Model diverges after epoch N


I have rather a newbee question about the training (see the picture).

Is it normal if a model shows divergence after epoch N (say N=100)? Or ist it a sign of poorly designed model/dataset?

What could be the cause for the score collapsing at iteration #5000? (see the green in the picture)

The score seems to be affected by some bias (see the yellow in the picture). Can we deduce something from such behaviour?

UPDATE: after reducing the learning rate the model converges well