BertInferenceExample, fine tune question

Maple · July 19, 2020, 3:06am

In this example, i wanna fine tune my own bert model, by follow this instruction:

Blockquote
https://github.com/KonduitAI/dl4j-dev-tools/tree/master/import-tests/model_zoo/bert

as google says, the eval_accuracy should be expected between 84%-88% like this:

but after my fine tune, my result is:

my accuracy have only 68%, but i complish fine tune strictly followed by the instruction, i wonder where the problem is.
appreciate for your help.

liweigu · July 20, 2020, 11:13am

I can get ‘eval_accuracy = 0.8627451’ on linux using cpu to train.

The log:

/model.ckpt-2751Saving ‘checkpoint_path’ summary for global step 2751: /TF_Graphs/mrpc_output/
/model.ckpt-275124854 139784295372608 estimator.py:2109] Saving ‘checkpoint_path’ summary for global step 2751: /TF_Graphs/mrpc_output/
INFO:tensorflow:evaluation_loop marked as finished
I0720 10:59:38.225406 139784295372608 error_handling.py:101] evaluation_loop marked as finished
INFO:tensorflow:***** Eval results *****
I0720 10:59:38.225584 139784295372608 run_classifier.py:923] ***** Eval results *****
INFO:tensorflow: eval_accuracy = 0.8627451
I0720 10:59:38.225656 139784295372608 run_classifier.py:925] eval_accuracy = 0.8627451
INFO:tensorflow: eval_loss = 0.7270211
I0720 10:59:38.225800 139784295372608 run_classifier.py:925] eval_loss = 0.7270211
INFO:tensorflow: global_step = 2751
I0720 10:59:38.225892 139784295372608 run_classifier.py:925] global_step = 2751
INFO:tensorflow: loss = 0.7270211
I0720 10:59:38.225950 139784295372608 run_classifier.py:925] loss = 0.7270211

Maple · July 21, 2020, 1:34am

I’ll try linux.maybe is caused by system. because i fine tuned bert on windows, with tensorflow-gpu.

liweigu · July 23, 2020, 6:38am

I tried to freeze chinese_L-12_H-768_A-12.zip using command like ‘python3 freezeTrainedBert.py --input_dir=/chinese_L-12_H-768_A-12 --ckpt=bert_model.ckpt’ and got error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Input node loss/Softmax not found in graph
Should i audit freezeTrainedBert.py?

Maple · July 23, 2020, 7:14am

我还没有做到冻结图这一步，但根据这个网页教程

的步骤来，应该是没有问题的。
我在Linux下重新fine tune了一下，发现用CPU进行fine tune是没有问题的，得到了 eval_accuracy = 0.8480392。

liweigu · July 23, 2020, 8:51am

So it’s fine on cpu.
I care how to load chinese_L-12_H-768_A-12 in order to transfer-train using custom data.

Maple · July 23, 2020, 9:48am

About custom data, maybe you need have your own data processor.

agibsonccc · July 24, 2020, 2:09pm

Could you compare results on cpu and gpu? Are you saying the numbers are different? Make sure to make it reproducible (setting a seed, same parameters,…) to see if we have a reproducible issue here.

Maple · July 24, 2020, 2:54pm

you means my fine tune progress? i had compared results both on CPU & GPU and Windows & Ubuntu, i’d like to share my experiment

Maple · July 24, 2020, 3:25pm

I fine tuned bert with MRPC task.experiment is present below.

Environment
python3.6
tensorflow 1.11.0 # use this for training on CPU
tensorflow-gpu 1.11.0 # use this for traning on GPU

Parameter:
max_seq_length=128
train_batch_size=4
learning_rate=2e-5
num_train_epochs=3.0
all these parameter is presented at
https://github.com/KonduitAI/dl4j-dev-tools/tree/master/import-tests/model_zoo/bert

Experiment
win10 + tensorflow 1.11.0 + CPU. eval_accuracy = 0.68
win10 + tensorflow-gpu 1.11.0 + GPU. eval_accuracy = 0.68
ubuntu + tensorflow 1.11.0 + CPU. eval_accuracy =0.84
ubuntu + tensorflow-gpu 1.11.0 + GPU. eval_accuracy = 0.68

obviously, only on ubuntu with CPU, the result is correct. so, we could only fine tuned bert on CPU in ubuntu, but it cost 2-3 hours per training. it only cost a quarter per training on GPU, but the result is wrong.
that’s is all my experiment.

Maple · July 25, 2020, 2:46am

oh it’s ridiculous, on my virtual machine with ubuntu using cpu, eval_accuracy = 0.84, then i install ubuntu to my computer(dual system), using cpu, eval_accuracy = 0.68, I’m confuse

Maple · July 26, 2020, 8:56am

solved. I trans the parameter in a wrong way, i see this method at a blog. with google instruction, trans the parameter at command line, all is ok.

Topic		Replies	Views
Fine-tuning Bert in DL4J DL4J	15	4708	March 11, 2024
Transfer Learning with LSTM DL4J	12	569	April 10, 2020
Low accuracy compared to model trained with Keras DL4J	8	777	August 21, 2020
Bert Model in DL4j - for text similarity DL4J	5	275	December 24, 2023
BERT model to Deeplearning4j DL4J	1	790	March 13, 2020

BertInferenceExample, fine tune question

Related topics