Is it possible to implement CRF layer using Computation Graph?

hurui200320 · March 14, 2020, 12:15pm

Hi,
Recently I need a implementation of BiLSTM-CRF model in one of my school project. I searched online and find out that DL4J implements neither CRF or BiLSTM-CRF. Here I’m wondering that Can I implement the CRF layer in BiLSTM-CRF using Computation Graph and custom layer? Or at least using SameDiff. I’m not sure how to do it, but if it is possible, I’ll dig in to it.
Thanks and Regards

eduardo · March 15, 2020, 11:55pm

In theory, yes it should be possible to implement it by extending SameDiffLayer like the SelfAttentionLayer we currently have (deeplearning4j/SelfAttentionLayer.java at master · eclipse/deeplearning4j · GitHub) But the recurrent part might be a little tricky.

hurui200320 · March 16, 2020, 1:52am

Thanks, I would try. BTW, would dl4j officially support CRF layer?

treo · March 16, 2020, 7:35am

At the moment there are no plans to add a CRF Layer officially.

hurui200320 · March 24, 2020, 1:39pm

Hi, it’s me again. After reading a lot of documents and codes, I decided to have a try. However when I was following a PyTorch(which according to SameDiff documents, it should be expect to have similarity with sameDiff) version of crf implementation, I found something different.

The line 88 of crf.py used an operation like this: torch.ones(emissions.shape[:2], dtype=torch.float), it’s seems like he created a NDArray with a variable shape, where emissions should have a shape of [seq_len, batch_size, nb_labels]. But I can’t find the operation corresponded to that in SameDiff. The getShape() function won’t return the true shape of input data, which I confirmed with layerInput. layerInput.getShape() in defineLayer(...) returns [-1, 14, 99], which is obviously not the true shape, since batch size won’t be -1. So… Any ideas?

Also, at line 151, there is a for loop depends on the seq_length which comes from tags.shape, and for now I have no idea how to do this in SameDiff. I’m trying extend a SameDiffOutputLayer since the document in SameDiffLayer suggested.

I feel a kind of confusion since I can’t find the corresponding operation of many functions from PyTorch. And it looks like SameDiff implements a static thing that you cannot change dynamically while PyTorch can. So… Can you help me out? Maybe a direction on how to do this in a proper / suitable way?

Topic		Replies	Views
Graph Neural Networks DL4J	6	512	November 11, 2022
About Dynamic Graph Computing SameDiff	0	459	March 15, 2021
Layer Normalization DL4J	3	251	January 9, 2023
Kolmogorov-Arnold DL4J	4	105	June 12, 2024
Is there a manual about GCN(Graph Convolutional Networks) using DL4J DL4J	13	522	November 11, 2022

Is it possible to implement CRF layer using Computation Graph?

Related topics