Issues about modifying the source code

fubuki · November 24, 2021, 11:41am

Hi all, due to my research purpose, I need to split the forwarding and backwarding phase in training into two separate functions, which is all done in one function (i.e. fit() ) in DL4J. Can anyone give me some advices on how to achieve my idea? Sincerely appreciate for your help!

agibsonccc · November 24, 2021, 11:59am

@fubuki you can either try using samediff (the lower level api with more control) or use dl4j’s external errors to do your own backpropagation.
In that case, you can call the gradient functions on your own.

Otherwise for the dl4j api ,a test example can be found here:

github.com

eclipse/deeplearning4j/blob/fc735d30023981ebbb0fafa55ea9520ec44292e0/deeplearning4j/deeplearning4j-core/src/test/java/org/deeplearning4j/nn/graph/TestComputationGraphNetwork.java

/*
 *  ******************************************************************************
 *  *
 *  *
 *  * This program and the accompanying materials are made available under the
 *  * terms of the Apache License, Version 2.0 which is available at
 *  * https://www.apache.org/licenses/LICENSE-2.0.
 *  *
 *  *  See the NOTICE file distributed with this work for additional
 *  *  information regarding copyright ownership.
 *  * Unless required by applicable law or agreed to in writing, software
 *  * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
 *  * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
 *  * License for the specific language governing permissions and limitations
 *  * under the License.
 *  *
 *  * SPDX-License-Identifier: Apache-2.0
 *  *****************************************************************************
 */

This file has been truncated. show original

fubuki · November 24, 2021, 1:01pm

Thank you for your advice! For the first solution, do you mean that I can only use samediff api to achieve me idea without using DL4J?

agibsonccc · November 24, 2021, 2:13pm

@fubuki generally samediff is meant for lower level control of graphs similar to tensorflow/pytorch. You can look at:

Take a look at our quickstart as well:

It’s up to you which route to go. Samediff is superceding dl4j as the main api long term though. Dl4j’s computation graph and multi layernetwork do have this functionality built in though via external errors as mentioned above. Samediff is capable of external errors as well though.

You may find more here:

github.com

eclipse/deeplearning4j/blob/12b8ff3514f3c996b7b7aa8cd7af3b1905da4102/nd4j/nd4j-backends/nd4j-tests/src/test/java/org/nd4j/autodiff/samediff/SameDiffTests.java#L2341-L2340


      
          INDArray externalGrad = Nd4j.linspace(1, 12, 12).reshape(3, 4);

fubuki · November 25, 2021, 1:23am

I got it. Thank you for your guidance!

fubuki · November 25, 2021, 2:54am

Hi, after referring to the github link you shared, I am still confused about the backward process in Samediff. Generally, for the backward process, it uses the forward output to perform the back propagation and updates the weights of the model using the gradient. I am not sure which the function / step is that updates the weights of the model.

agibsonccc · November 25, 2021, 1:43pm

@fubuki you would use external errors in combination with the fit function in that case. You can also use calculate gradients. You can configure a training configuration similar to other training examples after specifying external errors in the other examples listed.

fubuki · November 26, 2021, 2:24am

I would try it out. Thank you for your help!

fubuki · December 1, 2021, 11:17am

Hi, if I use calculateGradients() to get the gradients, should I update each variable by writing some codes like “var.get(‘w’).subi(grad.mul(lr))” ? Or is there any function I can use to update the variables?

agibsonccc · December 2, 2021, 10:57am

@fubuki You usually use use update like that yes.

fubuki · December 3, 2021, 1:50am

Got it. Thank you a lot!

fubuki · December 6, 2021, 6:06am

Hi! When I use the calculateGradients() and ExternalErrorFunction to execute the backward, it fails and reports that “No array was provided for required placeholder variable “input””. But I have already set values in the “input” variable. How should I handle this? Thank you!

agibsonccc · December 7, 2021, 11:06am

@fubuki that is saying that no gradient for input has been calculated not that you didn’t set the value for input. It sounds like there is an issue here somewhere. Could you file an issue with a reproducer for me to look at? Generally placeholders shouldn’t need gradients though. I will need to look at this in more detail.

It’s hard to tell if there is a bug here or not. Thanks!

fubuki · December 7, 2021, 11:42am

What about providing you with the source code in Github (Test function included)? I will make some comments on the code and tell you what my program is trying to do. Thank you!

agibsonccc · December 7, 2021, 11:43am

@fubuki sure happy to review.

fubuki · December 7, 2021, 12:00pm

@agibsonccc Here is the source code link: GitHub - fubukishiro/FTPipeHD_MC
The error occurs in MNISTCNNTest/subModelTrainTest() function, which is in the line 118 of MNISTCNNTest.java. I made a TODO comment on that line.
Forgive me to reuse some code from the SameDiff source code.
Sincerely thank you for your help!

fubuki · December 7, 2021, 12:10pm

@agibsonccc What my program want to do is to split a model into three sub-models. And it first forwards these three sub-models in a consecutive order and backwards them in a reverse order. For instance, the output of sub-model1 is the input of the sub-model2. The gradient of the input of subModel2 is the external error of sub-model1.

fubuki · December 10, 2021, 2:57am

@agibsonccc Hi, have you figured out any issue in my implementation? Thanks!

agibsonccc · December 10, 2021, 3:11am

@fubuki sorry haven’t had bandwidth to do that. Assume anytime you ask me to look at source code it’s going to involve me cloning a repository, potentially correcting some code, me testing against multiple versions (potentially M1.1 and snapshots) in case I find issues. It’s not just a “scan the repo and magically find the problem”

These things are usually at least a 1-2 hour affair that requires dedicated time. I’ll try to get back to you early next week after I cleared my backlog out. If we’re lucky earlier.

fubuki · December 10, 2021, 3:41am

@agibsonccc I totally understand that you are busy now. I will wait for you finishing your backlog. I have been always appreciating your help in my issue!

Topic		Replies	Views
Custom Loss Function and Gradient DL4J	1	282	August 3, 2023
What is the basic code flow of manually forward and backward instead of directly using an optimizer? DL4J	1	208	November 28, 2022
Questions on DL4J application DL4J	3	347	March 24, 2021
Can dl4j achieve split learning? DL4J	0	324	August 13, 2021
Any example of external errors for SameDiff? SameDiff	0	216	June 6, 2022

Issues about modifying the source code

Related topics