Compute the gradient of the loss function with respect to the inputs: dL/dOutput
Compute the gradient of the loss function with respect to the inputs: dL/dOutput
Label/expected output
Output of the model (neural network), before the activation function is applied
Activation function that should be applied to preOutput
Mask array; may be null
Gradient dL/dPreOut
fixed, no need to change
computeGradientAndScore Compute both the score (loss function value) and gradient.
computeGradientAndScore Compute both the score (loss function value) and gradient. This is equivalent to calling computeScore and computeGradient individually
Label/expected output
Output of the model (neural network)
Activation function that should be applied to preOutput
Mask array; may be null
Whether the score should be averaged (divided by number of rows in labels/output) or not
The score (loss function value) and gradient
fixed, no need to change
computeScore Compute the score (loss function value) for the given inputs.
computeScore Compute the score (loss function value) for the given inputs.
Label/expected preOutput
Output of the model (neural network)
Activation function that should be applied to preOutput
Mask array; may be null
Whether the score should be averaged (divided by number of rows in labels/preOutput) or not
Loss function value
fixed, no need to change
computeScoreArray Compute the score (loss function value) for each example individually.
computeScoreArray Compute the score (loss function value) for each example individually. For input [numExamples,nOut] returns scores as a column vector: [numExamples,1]
Labels/expected output
Output of the model (neural network)
Activation function that should be applied to preOutput
Loss function value for each example; column vector
fixed, no need to change
computeGradient dLdYHat Compute the gradient wrt to the preout (which is the input to the final layer of the neural net) Use the chain rule In this case L = (y - yHat)^2 + |y - yHat| dL/dyHat = -2*(y-yHat) - sign(y-yHat), sign of y - yHat = +1 if y-yHat>= 0 else -1 dyHat/dpreout = d(Activation(preout))/dpreout = Activation'(preout) dL/dpreout = dL/dyHat * dyHat/dpreout
computeGradient dLdYHat Compute the gradient wrt to the preout (which is the input to the final layer of the neural net) Use the chain rule In this case L = (y - yHat)^2 + |y - yHat| dL/dyHat = -2*(y-yHat) - sign(y-yHat), sign of y - yHat = +1 if y-yHat>= 0 else -1 dyHat/dpreout = d(Activation(preout))/dpreout = Activation'(preout) dL/dpreout = dL/dyHat * dyHat/dpreout
Label/expected output
Output of the model (neural network), before the activation function is applied
Activation function that should be applied to preOutput
Mask array; may be null
Gradient dL/dYHat
scoreArray Calculates the loss for a single data point or in other words a batch size of one
scoreArray Calculates the loss for a single data point or in other words a batch size of one
Labels/expected output
Output of the model (neural network)
Activation function that should be applied to preOutput
Mask associated with the labels
An array the shape and size of the output of the neural net.