-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Custom loss function #934
-
Is it possible to define a custom loss function for a neural network in Brain.js like in Tensorflow?
If not, where might I begin to implement it? I took a look in the codebase and searched for terms like "loss", "cost", etc. without much luck. I found the error calculation functions, but I'm still lost tbh. Any help would be greatly appreciated 🥰
Beta Was this translation helpful? Give feedback.
All reactions
Replies: 5 comments 12 replies
-
For instance, a custom loss function would greatly improve training times when using reinforcement learning.
Beta Was this translation helpful? Give feedback.
All reactions
-
OMG 🤣 Soooo... A lot more digging, and I found out that this was already solved 🥰
Beta Was this translation helpful? Give feedback.
All reactions
-
While @brain/rl
seems great for smaller decision machines, I'm a teensie bit worried about the overhead of all this boilerplate for use cases where all that rl
is used for is to define the loss function to use when training.
Is there perhaps a way to set a custom loss/cost/error function for training? If not, I'd be willing to implement the feature into brain.js
if you decide you're comfortable with that 😊
For the past few years, I've been using Brain.js way more than tf.js; honestly, this would make it possible to switch over to using Brain.js exclusively.
In the meantime, I'll run some tests to determine how fit @brain/rl
is for the work I do and report back my findings in case anyone else here would find the information useful 😊
Beta Was this translation helpful? Give feedback.
All reactions
-
Yea, let's support it. Hell is greatly appreciated.
Beta Was this translation helpful? Give feedback.
All reactions
-
Yea, let's support it. Hell is greatly appreciated.
This line in neural-network-gpu.ts
appears to be a good candidate 😊
this.backwardPropagate[this.outputLayer] = this.gpu.createKernelMap(
For CPU neural nets, I'm thinking about this line here in neural-network.ts
:
this.calculateDeltas(value.output);
As for the API, I'm thinking about just adding another function, loss
, to INeuralNetworkOptions
like callback
and log
. This way, if someone defines loss
, the function will be used to determine the overall fitness of a given output, and then utilize that with traditional backpropagation, or even other training methods (For example, I just finished an ELM
implementation in my fork, but I haven't created a pull request yet; I'm finishing the HELM
(Hybrid ELM) first, since it should do wonders for the training times).
I have much bigger ideas for the custom loss function than a simple HELM, but yeah. I'll update you once I have something nice for you, Robert 💞
Beta Was this translation helpful? Give feedback.
All reactions
-
Side note: I suggest playing around with that ELM implementation in your spare time. The training pattern is quite interesting!! For the standard XOR training set, it instantly reaches an error of ~17% by iteration 1000, but then struggles for any further iterations to drop below 17%. Once this threshold is reached, it any further progress seems extremely slow. I'm thinking if we leverage this in our HELM implementation by rushing the ELM's peak optimization and then switch over to standard backpropagation for all further training, we can take advantage of the extremely fast rush to the ELM's threshold while totally avoiding the ELM's threshold weakness (the 17% I mentioned in the case of the standard XOR training set).
Beta Was this translation helpful? Give feedback.
All reactions
-
btw the work is being done in this branch. Already have a working implementation for NeuralNetworkGPU
, but I still need to implement it for NeuralNetwork
/CPU-based nets. As of now, much is left to be desired, but it works 🥰 Currently implementing it for the CPU before adding a few parameters to the loss function to make it far more useful for training optimization. The end-goal API I'm thinking of looks like this (InputType
is defined by NeuralNetwork
and NeuralNetworkGPU
):
function loss( this: IKernelFunctionThis, actual: KernelOutput, // The actual output data of the neural network during this run. expected: KernelOutput, // The output data expected to be produced by the neural network given the provided input data. inputs: InputType, // The input data provided to entire neural network (the data passed into the `NeuralNetwork.run(input)` method). state: KernelOutput[][], // The current training state of the neural network; see below for more information. ) { return expected[this.thread.x] - actual[this.thread.x]; }
Where:
this: IKernelFunctionThis
is the GPU.js kernel function'sthis
value.actual: KernelOutput
is the current actual output data produced by running the neural network.expected: KernelOutput
is the output data expected to be produced when running the neural network if it were perfectly trained and optimized.inputs: InputType
is the input data that was provided to the neural network this run.state: KernelOutput[][]
is a state matrix. To better understand the state matrix, consider this pseudo-indexer:property = KernelOutput[layer][neuron]
Each value in the first order of the array corresponds to alayer
of the neural network. Each value in the second order corresponds to aneuron
in thatlayer
. Lastly, each value returned when indexing the second order of the state matrix corresponds to thatneuron
's state tensor. In this context, the neuron's state tensor can be defined as: "a tensor containing information specific to any particular neuron within an artificial neural network." While that definition is quite ad-hoc seeing as I just came up with it, I have to admit that it is some of my finest improv 😂
The aspect of the state
parameter of the loss function that excites me the most is that it paves the way for more implementing far more advanced training algorithms than the library currently supports, and with GPU-acceleration at that!! The possibilities are even more limitless with the CPU algorithms, but performance limits their applications at the moment (no where near too limited to be useful though!!! So many applications 🥰).
An example designed to optimize training for XOR might look like:
function loss( this: IKernelFunctionThis, actual: KernelOutput, // The actual output data of the neural network during this run. expected: KernelOutput, // The output data expected to be produced by the neural network given the provided input data. inputs: InputType, // The input data provided to entire neural network (the data passed into the `NeuralNetwork.run(input)` method). state: KernelOutput[][], // The current training state of the neural network; see above for more information. ) { // Perform a standard loss function as our foundation to build upon. let neuronLoss = expected[this.thread.x] - actual[this.thread.x]; // if ( o == i0 ^ i1 ) then return 10% of the loss value. // Otherwise, return the full loss value. if (Math.round(actual[0]) === Math.round(inputs[0]) ^ Math.round(inputs[1])) neuronLoss *= 0.1; return neuronLoss; }
which allows for you to train a neural network to predict XOR calculations without ever even having been fed a data set. Just stream random data as training input, allow it to produce random output at first, and train on the error from the calculation. Instead of telling the network, "Hey! There's a bounty reward for whoever can solve for Y given X," you can instead define a rule to determine what is or is not good behavior, or even create a hybrid loss function. The above kernel function is a simple example of a hybrid loss function that I kind of just tossed together to train a standard feed-forward network to solve for XOR by using the standard XOR training set. In my experience, this greatly improves training times when done correctly.
Sending tons of positivity and good vibes your way friend 💕
Beta Was this translation helpful? Give feedback.
All reactions
-
I just realized that a KernelOutput[][]
likely isn't possible given that GPU.js can currently only handle arrays up to rank 3. Perhaps a single network-wide KernelOutput
then. It's already up to the loss function implementer to consider the neural network's topology/shape, and simply using KernelOutput
allows the loss function implementer to have greater control over the training algorithm 😊
Beta Was this translation helpful? Give feedback.
All reactions
-
Here's some links in case you're interested 😊
- The fork
- The feature branch
- The normal training session (125000-175000 iterations)
- The loss-assisted training session (2500-3500 iterations)
Beta Was this translation helpful? Give feedback.
All reactions
-
Darn... I ran into a major roadblock. For some odd reason, it's refusing to let me add any parameters past "target" in the subkernel, and I can't possibly imagine why. It gives this error:
/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/function-node.js:972
return new Error(`${error} on line ${ splitLines.length }, position ${ lineBefore.length }:\n ${ debugString }`);
^
Error: Unknown argument inputs type on line 14, position 1:
actual, // The actual output data of the neural network during this run.
expected, // The output data expected to be produced by the neural network given the provided input data.
inputs, // The input data provided to entire neural network (the data passed into the `NeuralNetwork.run(input)` method).
state // The current training state of the neural network; see above for more information.
) {
// Perform a standard loss function as our foundation to build upon.
let error = expected - actual;
// if ( o == i0 ^ i1 ) then return 10% of the loss value.
// Otherwise, return the full loss value.
error *= inputs[this.thread.x];
return error;
}
}
at WebGLFunctionNode.astErrorOutput (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/function-node.js:972:12)
at WebGLFunctionNode.astFunction (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/web-gl/function-node.js:92:22)
at WebGLFunctionNode.astFunctionExpression (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/function-node.js:1019:17)
at WebGLFunctionNode.astGeneric (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/function-node.js:898:23)
at WebGLFunctionNode.toString (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/function-node.js:324:32)
at FunctionBuilder.traceFunctionCalls (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/function-builder.js:294:22)
at FunctionBuilder.traceFunctionCalls (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/function-builder.js:296:16)
at FunctionBuilder.getPrototypes (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/function-builder.js:331:55)
at FunctionBuilder.getPrototypeString (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/function-builder.js:318:17)
at HeadlessGLKernel.translateSource (/home/celeste/Documents/GitHub/brain.js/node_modules/gpu.js/src/backend/web-gl/kernel.js:596:45)
Node.js v20.13.1
any clues as to why gpu.js won't let me add more than 2 parameters to the subkernel?
Beta Was this translation helpful? Give feedback.
All reactions
-
Simplifying the subkernel didn't resolve the error either, so it's likely not parsing related...
Error: Unknown argument inputs type on line 7, position 1:
actual,
expected,
inputs,
state
) {
return expected - actual;
}
}
Beta Was this translation helpful? Give feedback.
All reactions
-
IT WORKS!!! AHHHHH OMG!!! Okay, okay, so it's working on both CPU and GPU now!!! The catch is that I need to inform toJSON()
about the lossState
property. Other than that, I think it should be ready 😊
Beta Was this translation helpful? Give feedback.
All reactions
-
Beta Was this translation helpful? Give feedback.
All reactions
-
Okay, this is honestly the coolest feature I've ever added to anything. This has me so excited!!! Like, to give you an idea of why I'm so excited: A custom loss function made training an autoencoder much faster, and within a single NeuralNetwork
or NeuralNetworkGPU
meaning that the current autoencoder class, AE
could be split into Autoencoder
and AutoencoderGPU
where each just directly inherits from NeuralNetwork
and NeuralNetworkGPU
respectively, making them play much more nicely with the existing API 🥰 Okie, PR time 😊 💞
Beta Was this translation helpful? Give feedback.