On this page:
8.18
top
up

2Module rml-neural/activation.πŸ”— i

This module defines a set of activation functions, or method that may be used to determine the sensitivity of neurons in a network layer. To support both forward and backward propagation each method contains the activation function and it’s derivative. This Wikipedia page has a good overview of a number of activation functions.

2.1Activation Function StructureπŸ”— i

Contracts that encapsulate the pattern data-type or false.

Contracts used to define the procedures used in the structures below. Both activation and derivative functions are represented as a procedure that take a single, and return a single, real? or flonum? . They are equivalent to the following contract values.

See also Parallelism with Futures in The Racket Guide In general it is preferable to use the flonum-activator? structure and the corresponding flonum-activation/c form as this reduces the numeric conversions and allows optimization such as futures to work efficiently.

struct

(struct activator(namefdfα))

name:symbol?
This structure provides the activator function, it’s derivative, and an optional expectation value for a given method.

  • f is the activation function, \phi(v_i)

  • df is the activation derivative function, \phi^\prime(v_i) – sometimes shown as \phi^{-1}(v_i)

  • α is an optional stochastic variable sampled from a uniform distribution at training time and fixed to the expectation value of the distribution at test time

struct

(struct flonum-activatoractivator(namefdfα))

name:symbol?
An extension to activator? that ensures that all values to the functions f and f as well as the value for α are guaranteed to be flonum? s. See also Fixnum and Flonum Optimizations in The Racket Guide. This allows for additional optimization and all math operations will be assumed to be flonum safe.

Construct an instance of activator? and flonum-activator? respectively. These constructors makes the value for α explicitly optional.

2.2Activation FunctionsπŸ”— i

Each of the activator? structures below will be defined by it’s activation function (the derivative is not shown). A sample plot shows the shape of the activation function in red and it’s derivative in turquoise.

\phi(v_i) = v_i

[画像:Sample Plot]

\phi(v_i) = \begin{cases} 0 & \text{for } v_i < 0\\ 1 & \text{for } v_i \geq 0 \end{cases}

[画像:Sample Plot]

\phi(v_i) = \frac{1}{1+e^{-v_i}}

[画像:Sample Plot]

\phi(v_i) = \tanh(v_i)

[画像:Sample Plot]

\phi(v_i) = \operatorname{atan}^{-1}(v_i)

[画像:Sample Plot]

\phi(v_i) = \frac{v_i}{1+\left|v_i\right|}

[画像:Sample Plot]

\phi(v_i) = \frac{v_i}{\sqrt{1+\alpha v_{i}^2}}

[画像:Sample Plot (Ξ± = 0.5)]

\phi(v_i) = \begin{cases} \frac{v_i}{\sqrt{1+\alpha v_{i}^2}} & \text{for } v_i < 0\\ v_i & \text{for } v_i \geq 0 \end{cases}

[画像:Sample Plot (Ξ± = 0.5)]

\phi(v_i) = \begin{cases} 0 & \text{for } v_i < 0\\ v_i & \text{for } v_i \geq 0 \end{cases}

[画像:Sample Plot]

\phi(v_i) = \begin{cases} \delta v_i & \text{for } v_i < 0\\ v_i & \text{for } v_i \geq 0 \end{cases}

[画像:Sample Plot]

Note that the fixed form of this activator uses a delta value \delta=0.01.

\phi(v_i) = \ln\left( 1 + e^{v_i} \right)

[画像:Sample Plot]

\phi(v_i) = \frac{\sqrt{v_{i}^2+1}-1}{2}+v_i

[画像:Sample Plot]

\phi(v_i) = \sin(v_i)

[画像:Sample Plot]

\phi(v_i) = \begin{cases} 1 & \text{for } v_i = 1\\ \frac{\sin(v_i)}{v_i} & \text{for } v_i \neq 0 \end{cases}

[画像:Sample Plot]

\phi(v_i) = e^{-v_{i}^2}

[画像:Sample Plot]

top
up

AltStyle γ«γ‚ˆγ£γ¦ε€‰ζ›γ•γ‚ŒγŸγƒšγƒΌγ‚Έ (->γ‚ͺγƒͺγ‚ΈγƒŠγƒ«) /