Why Derivative of Sigmoid Functions is Between 0 to 0.25?
The sigmoid function is defined as:
To find its derivative, we use the chain rule. The derivative of σ(x) with respect to x is:
Here’s the step-by-step process:
- First, express σ(x) in a more convenient form for differentiation:
2. Define u
3. Differentiate 1/u with respect to x:
4. Now
5. Combine these results:
6. Simplify by noting
To see why the derivative σ′(x) is always between 0 and 0.25, consider the properties of the sigmoid function:
- The sigmoid function σ(x) outputs values between 0 and 1.
- σ(x) has a maximum value at σ(x)=0.5, which occurs at x=0.
Substituting σ(x)=0.5 into the derivative formula:
σ′(x)=0.5⋅(1−0.5)=0.5⋅0.5=0.25
The derivative σ′(x) achieves its maximum value of 0.25 when σ(x)=0.5.
For other values of σ(x):
- If σ(x) is close to 0 or 1, then σ(x)(1−σ(x)) will be smaller than 0.25 because one of the terms (either σ(x) or 1−σ(x)) will be close to 0.
Therefore, σ′(x) is always in the range (0,0.25], meaning it is always between 0 and 0.25.