How Differentiation Works in Computers? 🧮


11 min, 2199 words

differentiation

How it started?

It all started when I was watching the YouTube video on Finding The Slope Algorithm (Forward Mode Automatic Differentiation) by Computerphile:

In this video, Mark Williams demonstrates Forward Mode Automatic Differentiation and explains how it addresses the limitations of traditional methods like Symbolic and Numerical Differentiation. The video also introduces the concept of Dual Numbers and shows how they can be efficiently used to compute the gradient of a function at any point.

Let's start with the basic concepts of differentiation!

Differentiation

In mathematics, the derivative is a fundamental tool that quantifies the sensitivity to change of a function's output with respect to its input. The derivative of a function of a single variable at a chosen input value, when it exists, is the slope of the tangent line to the graph of the function at that point. The process of finding a derivative is called differentiation.

differentiation_animation

Mathematical Definition

A function of a real variable \( f(x) \) is differentiable at a point \( a \) of its domain, if its domain contains an open interval containing \( ⁠a \)⁠, and the limit \( L \) exists.

$$ L = \lim_{h \to 0} \frac{f(a + h) - f(a)}{h} $$

Why We Calculate Derivatives?

Derivatives are a fundamental concept in calculus with many practical use cases across science, engineering, economics, and computer science. They measure how a quantity changes in response to a small change in another, commonly called "rate of change".

Think of a derivatives as answering:

"If I nudge the input just a little, how much does the output change?"

This makes derivatives crucial wherever change, sensitivity, or optimization is important. Some important applications include:

  1. CFD (Computational Fluid Dynamics) 🌊: Simulates fluid flow by solving Navier–Stokes equations using partial derivatives of velocity, pressure, and density. These derivatives capture how small changes propagate, enabling realistic real-time simulations of smoke, fire, and airflow.

cfd

  1. Image Processing & Edge Detection šŸ–¼ļø: Image processing uses derivatives like Sobel filters and Laplacians to detect edges by identifying rapid changes in pixel intensity. This helps highlight boundaries for applications in computer vision and object recognition.

sobel

  1. Signal Smoothing & Filtering šŸ“”: Derivatives detect sudden spikes or noise in audio and sensor data, enabling smoothing and feedback control. This improves performance in applications like music processing, GPS filtering, and motion stabilization.

signal_processing

  1. Machine Learning & AI 🧠: In machine learning, derivatives guide gradient descent by showing how to adjust weights to minimize loss. Backpropagation uses these derivatives to efficiently update neural network parameters during training.

gradient_descent

Differentiation Methods

When it comes to differentiating with a computer, our first instinct is often the method we learned in high school, using known rules for common functions and applying them step by step. Let’s take a closer look at how that works!

1. Symbolic Differentiation

Symbolic differentiation relies on the fundamental definition of a derivative which is taking the limit of the difference quotient. After establishing the derivatives of basic functions, we can systematically apply differentiation rules to compute the derivatives of more complex expressions.

$$ \boxed{L = \lim_{h \to 0} \frac{f(a + h) - f(a)}{h}} $$

Let’s calculate it for a simple function: \( f(x)=x^n \)

$$ \displaylines{\begin{align} f(x) &= x^n \\ f'(x) &= \lim_{h \to 0} \frac{f(x+h) - f(x)}{h} \\ &= \lim_{h \to 0} \frac{(x + h)^n - x^n}{h}\end{align}} $$

Now apply the Binomial Theorem:

$$ (x + h)^n = x^n + n x^{n-1} h + \frac{n(n-1)}{2!} x^{n-2} h^2 + \cdots + h^n $$

Substitute this back into the limit:

$$ \displaylines{\begin{align}f'(x) &= \lim_{h \to 0} \frac{x^n + nx^{n-1}h + \frac{n(n-1)}{2!}x^{n-2}h^2 + \cdots + h^n - x^n}{h} \\ &= \lim_{h \to 0} \frac{nx^{n-1}h + \frac{n(n-1)}{2!}x^{n-2}h^2 + \cdots + h^n}{h} \\ &= \lim_{h \to 0} \left(nx^{n-1} + \frac{n(n-1)}{2!}x^{n-2}h + \cdots + h^{n-1}\right) \\ &= nx^{n-1}\end{align}} $$

Using the power rule from symbolic differentiation, the derivative is:

$$ \frac{d}{dx} x^n = n x^{n-1} $$

This rule is one of the foundational results and forms the basis for differentiating more complex expressions built from powers of \(x\). You’ve probably seen a sheet that lists differentiation rules formulas like this:

derivate_rules

By breaking a complex function into smaller components and systematically applying basic rules such as the power, product, and chain rules you can compute the derivative of complex functions step-by-step.

We can implement this in any programming language by defining the fundamental rules and using them to compute derivatives. However, in Python, this functionality is already available through the sympy library. Let’s look at an example:

import sympy as sp

x = sp.Symbol("x")

f = x**2 + sp.sin(x)
f_prime = sp.diff(f, x)

print("f(x):", f)
print("f'(x):", f_prime)

# Substitute x = π
value = f_prime.subs(x, sp.pi)
# Evaluate the numeric value
numeric_value = value.evalf()
print(f"f'(Ļ€)={value}={numeric_value}")

Output

f(x): x**2 + sin(x)
f'(x): 2*x + cos(x)
f'(Ļ€)=-1 + 2*pi=5.28318530717959

While symbolic differentiation provides the exact mathematical expression for a function's rate of change, this elegance often falls short in real-world applications. What if your function isn't a neat equation, but rather a stream of experimental data or a "black-box" algorithm?

In these common scenarios, symbolic methods are simply unfeasible. This is precisely where numerical differentiation comes into picture. By approximating the derivative using discrete function values, it allows us to analyze the behavior of functions derived from empirical observations, complex simulations, or even cutting-edge machine learning models—areas where symbolic methods can't reach.

2. Numerical Differentiation

Numerical differentiation is the process of approximating the derivative of a function using its values at discrete points, rather than deriving an exact symbolic expression. It's used when a function's formula is unknown, too complex, or only available as data.

The simplest method used for numerical differentiation is finite difference approximations.

numerical-differentiation

A simple two-point estimation is to compute the slope of a nearby secant line through the points \( (x, f(x)) \) and \( (x + h, f(x + h)) \). Choosing a small number \( h \), \( h \) represents a small change in \( x \), and it can be either positive or negative. The slope of this line is

$$ {\displaystyle {\frac {f(x+h)-f(x)}{h}}.} $$

This expression is Newton's difference quotient (also known as a first-order divided difference).

The slope of this secant line differs from the slope of the tangent line by an amount that is approximately proportional to \( h \). As \( h \) approaches zero, the slope of the secant line approaches the slope of the tangent line. Therefore, the true derivative of f at x is the limit of the value of the difference quotient as the secant lines get closer and closer to being a tangent line:

$$ {\displaystyle f'(x)=\lim _{h\to 0}{\frac {f(x+h)-f(x)}{h}}.} $$

We can easily implement numerical differentiation for any function. Let's look at the Python implementation for this:

import math

def f(x):
    return math.sin(x)

def f_dash(f, x, dx):
    return (f(x + dx) - f(x)) / dx

x = math.pi / 4
true_derivative = math.cos(x)
print("True Derivative: ", true_derivative)
print("Numerical Derivative: ", f_dash(f, x, dx=0.0001))

Output

True Derivative:  0.7071067811865476
Numerical Derivative:  0.7070714246693033

Another two-point formula is to compute the slope of a nearby secant line through the points \( (x āˆ’ h, f(x āˆ’ h)) \) and \( (x + h, f(x + h)) \). The slope of this line is

$$ {\displaystyle {\frac {f(x+h)-f(x-h)}{2h}}.} $$

This formula is known as the Symmetric difference quotient. In this case the first-order errors cancel, so the slope of these secant lines differ from the slope of the tangent line by an amount that is approximately proportional to \( h^{2} \). Hence for small values of \( h \) this is a more accurate approximation to the tangent line than the one-sided estimation.

Let's implement this in Python:

import math

def f(x):
    return math.sin(x)

def f_dash_symetric(f, x, dx):
    return (f(x + dx) - f(x - dx)) / (2 * dx)

x = math.pi / 4
true_derivative = math.cos(x)
print("True Derivative: ", true_derivative)
print("Numerical Derivative: ", f_dash_symetric(f, x, dx=0.0001))

Output

True Derivative:  0.7071067811865476
Numerical Derivative:  0.70710678000796

Numerical differentiation is quick and easy to implement, especially when an explicit formula for the derivative is unavailable or the function is defined only through data points. However, it comes with certain limitations. The results are highly sensitive to the choice of step size, too large and the approximation is inaccurate; too small and rounding errors dominate due to floating-point precision limits. The following diagram represents this tradeoff and the sweet-spot for accurate calculations.

AbsoluteErrorNumericalDifferentiation

Let's try to replicate the results by calculating this error and plotting using Matplotlib:

import math
import matplotlib.pyplot as plt

def f(x):
    return math.sin(x)

def f_dash(f, x, dx):
    # Newton's difference
    return (f(x + dx) - f(x)) / dx

def f_dash_symetric(f, x, dx):
    # Symmetric difference
    return (f(x + dx) - f(x - dx)) / (2 * dx)

# Point at which to differentiate
x = math.pi / 4
true_derivative = math.cos(x)

# dx values
dx_values = [2 ** (-i) for i in range(1, 50)]

# Compute numerical derivatives and errors
errors_newton = [abs(f_dash(f, x, dx) - true_derivative) for dx in dx_values]
errors_symmetric = [
    abs(f_dash_symetric(f, x, dx) - true_derivative) for dx in dx_values
]

# Plotting
plt.figure(figsize=(16, 9))
plt.loglog(dx_values, errors_newton, marker="o", label="Newton's Difference")
plt.loglog(dx_values, errors_symmetric, marker="s", label="Symmetric Difference")
plt.xlabel("dx (step size)")
plt.ylabel("Absolute Error")
plt.title("Error Comparison of Numerical Differentiation Methods at x = π/4")
plt.grid(True, which="both", ls="--")
plt.legend()
plt.show()

Output:

Numerical Differentiation Comparison

We observe a similar pattern as in the previous diagram, highlighting how numerical differentiation is sensitive to step size. Apart from this it also tends to amplify any noise in the data, making it unreliable for real-world measurements or simulations with fluctuations. Additionally, it provides only pointwise estimates and does not yield a general formula, limiting its usefulness in analytical studies or symbolic manipulation.

3. Automatic Differentiation

Now let’s look into automatic differentiation, a powerful technique that computes exact derivatives by breaking down functions into elementary operations and applying the chain rule systematically.

Unlike numerical differentiation, it avoids rounding and truncation errors, and unlike symbolic differentiation, it scales well with complex functions. This makes it especially powerful in machine learning and scientific computing where accurate gradients are essential.

Automatic Differentiation

Auto-differentiation is thus neither numeric nor symbolic, nor is it a combination of both. It is also preferable to ordinary numerical methods: In contrast to the more traditional numerical methods based on finite differences, auto-differentiation is 'in theory' exact, and in comparison to symbolic algorithms, it is computationally inexpensive.

References

  1. Derivative - Wikipedia
  2. Differentiation Rules - MATH MINDS ACADEMY
  3. Numerical differentiation - Wikipedia
  4. Automatic differentiation - Wikipedia

Stay Tuned

Hope you enjoyed reading this. Stay tuned for more cool stuff coming your way!