cover

Divergence, Gauss-Ostrogradsky theorem and Laplacian

September 20, 2021 9 min read

Laplacian is an interesting object that initially was invented in multivariate calculus and field theory, but its generalizations arise in multiple areas of applied mathematics, from computer vision to spectral graph theory and from differential geometry to homologies. In this post I am going to explain the intuition behind Laplacian, which requires the introduction of the notion of divergence first. I'll also touch the famous Gauss-Ostrogradsky theorem.

In the late XVIII - early XIX centuries great mathematicians, such as Joseph-Louis Lagrange, Carl Friedrich Gauss and others were laying the foundations of modern physics. One of the most notable problems, they were studying, was heat transfer. The mathematical methods, developed in order to solve it, such as multivariate calculus and Fourier analysis are invaluable and are now ubiquitous in applied mathematics.

Throughout this post I will be using examples from heat transfer and diffusion problems as a motivation to explain the intuition behind mathematical concepts.

In order to introduce the concept of Laplacian, I have to introduce the notion of divergence first, as Laplacian is its special case. After we have dealt with divergence, the properties of Laplacian will become obvious.

Divergence

The notion of divergence in multivariate calculus has 2 very different definitions: conceptual and technical.

Their equivalence is not trivial to see, and I’ll dedicate a whole separate section below to prove it.

Conceptual definition

Conceptually, divergence of a vector field F\vec{F} at some point is the integral of flow of that vector field over an infinitesimal closed contour with area SS and volume VV around that point.

F(x,y,z)=limS,V0S1VF(x,y,z)\nabla \cdot F(x,y,z) = \lim \limits_{S, V \to 0} \oiint \limits_S \frac{1}{|V|} F(x,y,z)

Contour integral

Technical definition

Technically, divergence of a vector field F\vec{F} is a sum of its partial derivatives of the vector field coordinates: F(x,y,z)=Fxx+Fyy+Fyz\nabla \cdot F(x,y,z) = \frac{\partial F_x}{\partial x} + \frac{\partial F_y}{\partial y} + \frac{\partial F_y}{\partial z}.

Note that F\vec{F} is a vector field, not a scalar field. So, divergence is not just a dot product of gradient by (1,1,1) vector, as the notation abuse might suggest.

Infinitesimal cube

Proof of equivalence of conceptual and technical definitions

Say, our infinitesimal surface, surrounding the given point (x,y,z)(x,y,z) is a cube, with edges aligned with the coordinate vectors, so that edge of length dxdx is parallel to xx axis, dydy is parallel to yy axis and dzdz is parallel to zz axis.

As cube is small, we can assume the flow over each of its faces equal to the area of that face, multiplied by flow through its center.

Divergence, according to conceptual definition, would be the total flow through all 6 faces of the cube. I will denote a part of total divergence, corresponding to the flow through the pair of faces, orthogonal to the axis xx, as xF(x,y,z)\nabla_x \cdot F(x,y,z). Thus, the total divergence equals F(x,y,z)=xF(x,y,z)+yF(x,y,z)+zF(x,y,z)\nabla \cdot F(x,y,z) = \nabla_x \cdot F(x,y,z) + \nabla_y \cdot F(x,y,z) + \nabla_z \cdot F(x,y,z).

Let us calculate the divergence through the two faces, orthogonal to xx axis:

xF(x,y,z)=1V(Fx(x+dx2,y,z)Fx(xdx2,y,z))S=1dxdydz(Fx(x+dx2,y,z)Fx(xdx2,y,z))dydz=\nabla_x \cdot F(x, y, z) = \frac{1}{V} (F_x(x+\frac{dx}{2}, y, z) - F_x(x-\frac{dx}{2}, y, z)) \cdot S = \frac{1}{dx \cdot dy \cdot dz} (F_x(x+\frac{dx}{2}, y, z) - F_x(x-\frac{dx}{2}, y, z)) \cdot dy \cdot dz =

=1dxdydzFx(x,y,z)dxdydz=Fx(x,y,z) = \frac{1}{dx \cdot dy \cdot dz} F_x'(x, y ,z) \cdot dx \cdot dy \cdot dz = F_x'(x, y ,z)

So, the total divergence is:

F(x,y,z)=xF(x,y,z)+yF(x,y,z)+zF(x,y,z)=Fx(x,y,z)+Fy(x,y,z)+Fz(x,y,z)\nabla \cdot F(x, y, z) = \nabla_x F(x, y, z) + \nabla_y F(x, y, z) + \nabla_z F(x, y, z) = F_x'(x, y ,z) + F_y'(x, y ,z) + F_z'(x, y ,z)

We see that conceptual definition converged to the technical definition.

Invariance of divergence to coordinate change

We chose our infinitesimal cube so, that its edges are aligned in parallel to the coordinate axes.

However, divergence would not change, if we chose the directions differently.

Indeed, note that divergence is the trace of Jacobian matrix, and trace is invariant to similarity transformations (such as rotation of coordinates).

To show this fact, first show that tr(AB)=tr(BA)tr(A \cdot B) = tr(B \cdot A) by direct calculation and from this it follows that tr(BAB1)=tr(AB1B)=tr(A)tr(B A B^{-1}) = tr(A B^{-1} B) = tr(A).

Why this works for non-rectangular surfaces?

TODO

Gauss-Ostrogradsky theorem

Gauss-Ostrogradsky theorem basically states that you can calculate flow of the vector field through a macroscopic closed surface as an integral of divergence over the volume, confined in that surface.

It is proved by application of same discussion, as we employed for infinitesimal surface/volume (just split the whole macroscopic volume into these infinitesimal pieces).

Laplacian

Laplacian follows from divergence in one simple step.

Suppose that instead of a vector fields F\vec{F} you have a scalar field VV. For instance, instead of flow of matter, you have a distribution of temperature or concentration over a volume.

Well, we can get a vector field out of that scalar field easily: just find the gradient of the scalar field and use it as the vector field.

The operator V(x,y,z)=ΔV(x,y,z)\nabla \cdot \nabla V(x,y,z) = \Delta V(x,y,z) is called Laplacian.

Being a divergence, Laplacian is invariant to the change of basis as well (by the way, it is the trace of Hessian).

Laplacian as a measure of non-conformness of a point

For Laplacian there exists an alternative interpretation: you can draw an infinitesimal sphere around the point and calculate the average value of a function over that sphere. The difference between the average value and the value at the point corresponds to Laplacian:

2f(r0)=limR02dR2(fshellf(r0))\nabla^2 f(r_0) = \lim_{R \to 0} \frac{2d}{R^2} (\langle f \rangle_{shell} - f(r_0))

If a point is very much different from its neighbourhood, Laplacian will be great. If it is exactly the same as its surroundings, Laplacian will be 0.

Proof of equivalence of definition of Laplacian and its interpretation as a measure of non-conformness

fshell=shellf(r)drshelldr=shellf(r)dr4πR2\langle f \rangle_{shell} = \frac{\int_{shell} f(r) dr}{\int_{shell} dr} = \frac{\int_{shell} f(r) dr}{ 4 \pi R^2 }

2f(r0)=limR02dR2(fshellf(r0))=limR02dR2(shellf(r)dr4πR2f(r0))=limR064πR2shell(f(r)f(r0))d2r\nabla^2 f(r_0) = \lim\limits_{R \to 0} \frac{2d}{R^2} (\langle f \rangle_{shell} - f(r_0)) = \lim\limits_{R \to 0} \frac{2d}{R^2} (\frac{\int_{shell} f(r) dr}{ 4 \pi R^2 } - f(r_0)) = \lim\limits_{R \to 0} \frac{6}{4 \pi R^2} \int_{shell} (f(r) - f(r_0)) d^2 r

By Taylor series:

f(r)f(r0)=dfdxx+dfdyy+dfdzz+12d2fdx2x2+12d2fdy2y2+12d2fdz2z2+d2fdxdyxy+d2fdxdzxz+d2fdydzyz+...f(r) - f(r_0) = \frac{df}{dx} x + \frac{df}{dy} y + \frac{df}{dz} z + \frac{1}{2} \frac{d^2f}{dx^2}x^2 + \frac{1}{2} \frac{d^2f}{dy^2}y^2 + \frac{1}{2} \frac{d^2f}{dz^2}z^2 + \frac{d^2f}{dxdy}xy + \frac{d^2f}{dxdz}xz + \frac{d^2f}{dydz}yz + ...

By symmetry components with linear parts cancel out: shellxd2r=0\int_{shell} x d^2 r = 0, shellxyd2r=0\int_{shell} xy d^2 r = 0 etc.

Components with quadratic parts stay, but compound:

shellx2d2r=shelly2d2r=shellz2d2r=13shell(x2+y2+z2)d2r=13shellR2d2r=43πR4\int_{shell} x^2 d^2 r = \int_{shell} y^2 d^2 r = \int_{shell} z^2 d^2 r = \frac{1}{3} \int_{shell} (x^2 + y^2 + z^2) d^2 r = \frac{1}{3} \int_{shell} R^2 d^2 r = \frac{4}{3} \pi R^4

Substitute this into:

limR064πR2shell(f(r)f(r0))d2r=limR064πR21243πR4(d2fdx2+d2fdy2+d2fdz2)=d2fdx2+d2fdy2+d2fdz2\lim\limits_{R \to 0} \frac{6}{4 \pi R^2} \int_{shell} (f(r) - f(r_0)) d^2 r = \lim\limits_{R \to 0} \frac{6}{4 \pi R^2} \frac{1}{2} \frac{4}{3} \pi R^4 (\frac{d^2f}{dx^2} + \frac{d^2f}{dy^2} + \frac{d^2f}{dz^2}) = \frac{d^2f}{dx^2} + \frac{d^2f}{dy^2} + \frac{d^2f}{dz^2}.

Applications outside multivariate calculus and field theory

There are several discrete analogues of the continuous Laplacian that are used in various fields of computer science.

Discrete Laplace operator in computer vision for edge detection

Discrete version of Laplace operator is a convolutional filter, used in computer vision for edge detection. In 2D case it is a 3-by-3 matrix of the following structure:

L=(010141010)L = \begin{pmatrix} 0 && -1 && 0 \\ -1 && 4 && -1 \\ 0 && -1 && 0 \\ \end{pmatrix}

Try applying it to a black-and-white photo of a brick wall. You will see that every pixel of a brick will become black after application of Laplace operator as a convolutional filter (because neighbors of this pixel from all directions have the same color). But the edges of a brick will become white, because pixels outside have a different color.

Discrete Laplacian in spectral graph theory

This is a subject of a whole separate post on spectral graph theory.

References


Boris Burkov

Written by Boris Burkov who lives in Moscow, Russia, loves to take part in development of cutting-edge technologies, reflects on how the world works and admires the giants of the past. You can follow me in Telegram