Functional Analysis¶

Functions are vectors¶

Intuition¶

Tip

We usually think of vectors in finite dimensional case, e.g. \(\mathbf{u}\in\mathbb{R}^4\), as a list

\[\begin{split}\mathbf{u}=\begin{bmatrix}0.3426 \\1.3258 \\6.8943 \\8.3878 \\\end{bmatrix}\\\end{split}\]
An alternate way to look at this is as a list of tuples, which binds the dimension (integer index in this case) and the value of that dimension

\[\mathbf{u}=\left((0,0.3426),(1,1.3258),(2,6.8943),(3,8.387)\right)\]
- This can be represented by a function
  
  \[u:[0,1,2,3]\mapsto[0.3426,1.3258,6.8943,8.387]\]
- In general, these sort of functions can be represented by
  
  \[u:\mathcal{I}\mapsto\mathbb{R}\]
  
  where \(\mathcal{I}\) is a finite index set.
If we extend \(\mathcal{I}\) to be countably infinite or uncountable (e.g. \(\mathbb{N}\) or \(\mathbb{R}\)), then, we end up with infinite lists of tuples, e.g.

\[f=\{(x,y)|x,y\in\mathbb{R}\}\]
- This is the way functions are defined in set theory, where the dimension of each function (informally) is \(|\mathcal{X}|\).

Cardinality of Function Space¶

Note

The space of functions \(f:\mathcal{X}\mapsto\mathcal{Y}\) is often denoted by \(\mathcal{Y}^{\mathcal{X}}\).
- For every element \(x\in\mathcal{X}\), we have a choice to make its image map to some \(y\in\mathcal{Y}\).
- Therefore, the size of this choice for each \(x\in\mathcal{X}\) is \(|\mathcal{Y}|\).
- The size of the choice for all the elements in \(\mathcal{X}\) is therefore \(|\mathcal{Y}|^{|\mathcal{X}|}\).

Tip

To remember, we can think of the selection problem, where we have \(n\) items which are either selected or discarded.
- We know from combinatorics that the total number of such choices is \(2^n\).
- This can be formed as a function where each of the \(n\) elements map to either 0 or 1.
We also verify that function view of real-valued vectors which maps the index set (of size \(d\)) to reals also make sense since we represent the dimension as \(\mathbb{R}^d\).

Technicalities¶

As long as we’re restricting ourselves to the class of functions from one vector (linear) space \(\mathcal{X}\) to another \(\mathcal{Y}\) over the same underlying scalar field \(\mathbb{F}\), we can faithfully recover many useful properties from a finite dimensional vector spaces for the (potentially infinite dimensional) space of functions.

Addition and Scalar Multiplication¶

Note

For finite dimensional vectors \(\mathbf{u},\mathbf{v}\in V_{\mathcal{F}}\), where \(\mathcal{F}\) is the underlying field:
- [Vector Addition]: We add the corresponding values for each dimension.
- [Scalar Multiplication]: For any \(\alpha\in\mathcal{F}\), we multiply \(\alpha\in\mathcal{F}\) to the value of each dimension.
For functions \(f:\mathcal{X}\mapsto\mathcal{Y}\) and \(g:\mathcal{X}\mapsto\mathcal{Y}\), the dimensions are \(x\in\mathcal{X}\).
- [Vector Addition]: We can define vector addition of a function for each “dimension” \(x\)
  
  \[(f + g)(x) = f(x) + g(x)\]
- [Scalar Multiplication]: We can define scalar multiplication for each “dimension” \(x\)
  
  \[(\alpha\cdot f)(x) = \alpha\cdot f(x)\]

Tip

With \(\mathcal{X}\) and \(\mathcal{Y}\) being linear space over the same scalar field, the addition and scalar multiplication makes sense.

Inner Product¶

Note

For finite dimensional vectors \(\mathbf{u},\mathbf{v}\in V_{\mathcal{F}}\), to compute the inner (dot) product
- We take a scalar product of the corresponding values for each dimension.
- We sum the results across all dimensions.
  
  \[\langle\mathbf{u},\mathbf{v}\rangle=\sum_{i=1}^n u_i\cdot v_i\]

Warning

Let’s add a constraint that \(\mathcal{Y}\) is equipped with an inner product.
Let’s add a constraint that \(\mathcal{X}\) is equipped with a positive measure \(\mu(x)\) and \(\mathop{d\mu}(x)=\mathop{dx}\).

Note

For functions \(f:\mathcal{X}\mapsto\mathcal{Y}\) and \(g:\mathcal{X}\mapsto\mathcal{Y}\)
- We can take a scalar product for each dimension \(x\).
- Since \(\mathcal{X}\) can be uncountable, we replace the sum with integration
  
  \[\langle f,g\rangle=\int_{\mathcal{X}}f(x)\cdot g(x)\mathop{dx}\]

Tip

With \(\mathcal{Y}\) being an inner product space, dot product under the integral makes sense.

Orthogonality¶

Note

Two functions \(f\) and \(g\) are orthogonal if their inner product is 0.
Example: For real trig functions \(\sin:[0,\pi]\mapsto[0,1]\in\mathbb{R}\) and \(\cos:[0,\pi]\mapsto[-1,1]\in\mathbb{R}\)

\[\langle\sin,\cos\rangle=\int_\limits{0}^{\pi}\sin(x)\cos(x)\mathop{dx}=0\]

Norm - Induced by the Inner Product¶

Lp Space¶

Note

The inner product for finite vectors induces a norm (\(l_2\))

\[||\mathbf{u}||_2^2=\langle \mathbf{u},\mathbf{u}\rangle=\sum_{i=1}^n|u_i|^2\]
The inner product defined above induces a norm

\[||f||_2^2=\langle f,f\rangle=\int_{\mathcal{X}}|f(x)|^2\mathop{dx}\]
More generally, we can have

\[||f||_{L_p}=\left(\int_{\mathcal{X}}|f(x)|^p\mathop{dx}\right)^{1/p}\]

Tip

For more general measurable spaces where we have a measure \(\mu(x)\) defined

\[||f||_{L_p(\mathcal{X},\mu)}=\left(\int_{\mathcal{X}}|f(x)|^p\mathop{d\mu}(x)\right)^{1/p}\]
For \(p=\infty\)

\[||f||_{L_\infty(\mathcal{X},\mu)}=\text{ess}\sup_\limits{x\in\mathcal{X}}|f(x)|\]
We write the function space as \(L^p(\mathcal{X},\mathcal{Y})=\{f|f:\mathcal{X}\mapsto\mathcal{Y};\text{such that }L_p(\mathcal{X,\mu})\text{ exists}\}\)
- Example: \(L^2([0,1],\mathbb{R})\)

Metric - Induced by the Norm¶

Note

The \(l_p\) norm for finite vectors induces a metric

\[d(\mathbf{u}, \mathbf{v})=||\mathbf{u}-\mathbf{v}||_p=\left(\sum_{i=1}^n|u_i-v_i|^p\right)^{1/p}\]
We can define, similarly, for functions

\[d(f, g)=||f-g||_{L_p(\mathcal{X},\mu)}=\left(\int_\limits{i=1}^n|f(x)-g(x)|^p\mathop{d\mu}(x)\right)^{1/p}\]
- If \(d(f, g)=0\), then the functions are the same “almost everywhere”.
- In this case, whereever they are different has measure zero (e.g, finitely many points or “dimensions”).

Topological Properties¶

Note

With a metric defined, we can define topological properties such as convergence and complete function spaces.

Tip

Complete normed spaces are known as Banach Space.
Complete normed spaces where the norm is induced by inner-product are known as Hilbert Space.

Warning

Without the metric, the only topology we can have for the set of functions is the product topology (as suggested by \(\mathcal{Y}^{\mathcal{X}}\)).
With product topology, the only convergence that we can define is point-wise convergence which is a weak form of convergence.

Point-wise Convergence¶

Note

Let \((f_n)_{n=1}^\infty\) be a sequence of functions where \(f_n:\mathcal{X}\mapsto\mathcal{Y}\).
Let \(f\) be another function \(f:\mathcal{X}\mapsto\mathcal{Y}\)
We say that the sequence is point-wise converging towards \(f\)

\[\lim\limits_{n\to\infty}f_n=f\iff\forall x\in\mathcal{X}, \lim\limits_{n\to\infty}f_n(x)=f(x)\]

Uniform Convergence¶

Note

We can make a stronger convergence criterion for functions which map to a metric space \(\mathcal{Y}\) (i.e., where \(\sup\) makes sense).
We say that the sequence is uninformly converging towards \(f\)

\[\lim\limits_{n\to\infty}f_n=f\iff\lim\limits_{n\to\infty}\sup\{|f_n(x)-f(x)|:x\in\mathcal{X}\}=0\]

Tip

All uniformly convergent functions are point-wise convergent.
The converse is not true.

Linear Transforms¶

Linear Transforms on Euclidean Space¶

Note

We consider a set of unit-vectors (i.e. normalized but not necessarily orthogonal) in \(\mathbb{R}^n\) for a finite dimensional vector space as

\[\{\mathbf{a}_1,\cdots\mathbf{a}_n\}\]
We can find the projection of any vector \(\mathbf{u}\in\mathbb{R}^n\) onto each of the basis

\[\langle\mathbf{a}_i,\mathbf{u}\rangle\]
Under the new basis, this gives the i-th co-ordinate for the result vector \(\mathbf{v}\)

\[\begin{split}\begin{bmatrix}0\\\vdots\\v_i\\\vdots\\0\end{bmatrix}=\langle\mathbf{a}_i,\mathbf{u}\rangle\end{split}\]
We note that we can collect the basis vectors inside a matrix as rows and express the relation as

\[\begin{split}\mathbf{v}=\begin{bmatrix}-&\mathbf{a_1^*}&-\\ \vdots&\vdots&\vdots\\ -&\mathbf{a_n^*}&-\\\end{bmatrix}\mathbf{u}=\mathbf{A}^T\mathbf{u}\end{split}\]
We also note that the final vector can be written as a sum

\[\begin{split}\mathbf{v}=\sum_{i=1}^n\begin{bmatrix}0\\\vdots\\v_i\\\vdots\\0\end{bmatrix}=\sum_{i=1}^n\langle\mathbf{u},\mathbf{a}_i\rangle\end{split}\]

Function view¶

Note

Let \(\mathcal{I}=\{1,\cdots,d\}\) be the index set.
The vector \(\mathbf{u}\) defines a function \(u:\mathcal{I}\mapsto\mathbb{R}\in\mathcal{U}(\mathcal{I})\)
The basis vectors \(\mathbf{a}_k\) are also functions \(a_k:\mathcal{I}\mapsto\mathcal{H}\) where \(\mathcal{H}^{\mathcal{I}}\subseteq\mathbb{R}^d\) is the subspace spanned by the basis.
Then the matrix \(\mathbf{A}\) defines a linear transform \(A:\mathcal{H}\mapsto\mathcal{U}(\mathcal{I})\) where

\[v(i)=(Au)(i)\]

Linear Transforms on Function Space¶

Note

We consider functions \(f:E\mapsto\mathbb{R}\in\mathcal{F}(E)\) (similar to \(\mathcal{U}(\mathcal{I})\)) where \(E\) is any abstract space.
We consider basis functions \(h:E\mapsto\mathcal{H}\) where \(\mathcal{H}\) is a Hilbert space equipped with a norm

\[\langle \cdot,\cdot\rangle_{\mathcal{H}}\]
For any \(x\in E\), the projection onto a basis is given by

\[f(x)=\langle f,h(x)\rangle_{\mathcal{H}}\]
This defines a linear transform \(L:\mathcal{H}\mapsto\mathcal{F}(E)\) where

\[f(\cdot)=(Lf)(\cdot)\]

Kernel of a Linear Transform¶

Note

The kernel of a linear transform is found by inner products of the basis functions.
For finite dimensional case \(K:\mathcal{I}\times\mathcal{I}\mapsto\mathbb{R}\)

\[K(i,j)=\langle\mathbf{a}^*_i,\mathbf{a}^*_j\rangle\]
For the functional case \(K:E\times E\mapsto\mathbb{R}\)

\[K(x,y)=\langle h(y),h(x)\rangle_{\mathcal{H}}\]
For any fixed \(y_n\in E\), \(K(x,y_n)\) is a function of just \(x\in E\) in \(\mathcal{F}(E)\)
Therefore, often the basis functions are referred by just the kernel itself as

\[K(\cdot,x)=h(x)\]

and

\[K(x,y)=\langle K(\cdot,x),K(\cdot,y)\rangle\]
Symmetry: \(K(x,y)=K(y,x)\) due to the symmetry of the inner product.
Reproducing Property:
- Function evaluation is inner product: \(f(x)=\langle f,K(\cdot,x)\rangle_{\mathcal{H}}\)
- Therefore, knowing the kernel, we can recover any function.
Any function \(g\in\mathcal{F}(E)\) can be expressed as a linear combination of the basis

\[g(x)=\sum_m\alpha_m K(x,y_m)\]

Tip

We note that when the matrix is the centered, normalised data matrix, then the kernel gives the sample covariance matrix.
This hints as the usability of functional kernels for covariance functions for infinite dimensional Gaussian distributions (GPs).

Mercer Theorem - Eigenfunctions¶

Note

TODO

Other Integral Transforms¶

General Form¶

Note

TODO

Fourier Basis¶

Note

For “well-behaved” (i.e. square-integrable so that one can define \(L_2\) norm as per above) periodic functions, we can have basis functions of odd and even frequencies.
Schauder basis (allows for infinite sum over basis):
- A basis for functions in \(L^2([0,1],\mathbb{R})\) can be defined in terms of an infinite set of orthonormal functions`
  
  \[\{1, (\sqrt{2}\sin(2\pi kx))_{k=1}^\infty, (\sqrt{2}\cos(2\pi kx))_{k=1}^\infty\}\]
- The \(\sin\) functions account for odd-frequencies and the \(\cos\) functions account for even-frequencies.
Here we have 3 sets of basis functions, so we use 3 different kinds of normalised-projection co-efficients, \(a_0,a_k,b_k\)

\[f(x)=a_0\cdot1+\sum_{k=1}^\infty a_i\cdot\cos(2\pi kx)+\sum_{k=1}^\infty b_i\cdot\sin(2\pi kx)\]
\(a_0\) computes the projection of \(f(x)\) onto the constant function \(1\).

\[a_0=\frac{\int_\limits{[0,1]}1\cdot f(x)\mathop{dx}}{\int_\limits{[0,1]}1\cdot 1\mathop{dx}}=\int_\limits{[0,1]}f(x)\mathop{dx}\]
For each \(k>0\), \(a_k\) computes the projection of \(f(x)\) onto the even frequencies, \(\sqrt{2}\cos(2\pi nx)\).

\[a_k=\frac{\int_\limits{[0,1]}f(x)\cdot\sqrt{2}\cos(2\pi kx)\mathop{dx}}{\int_\limits{[0,1]}\sqrt{2}\cos(2\pi kx)\cdot\sqrt{2}\cos(2\pi kx)\mathop{dx}}\]
Similarly, for \(b_k\).

Useful Function Spaces¶

Sobolev Space¶

Holder Space¶

Reproducing kernel Hilbert space¶

Moore-Aronszajn Theorem¶

We assume that the kernel is positive definite. .. warning:

For every positive definite kernel :math:`K` on a set :math:`E`, there exists a unique RKHS :math:`\mathcal{H}_k` of functions on :math:`E` for which :math:`K` is the reproducing kernel.

Functional Analysis¶

Functions are vectors¶

Intuition¶

Cardinality of Function Space¶

Technicalities¶

Addition and Scalar Multiplication¶

Inner Product¶

Orthogonality¶

Norm - Induced by the Inner Product¶

Lp Space¶

Metric - Induced by the Norm¶

Topological Properties¶

Point-wise Convergence¶

Uniform Convergence¶

Linear Transforms¶

Linear Transforms on Euclidean Space¶

Function view¶

Linear Transforms on Function Space¶

Kernel of a Linear Transform¶

Mercer Theorem - Eigenfunctions¶

Other Integral Transforms¶

General Form¶

Fourier Basis¶

Useful Function Spaces¶

Sobolev Space¶

Holder Space¶

Reproducing kernel Hilbert space¶

Moore-Aronszajn Theorem¶

Generalised Functions - Distributions¶

Dirac-Delta Function¶