Functional Analysis

Functions are vectors

Intuition

Tip

  • We usually think of vectors in finite dimensional case, e.g. \(\mathbf{u}\in\mathbb{R}^4\), as a list

    \[\begin{split}\mathbf{u}=\begin{bmatrix}0.3426 \\1.3258 \\6.8943 \\8.3878 \\\end{bmatrix}\\\end{split}\]
  • An alternate way to look at this is as a list of tuples, which binds the dimension (integer index in this case) and the value of that dimension

    \[\mathbf{u}=\left((0,0.3426),(1,1.3258),(2,6.8943),(3,8.387)\right)\]
    • This can be represented by a function

      \[u:[0,1,2,3]\mapsto[0.3426,1.3258,6.8943,8.387]\]
    • In general, these sort of functions can be represented by

      \[u:\mathcal{I}\mapsto\mathbb{R}\]

      where \(\mathcal{I}\) is a finite index set.

  • If we extend \(\mathcal{I}\) to be countably infinite or uncountable (e.g. \(\mathbb{N}\) or \(\mathbb{R}\)), then, we end up with infinite lists of tuples, e.g.

    \[f=\{(x,y)|x,y\in\mathbb{R}\}\]
    • This is the way functions are defined in set theory, where the dimension of each function (informally) is \(|\mathcal{X}|\).

Cardinality of Function Space

Note

  • The space of functions \(f:\mathcal{X}\mapsto\mathcal{Y}\) is often denoted by \(\mathcal{Y}^{\mathcal{X}}\).

    • For every element \(x\in\mathcal{X}\), we have a choice to make its image map to some \(y\in\mathcal{Y}\).

    • Therefore, the size of this choice for each \(x\in\mathcal{X}\) is \(|\mathcal{Y}|\).

    • The size of the choice for all the elements in \(\mathcal{X}\) is therefore \(|\mathcal{Y}|^{|\mathcal{X}|}\).

Tip

  • To remember, we can think of the selection problem, where we have \(n\) items which are either selected or discarded.

    • We know from combinatorics that the total number of such choices is \(2^n\).

    • This can be formed as a function where each of the \(n\) elements map to either 0 or 1.

  • We also verify that function view of real-valued vectors which maps the index set (of size \(d\)) to reals also make sense since we represent the dimension as \(\mathbb{R}^d\).

Technicalities

As long as we’re restricting ourselves to the class of functions from one vector (linear) space \(\mathcal{X}\) to another \(\mathcal{Y}\) over the same underlying scalar field \(\mathbb{F}\), we can faithfully recover many useful properties from a finite dimensional vector spaces for the (potentially infinite dimensional) space of functions.

Addition and Scalar Multiplication

Note

  • For finite dimensional vectors \(\mathbf{u},\mathbf{v}\in V_{\mathcal{F}}\), where \(\mathcal{F}\) is the underlying field:

    • [Vector Addition]: We add the corresponding values for each dimension.

    • [Scalar Multiplication]: For any \(\alpha\in\mathcal{F}\), we multiply \(\alpha\in\mathcal{F}\) to the value of each dimension.

  • For functions \(f:\mathcal{X}\mapsto\mathcal{Y}\) and \(g:\mathcal{X}\mapsto\mathcal{Y}\), the dimensions are \(x\in\mathcal{X}\).

    • [Vector Addition]: We can define vector addition of a function for each “dimension” \(x\)

      \[(f + g)(x) = f(x) + g(x)\]
    • [Scalar Multiplication]: We can define scalar multiplication for each “dimension” \(x\)

      \[(\alpha\cdot f)(x) = \alpha\cdot f(x)\]

Tip

With \(\mathcal{X}\) and \(\mathcal{Y}\) being linear space over the same scalar field, the addition and scalar multiplication makes sense.

Inner Product

Note

  • For finite dimensional vectors \(\mathbf{u},\mathbf{v}\in V_{\mathcal{F}}\), to compute the inner (dot) product

    • We take a scalar product of the corresponding values for each dimension.

    • We sum the results across all dimensions.

      \[\langle\mathbf{u},\mathbf{v}\rangle=\sum_{i=1}^n u_i\cdot v_i\]

Warning

  • Let’s add a constraint that \(\mathcal{Y}\) is equipped with an inner product.

  • Let’s add a constraint that \(\mathcal{X}\) is equipped with a positive measure \(\mu(x)\) and \(\mathop{d\mu}(x)=\mathop{dx}\).

Note

  • For functions \(f:\mathcal{X}\mapsto\mathcal{Y}\) and \(g:\mathcal{X}\mapsto\mathcal{Y}\)

    • We can take a scalar product for each dimension \(x\).

    • Since \(\mathcal{X}\) can be uncountable, we replace the sum with integration

      \[\langle f,g\rangle=\int_{\mathcal{X}}f(x)\cdot g(x)\mathop{dx}\]

Tip

With \(\mathcal{Y}\) being an inner product space, dot product under the integral makes sense.

Orthogonality

Note

  • Two functions \(f\) and \(g\) are orthogonal if their inner product is 0.

  • Example: For real trig functions \(\sin:[0,\pi]\mapsto[0,1]\) and \(\cos:[0,\pi]\mapsto[0,1]\)

    \[\langle\sin,\cos\rangle=\int_\limits{0}^{\pi}\sin(x)\cos(x)\mathop{dx}=0\]

Norm - Induced by the Inner Product

Lp Space

Note

  • The inner product for finite vectors induces a norm (\(l_2\))

    \[||\mathbf{u}||_2^2=\langle \mathbf{u},\mathbf{u}\rangle=\sum_{i=1}^n|u_i|^2\]
  • The inner product defined above induces a norm

    \[||f||_2^2=\langle f,f\rangle=\int_{\mathcal{X}}|f(x)|^2\mathop{dx}\]
  • More generally, we can have

    \[||f||_{L_p}=\left(\int_{\mathcal{X}}|f(x)|^p\mathop{dx}\right)^{1/p}\]

Tip

  • For more general measurable spaces where we have a measure \(\mu(x)\) defined

    \[||f||_{L_p(\mathcal{X},\mu)}=\left(\int_{\mathcal{X}}|f(x)|^p\mathop{d\mu}(x)\right)^{1/p}\]
  • For \(p=\infty\)

    \[||f||_{L_\infty(\mathcal{X},\mu)}=\text{ess}\sup_\limits{x\in\mathcal{X}}|f(x)|\]
  • We write the function space as \(L^p(\mathcal{X},\mathcal{Y})=\{f|f:\mathcal{X}\mapsto\mathcal{Y};\text{such that }L_p(\mathcal{X,\mu})\text{ exists}\}\)

    • Example: \(L^2([0,1],\mathbb{R})\)

Metric - Induced by the Norm

Note

  • The \(l_p\) norm for finite vectors induces a metric

    \[d(\mathbf{u}, \mathbf{v})=||\mathbf{u}-\mathbf{v}||_2=\left(\sum_{i=1}^n|u_i-v_i|^p\right)^{1/p}\]
  • We can define, similarly, for functions

    \[d(f, g)=||f-g||_{L_p(\mathcal{X},\mu)}=\left(\int_\limits{i=1}^n|f(x)-g(x)|^p\mathop{d\mu}(x)\right)^{1/p}\]
    • If \(d(f, g)=0\), then the functions are the same “almost everywhere”.

    • In this case, they are different for at most finitely many “dimensions”.

Topological Properties

Note

With a metric defined, we can define topological properties such as convergence and complete function spaces.

Tip

Warning

  • Without the metric, the only topology we can have for the set of functions is the product topology (as suggested by \(\mathcal{Y}^{\mathcal{X}}\)).

  • With product topology, the only convergence that we can define is point-wise convergence which is a weak form of convergence.

Point-wise Convergence

Note

  • Let \((f_n)_{n=1}^\infty\) be a sequence of functions where \(f_n:\mathcal{X}\mapsto\mathcal{Y}\).

  • Let \(f\) be another function \(f:\mathcal{X}\mapsto\mathcal{Y}\)

  • We say that the sequence is point-wise converging towards \(f\)

    \[\lim\limits_{n\to\infty}f_n=f\iff\forall x\in\mathcal{X}, \lim\limits_{n\to\infty}f_n(x)=f(x)\]
Uniform Convergence

Note

  • We can make a stronger convergence criterion for functions which map to a metric space \(\mathcal{Y}\) (i.e., where \(\sup\) makes sense).

  • We say that the sequence is uninformly converging towards \(f\)

    \[\lim\limits_{n\to\infty}f_n=f\iff\lim\limits_{n\to\infty}\sup\{|f_n(x)-f(x)|:x\in\mathcal{X}\}=0\]

Tip

  • All uniformly convergent functions are point-wise convergent.

  • The converse is not true.

Linear Transforms

Linear Transforms on Euclidean Space

Note

  • We consider an normalized set of basis vectors (i.e. of unit-length but not necessarily orthogonal) in \(\mathbb{R}^n\) for a finite dimensional vector space as

    \[\{\mathbf{a}_1,\cdots\mathbf{a}_n\}\]
  • We can find the projection of any vector \(\mathbf{u}\in\mathbb{R}^n\) onto each of the basis

    \[\langle\mathbf{a}_i,\mathbf{u}\rangle\]
  • Under the new basis, this gives the i-th co-ordinate for the result vector \(\mathbf{v}\)

    \[\begin{split}\begin{bmatrix}0\\\vdots\\v_i\\\vdots\\0\end{bmatrix}=\langle\mathbf{a}_i,\mathbf{u}\rangle\end{split}\]
  • We note that we can collect the basis vectors inside a matrix as rows and express the relation as

    \[\begin{split}\mathbf{v}=\begin{bmatrix}-&\mathbf{a_1^*}&-\\ \vdots&\vdots&\vdots\\ -&\mathbf{a_n^*}&-\\\end{bmatrix}\mathbf{u}=\mathbf{A}^T\mathbf{u}\end{split}\]
  • We also note that the final vector can be written as a sum

    \[\begin{split}\mathbf{v}=\sum_{i=1}^n\begin{bmatrix}0\\\vdots\\v_i\\\vdots\\0\end{bmatrix}=\sum_{i=1}^n\langle\mathbf{u},\mathbf{a}_i\rangle\end{split}\]
Function view

Note

  • Let \(\mathcal{I}=\{1,\cdots,d\}\) be the index set.

  • The vector \(\mathbf{u}\) defines a function \(u:\mathcal{I}\mapsto\mathbb{R}\in\mathcal{U}(\mathcal{I})\)

  • The basis vectors \(\mathbf{a}_k\) are also functions \(a_k:\mathcal{I}\mapsto\mathcal{H}\) where \(\mathcal{H}^{\mathcal{I}}\subseteq\mathbb{R}^d\) is the subspace spanned by the basis.

  • Then the matrix \(\mathbf{A}\) defines a linear transform \(A:\mathcal{H}\mapsto\mathcal{U}(\mathcal{I})\) where

    \[v(i)=(Au)(i)\]
Linear Transforms on Function Space

Note

  • We consider functions \(f:E\mapsto\mathbb{R}\in\mathcal{F}(E)\) (similar to \(\mathcal{U}(\mathcal{I})\)) where \(E\) is any abstract space.

  • We consider basis functions \(h:E\mapsto\mathcal{H}\) where \(\mathcal{H}\) is a Hilbert space equipped with a norm

    \[\langle \cdot,\cdot\rangle_{\mathcal{H}}\]
  • For any \(x\in E\), the projection onto a basis is given by

    \[f(x)=\langle f,h(x)\rangle_{\mathcal{H}}\]
  • This defines a linear transform \(L:\mathcal{H}\mapsto\mathcal{F}(E)\) where

    \[f(\cdot)=(Lf)(\cdot)\]
Kernel of a Linear Transform

Note

  • The kernel of a linear transform is found by inner products of the basis functions.

  • For finite dimensional case \(K:\mathcal{I}\times\mathcal{I}\mapsto\mathbb{R}\)

    \[K(i,j)=\langle\mathbf{a}^*_i,\mathbf{a}^*_j\rangle\]
  • For the functional case \(K:E\times E\mapsto\mathbb{R}\)

    \[K(x,y)=\langle h(y),h(x)\rangle_{\mathcal{H}}\]
  • For any fixed \(y_n\in E\), \(K(x,y_n)\) is a function of just \(x\in E\) in \(\mathcal{F}(E)\)

  • Therefore, often the basis functions are referred by just the kernel itself as

    \[K(\cdot,x)=h(x)\]

    and

    \[K(x,y)=\langle K(\cdot,x),K(\cdot,y)\rangle\]
  • Symmetry: \(K(x,y)=K(y,x)\) due to the symmetry of the inner product.

  • Reproducing Property:

    • Function evaluation is inner product: \(f(x)=\langle f,K(\cdot,x)\rangle_{\mathcal{H}}\)

    • Therefore, knowing the kernel, we can recover any function.

  • Any function \(g\in\mathcal{F}(E)\) can be expressed as a linear combination of the basis

    \[g(x)=\sum_m\alpha_m K(x,y_m)\]

Tip

  • We note that when the matrix is the centered, normalised data matrix, then the kernel gives the sample covariance matrix.

  • This hints as the usability of functional kernels for covariance functions for infinite dimensional Gaussian distributions (GPs).

Moore-Aronszajn Theorem

Warning

  • We assume that the kernel is positive definite.

  • Then there exists a unique \(\mathcal{H}_k\) for which \(K\) is the kernel basis.

Mercer Theorem - Eigenfunctions

Note

TODO

Other Integral Transforms

General Form

Note

TODO

Fourier Basis

Note

  • For “well-behaved” (i.e. square-integrable so that one can define \(L_2\) norm as per above) periodic functions, we can have basis functions of odd and even frequencies.

  • Schauder basis (allows for infinite sum over basis):

    • A basis for functions in \(L^2([0,1],\mathbb{R})\) can be defined in terms of an infinite set of orthonormal functions`

      \[\{1, (\sqrt{2}\sin(2\pi nx))_{n=1}^\infty, (\sqrt{2}\cos(2\pi nx))_{n=1}^\infty\}\]
    • The \(\sin\) functions account for odd-frequencies and the \(\cos\) functions account for even-frequencies.

  • Here we have 3 sets of basis functions, so we use 3 different kinds of normalised-projection co-efficients, \(a_0,a_i,b_i\)

    \[f(x)=a_0\cdot1+\sum_{n=1}^\infty a_i\cdot\cos(2\pi nx)+\sum_{n=1}^\infty b_i\cdot\sin(2\pi nx)\]
  • \(a_0\) computes the projection of \(f(x)\) onto the constant function \(1\).

    \[a_0=\frac{\int_\limits{[0,1]}1\cdot f(x)\mathop{dx}}{\int_\limits{[0,1]}1\cdot 1\mathop{dx}}=\int_\limits{[0,1]}f(x)\mathop{dx}\]
  • For each \(k>0\), \(a_k\) computes the projection of \(f(x)\) onto the even frequencies, \(\sqrt{2}\cos(2\pi nx)\).

    \[a_k=\frac{\int_\limits{[0,1]}f(x)\cdot\sqrt{2}\cos(2\pi kx)\mathop{dx}}{\int_\limits{[0,1]}\sqrt{2}\cos(2\pi kx)\cdot\sqrt{2}\cos(2\pi kx)\mathop{dx}}\]
  • Similarly, for \(b_k\).

Useful Function Spaces

Sobolev Space

Holder Space

Reproducing kernel Hilbert space