Linear Transform¶

Linear Transform in Vector Space¶

Note

Let \(U\) and \(W\) be two vector spaces over the same scalar field \(\mathcal{F}\).
A linear transform is a function \(T:U\mapsto W\) if
- \(\forall\mathbf{u},\mathbf{v}\in U, T(\mathbf{u}+\mathbf{v})=T(\mathbf{u})+T(\mathbf{v})\)
- \(\forall c\in\mathcal{F},\forall\mathbf{u}\in U, T(c\cdot\mathbf{u})=c\cdot T(\mathbf{u})\)
This means, if we want to add or scale vecetors, it doesn’t matter whether we do it in the domain space before the mapping or in the range space after the mapping.

Tip

A linear transform is one-to-one when it’s onto.

Linear Operator¶

Tip

Linear transforms from \(U\) to \(U\) are called Linear Operators.
The set of all linear operators \(T:U\mapsto U\) is represented as \(L(U)\).

Space of Linear Transforms¶

Tip

The set of all linear transforms \(T:U\mapsto W\) is represented as \(L(U,W)\).

As a Vector Space over Addition? Always.¶

Composition of Linear Transforms¶

Note

We can define composite linear transforms in the usual way.
Let \(A:U\mapsto V\) and \(B:V\mapsto W\).
Then \((B\circ A)\in L(U,W)\) where \(\forall\mathbf{u}\in U, (B\circ A)(\mathbf{u})=B(A(\mathbf{u}))\).

As a Vector Space over Composition? Not Guaranteed.¶

Examples¶

Scalar Multiplication as a Linear Transform¶

Tip

For every scalar \(\alpha\in\mathbb{R}\), we can define a unique linear operator in \(L(\mathbb{R})\) with its already defined multiplication operator as \(\alpha:\mathbb{R}\mapsto\mathbb{R}\) where \(\forall x\in\mathbb{R}, \alpha(x)=\alpha\cdot x\).
We note that
- \(\forall u,v\in \mathbb{R}, \alpha(u+v)=\alpha(u)+\alpha(v)\)
- \(\forall c\in\mathbb{R},\forall u\in \mathbb{R}, \alpha(c\cdot u)=c\cdot\alpha(u)\)

Matrix-Vector Multiplication as Linear Transform¶

Tip

The matrix \(\mathbf{A}\) is a linear transform which maps \(\mathbb{C}^n\) dimensional vectors to \(\mathbb{C}^m\) dimensional vectors.

\[\mathbf{A}:\mathbb{C}^n\mapsto\mathbb{C}^m\]
The range of this transform is the column space of this transform

\[C(\mathbf{A})=\{\mathbf{A}\mathbf{x}\mathop{|}\forall \mathbf{x}\in\mathbb{C}^n\}\]
The transposed matrix \(\mathbf{A}^\top\) does the mapping the other way around (but it’s not necessarily the inverse transform)

\[\mathbf{A}^\top:\mathbb{C}^m\mapsto\mathbb{C}^n\]
The range of the transpose transform is the row space of \(\mathbf{A}\)

\[C(\mathbf{A}^\top)=\{\mathbf{A}^\top\mathbf{y}\mathop{|}\forall \mathbf{y}\in\mathbb{C}^m\}\]

Differentiation as a Linear Transform¶

Tip

Let \(X\) and \(Y\) be two vector spaces over \(\mathbb{R}\).
Let \(\mathcal{F}=\{f\mathop{|} f:X\mapsto Y\}\) be the set of all function from \(X\) into \(Y\).
Let \(\mathcal{D}=\{g\mathop{|} g:X\mapsto Y\}\) be the set of all differentiable functions.
The differentiation operator \(D=\frac{\mathop{d}}{\mathop{dx}}(\cdot)\) on its own defines a function \(D:\mathcal{G}\mapsto\mathcal{F}\).

\[D(g\in\mathcal{G})=g'\in\mathcal{F}\]
- Here \(g'(x)=\frac{\mathop{d}}{\mathop{dx}}(g)(x)\).
\(D\) applied on some specific function \(g\) (ready to be evaluated on any \(x\)) defines a linear transform \(D(g)\in\mathcal{F}:X\mapsto Y\) since
- \(\forall g_1,g_2\in \mathcal{G}, D(g_1+g_2)=D(g_1)+D(g_2)\)
- \(\forall c\in \mathbb{R},\forall g\in \mathcal{G}, D(c\cdot g)=c\cdot D(g)\)

Integration as a Linear Transform¶

Tip

Let \(X\) and \(Y\) be two vector spaces over \(\mathbb{R}\).
Let \(\mathcal{F}=\{f\mathop{|} f:X\mapsto Y\}\) be the set of all function from \(X\) into \(Y\).
Let \(\mathcal{I}=\{g\mathop{|} g:X\mapsto Y\}\) be the set of all integrable functions.
The integration operator \(I=\int(\cdot)\mathop{dx}\) on its own defines a function \(I:\mathcal{I}\mapsto\mathcal{F}\).

\[I(g\in\mathcal{I})=G\in\mathcal{F}\]
- Here \(G(x)=\int g(x)\mathop{dx}\).
\(I\) applied on some specific function \(g\) (ready to be evaluated on any \(x\)) defines a linear transform \(I(g)\in\mathcal{F}:X\mapsto Y\) since
- \(\forall g_1,g_2\in \mathcal{I}, I(g_1+g_2)=I(g_1)+I(g_2)\)
- \(\forall c\in \mathbb{R},\forall g\in \mathcal{I}, I(c\cdot g)=c\cdot I(g)\)

Matrix-vector multiplication¶

Let \(\mathbf{A}\) be a \(m\times n\) matrix.

Column view: \(\mathbf{a}_k\in\mathbb{R}^m\) are column vectors

\[\begin{split}\mathbf{A}=\begin{bmatrix} | & \cdots & |\\ \mathbf{a}_1 & \cdots & \mathbf{a}_n\\ | & \cdots & |\\ \end{bmatrix}\end{split}\]
Row view: \((\mathbf{a}^*_k)^\top\in\mathbb{R}^n\) are row vectors

\[\begin{split}\mathbf{A}=\begin{bmatrix}-&\mathbf{a}^*_1&-\\&\vdots&\\-&\mathbf{a}^*_m&-\end{bmatrix}\end{split}\]

Let \(\mathbf{x}\in\mathbb{R}^n\) be a column vector which can also be thought of as a \(n\times 1\) matrix

\[\begin{split}\mathbf{x}=(x_1,\cdots,x_n)^\top=\begin{bmatrix} x_1\\ \vdots\\ x_n \end{bmatrix}\end{split}\]

Note

Column view: The multiplication \(\mathbf{A}\mathbf{x}\) is a combination of the column vectors of \(\mathbf{A}\), where each vector \(\mathbf{a}_k\) is scaled as per \(x_k\).

\[\begin{split}\mathbf{A}\mathbf{x}=\begin{bmatrix} | & \cdots & |\\ \mathbf{a}_1 & \cdots & \mathbf{a}_n\\ | & \cdots & |\\ \end{bmatrix}\begin{bmatrix}x_1\\\vdots\\x_n\end{bmatrix}=x_1\begin{bmatrix}|\\ \mathbf{a}_1\\|\end{bmatrix}+\cdots+x_n\begin{bmatrix}|\\ \mathbf{a}_n\\|\end{bmatrix}\end{split}\]
Row view: It can also be thought of the collection of inner products with each row vectors

\[\begin{split}\mathbf{A}\mathbf{x}=\begin{bmatrix}\langle(\mathbf{a}^*_1)^\top,\mathbf{x}\rangle\\\vdots\\\langle(\mathbf{a}^*_m)^\top,\mathbf{x}\rangle\end{bmatrix}\end{split}\]

Attention

The equation \(\mathbf{A}\mathbf{x}=\mathbf{b}\) has a unique solution if \(\mathbf{b}\in C(\mathbf{A})\).

Matrix-matrix multiplication¶

Let \(\mathbf{A}\) be the matrix as before and let \(\mathbf{B}\) be a \(n\times p\) matrix written as a collection of rows similar to a vector

\[\begin{split}\mathbf{B}=\begin{bmatrix}-&\mathbf{b}^*_1&-\\&\vdots&\\-&\mathbf{b}^*_n&-\end{bmatrix}\end{split}\]

where \(\mathbf{b}^*_k\in\mathbb{R}^p\) are the row vectors.

Note

The multiplication \(\mathbf{A}\mathbf{B}\) is the sum of outer products \(\mathbf{u}\mathbf{v}^\top=\mathbf{a}_k \mathbf{b}^*_k\)

\[\begin{split}\mathbf{A}\mathbf{B}=\begin{bmatrix} | & \cdots & |\\ \mathbf{a}_1 & \cdots & \mathbf{a}_n\\ | & \cdots & |\\ \end{bmatrix}\begin{bmatrix}-&\mathbf{b}^*_1&-\\&\vdots&\\-&\mathbf{b}^*_n&-\end{bmatrix}=\begin{bmatrix}|\\ \mathbf{a}_1\\|\end{bmatrix}\begin{bmatrix}-&\mathbf{b}^*_1&-\end{bmatrix}+\cdots+\begin{bmatrix}|\\ \mathbf{a}_n\\|\end{bmatrix}\begin{bmatrix}-&\mathbf{b}^*_n&-\end{bmatrix}\end{split}\]

Independence, Rank, Inverse Mapping, Basis and Fundamental Subspaces¶

Independence¶

Note

Vector \(\mathbf{u}\) is linearly independent of vector \(\mathbf{v}\) if they are not in the same direction.
- There is no scalar \(a\in\mathbb{R}\) such that \(\mathbf{u}=a\mathbf{v}\)
Vector \(\mathbf{w}\) is linearly independent of vectors \(\mathbf{u}\) and \(\mathbf{v}\) if it is not in the same place spanned by these.
- There are no scalars \(a,b\in\mathbb{R}\) such that \(\mathbf{w}=a\mathbf{u}+b\mathbf{v}\)
Extends naturally for more dimensions.

Rank¶

Rank determines whether the linear transform \(\mathbf{A}\) defines a mapping which is onto or into.

Note

The number of independent column vectors in a matrix \(\mathbf{A}\) is the column-rank.
The number of independent row vectors in a matrix \(\mathbf{A}\) is the row-rank.

Attention

For any matrix \(\mathbf{A}\), column-rank and row-rank are the same, and it is called the rank of a matrix, \(r\leq m\) and \(r\leq n\).
\(r\) is the dimensionality of the column-space \(C(\mathbf{A})\) as well as the row-space \(C(\mathbf{A}^\top)\).
If \(m=n=r\), then the matrix is full-rank.

Inverse Mapping¶

Note

A full rank matrix \(\mathbf{A}:\mathbb{R}^n\mapsto\mathbb{R}^n\) defines a onto mapping, i.e. it spans the entire range.
In such cases, the operation is one-to-one as well. There are no two vectors in the domain which maps to the same vector in the range space.
We can define an inverse transform in this case as \(\mathbf{A}^{-1}:\mathbb{R}^n\mapsto\mathbb{R}^n\).

Basis¶

Note

For a matrix \(\mathbf{A}\) of rank \(r\), there are \(r\) independent column vectors which span \(\mathbb{R}^r\).
These column vectors form one basis of the column space.
We note that these don’t necessarily have to be orthogonal.

Attention

There can be multiple basis vectors for a matrix which span the same column space.

Fundamental Subspaces¶

Note

We define the null-space of \(\mathbf{A}:\mathbb{R}^n\mapsto\mathbb{R}^m\) as the subspace in the domain \(\mathbb{R}^n\) which maps to \(\mathbf{0}\) in the range \(\mathbb{R}^m\).

\[N(\mathbf{A})\subseteq \mathbb{R}^n\]
The vectors in the null-space span a \(n-r\) dimensional space where \(r\) is the rank of the matrix.
- We prefer the basis for the null-space to be orthogonal although it’s not a necessity.
The right-null-space is defined as the null-space of the transposed transform \(\mathbf{A}^\top\).

Attention

\(\dim(C(\mathbf{A}))=r\) and \(\dim(N(\mathbf{A}^\top))=m-r\)
\(\dim(C(\mathbf{A}^\top))=r\) and \(\dim(N(\mathbf{A}))=n-r\)

Orthogonality¶

Orthogonal vectors¶

Note

Two vectors \(\mathbf{u}\) and \(\mathbf{v}\) are orthogonal if \(\mathbf{u}^\top\mathbf{v}=0\).

Tip

Pythagoras: For \(\mathbf{x}\mathop{\bot}\mathbf{y}\)

\[||\mathbf{x}-\mathbf{y}||=(\mathbf{x}-\mathbf{y})^\top(\mathbf{x}-\mathbf{y})=\mathbf{x}^\top\mathbf{x}+\mathbf{y}^\top\mathbf{y}-\mathbf{x}^\top\mathbf{y}-\mathbf{y}^\top\mathbf{x}=\mathbf{x}^\top\mathbf{x}+\mathbf{y}^\top\mathbf{y}=||\mathbf{x}||+||\mathbf{y}||\]
In general, \(\mathbf{x}^\top\mathbf{y}=||\mathbf{x}||\cdot||\mathbf{y}||\cdot\cos\theta\)

Attention

If \(\mathbf{x}\in N(\mathbf{A})\), then for any \(k\), \(\mathbf{a}^*_k\mathop{\bot}\mathbf{x}\) as \((\mathbf{a}^*_k)^\top\mathbf{x}=0\).
Therefore, any vector in the null-space cannot be spanned by the row-space of \(\mathbf{A}\).

Orthonormal vectors¶

Note

Orthogonal vectors such that \(||\mathbf{u}||=1\).

Matrix with orthonormal columns¶

Note

Written as \(\mathbf{Q}\).
We note that \(\mathbf{Q}^\top\mathbf{Q}=\mathbf{I}\).
Doesn’t change the length: \(||\mathbf{Q}\mathbf{x}||=||\mathbf{x}||\) but might lose/gain a few dimensions though based on the dimensionality of \(\mathbf{Q}\).

\[||\mathbf{Q}\mathbf{x}||=(\mathbf{Q}\mathbf{x})^\top(\mathbf{Q}\mathbf{x})=\mathbf{x}^\top(\mathbf{Q}^\top\mathbf{Q})\mathbf{x}=\mathbf{x}^\top\mathbf{x}=||\mathbf{x}||\]
If \(\mathbf{Q}_1\) and \(\mathbf{Q}_2\) are matrices with orthonormal columns, then \(\mathbf{Q}=\mathbf{Q}_1\mathbf{Q}_2\) is also a matrix with orthonormal columns.

\[\mathbf{Q}^\top\mathbf{Q}=(\mathbf{Q}_1\mathbf{Q}_2)^\top(\mathbf{Q}_1\mathbf{Q}_2)=\mathbf{Q}_2^\top(\mathbf{Q}_1^\top\mathbf{Q}_1)\mathbf{Q}_2=\mathbf{Q}_2^\top\mathbf{Q}_2=\mathbf{I}\]

Projection matrices¶

Note

Any matrix that can be factorised as \(\mathbf{P}=\mathbf{Q}\mathbf{Q}^\top\) is a projection matrix.
For any vector \(\mathbf{v}\), \(\mathbf{P}\mathbf{v}\) is the orthogonal projection onto the column space of \(\mathbf{P}\).
Any vector \(\mathbf{v}\) can be broken into two parts
- Projection \(\mathbf{P}\mathbf{v}\)
- Error \(\mathbf{v}-\mathbf{P}\mathbf{v}\)

Attention

Repeated projection doesn’t change anything

\[\mathbf{P}^2=(\mathbf{Q}\mathbf{Q}^\top)(\mathbf{Q}\mathbf{Q}^\top)=\mathbf{Q}(\mathbf{Q}^\top\mathbf{Q})\mathbf{Q}^\top=\mathbf{Q}\mathbf{Q}^\top=\mathbf{P}\]
Projection matrices are symmetric

\[\mathbf{P}^\top=(\mathbf{Q}\mathbf{Q}^\top)^\top=(\mathbf{Q}^\top)^\top\mathbf{Q}^\top=\mathbf{Q}\mathbf{Q}^\top=\mathbf{P}\]

Orthogonal matrices¶

Note

Square matrices with orthonormal columns.

Attention

We have \(\mathbf{Q}^\top=\mathbf{Q}^{-1}\) since

\[\mathbf{Q}^\top\mathbf{Q}=\mathbf{Q}\mathbf{Q}^\top=\mathbf{I}\]
They represent a pure rotation or reflection in \(\mathbb{R}^n\) as neither the length or the dimensionality changes of any vector under this transformation.
- Positive determinant implies rotation, negative determinant implies reflection (as the orientation changes).

Orthonormal basis¶

Note

Standard co-ordinate vectors are an example of orthonormal basis.
It’s not necessary for basis vectors to be orthonormal but it’s desired.
For orthonormal basis, we can obtain the scalar along each component independently.
- Let the orthogonal basis vectors are \(\mathbf{q}_1,\cdots,\mathbf{q}_n\). Then any vector \(\mathbf{v}\in\mathbb{R}^n\) can be expressed as
  
  \[\mathbf{v}=c_1\mathbf{q}_1+\cdots+c_n\mathbf{q}_n\]
- The scalar along any \(\mathbf{q}_k\) can be obtained as \(c_k=\mathbf{q}_k^\top\mathbf{v}\) since
  
  \[\mathbf{q}_k^\top\mathbf{v}=c_1\mathbf{q}_k^\top\mathbf{q}_1+\cdots+c_k\mathbf{q}_k^\top\mathbf{q}_k+\cdots+c_n\mathbf{q}_k^\top\mathbf{q}_n=c_1\cdot0+\cdots+c_k\cdot1+\cdots+c_n\cdot0=c_k\]

Tip

We can create an orthogonal matrix \(\mathbf{Q}\) with the basis vectors as columns. Then all these coefficients can be found using \(\mathbf{Q}\mathbf{v}\).

Orthogonal subspace¶

Attention

\(C(\mathbf{A})\mathop{\bot} N(\mathbf{A}^\top)\) and \(C(\mathbf{A}^\top)\mathop{\bot} N(\mathbf{A})\)
\(\mathbf{A}:\text{span}\left(C(\mathbf{A}^\top)\mathop{\cup} N(\mathbf{A})\right)=\mathbb{R}^n\mapsto \text{span}\left(C(\mathbf{A})\mathop{\cup} N(\mathbf{A}^\top)\right)=\mathbb{R}^m\)