Linear Transform¶
Linear Transform in Vector Space¶
Note
Let \(U\) and \(W\) be two vector spaces over the same scalar field \(\mathcal{F}\).
A linear transform is a function \(T:U\mapsto W\) if
\(\forall\mathbf{u},\mathbf{v}\in U, T(\mathbf{u}+\mathbf{v})=T(\mathbf{u})+T(\mathbf{v})\)
\(\forall c\in\mathcal{F},\forall\mathbf{u}\in U, T(c\cdot\mathbf{u})=c\cdot T(\mathbf{u})\)
This means, if we want to add or scale vecetors, it doesn’t matter whether we do it in the domain space before the mapping or in the range space after the mapping.
Tip
A linear transform is one-to-one when it’s onto.
Linear Operator¶
Tip
Linear transforms from \(U\) to \(U\) are called Linear Operators.
The set of all linear operators \(T:U\mapsto U\) is represented as \(L(U)\).
Space of Linear Transforms¶
Tip
The set of all linear transforms \(T:U\mapsto W\) is represented as \(L(U,W)\).
As a Vector Space over Addition? Always.¶
See also
Let’s consider \(A,B\in L(U,W)\).
We can define a commutative addition in \(L(U,V)\) with the same scalar multiplication of \(W\).
Let \(C=(a\cdot A+b\cdot B)\) where for any \(a,b\in\mathcal{F}\) we have
\[\forall\mathbf{u}\in U, C(\mathbf{u})=(a\cdot A+b\cdot B)(\mathbf{u})=a\cdot A(\mathbf{u})+b\cdot B(\mathbf{u})\]We note that \(C\in L(U,W)\).
We also define an identity operator \(0_L\in L(U,W)\) such that \(\forall \mathbf{u}, 0_L(\mathbf{u})=\mathbf{0}\).
We note that \(A+0_L=0_L+A=A\).
We can define a unique additive inverse \(-A:U\mapsto W\).
\[A(\mathbf{u})+-A(\mathbf{u})=-A(\mathbf{u})+A(\mathbf{u})=0_L(\mathbf{u})=\mathbf{0}\]
Composition of Linear Transforms¶
Note
We can define composite linear transforms in the usual way.
Let \(A:U\mapsto V\) and \(B:V\mapsto W\).
Then \((B\circ A)\in L(U,W)\) where \(\forall\mathbf{u}\in U, (B\circ A)(\mathbf{u})=B(A(\mathbf{u}))\).
As a Vector Space over Composition? Not Guaranteed.¶
See also
Let’s consider \(A,B\in L(U)\).
We can consider \(\circ\) as an “addition” in \(L(U,V)\) with the same scalar multiplication of \(U\).
Let \(C=((b\cdot B)\circ (a\cdot A))\in L(U)\) where for any \(a,b\in\mathcal{F}\) we have
\[\forall\mathbf{u}\in U, C(\mathbf{u})=((b\cdot B)\circ (a\cdot A))(\mathbf{u})=ab\cdot B(A(\mathbf{u}))\]
We define the identity operator \(I:U\mapsto U\) such that \(\forall \mathbf{u}, I(\mathbf{u})=\mathbf{u}\).
We note that \(A\circ I = I\circ A = A\)
If the transform is onto, then we can define a unique composition inverse \(A^{-1}:U\mapsto U\) such that
\[(A\circ A^{-1})(\mathbf{u}) = (A^{-1}\circ A)(\mathbf{u}) = I(\mathbf{u}) = \mathbf{u}\]
Warning
HOWEVER, The composition operator is not always commutative.
It is generally NOT true that \((A\circ B)(\mathbf{u})=(B\circ A)(\mathbf{u})\).
Example where it IS commutative:
Let \(\mathbf{A}\) and \(\mathbf{B}\) be matrices with the same eigenvectors and possibly different eigenvalues.
In this case, the composition is commutative.
We note that this is a sufficient but not a necessary condition.
Attention
The composition operator, therefore, is better thought of as a multiplication.
Together with the addition and multiplication, the space of linear operators follows the structure of a ring.
Examples¶
Scalar Multiplication as a Linear Transform¶
Tip
For every scalar \(\alpha\in\mathbb{R}\), we can define a unique linear operator in \(L(\mathbb{R})\) with its already defined multiplication operator as \(\alpha:\mathbb{R}\mapsto\mathbb{R}\) where \(\forall x\in\mathbb{R}, \alpha(x)=\alpha\cdot x\).
We note that
\(\forall u,v\in \mathbb{R}, \alpha(u+v)=\alpha(u)+\alpha(v)\)
\(\forall c\in\mathbb{R},\forall u\in \mathbb{R}, \alpha(c\cdot u)=c\cdot\alpha(u)\)
Matrix-Vector Multiplication as Linear Transform¶
Tip
The matrix \(\mathbf{A}\) is a linear transform which maps \(\mathbb{C}^n\) dimensional vectors to \(\mathbb{C}^m\) dimensional vectors.
\[\mathbf{A}:\mathbb{C}^n\mapsto\mathbb{C}^m\]The range of this transform is the column space of this transform
\[C(\mathbf{A})=\{\mathbf{A}\mathbf{x}\mathop{|}\forall \mathbf{x}\in\mathbb{C}^n\}\]The transposed matrix \(\mathbf{A}^\top\) does the mapping the other way around (but it’s not necessarily the inverse transform)
\[\mathbf{A}^\top:\mathbb{C}^m\mapsto\mathbb{C}^n\]The range of the transpose transform is the row space of \(\mathbf{A}\)
\[C(\mathbf{A}^\top)=\{\mathbf{A}^\top\mathbf{y}\mathop{|}\forall \mathbf{y}\in\mathbb{C}^m\}\]
Differentiation as a Linear Transform¶
Tip
Let \(X\) and \(Y\) be two vector spaces over \(\mathbb{R}\).
Let \(\mathcal{F}=\{f\mathop{|} f:X\mapsto Y\}\) be the set of all function from \(X\) into \(Y\).
Let \(\mathcal{D}=\{g\mathop{|} g:X\mapsto Y\}\) be the set of all differentiable functions.
The differentiation operator \(D=\frac{\mathop{d}}{\mathop{dx}}(\cdot)\) on its own defines a function \(D:\mathcal{G}\mapsto\mathcal{F}\).
\[D(g\in\mathcal{G})=g'\in\mathcal{F}\]Here \(g'(x)=\frac{\mathop{d}}{\mathop{dx}}(g)(x)\).
\(D\) applied on some specific function \(g\) (ready to be evaluated on any \(x\)) defines a linear transform \(D(g)\in\mathcal{F}:X\mapsto Y\) since
\(\forall g_1,g_2\in \mathcal{G}, D(g_1+g_2)=D(g_1)+D(g_2)\)
\(\forall c\in \mathbb{R},\forall g\in \mathcal{G}, D(c\cdot g)=c\cdot D(g)\)
Integration as a Linear Transform¶
Tip
Let \(X\) and \(Y\) be two vector spaces over \(\mathbb{R}\).
Let \(\mathcal{F}=\{f\mathop{|} f:X\mapsto Y\}\) be the set of all function from \(X\) into \(Y\).
Let \(\mathcal{I}=\{g\mathop{|} g:X\mapsto Y\}\) be the set of all integrable functions.
The integration operator \(I=\int(\cdot)\mathop{dx}\) on its own defines a function \(I:\mathcal{I}\mapsto\mathcal{F}\).
\[I(g\in\mathcal{I})=G\in\mathcal{F}\]Here \(G(x)=\int g(x)\mathop{dx}\).
\(I\) applied on some specific function \(g\) (ready to be evaluated on any \(x\)) defines a linear transform \(I(g)\in\mathcal{F}:X\mapsto Y\) since
\(\forall g_1,g_2\in \mathcal{I}, I(g_1+g_2)=I(g_1)+I(g_2)\)
\(\forall c\in \mathbb{R},\forall g\in \mathcal{I}, I(c\cdot g)=c\cdot I(g)\)
Matrix-vector multiplication¶
Let \(\mathbf{A}\) be a \(m\times n\) matrix.
Column view: \(\mathbf{a}_k\in\mathbb{R}^m\) are column vectors
\[\begin{split}\mathbf{A}=\begin{bmatrix} | & \cdots & |\\ \mathbf{a}_1 & \cdots & \mathbf{a}_n\\ | & \cdots & |\\ \end{bmatrix}\end{split}\]Row view: \((\mathbf{a}^*_k)^\top\in\mathbb{R}^n\) are row vectors
\[\begin{split}\mathbf{A}=\begin{bmatrix}-&\mathbf{a}^*_1&-\\&\vdots&\\-&\mathbf{a}^*_m&-\end{bmatrix}\end{split}\]
Let \(\mathbf{x}\in\mathbb{R}^n\) be a column vector which can also be thought of as a \(n\times 1\) matrix
\[\begin{split}\mathbf{x}=(x_1,\cdots,x_n)^\top=\begin{bmatrix} x_1\\ \vdots\\ x_n \end{bmatrix}\end{split}\]
Note
Column view: The multiplication \(\mathbf{A}\mathbf{x}\) is a combination of the column vectors of \(\mathbf{A}\), where each vector \(\mathbf{a}_k\) is scaled as per \(x_k\).
\[\begin{split}\mathbf{A}\mathbf{x}=\begin{bmatrix} | & \cdots & |\\ \mathbf{a}_1 & \cdots & \mathbf{a}_n\\ | & \cdots & |\\ \end{bmatrix}\begin{bmatrix}x_1\\\vdots\\x_n\end{bmatrix}=x_1\begin{bmatrix}|\\ \mathbf{a}_1\\|\end{bmatrix}+\cdots+x_n\begin{bmatrix}|\\ \mathbf{a}_n\\|\end{bmatrix}\end{split}\]Row view: It can also be thought of the collection of inner products with each row vectors
\[\begin{split}\mathbf{A}\mathbf{x}=\begin{bmatrix}\langle(\mathbf{a}^*_1)^\top,\mathbf{x}\rangle\\\vdots\\\langle(\mathbf{a}^*_m)^\top,\mathbf{x}\rangle\end{bmatrix}\end{split}\]
Attention
The equation \(\mathbf{A}\mathbf{x}=\mathbf{b}\) has a unique solution if \(\mathbf{b}\in C(\mathbf{A})\).
Matrix-matrix multiplication¶
Let \(\mathbf{A}\) be the matrix as before and let \(\mathbf{B}\) be a \(n\times p\) matrix written as a collection of rows similar to a vector
\[\begin{split}\mathbf{B}=\begin{bmatrix}-&\mathbf{b}^*_1&-\\&\vdots&\\-&\mathbf{b}^*_n&-\end{bmatrix}\end{split}\]
where \(\mathbf{b}^*_k\in\mathbb{R}^p\) are the row vectors.
Note
The multiplication \(\mathbf{A}\mathbf{B}\) is the sum of outer products \(\mathbf{u}\mathbf{v}^\top=\mathbf{a}_k \mathbf{b}^*_k\)
\[\begin{split}\mathbf{A}\mathbf{B}=\begin{bmatrix} | & \cdots & |\\ \mathbf{a}_1 & \cdots & \mathbf{a}_n\\ | & \cdots & |\\ \end{bmatrix}\begin{bmatrix}-&\mathbf{b}^*_1&-\\&\vdots&\\-&\mathbf{b}^*_n&-\end{bmatrix}=\begin{bmatrix}|\\ \mathbf{a}_1\\|\end{bmatrix}\begin{bmatrix}-&\mathbf{b}^*_1&-\end{bmatrix}+\cdots+\begin{bmatrix}|\\ \mathbf{a}_n\\|\end{bmatrix}\begin{bmatrix}-&\mathbf{b}^*_n&-\end{bmatrix}\end{split}\]
Independence, Rank, Inverse Mapping, Basis and Fundamental Subspaces¶
Independence¶
Note
Vector \(\mathbf{u}\) is linearly independent of vector \(\mathbf{v}\) if they are not in the same direction.
There is no scalar \(a\in\mathbb{R}\) such that \(\mathbf{u}=a\mathbf{v}\)
Vector \(\mathbf{w}\) is linearly independent of vectors \(\mathbf{u}\) and \(\mathbf{v}\) if it is not in the same place spanned by these.
There are no scalars \(a,b\in\mathbb{R}\) such that \(\mathbf{w}=a\mathbf{u}+b\mathbf{v}\)
Extends naturally for more dimensions.
Rank¶
Rank determines whether the linear transform \(\mathbf{A}\) defines a mapping which is onto or into.
Note
The number of independent column vectors in a matrix \(\mathbf{A}\) is the column-rank.
The number of independent row vectors in a matrix \(\mathbf{A}\) is the row-rank.
Attention
For any matrix \(\mathbf{A}\), column-rank and row-rank are the same, and it is called the rank of a matrix, \(r\leq m\) and \(r\leq n\).
\(r\) is the dimensionality of the column-space \(C(\mathbf{A})\) as well as the row-space \(C(\mathbf{A}^\top)\).
If \(m=n=r\), then the matrix is full-rank.
Inverse Mapping¶
Note
A full rank matrix \(\mathbf{A}:\mathbb{R}^n\mapsto\mathbb{R}^n\) defines a onto mapping, i.e. it spans the entire range.
In such cases, the operation is one-to-one as well. There are no two vectors in the domain which maps to the same vector in the range space.
We can define an inverse transform in this case as \(\mathbf{A}^{-1}:\mathbb{R}^n\mapsto\mathbb{R}^n\).
Basis¶
Note
For a matrix \(\mathbf{A}\) of rank \(r\), there are \(r\) independent column vectors which span \(\mathbb{R}^r\).
These column vectors form one basis of the column space.
We note that these don’t necessarily have to be orthogonal.
Attention
There can be multiple basis vectors for a matrix which span the same column space.
Fundamental Subspaces¶
Note
We define the null-space of \(\mathbf{A}:\mathbb{R}^n\mapsto\mathbb{R}^m\) as the subspace in the domain \(\mathbb{R}^n\) which maps to \(\mathbf{0}\) in the range \(\mathbb{R}^m\).
\[N(\mathbf{A})\subseteq \mathbb{R}^n\]The vectors in the null-space span a \(n-r\) dimensional space where \(r\) is the rank of the matrix.
We prefer the basis for the null-space to be orthogonal although it’s not a necessity.
The right-null-space is defined as the null-space of the transposed transform \(\mathbf{A}^\top\).
Attention
\(\dim(C(\mathbf{A}))=r\) and \(\dim(N(\mathbf{A}^\top))=m-r\)
\(\dim(C(\mathbf{A}^\top))=r\) and \(\dim(N(\mathbf{A}))=n-r\)
Orthogonality¶
Orthogonal vectors¶
Note
Two vectors \(\mathbf{u}\) and \(\mathbf{v}\) are orthogonal if \(\mathbf{u}^\top\mathbf{v}=0\).
Tip
Pythagoras: For \(\mathbf{x}\mathop{\bot}\mathbf{y}\)
\[||\mathbf{x}-\mathbf{y}||=(\mathbf{x}-\mathbf{y})^\top(\mathbf{x}-\mathbf{y})=\mathbf{x}^\top\mathbf{x}+\mathbf{y}^\top\mathbf{y}-\mathbf{x}^\top\mathbf{y}-\mathbf{y}^\top\mathbf{x}=\mathbf{x}^\top\mathbf{x}+\mathbf{y}^\top\mathbf{y}=||\mathbf{x}||+||\mathbf{y}||\]In general, \(\mathbf{x}^\top\mathbf{y}=||\mathbf{x}||\cdot||\mathbf{y}||\cdot\cos\theta\)
Attention
If \(\mathbf{x}\in N(\mathbf{A})\), then for any \(k\), \(\mathbf{a}^*_k\mathop{\bot}\mathbf{x}\) as \((\mathbf{a}^*_k)^\top\mathbf{x}=0\).
Therefore, any vector in the null-space cannot be spanned by the row-space of \(\mathbf{A}\).
Orthonormal vectors¶
Note
Orthogonal vectors such that \(||\mathbf{u}||=1\).
Matrix with orthonormal columns¶
Note
Written as \(\mathbf{Q}\).
We note that \(\mathbf{Q}^\top\mathbf{Q}=\mathbf{I}\).
Doesn’t change the length: \(||\mathbf{Q}\mathbf{x}||=||\mathbf{x}||\) but might lose/gain a few dimensions though based on the dimensionality of \(\mathbf{Q}\).
\[||\mathbf{Q}\mathbf{x}||=(\mathbf{Q}\mathbf{x})^\top(\mathbf{Q}\mathbf{x})=\mathbf{x}^\top(\mathbf{Q}^\top\mathbf{Q})\mathbf{x}=\mathbf{x}^\top\mathbf{x}=||\mathbf{x}||\]If \(\mathbf{Q}_1\) and \(\mathbf{Q}_2\) are matrices with orthonormal columns, then \(\mathbf{Q}=\mathbf{Q}_1\mathbf{Q}_2\) is also a matrix with orthonormal columns.
\[\mathbf{Q}^\top\mathbf{Q}=(\mathbf{Q}_1\mathbf{Q}_2)^\top(\mathbf{Q}_1\mathbf{Q}_2)=\mathbf{Q}_2^\top(\mathbf{Q}_1^\top\mathbf{Q}_1)\mathbf{Q}_2=\mathbf{Q}_2^\top\mathbf{Q}_2=\mathbf{I}\]
Projection matrices¶
Note
Any matrix that can be factorised as \(\mathbf{P}=\mathbf{Q}\mathbf{Q}^\top\) is a projection matrix.
For any vector \(\mathbf{v}\), \(\mathbf{P}\mathbf{v}\) is the orthogonal projection onto the column space of \(\mathbf{P}\).
Any vector \(\mathbf{v}\) can be broken into two parts
Projection \(\mathbf{P}\mathbf{v}\)
Error \(\mathbf{v}-\mathbf{P}\mathbf{v}\)
Attention
Repeated projection doesn’t change anything
\[\mathbf{P}^2=(\mathbf{Q}\mathbf{Q}^\top)(\mathbf{Q}\mathbf{Q}^\top)=\mathbf{Q}(\mathbf{Q}^\top\mathbf{Q})\mathbf{Q}^\top=\mathbf{Q}\mathbf{Q}^\top=\mathbf{P}\]Projection matrices are symmetric
\[\mathbf{P}^\top=(\mathbf{Q}\mathbf{Q}^\top)^\top=(\mathbf{Q}^\top)^\top\mathbf{Q}^\top=\mathbf{Q}\mathbf{Q}^\top=\mathbf{P}\]
Orthogonal matrices¶
Note
Square matrices with orthonormal columns.
Attention
We have \(\mathbf{Q}^\top=\mathbf{Q}^{-1}\) since
\[\mathbf{Q}^\top\mathbf{Q}=\mathbf{Q}\mathbf{Q}^\top=\mathbf{I}\]They represent a pure rotation or reflection in \(\mathbb{R}^n\) as neither the length or the dimensionality changes of any vector under this transformation.
Positive determinant implies rotation, negative determinant implies reflection (as the orientation changes).
Orthonormal basis¶
Note
Standard co-ordinate vectors are an example of orthonormal basis.
It’s not necessary for basis vectors to be orthonormal but it’s desired.
For orthonormal basis, we can obtain the scalar along each component independently.
Let the orthogonal basis vectors are \(\mathbf{q}_1,\cdots,\mathbf{q}_n\). Then any vector \(\mathbf{v}\in\mathbb{R}^n\) can be expressed as
\[\mathbf{v}=c_1\mathbf{q}_1+\cdots+c_n\mathbf{q}_n\]The scalar along any \(\mathbf{q}_k\) can be obtained as \(c_k=\mathbf{q}_k^\top\mathbf{v}\) since
\[\mathbf{q}_k^\top\mathbf{v}=c_1\mathbf{q}_k^\top\mathbf{q}_1+\cdots+c_k\mathbf{q}_k^\top\mathbf{q}_k+\cdots+c_n\mathbf{q}_k^\top\mathbf{q}_n=c_1\cdot0+\cdots+c_k\cdot1+\cdots+c_n\cdot0=c_k\]
Tip
We can create an orthogonal matrix \(\mathbf{Q}\) with the basis vectors as columns. Then all these coefficients can be found using \(\mathbf{Q}\mathbf{v}\).
Orthogonal subspace¶
Attention
\(C(\mathbf{A})\mathop{\bot} N(\mathbf{A}^\top)\) and \(C(\mathbf{A}^\top)\mathop{\bot} N(\mathbf{A})\)
\(\mathbf{A}:\text{span}\left(C(\mathbf{A}^\top)\mathop{\cup} N(\mathbf{A})\right)=\mathbb{R}^n\mapsto \text{span}\left(C(\mathbf{A})\mathop{\cup} N(\mathbf{A}^\top)\right)=\mathbb{R}^m\)