Linear Transform

Linear Transform in Vector Space

Note

  • Let \(U\) and \(W\) be two vector spaces over the same scalar field \(\mathcal{F}\).

  • A linear transform is a function \(T:U\mapsto W\) if

    • \(\forall\mathbf{u},\mathbf{v}\in U, T(\mathbf{u}+\mathbf{v})=T(\mathbf{u})+T(\mathbf{v})\)

    • \(\forall c\in\mathcal{F},\forall\mathbf{u}\in U, T(c\cdot\mathbf{u})=c\cdot T(\mathbf{u})\)

  • This means, if we want to add or scale vecetors, it doesn’t matter whether we do it in the domain space before the mapping or in the range space after the mapping.

Tip

A linear transform is one-to-one when it’s onto.

Linear Operator

Tip

  • Linear transforms from \(U\) to \(U\) are called Linear Operators.

  • The set of all linear operators \(T:U\mapsto U\) is represented as \(L(U)\).

Space of Linear Transforms

Tip

The set of all linear transforms \(T:U\mapsto W\) is represented as \(L(U,W)\).

As a Vector Space over Addition? Always.

See also

  • Let’s consider \(A,B\in L(U,W)\).

  • We can define a commutative addition in \(L(U,V)\) with the same scalar multiplication of \(W\).

    • Let \(C=(a\cdot A+b\cdot B)\) where for any \(a,b\in\mathcal{F}\) we have

      \[\forall\mathbf{u}\in U, C(\mathbf{u})=(a\cdot A+b\cdot B)(\mathbf{u})=a\cdot A(\mathbf{u})+b\cdot B(\mathbf{u})\]
    • We note that \(C\in L(U,W)\).

  • We also define an identity operator \(0_L\in L(U,W)\) such that \(\forall \mathbf{u}, 0_L(\mathbf{u})=\mathbf{0}\).

    • We note that \(A+0_L=0_L+A=A\).

  • We can define a unique additive inverse \(-A:U\mapsto W\).

    \[A(\mathbf{u})+-A(\mathbf{u})=-A(\mathbf{u})+A(\mathbf{u})=0_L(\mathbf{u})=\mathbf{0}\]

Composition of Linear Transforms

Note

  • We can define composite linear transforms in the usual way.

  • Let \(A:U\mapsto V\) and \(B:V\mapsto W\).

  • Then \((B\circ A)\in L(U,W)\) where \(\forall\mathbf{u}\in U, (B\circ A)(\mathbf{u})=B(A(\mathbf{u}))\).

As a Vector Space over Composition? Not Guaranteed.

See also

  • Let’s consider \(A,B\in L(U)\).

  • We can consider \(\circ\) as an “addition” in \(L(U,V)\) with the same scalar multiplication of \(U\).

    • Let \(C=((b\cdot B)\circ (a\cdot A))\in L(U)\) where for any \(a,b\in\mathcal{F}\) we have

      \[\forall\mathbf{u}\in U, C(\mathbf{u})=((b\cdot B)\circ (a\cdot A))(\mathbf{u})=ab\cdot B(A(\mathbf{u}))\]
  • We define the identity operator \(I:U\mapsto U\) such that \(\forall \mathbf{u}, I(\mathbf{u})=\mathbf{u}\).

    We note that \(A\circ I = I\circ A = A\)

  • If the transform is onto, then we can define a unique composition inverse \(A^{-1}:U\mapsto U\) such that

    \[(A\circ A^{-1})(\mathbf{u}) = (A^{-1}\circ A)(\mathbf{u}) = I(\mathbf{u}) = \mathbf{u}\]

Warning

  • HOWEVER, The composition operator is not always commutative.

    • It is generally NOT true that \((A\circ B)(\mathbf{u})=(B\circ A)(\mathbf{u})\).

  • Example where it IS commutative:

    • Let \(\mathbf{A}\) and \(\mathbf{B}\) be matrices with the same eigenvectors and possibly different eigenvalues.

    • In this case, the composition is commutative.

    • We note that this is a sufficient but not a necessary condition.

Attention

  • The composition operator, therefore, is better thought of as a multiplication.

  • Together with the addition and multiplication, the space of linear operators follows the structure of a ring.

Examples

Scalar Multiplication as a Linear Transform

Tip

  • For every scalar \(\alpha\in\mathbb{R}\), we can define a unique linear operator in \(L(\mathbb{R})\) with its already defined multiplication operator as \(\alpha:\mathbb{R}\mapsto\mathbb{R}\) where \(\forall x\in\mathbb{R}, \alpha(x)=\alpha\cdot x\).

  • We note that

    • \(\forall u,v\in \mathbb{R}, \alpha(u+v)=\alpha(u)+\alpha(v)\)

    • \(\forall c\in\mathbb{R},\forall u\in \mathbb{R}, \alpha(c\cdot u)=c\cdot\alpha(u)\)

Matrix-Vector Multiplication as Linear Transform

Tip

  • The matrix \(\mathbf{A}\) is a linear transform which maps \(\mathbb{C}^n\) dimensional vectors to \(\mathbb{C}^m\) dimensional vectors.

    \[\mathbf{A}:\mathbb{C}^n\mapsto\mathbb{C}^m\]
  • The range of this transform is the column space of this transform

    \[C(\mathbf{A})=\{\mathbf{A}\mathbf{x}\mathop{|}\forall \mathbf{x}\in\mathbb{C}^n\}\]
  • The transposed matrix \(\mathbf{A}^\top\) does the mapping the other way around (but it’s not necessarily the inverse transform)

    \[\mathbf{A}^\top:\mathbb{C}^m\mapsto\mathbb{C}^n\]
  • The range of the transpose transform is the row space of \(\mathbf{A}\)

    \[C(\mathbf{A}^\top)=\{\mathbf{A}^\top\mathbf{y}\mathop{|}\forall \mathbf{y}\in\mathbb{C}^m\}\]

Differentiation as a Linear Transform

Tip

  • Let \(X\) and \(Y\) be two vector spaces over \(\mathbb{R}\).

  • Let \(\mathcal{F}=\{f\mathop{|} f:X\mapsto Y\}\) be the set of all function from \(X\) into \(Y\).

  • Let \(\mathcal{D}=\{g\mathop{|} g:X\mapsto Y\}\) be the set of all differentiable functions.

  • The differentiation operator \(D=\frac{\mathop{d}}{\mathop{dx}}(\cdot)\) on its own defines a function \(D:\mathcal{G}\mapsto\mathcal{F}\).

    \[D(g\in\mathcal{G})=g'\in\mathcal{F}\]
    • Here \(g'(x)=\frac{\mathop{d}}{\mathop{dx}}(g)(x)\).

  • \(D\) applied on some specific function \(g\) (ready to be evaluated on any \(x\)) defines a linear transform \(D(g)\in\mathcal{F}:X\mapsto Y\) since

    • \(\forall g_1,g_2\in \mathcal{G}, D(g_1+g_2)=D(g_1)+D(g_2)\)

    • \(\forall c\in \mathbb{R},\forall g\in \mathcal{G}, D(c\cdot g)=c\cdot D(g)\)

Integration as a Linear Transform

Tip

  • Let \(X\) and \(Y\) be two vector spaces over \(\mathbb{R}\).

  • Let \(\mathcal{F}=\{f\mathop{|} f:X\mapsto Y\}\) be the set of all function from \(X\) into \(Y\).

  • Let \(\mathcal{I}=\{g\mathop{|} g:X\mapsto Y\}\) be the set of all integrable functions.

  • The integration operator \(I=\int(\cdot)\mathop{dx}\) on its own defines a function \(I:\mathcal{I}\mapsto\mathcal{F}\).

    \[I(g\in\mathcal{I})=G\in\mathcal{F}\]
    • Here \(G(x)=\int g(x)\mathop{dx}\).

  • \(I\) applied on some specific function \(g\) (ready to be evaluated on any \(x\)) defines a linear transform \(I(g)\in\mathcal{F}:X\mapsto Y\) since

    • \(\forall g_1,g_2\in \mathcal{I}, I(g_1+g_2)=I(g_1)+I(g_2)\)

    • \(\forall c\in \mathbb{R},\forall g\in \mathcal{I}, I(c\cdot g)=c\cdot I(g)\)

Matrix-vector multiplication

Let \(\mathbf{A}\) be a \(m\times n\) matrix.

  • Column view: \(\mathbf{a}_k\in\mathbb{R}^m\) are column vectors

    \[\begin{split}\mathbf{A}=\begin{bmatrix} | & \cdots & |\\ \mathbf{a}_1 & \cdots & \mathbf{a}_n\\ | & \cdots & |\\ \end{bmatrix}\end{split}\]
  • Row view: \((\mathbf{a}^*_k)^\top\in\mathbb{R}^n\) are row vectors

    \[\begin{split}\mathbf{A}=\begin{bmatrix}-&\mathbf{a}^*_1&-\\&\vdots&\\-&\mathbf{a}^*_m&-\end{bmatrix}\end{split}\]

Let \(\mathbf{x}\in\mathbb{R}^n\) be a column vector which can also be thought of as a \(n\times 1\) matrix

\[\begin{split}\mathbf{x}=(x_1,\cdots,x_n)^\top=\begin{bmatrix} x_1\\ \vdots\\ x_n \end{bmatrix}\end{split}\]

Note

  • Column view: The multiplication \(\mathbf{A}\mathbf{x}\) is a combination of the column vectors of \(\mathbf{A}\), where each vector \(\mathbf{a}_k\) is scaled as per \(x_k\).

    \[\begin{split}\mathbf{A}\mathbf{x}=\begin{bmatrix} | & \cdots & |\\ \mathbf{a}_1 & \cdots & \mathbf{a}_n\\ | & \cdots & |\\ \end{bmatrix}\begin{bmatrix}x_1\\\vdots\\x_n\end{bmatrix}=x_1\begin{bmatrix}|\\ \mathbf{a}_1\\|\end{bmatrix}+\cdots+x_n\begin{bmatrix}|\\ \mathbf{a}_n\\|\end{bmatrix}\end{split}\]
  • Row view: It can also be thought of the collection of inner products with each row vectors

    \[\begin{split}\mathbf{A}\mathbf{x}=\begin{bmatrix}\langle(\mathbf{a}^*_1)^\top,\mathbf{x}\rangle\\\vdots\\\langle(\mathbf{a}^*_m)^\top,\mathbf{x}\rangle\end{bmatrix}\end{split}\]

Attention

The equation \(\mathbf{A}\mathbf{x}=\mathbf{b}\) has a unique solution if \(\mathbf{b}\in C(\mathbf{A})\).

Matrix-matrix multiplication

Let \(\mathbf{A}\) be the matrix as before and let \(\mathbf{B}\) be a \(n\times p\) matrix written as a collection of rows similar to a vector

\[\begin{split}\mathbf{B}=\begin{bmatrix}-&\mathbf{b}^*_1&-\\&\vdots&\\-&\mathbf{b}^*_n&-\end{bmatrix}\end{split}\]

where \(\mathbf{b}^*_k\in\mathbb{R}^p\) are the row vectors.

Note

The multiplication \(\mathbf{A}\mathbf{B}\) is the sum of outer products \(\mathbf{u}\mathbf{v}^\top=\mathbf{a}_k \mathbf{b}^*_k\)

\[\begin{split}\mathbf{A}\mathbf{B}=\begin{bmatrix} | & \cdots & |\\ \mathbf{a}_1 & \cdots & \mathbf{a}_n\\ | & \cdots & |\\ \end{bmatrix}\begin{bmatrix}-&\mathbf{b}^*_1&-\\&\vdots&\\-&\mathbf{b}^*_n&-\end{bmatrix}=\begin{bmatrix}|\\ \mathbf{a}_1\\|\end{bmatrix}\begin{bmatrix}-&\mathbf{b}^*_1&-\end{bmatrix}+\cdots+\begin{bmatrix}|\\ \mathbf{a}_n\\|\end{bmatrix}\begin{bmatrix}-&\mathbf{b}^*_n&-\end{bmatrix}\end{split}\]

Independence, Rank, Inverse Mapping, Basis and Fundamental Subspaces

Independence

Note

  • Vector \(\mathbf{u}\) is linearly independent of vector \(\mathbf{v}\) if they are not in the same direction.

    • There is no scalar \(a\in\mathbb{R}\) such that \(\mathbf{u}=a\mathbf{v}\)

  • Vector \(\mathbf{w}\) is linearly independent of vectors \(\mathbf{u}\) and \(\mathbf{v}\) if it is not in the same place spanned by these.

    • There are no scalars \(a,b\in\mathbb{R}\) such that \(\mathbf{w}=a\mathbf{u}+b\mathbf{v}\)

  • Extends naturally for more dimensions.

Rank

Rank determines whether the linear transform \(\mathbf{A}\) defines a mapping which is onto or into.

Note

  • The number of independent column vectors in a matrix \(\mathbf{A}\) is the column-rank.

  • The number of independent row vectors in a matrix \(\mathbf{A}\) is the row-rank.

Attention

  • For any matrix \(\mathbf{A}\), column-rank and row-rank are the same, and it is called the rank of a matrix, \(r\leq m\) and \(r\leq n\).

  • \(r\) is the dimensionality of the column-space \(C(\mathbf{A})\) as well as the row-space \(C(\mathbf{A}^\top)\).

  • If \(m=n=r\), then the matrix is full-rank.

Inverse Mapping

Note

  • A full rank matrix \(\mathbf{A}:\mathbb{R}^n\mapsto\mathbb{R}^n\) defines a onto mapping, i.e. it spans the entire range.

  • In such cases, the operation is one-to-one as well. There are no two vectors in the domain which maps to the same vector in the range space.

  • We can define an inverse transform in this case as \(\mathbf{A}^{-1}:\mathbb{R}^n\mapsto\mathbb{R}^n\).

Basis

Note

  • For a matrix \(\mathbf{A}\) of rank \(r\), there are \(r\) independent column vectors which span \(\mathbb{R}^r\).

  • These column vectors form one basis of the column space.

  • We note that these don’t necessarily have to be orthogonal.

Attention

  • There can be multiple basis vectors for a matrix which span the same column space.

Fundamental Subspaces

Note

  • We define the null-space of \(\mathbf{A}:\mathbb{R}^n\mapsto\mathbb{R}^m\) as the subspace in the domain \(\mathbb{R}^n\) which maps to \(\mathbf{0}\) in the range \(\mathbb{R}^m\).

    \[N(\mathbf{A})\subseteq \mathbb{R}^n\]
  • The vectors in the null-space span a \(n-r\) dimensional space where \(r\) is the rank of the matrix.

    • We prefer the basis for the null-space to be orthogonal although it’s not a necessity.

  • The right-null-space is defined as the null-space of the transposed transform \(\mathbf{A}^\top\).

Attention

  • \(\dim(C(\mathbf{A}))=r\) and \(\dim(N(\mathbf{A}^\top))=m-r\)

  • \(\dim(C(\mathbf{A}^\top))=r\) and \(\dim(N(\mathbf{A}))=n-r\)

Orthogonality

Orthogonal vectors

Note

Two vectors \(\mathbf{u}\) and \(\mathbf{v}\) are orthogonal if \(\mathbf{u}^\top\mathbf{v}=0\).

Tip

  • Pythagoras: For \(\mathbf{x}\mathop{\bot}\mathbf{y}\)

    \[||\mathbf{x}-\mathbf{y}||=(\mathbf{x}-\mathbf{y})^\top(\mathbf{x}-\mathbf{y})=\mathbf{x}^\top\mathbf{x}+\mathbf{y}^\top\mathbf{y}-\mathbf{x}^\top\mathbf{y}-\mathbf{y}^\top\mathbf{x}=\mathbf{x}^\top\mathbf{x}+\mathbf{y}^\top\mathbf{y}=||\mathbf{x}||+||\mathbf{y}||\]
  • In general, \(\mathbf{x}^\top\mathbf{y}=||\mathbf{x}||\cdot||\mathbf{y}||\cdot\cos\theta\)

Attention

  • If \(\mathbf{x}\in N(\mathbf{A})\), then for any \(k\), \(\mathbf{a}^*_k\mathop{\bot}\mathbf{x}\) as \((\mathbf{a}^*_k)^\top\mathbf{x}=0\).

  • Therefore, any vector in the null-space cannot be spanned by the row-space of \(\mathbf{A}\).

Orthonormal vectors

Note

Orthogonal vectors such that \(||\mathbf{u}||=1\).

Matrix with orthonormal columns

Note

  • Written as \(\mathbf{Q}\).

  • We note that \(\mathbf{Q}^\top\mathbf{Q}=\mathbf{I}\).

  • Doesn’t change the length: \(||\mathbf{Q}\mathbf{x}||=||\mathbf{x}||\) but might lose/gain a few dimensions though based on the dimensionality of \(\mathbf{Q}\).

    \[||\mathbf{Q}\mathbf{x}||=(\mathbf{Q}\mathbf{x})^\top(\mathbf{Q}\mathbf{x})=\mathbf{x}^\top(\mathbf{Q}^\top\mathbf{Q})\mathbf{x}=\mathbf{x}^\top\mathbf{x}=||\mathbf{x}||\]
  • If \(\mathbf{Q}_1\) and \(\mathbf{Q}_2\) are matrices with orthonormal columns, then \(\mathbf{Q}=\mathbf{Q}_1\mathbf{Q}_2\) is also a matrix with orthonormal columns.

    \[\mathbf{Q}^\top\mathbf{Q}=(\mathbf{Q}_1\mathbf{Q}_2)^\top(\mathbf{Q}_1\mathbf{Q}_2)=\mathbf{Q}_2^\top(\mathbf{Q}_1^\top\mathbf{Q}_1)\mathbf{Q}_2=\mathbf{Q}_2^\top\mathbf{Q}_2=\mathbf{I}\]

Projection matrices

Note

  • Any matrix that can be factorised as \(\mathbf{P}=\mathbf{Q}\mathbf{Q}^\top\) is a projection matrix.

  • For any vector \(\mathbf{v}\), \(\mathbf{P}\mathbf{v}\) is the orthogonal projection onto the column space of \(\mathbf{P}\).

  • Any vector \(\mathbf{v}\) can be broken into two parts

    • Projection \(\mathbf{P}\mathbf{v}\)

    • Error \(\mathbf{v}-\mathbf{P}\mathbf{v}\)

Attention

  • Repeated projection doesn’t change anything

    \[\mathbf{P}^2=(\mathbf{Q}\mathbf{Q}^\top)(\mathbf{Q}\mathbf{Q}^\top)=\mathbf{Q}(\mathbf{Q}^\top\mathbf{Q})\mathbf{Q}^\top=\mathbf{Q}\mathbf{Q}^\top=\mathbf{P}\]
  • Projection matrices are symmetric

    \[\mathbf{P}^\top=(\mathbf{Q}\mathbf{Q}^\top)^\top=(\mathbf{Q}^\top)^\top\mathbf{Q}^\top=\mathbf{Q}\mathbf{Q}^\top=\mathbf{P}\]

Orthogonal matrices

Note

Square matrices with orthonormal columns.

Attention

  • We have \(\mathbf{Q}^\top=\mathbf{Q}^{-1}\) since

    \[\mathbf{Q}^\top\mathbf{Q}=\mathbf{Q}\mathbf{Q}^\top=\mathbf{I}\]
  • They represent a pure rotation or reflection in \(\mathbb{R}^n\) as neither the length or the dimensionality changes of any vector under this transformation.

    • Positive determinant implies rotation, negative determinant implies reflection (as the orientation changes).

Orthonormal basis

Note

  • Standard co-ordinate vectors are an example of orthonormal basis.

  • It’s not necessary for basis vectors to be orthonormal but it’s desired.

  • For orthonormal basis, we can obtain the scalar along each component independently.

    • Let the orthogonal basis vectors are \(\mathbf{q}_1,\cdots,\mathbf{q}_n\). Then any vector \(\mathbf{v}\in\mathbb{R}^n\) can be expressed as

      \[\mathbf{v}=c_1\mathbf{q}_1+\cdots+c_n\mathbf{q}_n\]
    • The scalar along any \(\mathbf{q}_k\) can be obtained as \(c_k=\mathbf{q}_k^\top\mathbf{v}\) since

      \[\mathbf{q}_k^\top\mathbf{v}=c_1\mathbf{q}_k^\top\mathbf{q}_1+\cdots+c_k\mathbf{q}_k^\top\mathbf{q}_k+\cdots+c_n\mathbf{q}_k^\top\mathbf{q}_n=c_1\cdot0+\cdots+c_k\cdot1+\cdots+c_n\cdot0=c_k\]

Tip

  • We can create an orthogonal matrix \(\mathbf{Q}\) with the basis vectors as columns. Then all these coefficients can be found using \(\mathbf{Q}\mathbf{v}\).

Orthogonal subspace

Attention

  • \(C(\mathbf{A})\mathop{\bot} N(\mathbf{A}^\top)\) and \(C(\mathbf{A}^\top)\mathop{\bot} N(\mathbf{A})\)

  • \(\mathbf{A}:\text{span}\left(C(\mathbf{A}^\top)\mathop{\cup} N(\mathbf{A})\right)=\mathbb{R}^n\mapsto \text{span}\left(C(\mathbf{A})\mathop{\cup} N(\mathbf{A}^\top)\right)=\mathbb{R}^m\)