Skip to content

Linear Transformations#

Definition: Linear Transformation

A linear transformation or vector space homomorphism from a vector space \((V,F,+_V,\cdot_{FV})\) to a vector space \((W,F,+_W,\cdot_{FW})\) is a function \(T: V \to W\) which has the following property for all \(\lambda, \mu \in F\) and all \(\mathbf{u}, \mathbf{v} \in V\):

\[ T(\lambda \mathbf{u} + \mu \mathbf{v}) = \lambda T(\mathbf{u}) + \mu T (\mathbf{v}) \]
Example: Linearity of Identity

The identity function on any vector space \((V, F, +, \cdot)\) is linear.

Example: Linearity of Zero Function

Let \((V,F,+_V,\cdot_{FV})\) and \((W,F,+_W,\cdot_{FW})\) be vector spaces.

The function \(f: V \to W\) defined as

\[ f(\mathbf{v}) = \mathbf{0}_W \qquad \forall \mathbf{v} \in V \]

is linear.

Example: Linear Real Function

Every real function \(f: \mathbb{R} \to \mathbb{R}\) defined as

\[ f(x) = ax, \]

where \(a \in \mathbb{R}\), is linear.

However, every real function \(g: \mathbb{R} \to \mathbb{R}\) defined as

\[ g(x) = ax + b, \]

where \(a \in \mathbb{R}\) and \(b \in \mathbb{R} \setminus \{0\}\), is not linear.

Example: Linearity of Differentiation

Let \(C^{\infty}\) be the set of all smooth real functions. It can be shown that \(C^{\infty}\) is a vector space. The function which maps each \(f \in C^{\infty}\) to its derivative \(f'\) is linear.

Definition: Kernel

Let \((V, F, +_V, \cdot_{FV})\) and \((W, F, +_W, \cdot_{FW})\) be vector spaces.

The kernel of a linear transformation \(T: V \to W\) is the set of all vectors \(\mathbf{v} \in V\) which the transformation sends to the zero vector in \(W\):

\[ \{\mathbf{v} \in V \mid T(\mathbf{v}) = \mathbf{0}_W\} \]

Notation

\[ \ker(T) \]

Theorem: Zero Vector to Zero Vector

Every linear transformation \(T: V \to W\) always transforms the zero vector of \(V\) to the zero vector of \(W\):

\[ T(\mathbf{0}_V) = \mathbf{0}_W \]
Proof

TODO

Theorem: Subspace Preservation

Let \(f: V \to W\) be a linear transformation from the vector space \(V\) to the vector space \(W\).

If \(U\) is a subspace of \(V\), then \(f(U)\) is a subspace of \(W\).

If \(\tilde{U}\) is a subspace of \(W\), then its inverse image \(f^{-1}(\tilde{U})\) is a subspace of \(V\).

Proof

We need to prove two things:

Proof of (I):

Obviously, \(\mathbf{0}_W \in f(U)\), since \(f\) is linear.

Let \(\mathbf{w}_1, \mathbf{w}_2 \in f(U)\) and \(\mathbf{u}_1, \mathbf{u}_2 \in U\) be such that \(\mathbf{w}_1 = f(\mathbf{u}_1)\) and \(\mathbf{w}_2 = f(\mathbf{u}_2)\). Since \(\mathbf{u}_1 \in U\) and \(\mathbf{u}_2 \in U\), we know that \(\mathbf{u}_1 + \mathbf{u}_2 \in U\) and so \(f(\mathbf{u}_1 + \mathbf{u}_2) \in f(U)\). Since \(f\) is linear, we have

\[ f(\mathbf{u}_1 + \mathbf{u}_2) = f(\mathbf{u}_1) + f(\mathbf{u}_2) = \mathbf{w}_1 + \mathbf{w}_2 \in f(U). \]

Similarly, let \(\mathbf{w} \in f(U)\) and \(\mathbf{u} \in U\) be such that \(\mathbf{w} = f(\mathbf{u})\). Since \(U\) is a subspace, we know that \(\lambda \mathbf{u} \in U\) for all \(\lambda \in F\) and so \(f(\lambda \mathbf{u}) \in f(U)\). Since \(f\) is linear, we have

\[ f(\lambda \mathbf{u}) = \lambda f(\mathbf{u}) = \lambda \mathbf{w} \in f(U) \]

Proof of (II):

TODO

Example: Kernel is a Subspace

The kernel of a linear transformation \(f: V \to W\) is always a subspace of \(V\).

Definition: Nullity

The nullity of \(f\) is the dimension of its kernel:

\[ \dim \ker f \]

Example: Image is a Subspace

The image of a linear transformation \(f: V \to W\) is always a subspace of \(W\).

Definition: Rank

The rank of \(f\) is the dimension of its image:

\[ \mathop{\operatorname{Rank}} f \overset{\text{def}}{=} \dim f(V) \]

Theorem: Rank-Nullity Theorem

Let \(V\) and \(W\) be finite dimensional vector spaces and let \(f: V \to W\).

If \(f\) is linear, then the dimension of \(V\) is equal to the sum of \(f\)'s nullity and \(f\)'s rank:

\[ \dim V = \dim \ker f + \mathop{\operatorname{Rank}} f \]
Proof

Let \(\mathbf{v}_1,\dotsc,\mathbf{v}_k\) be a basis for \(\ker f\). We extend it to a basis \(\mathbf{v}_1,\dotsc,\mathbf{v}_k, \mathbf{v}_{k+1},\dotsc,\mathbf{v}_n\) for \(V\).

Since \(V\) is the span of \(\mathbf{v}_1, \dotsc, \mathbf{v}_n\), the image of \(f\) is the following:

\[ f(V) = f(\operatorname{span}(\mathbf{v}_1, \dotsc, \mathbf{v}_n)) \]

Since the image of a span is the span of the image, we have the following:

\[ f(V) = \operatorname{span}(f(\mathbf{v}_1), \dotsc, f(\mathbf{v}_n)) \]

However, since \(\mathbf{v}_1,\dotsc,\mathbf{v}_k \in \ker f\), we know that \(f(\mathbf{v}_1) = \cdots = f(\mathbf{v}_k) = \mathbf{0}\) and so \(\mathbf{v}_1,\dotsc,\mathbf{v}_k\) don't contribute anything to the above span:

\[ \operatorname{span}(f(\mathbf{v}_1), \dotsc, f(\mathbf{v}_n)) = \operatorname{span}(f(\mathbf{v}_{k+1}), \dotsc, f(\mathbf{v}_n)) \]

Therefore, the image of \(f\) is the span of \(f(\mathbf{v}_{k+1}), \dotsc, f(\mathbf{v}_n)\):

\[ f(V) = \operatorname{span}(f(\mathbf{v}_{k+1}), \dotsc, f(\mathbf{v}_n)) \]

Since \(\mathbf{0}_W \in f(V)\), we know that

\[ \mathbf{0}_W = \lambda_{k+1}f(\mathbf{v}_{k+1}) + \cdots + \lambda_n f(\mathbf{v}_n) \]

for some \(\lambda_{k+1}, \dotsc, \lambda_n\). However, \(f\) is linear and so we have:

\[ \mathbf{0}_W = \lambda_{k+1}f(\mathbf{v}_{k+1}) + \cdots + \lambda_n f(\mathbf{v}_n) = f(\lambda_{k+1}\mathbf{v}_{k+1} + \cdots + \lambda_n\mathbf{v}_n) \]

This means that \(\lambda_{k+1}\mathbf{v}_{k+1} + \cdots + \lambda_n\mathbf{v}_n \in \ker f\). But since \(\mathbf{v}_{k+1},\dotsc,\mathbf{v}_{n}\) are linearly independent, this implies that \(\lambda_{k+1} = \cdots = \lambda_n = 0\). Therefore, \(\lambda_{k+1}f(\mathbf{v}_{k+1}), \dotsc, \lambda_n f(\mathbf{v}_n)\) are also linearly independent and, since the are a spanning set of \(f(V)\), they are a basis for \(f(V)\). Therefore, the rank of \(f\) is \(n - k\) and we have

\[ \dim V = n = k + (n - k) = \dim \ker f + \mathop{\operatorname{Rank}} f \]

Theorem: Span of Image is Image of Span

Let \(f: V \to W\) be a linear transformation from the vector space \(V\) to the vector space \(W\).

If \(S \subseteq V\), then the span of the image \(f(S)\) is equal to the image of the span of \(S\):

\[ \mathop{\operatorname{span}}(f(S)) = f(\mathop{\operatorname{span}}(S)) \]
Proof

Since \(f\) is linear and \(\mathop{\operatorname{span}}(S)\) is a subspace of \(V\), we know that its image \(f(\mathop{\operatorname{span}}(S))\) is a subspace of \(W\). Since \(S \subseteq \mathop{\operatorname{span}}(S)\), we know that \(f(S) \subseteq f(\mathop{\operatorname{span}}(S))\). By the definition of span, we know that if \(U\) is a subspace which contains \(T\), then \(\mathop{\operatorname{span}}(T) \subseteq U\), i.e we have \(\mathop{\operatorname{span}}(f(S)) \subseteq f(\mathop{\operatorname{span}}(S))\) because \(f(\mathop{\operatorname{span}}(S))\) is a subspace which contains \(f(S)\).

Similarly, since \(f\) is linear and \(\mathop{\operatorname{span}}(f(S))\) is a subspace of \(W\), we know that the inverse image \(f^{-1}(\mathop{\operatorname{span}}(f(S)))\) is a subspace of \(V\). Since \(f(S) \subseteq \mathop{\operatorname{span}}(f(S))\), we know that \(S \subseteq f^{-1}(\mathop{\operatorname{span}}(f(S)))\). By the definition of span, we know that if \(U\) is a subspace which contains \(T\), then \(\mathop{\operatorname{span}}(T) \subseteq U\), i.e. we have \(\mathop{\operatorname{span}}(S) \subseteq f^{-1}(\mathop{\operatorname{span}}(f(S)))\) because \(f^{-1}(\mathop{\operatorname{span}}(f(S)))\) is a subspace which contains \(S\). Since \(\mathop{\operatorname{span}}(S) \subseteq f^{-1}(\mathop{\operatorname{span}}(f(S)))\), we have that \(f(\mathop{\operatorname{span}}(S)) \subseteq f(\mathop{\operatorname{span}}(f(S)))\).

We have shown that \(\mathop{\operatorname{span}}(f(S)) \subseteq f(\mathop{\operatorname{span}}(S))\) and \(f(\mathop{\operatorname{span}}(S)) \subseteq f(\mathop{\operatorname{span}}(f(S)))\), i.e. \(\mathop{\operatorname{span}}(f(S)) = f(\mathop{\operatorname{span}}(S))\).

Example

The function \(f: \mathbb{R}^3 \to \mathbb{R}^4\) with

\[ f \left(\begin{bmatrix}x \\ y \\ z\end{bmatrix}\right) = \begin{bmatrix}x + z \\ x + 2y \\ y - z \\ x + y + z\end{bmatrix} \]

is linear.

Suppose

\[ S = \left\{\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}, \begin{bmatrix}1 \\ 1 \\ 0\end{bmatrix}\right\}. \]

The span of \(S\) is

\[ \begin{aligned} \mathop{\operatorname{span}} S &= \left\{\lambda \begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix} + \mu \begin{bmatrix}1 \\ 1 \\ 0\end{bmatrix}: \lambda, \mu \in \mathbb{R}\right\} \\ &= \left\{\begin{bmatrix}\lambda \\ 0 \\ 0\end{bmatrix} + \begin{bmatrix}\mu \\ \mu \\ 0\end{bmatrix}: \lambda, \mu \in \mathbb{R}\right\} \\ &= \left\{\begin{bmatrix}\lambda + \mu \\ \mu \\ 0\end{bmatrix}: \lambda, \mu \in \mathbb{R}\right\} \\ &= \left\{\begin{bmatrix}r \\ s \\ 0\end{bmatrix}: r, s \in \mathbb{R}\right\} \end{aligned} \]

Therefore, \(f(\mathop{\operatorname{span}}(S))\) is

\[ \begin{aligned} f(\mathop{\operatorname{span}}(S)) &= f\left( \left\{\begin{bmatrix}r \\ s \\ 0\end{bmatrix}: r, s \in \mathbb{R}\right\} \right) \\ &= \left\{f\left(\begin{bmatrix}r \\ s \\ 0\end{bmatrix}\right) : r, s \in \mathbb{R} \right\} \\ &= \left\{\begin{bmatrix}r \\ r + 2s \\ s \\ r+s\end{bmatrix}: r, s \in \mathbb{R}\right\} \end{aligned} \]

The image of \(S\) is the following:

\[ f(S) = \left\{f\left(\begin{bmatrix}1 \\ 0 \\ 0\end{bmatrix}\right), f\left(\begin{bmatrix}1 \\ 1 \\ 0 \\ 1\end{bmatrix}, \begin{bmatrix}1 \\ 3 \\ 1 \\ 2\end{bmatrix}\right)\right\} \]

Its span is the following:

\[ \begin{aligned} \mathop{\operatorname{span}}(f(S)) &= \left\{ \lambda \begin{bmatrix}1 \\ 1 \\ 0 \\ 1\end{bmatrix} + \mu \begin{bmatrix}1 \\ 3 \\ 1 \\ 2\end{bmatrix}: \lambda, \mu \in \mathbb{R}\right\} \\ &= \left\{\begin{bmatrix}\lambda + \mu \\ (\lambda + \mu) + 2\mu \\ \mu \\ (\lambda + \mu) + \mu\end{bmatrix}: \lambda, \mu \in \mathbb{R}\right\} \\ &= \left\{\begin{bmatrix}r \\ r + 2s \\ s \\ r+s\end{bmatrix}: r, s \in \mathbb{R}\right\} \end{aligned} \]

Theorem: Linearity of Composition

If \(f: V \to U\) and \(g: f(V) \to W\) are linear transformations, then their composition \(g \circ f\) is also a linear transformation \(g \circ f: V\to W\).

Proof
\[ \begin{aligned}g\circ f(\lambda \mathbf{u} +\mu\mathbf{v}) &= g(f(\lambda \mathbf{u} +\mu\mathbf{v}))\\ &= g(\lambda f(\mathbf{u}) + \mu f(\mathbf{v})) \\ &= g(\lambda f(\mathbf{u})) + g(\mu f(\mathbf{v})) \\ &= \lambda g(f(\mathbf{u})) +\mu g(f(\mathbf{v}))\\ &= \lambda g\circ f(\mathbf{u})+\mu g\circ f (\mathbf{v})\end{aligned} \]

Theorem: Linearity of Inverse Transformations

If \(T\) is a bijective linear transformation, then its inverse \(T^{-1}\) is also a bijective linear transformation.

Definition: Vector Space Isomorphism

Bijective linear transformations are known as vector space isomorphisms.

Definition: Automorphism

An automorphism is a bijective endomorphism.

Proof

Let \(f: V \to W\) be a bijective linear transformation and let \(\mathbf{w}_1, \mathbf{w}_2 \in W\). Since \(f\) is bijective, there exist \(\mathbf{v}_1, \mathbf{v}_2 \in V\) such that

\[ \mathbf{w}_1 = f(\mathbf{v}_1) \qquad \text{and} \qquad \mathbf{w}_2 = f(\mathbf{v}_2). \]

Furthermore,

\[ f^{-1}(\mathbf{w}_1 + \mathbf{w}_2) = f^{-1}(f(\mathbf{v}_1) + f(\mathbf{v}_2)) = f^{-1}(f(\mathbf{v}_1 + \mathbf{v}_2)) = \mathbf{v}_1 + \mathbf{v}_2. \]

Let \(\mathbf{w} \in W\) and \(\mathbf{v} \in V\) with \(\mathbf{w} = f(\mathbf{v})\). We have the following:

\[ f^{-1}(\lambda \mathbf{w}) = f^{-1}(\lambda f(\mathbf{v})) = f^{-1}(f(\lambda \mathbf{v})) = \lambda \mathbf{v} = \lambda f^{-1}(f(\mathbf{v})) = \lambda f^{-1}(\mathbf{w}) \]

Theorem: Basis Transformation

Let \((V, F)\) and \((W, F)\) be finite dimensional vector spaces, let \(\mathbf{v}_1, \dotsc, \mathbf{v}_n\) be a basis of \(V\) and let \(\mathbf{w}_1, \dotsc, \mathbf{w}_n \in W\) be arbitrary.

There exists exactly one linear transformation \(f: V \to W\) such that \(f(\mathbf{v}_i) = \mathbf{w}_i\) for all \(i \in \{1, \dotsc, n\}\). Moreover, \(f\) is bijective if and only if \(\mathbf{w}_1, \dotsc, \mathbf{w}_n\) is a basis of \(W\).

Proof

TODO

Theorem: Injectivity Condition for Linear Transformations

A linear transformation \(f: V \to W\) is injective if and only if \(\mathbf{0}_V\) is the only element of its kernel.

\[ \ker f = \{\mathbf{0}_V\} \]
Proof

TODO

Theorem: Injectivity \(\implies\) Bijectivity for Equal Finite Dimensions

Let \(f: V \to W\) be a linear transformation.

If \(f\) is injective and the dimensions of \(V\) and \(W\) are finite and equal, then \(f\) is bijective.

Proof

TODO

Matrix Representations#

Theorem: Matrix Representation of a Linear Transformation

Let \((V,F,+,\cdot)\) and \((W,F,+,\cdot)\) be vector spaces and let \(B_V\) and \(B_W\) be ordered bases of \(V\) and \(W\), respectively.

If \(T: V \to W\) is a linear transformation, then there exists a unique matrix \({}_{B_W} [T]_{B_V}\in F^{\dim(W)\times \dim(V)}\) such that the coordinate vector \([T(\mathbf{v})]_{B_W}\) is equal to the product of \({}_{B_W} [T]_{B_V}\) and the coordinate vector \([\mathbf{v}]_{B_V}\) for every \(\mathbf{v} \in V\):

\[ [T(\mathbf{v})]_{B_W} = {}_{B_W} [T]_{B_V}\cdot [\mathbf{v}]_{B_V}, \]

Warning: Dependence on the Choice of Bases

The coefficients of the matrix \({}_{B_W} [T]_{B_V}\) depend on the choice of \(B_V\) and \(B_W\), i.e. different ordered bases will make the coefficients of the matrix representation of \(T\) different.

Proof

TODO

Example: Matrix Representation of Differentiation

Let \(P_n\) be the vector space of all real polynomial function expressible via a polynomial of degree \(\le n\):

\[ P_n = \{f: \mathbb{R} \to \mathbb{R} \mid f \text{ is polynomial and } \deg (f) \le n\} \]

We know that \((1, x^1, x^2, x^3)\) is an ordered basis for \(P_3\) and \((1, x^1, x^2)\) is an ordered basis for \(P_2\). The matrix representation of the differentiation operator with respect to \(P_3\) and \(P_2\) is the following:

\[ {}_{P_2} [D]_{P_3} = \begin{bmatrix}0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 3\end{bmatrix} \]

Consider \(f(x) = -2x^3 +x^2\) with \(D(f(x)) = -6x^2 + 2x\). We have:

\[ [f]_{P_3} = \begin{bmatrix}0 \\ 0 \\ 1 \\ -2\end{bmatrix} \qquad [D(f(x))]_{P_2} = \begin{bmatrix}0 \\ 2 \\ -6\end{bmatrix} \]

One can easily very the following:

\[ \begin{aligned}[D(f(x))]_{P_2} &= {}_{P_2} [D]_{P_3} \cdot [f]_{P_3} \\ \begin{bmatrix}0 \\ 2 \\ -6\end{bmatrix} &= \begin{bmatrix}0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 3\end{bmatrix} \begin{bmatrix}0 \\ 0 \\ 1 \\ -2\end{bmatrix}\end{aligned} \]

Theorem: Input Basis Change

Let \((V,F,+,\cdot)\) and \((W,F,+,\cdot)\) be vector spaces, let \(B_V\) and \(B_V'\) be ordered bases of \(V\), let \(B_W\) be an ordered basis of \(W\) and let \(T: V \to W\) be a linear transformation.

If the matrix representation of \(T\) with respect to \(B_V\) and \(B_W\) is \({}_{B_W} [T]_{B_V}\), then its matrix representation with respect to \(B_V'\) is the product of \({}_{B_W} [T]_{B_V}\) with the matrix representation of the identity function of \(V\) with respect to \(B_V\) and \(B_V'\):

\[ {}_{B_W} [T]_{B_V'} = {}_{B_W} [T]_{B_V} \cdot {}_{B_V} [\operatorname{id}]_{B_V'} \]
Proof

TODO

Theorem: Output Basis Change

Let \((V,F,+,\cdot)\) and \((W,F,+,\cdot)\) be vector spaces, let \(B_V\) be an ordered basis of \(V\), let \(B_W\) and \(B_W'\) be ordered bases of \(W\) and let \(T: V \to W\) be a linear transformation.

If the matrix representation of \(T\) with respect to \(B_V\) and \(B_W\) is \({}_{B_W} [T]_{B_V}\), then its matrix representation with respect to \(B_W'\) is the product of the matrix representation of the identity function of \(V\) (with respect to \(B_W\) and \(B_W'\)) and \({}_{B_W} [T]_{B_V}\):

\[ {}_{B_W'} [T]_{B_V} = {}_{B_W'} [\operatorname{id}]_{B_W} \cdot {}_{B_W} [T]_{B_V} \]
Proof

TODO

Example

Let \(P_n\) be the vector space of all real polynomial function expressible via a polynomial of degree \(\le n\):

\[ P_n = \{f: \mathbb{R} \to \mathbb{R} \mid f \text{ is polynomial and } \deg (f) \le n\} \]

We know that \((1, x, x^2, x^3)\) is an ordered basis for \(P_3\) and \((1, x, x^2)\) is an ordered basis for \(P_2\). The matrix representation of the differentiation operator with respect to \(P_3\) and \(P_2\) is the following:

\[ {}_{P_2} [D]_{P_3} = \begin{bmatrix}0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 3\end{bmatrix} \]

We want to find the matrix representation of the differentiation operator with respect to \(P_3\) and the ordered basis \(P_2' = (1, x+1, x^2 + x + 1)\). We have the following:

\[ {}_{P_2'} [D]_{P_3} = {}_{P_2'} [\operatorname{id}]_{P_2} \cdot {}_{P_2} [D]_{P_3} \]

Since \(\operatorname{id}\) is the identity function, we know that \({}_{P_2'} [\operatorname{id}]_{P_2}\) is just the inverse of \({}_{P_2} [\operatorname{id}]_{P_2'}\):

\[ {}_{P_2'} [\operatorname{id}]_{P_2} = ({}_{P_2} [\operatorname{id}]_{P_2'})^{-1} \]

Finding \({}_{P_2} [\operatorname{id}]_{P_2'}\) is easy using the usual algorithm:

\[ \begin{aligned} {}_{P_2} [\operatorname{id}]_{P_2'} &= \begin{bmatrix}\vert & \vert & \vert \\ [\operatorname{id}(1)]_{P_2} & [\operatorname{id}(x+1)]_{P_2} & [\operatorname{id}(x^2 + x + 1)]_{P_2} \\ \vert & \vert & \vert\end{bmatrix} \\ &= \begin{bmatrix}\vert & \vert & \vert \\ [1]_{P_2} & [x+1]_{P_2} & [x^2 + x + 1]_{P_2} \\ \vert & \vert & \vert\end{bmatrix} \\ &= \begin{bmatrix}1 & 1 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 1\end{bmatrix}\end{aligned} \]

Its inverse is:

\[ ({}_{P_2} [\operatorname{id}]_{P_2'})^{-1} = \begin{bmatrix}1 & 1 & 1 \\ 0 & 1 & 1 \\ 0 & 0 & 1\end{bmatrix}^{-1} = \begin{bmatrix}1 & -1 & 0 \\ 0 & 1 & -1 \\ 0 & 0 & 1\end{bmatrix} \]

We therefore have

\[ {}_{P_2'} [\operatorname{id}]_{P_2} = \begin{bmatrix}1 & -1 & 0 \\ 0 & 1 & -1 \\ 0 & 0 & 1\end{bmatrix} \]

and can finally calculate \({}_{P_2'} [D]_{P_3}\):

\[ {}_{P_2'} [D]_{P_3} = \begin{bmatrix}1 & -1 & 0 \\ 0 & 1 & -1 \\ 0 & 0 & 1\end{bmatrix} \begin{bmatrix}0 & 1 & 0 & 0 \\ 0 & 0 & 2 & 0 \\ 0 & 0 & 0 & 3\end{bmatrix} = \begin{bmatrix}0 & 1 & -2 & 0 \\ 0 & 0 & 2 & -3 \\ 0 & 0 & 0 & 3\end{bmatrix} \]

Algorithm: Finding the Matrix Representation

Let \((V,F,+,\cdot)\) and \((W,F,+,\cdot)\) be vector spaces and let \(T: V \to W\) be a linear transformation.

We want to find the matrix representation \({}_{B_W}[T]_{B_V}\) with respect to two ordered bases of our choice: \(B_V = (\mathbf{b}_1^V,\cdots,\mathbf{b}_m^V)\) for \(V\) and \(B_W = (\mathbf{b}_1^W,\cdots,\mathbf{b}_n^W)\) for \(W\).

  1. Determine the effect of \(T\) on the elements of the input basis \(B_V\), i.e. determine \(\mathbf{w}_1, \dotsc, \mathbf{w}_m\) by applying \(T\) to \(\mathbf{b}_1^V,\cdots,\mathbf{b}_m^V\):
\[ \mathbf{w}_1 = T(\mathbf{b}_1^V) \qquad \cdots \qquad \mathbf{w}_m = T(\mathbf{b}_m^V) \]
  1. Determine the coordinate vectors \([\mathbf{w}_1]_{B_W},\cdots,[\mathbf{w}_m]_{B_W}\) of the resulting vectors with respect to the output basis \(B_W\).

    • The specific calculation depends on the nature of the vector space (e.g., solving a linear system of equations).
  2. Construct \({}_{B_W}[T]_{B_V}\) by using these coordinate vectors as the columns of the matrix:

\[ {}_{B_W}[T]_{B_V} = \begin{bmatrix}\vert & \vert & \vert \\ [\mathbf{w}_1]_{B_W} & \cdots & [\mathbf{w}_m]_{B_W} \\ \vert & \vert & \vert \end{bmatrix} = \begin{bmatrix}\vert & \vert & \vert \\ [T(\mathbf{b}_1^V)]_{B_W} & \cdots & [T(\mathbf{b}_m^V)]_{B_W} \\ \vert & \vert & \vert \end{bmatrix} \]