Linear Transformations#
Definition: Linear Transformation
A linear transformation or vector space homomorphism from a vector space \((V,F,+_V,\cdot_{FV})\) to a vector space \((W,F,+_W,\cdot_{FW})\) is a function \(T: V \to W\) which has the following property for all \(\lambda, \mu \in F\) and all \(\mathbf{u}, \mathbf{v} \in V\):
Example: Linearity of Identity
The identity function on any vector space \((V, F, +, \cdot)\) is linear.
Example: Linearity of Zero Function
Let \((V,F,+_V,\cdot_{FV})\) and \((W,F,+_W,\cdot_{FW})\) be vector spaces.
The function \(f: V \to W\) defined as
is linear.
Example: Linear Real Function
Every real function \(f: \mathbb{R} \to \mathbb{R}\) defined as
where \(a \in \mathbb{R}\), is linear.
However, every real function \(g: \mathbb{R} \to \mathbb{R}\) defined as
where \(a \in \mathbb{R}\) and \(b \in \mathbb{R} \setminus \{0\}\), is not linear.
Example: Linearity of Differentiation
Let \(C^{\infty}\) be the set of all smooth real functions. It can be shown that \(C^{\infty}\) is a vector space. The function which maps each \(f \in C^{\infty}\) to its derivative \(f'\) is linear.
Definition: Kernel
Let \((V, F, +_V, \cdot_{FV})\) and \((W, F, +_W, \cdot_{FW})\) be vector spaces.
The kernel of a linear transformation \(T: V \to W\) is the set of all vectors \(\mathbf{v} \in V\) which the transformation sends to the zero vector in \(W\):
Notation
Theorem: Zero Vector to Zero Vector
Every linear transformation \(T: V \to W\) always transforms the zero vector of \(V\) to the zero vector of \(W\):
Proof
TODO
Theorem: Subspace Preservation
Let \(f: V \to W\) be a linear transformation from the vector space \(V\) to the vector space \(W\).
If \(U\) is a subspace of \(V\), then \(f(U)\) is a subspace of \(W\).
If \(\tilde{U}\) is a subspace of \(W\), then its inverse image \(f^{-1}(\tilde{U})\) is a subspace of \(V\).
Proof
We need to prove two things:
- (I) If \(U\) is a subspace of \(V\), then \(f(U)\) is a subspace of \(W\).
- (II) If \(\tilde{U}\) is a subspace of \(W\), then the inverse image \(f^{-1}(\tilde{U})\) is a subspace of \(V\).
Proof of (I):
Obviously, \(\mathbf{0}_W \in f(U)\), since \(f\) is linear.
Let \(\mathbf{w}_1, \mathbf{w}_2 \in f(U)\) and \(\mathbf{u}_1, \mathbf{u}_2 \in U\) be such that \(\mathbf{w}_1 = f(\mathbf{u}_1)\) and \(\mathbf{w}_2 = f(\mathbf{u}_2)\). Since \(\mathbf{u}_1 \in U\) and \(\mathbf{u}_2 \in U\), we know that \(\mathbf{u}_1 + \mathbf{u}_2 \in U\) and so \(f(\mathbf{u}_1 + \mathbf{u}_2) \in f(U)\). Since \(f\) is linear, we have
Similarly, let \(\mathbf{w} \in f(U)\) and \(\mathbf{u} \in U\) be such that \(\mathbf{w} = f(\mathbf{u})\). Since \(U\) is a subspace, we know that \(\lambda \mathbf{u} \in U\) for all \(\lambda \in F\) and so \(f(\lambda \mathbf{u}) \in f(U)\). Since \(f\) is linear, we have
Proof of (II):
TODO
Example: Kernel is a Subspace
The kernel of a linear transformation \(f: V \to W\) is always a subspace of \(V\).
Example: Image is a Subspace
The image of a linear transformation \(f: V \to W\) is always a subspace of \(W\).
Theorem: Rank-Nullity Theorem
Let \(V\) and \(W\) be finite dimensional vector spaces and let \(f: V \to W\).
If \(f\) is linear, then the dimension of \(V\) is equal to the sum of \(f\)'s nullity and \(f\)'s rank:
Proof
Let \(\mathbf{v}_1,\dotsc,\mathbf{v}_k\) be a basis for \(\ker f\). We extend it to a basis \(\mathbf{v}_1,\dotsc,\mathbf{v}_k, \mathbf{v}_{k+1},\dotsc,\mathbf{v}_n\) for \(V\).
Since \(V\) is the span of \(\mathbf{v}_1, \dotsc, \mathbf{v}_n\), the image of \(f\) is the following:
Since the image of a span is the span of the image, we have the following:
However, since \(\mathbf{v}_1,\dotsc,\mathbf{v}_k \in \ker f\), we know that \(f(\mathbf{v}_1) = \cdots = f(\mathbf{v}_k) = \mathbf{0}\) and so \(\mathbf{v}_1,\dotsc,\mathbf{v}_k\) don't contribute anything to the above span:
Therefore, the image of \(f\) is the span of \(f(\mathbf{v}_{k+1}), \dotsc, f(\mathbf{v}_n)\):
Since \(\mathbf{0}_W \in f(V)\), we know that
for some \(\lambda_{k+1}, \dotsc, \lambda_n\). However, \(f\) is linear and so we have:
This means that \(\lambda_{k+1}\mathbf{v}_{k+1} + \cdots + \lambda_n\mathbf{v}_n \in \ker f\). But since \(\mathbf{v}_{k+1},\dotsc,\mathbf{v}_{n}\) are linearly independent, this implies that \(\lambda_{k+1} = \cdots = \lambda_n = 0\). Therefore, \(\lambda_{k+1}f(\mathbf{v}_{k+1}), \dotsc, \lambda_n f(\mathbf{v}_n)\) are also linearly independent and, since the are a spanning set of \(f(V)\), they are a basis for \(f(V)\). Therefore, the rank of \(f\) is \(n - k\) and we have
Theorem: Span of Image is Image of Span
Let \(f: V \to W\) be a linear transformation from the vector space \(V\) to the vector space \(W\).
If \(S \subseteq V\), then the span of the image \(f(S)\) is equal to the image of the span of \(S\):
Proof
Since \(f\) is linear and \(\mathop{\operatorname{span}}(S)\) is a subspace of \(V\), we know that its image \(f(\mathop{\operatorname{span}}(S))\) is a subspace of \(W\). Since \(S \subseteq \mathop{\operatorname{span}}(S)\), we know that \(f(S) \subseteq f(\mathop{\operatorname{span}}(S))\). By the definition of span, we know that if \(U\) is a subspace which contains \(T\), then \(\mathop{\operatorname{span}}(T) \subseteq U\), i.e we have \(\mathop{\operatorname{span}}(f(S)) \subseteq f(\mathop{\operatorname{span}}(S))\) because \(f(\mathop{\operatorname{span}}(S))\) is a subspace which contains \(f(S)\).
Similarly, since \(f\) is linear and \(\mathop{\operatorname{span}}(f(S))\) is a subspace of \(W\), we know that the inverse image \(f^{-1}(\mathop{\operatorname{span}}(f(S)))\) is a subspace of \(V\). Since \(f(S) \subseteq \mathop{\operatorname{span}}(f(S))\), we know that \(S \subseteq f^{-1}(\mathop{\operatorname{span}}(f(S)))\). By the definition of span, we know that if \(U\) is a subspace which contains \(T\), then \(\mathop{\operatorname{span}}(T) \subseteq U\), i.e. we have \(\mathop{\operatorname{span}}(S) \subseteq f^{-1}(\mathop{\operatorname{span}}(f(S)))\) because \(f^{-1}(\mathop{\operatorname{span}}(f(S)))\) is a subspace which contains \(S\). Since \(\mathop{\operatorname{span}}(S) \subseteq f^{-1}(\mathop{\operatorname{span}}(f(S)))\), we have that \(f(\mathop{\operatorname{span}}(S)) \subseteq f(\mathop{\operatorname{span}}(f(S)))\).
We have shown that \(\mathop{\operatorname{span}}(f(S)) \subseteq f(\mathop{\operatorname{span}}(S))\) and \(f(\mathop{\operatorname{span}}(S)) \subseteq f(\mathop{\operatorname{span}}(f(S)))\), i.e. \(\mathop{\operatorname{span}}(f(S)) = f(\mathop{\operatorname{span}}(S))\).
Example
The function \(f: \mathbb{R}^3 \to \mathbb{R}^4\) with
is linear.
Suppose
The span of \(S\) is
Therefore, \(f(\mathop{\operatorname{span}}(S))\) is
The image of \(S\) is the following:
Its span is the following:
Theorem: Linearity of Composition
If \(f: V \to U\) and \(g: f(V) \to W\) are linear transformations, then their composition \(g \circ f\) is also a linear transformation \(g \circ f: V\to W\).
Proof
Theorem: Linearity of Inverse Transformations
If \(T\) is a bijective linear transformation, then its inverse \(T^{-1}\) is also a bijective linear transformation.
Definition: Vector Space Isomorphism
Bijective linear transformations are known as vector space isomorphisms.
Definition: Automorphism
An automorphism is a bijective endomorphism.
Proof
Let \(f: V \to W\) be a bijective linear transformation and let \(\mathbf{w}_1, \mathbf{w}_2 \in W\). Since \(f\) is bijective, there exist \(\mathbf{v}_1, \mathbf{v}_2 \in V\) such that
Furthermore,
Let \(\mathbf{w} \in W\) and \(\mathbf{v} \in V\) with \(\mathbf{w} = f(\mathbf{v})\). We have the following:
Theorem: Basis Transformation
Let \((V, F)\) and \((W, F)\) be finite dimensional vector spaces, let \(\mathbf{v}_1, \dotsc, \mathbf{v}_n\) be a basis of \(V\) and let \(\mathbf{w}_1, \dotsc, \mathbf{w}_n \in W\) be arbitrary.
There exists exactly one linear transformation \(f: V \to W\) such that \(f(\mathbf{v}_i) = \mathbf{w}_i\) for all \(i \in \{1, \dotsc, n\}\). Moreover, \(f\) is bijective if and only if \(\mathbf{w}_1, \dotsc, \mathbf{w}_n\) is a basis of \(W\).
Proof
TODO
Theorem: Injectivity Condition for Linear Transformations
A linear transformation \(f: V \to W\) is injective if and only if \(\mathbf{0}_V\) is the only element of its kernel.
Proof
TODO
Theorem: Injectivity \(\implies\) Bijectivity for Equal Finite Dimensions
Let \(f: V \to W\) be a linear transformation.
If \(f\) is injective and the dimensions of \(V\) and \(W\) are finite and equal, then \(f\) is bijective.
Proof
TODO
Matrix Representations#
Theorem: Matrix Representation of a Linear Transformation
Let \((V,F,+,\cdot)\) and \((W,F,+,\cdot)\) be vector spaces and let \(B_V\) and \(B_W\) be ordered bases of \(V\) and \(W\), respectively.
If \(T: V \to W\) is a linear transformation, then there exists a unique matrix \({}_{B_W} [T]_{B_V}\in F^{\dim(W)\times \dim(V)}\) such that the coordinate vector \([T(\mathbf{v})]_{B_W}\) is equal to the product of \({}_{B_W} [T]_{B_V}\) and the coordinate vector \([\mathbf{v}]_{B_V}\) for every \(\mathbf{v} \in V\):
Warning: Dependence on the Choice of Bases
The coefficients of the matrix \({}_{B_W} [T]_{B_V}\) depend on the choice of \(B_V\) and \(B_W\), i.e. different ordered bases will make the coefficients of the matrix representation of \(T\) different.
Proof
TODO
Example: Matrix Representation of Differentiation
Let \(P_n\) be the vector space of all real polynomial function expressible via a polynomial of degree \(\le n\):
We know that \((1, x^1, x^2, x^3)\) is an ordered basis for \(P_3\) and \((1, x^1, x^2)\) is an ordered basis for \(P_2\). The matrix representation of the differentiation operator with respect to \(P_3\) and \(P_2\) is the following:
Consider \(f(x) = -2x^3 +x^2\) with \(D(f(x)) = -6x^2 + 2x\). We have:
One can easily very the following:
Theorem: Input Basis Change
Let \((V,F,+,\cdot)\) and \((W,F,+,\cdot)\) be vector spaces, let \(B_V\) and \(B_V'\) be ordered bases of \(V\), let \(B_W\) be an ordered basis of \(W\) and let \(T: V \to W\) be a linear transformation.
If the matrix representation of \(T\) with respect to \(B_V\) and \(B_W\) is \({}_{B_W} [T]_{B_V}\), then its matrix representation with respect to \(B_V'\) is the product of \({}_{B_W} [T]_{B_V}\) with the matrix representation of the identity function of \(V\) with respect to \(B_V\) and \(B_V'\):
Proof
TODO
Theorem: Output Basis Change
Let \((V,F,+,\cdot)\) and \((W,F,+,\cdot)\) be vector spaces, let \(B_V\) be an ordered basis of \(V\), let \(B_W\) and \(B_W'\) be ordered bases of \(W\) and let \(T: V \to W\) be a linear transformation.
If the matrix representation of \(T\) with respect to \(B_V\) and \(B_W\) is \({}_{B_W} [T]_{B_V}\), then its matrix representation with respect to \(B_W'\) is the product of the matrix representation of the identity function of \(V\) (with respect to \(B_W\) and \(B_W'\)) and \({}_{B_W} [T]_{B_V}\):
Proof
TODO
Example
Let \(P_n\) be the vector space of all real polynomial function expressible via a polynomial of degree \(\le n\):
We know that \((1, x, x^2, x^3)\) is an ordered basis for \(P_3\) and \((1, x, x^2)\) is an ordered basis for \(P_2\). The matrix representation of the differentiation operator with respect to \(P_3\) and \(P_2\) is the following:
We want to find the matrix representation of the differentiation operator with respect to \(P_3\) and the ordered basis \(P_2' = (1, x+1, x^2 + x + 1)\). We have the following:
Since \(\operatorname{id}\) is the identity function, we know that \({}_{P_2'} [\operatorname{id}]_{P_2}\) is just the inverse of \({}_{P_2} [\operatorname{id}]_{P_2'}\):
Finding \({}_{P_2} [\operatorname{id}]_{P_2'}\) is easy using the usual algorithm:
Its inverse is:
We therefore have
and can finally calculate \({}_{P_2'} [D]_{P_3}\):
Algorithm: Finding the Matrix Representation
Let \((V,F,+,\cdot)\) and \((W,F,+,\cdot)\) be vector spaces and let \(T: V \to W\) be a linear transformation.
We want to find the matrix representation \({}_{B_W}[T]_{B_V}\) with respect to two ordered bases of our choice: \(B_V = (\mathbf{b}_1^V,\cdots,\mathbf{b}_m^V)\) for \(V\) and \(B_W = (\mathbf{b}_1^W,\cdots,\mathbf{b}_n^W)\) for \(W\).
- Determine the effect of \(T\) on the elements of the input basis \(B_V\), i.e. determine \(\mathbf{w}_1, \dotsc, \mathbf{w}_m\) by applying \(T\) to \(\mathbf{b}_1^V,\cdots,\mathbf{b}_m^V\):
-
Determine the coordinate vectors \([\mathbf{w}_1]_{B_W},\cdots,[\mathbf{w}_m]_{B_W}\) of the resulting vectors with respect to the output basis \(B_W\).
- The specific calculation depends on the nature of the vector space (e.g., solving a linear system of equations).
-
Construct \({}_{B_W}[T]_{B_V}\) by using these coordinate vectors as the columns of the matrix: