NOTE: These notes were converted from LATEX to markdown algorithmically and may thus contain some notation errors. And obviously some errors may be present regardless because, well, they’re my notes :)
Determinants
Number Fields
A number field is any set K of objects for which the arithmetic
operations give again, elements of K. The operations have the
following axioms:
For every pair of numbers α and β in k there
corresponds a single number α+β in K, the sum of
α and β.
α+β=β+α for all α,β∈K.
(α+β)+γ=α+(β+γ) for
every α,β,γ∈K.
There exists a number 0 such that
0+α=α,α∈K.
For every α in K there exists a number (negative
element) γ in K such that
α+γ=0⟹a=−γ. We get subtraction
by defining the difference β−α as β+γ.
For every pair of numbers α,β∈K there exists a
unique number α⋅β in K, called the product of
α and β:
αβ=βα
(αβ)γ=α(βγ)
There exists a number 1 such that 1⋅α=α
for all α∈K.
α(β+γ)=αβ+αγ
For all α there exists a number γ such that
αγ=1⟹γ=α1. We
define the quotient αβ as
β⋅γ.
We define the natural numbers, N, as the numbers
1,2,3,.... We define the integers Z as the natural
numbers, their negative counterparts and 0. We define the rational
numbers in a field K as the set of all quotients qp,
where p,q∈Z,q=0.
Examples of number fields:
The field of rational numbers. Note that integers do not form a
number field, for they do not satisfy axiom 2(e).
The field of real numbers R.
The field of complex numbers C of the form a+ib,
where a,b∈R. We have
(a1+ib1)+(a2+ib2)=(a1+a2)+i(b1+b2).
a+i0=a.
Systems of Linear Equations
In the most general case, such a system has the form
where x1,x2,…,xn are the unknowns,
a11,a12,…,akn are the coefficients and
b1,b2,…,bn are constants. The first index of a coefficient
indicates the number of the equation in which the coefficient appears,
and the second index indicates the index of the unknown variable.
A solution to the system means any set of numbers
c1,c2,…,cn∈K which, when substituted for the unknowns
reduces every equation into an identity. A system with at least one
solution is called compatible. If a sytem has a unique solution, it is
called determinate. If it has at least two solution, it is called
indeterminate. The basic problems of studying a system are:
Whether the ystem is compatible or incompatible;
If the system is compatible, to find whether it is determinate or
indeterminate;
If the system is determinate, to find the solution;
If the system is indeterminate, to describe the set of all
solutions.
Determinants
Introduction
Suppose we have a square matrix (an array of n2 numbers
aij∈K (i,=1,2,…,n):
a11a21⋮an1a12a22⋮an2……⋮…a1na2n⋮ann
The number of rows and columns of the matrix is called its order.
Every aij is called an element of the matrix. The first index
indicates the row and the second index the column of aij. The
elements a11,a22,…,ann indicate the principal
diagonal of the matrix.
The Terms of the Determinant
A term of the determinant of the matrix is a product of n elements
that contains just one element from each row and each column:
aα11aα22…aαnn
Note that the
factors of the term go from \“left to right\” (i.e. they are ordered by
columns, not by rows). The first element is in the first column and some
row α1, the second element is in the second column and some row
α2, etc. The numbers α1,α2,…,αn are
all different and represent some permutation of the numbers
1,2,…,n. By an inversion in the sequence
α1,α2,…,αn we mean an arrangement of two
indices such that the larger index come before the small index. We
denote the total number of inversions with
N(α1,α2,…,αn). For example, in the
permutation 2,1,4,3 there are two inversion (2>1 and 4>3),
so N(2,1,4,3)=2. In the permutation 4,3,1,2 there are five
inversion (4>1, 3>1, 4>2, 3>2, 4>3), so
N(4,3,1,2)=4
If the number of inversions is even, we put a plus sign before the term;
if the number is odd, we put a minus sign. In other words, in front of
every term we put the expression
(−1)N(α1,α2,…,αn)
The total number of the terms in the determinant is always equal to the
total number of permutations of 1,2,…,n, so n!.
The Determinant
The determinant D of a matrix is defined as the algebraic sum of the
n! terms of the determinant:
D=∑(−1)N(α1,α2,…,αn)aα11aα22…aαnn
Note that all of the terms of the determinant are exclusive, as in, they
do not contain the same terms, nor do they contain any more than one
element from each column. We denote the determinant D by one of the
following symbols:
We can determine the sign of a term of a determinant more intuitively
through geometric terms: There are only two ways two elements of a term
can relate to each other. You can draw a line between the two, and if
the line is going down the slope (not to be confused with the geometric
slope; it is actually the exact opposite) is positive and if the line is
going up, the slope is negative. We draw all the segments with a
negative slope (going upwards) that join pairs of elements in
aα11,aα22,…,aαnn. We put a
plus sign before the term if the number of all of those segments is
even, and minus sign if it is odd. So again, we’re just calculating the
number of inversions here.
The transposition operation. We interchange the rows and columns
of the determinant, to get the transpose of it. The tranpose has the
same value as the original determinant. DT=D, that is. From
this we know the equivalence of the rows and columns of a
determinant.
The antisymmetry property. A determinant changes sign when two
of its columns are interchanged. Say we change the columns j and
j+1. If a term initially had a positive slope connecting some
elements from columns j and j+1, that slope is now negative.
This means that we get the same terms in the determinant but the
sign changes. If we change two nonadjacent columns j,k with m
columns inbetween. We get a total of 2m+1 adjacent column
interchanges, which means that the sign flips.
A determinant with two identical columns vanishes. Interchanging
the columns would not change D. But as was proven, a column change
always changes the sign of the determinant. We have
D=−D⟹D=0.
The linear property. If all the elements of a column j can be
expressed as linear combinations of two columns, i.e.:
aij=λbi+μci for i=1,2,…,n
for
some λ and μ, we have
D=λD1+μD2
where D1 and D2 are identical to D, just that the jth
column of D1 consists of elements bi and the jth column of
D2 consists of elements ci. We can extend this to the general
case where aij=λbI+μci+⋯+τfi.
Any common factor of a column can be factored out of a
determinant. If we have aij=λbi, then by the linear
property we have
Dj(aij)=Dj(λbi)=λDj(bi). If a column
consists entirely of zeros, the determinant vanishes, because we
have Dj(0)=Dj(0⋅1)=0⋅Dj(1)=0.
The value of a determinant does not change by adding the elements
of one column multipliede by λ to the elements of another
column. We have Dj(aij+λaik) for columns j,k
and i=1,2,…,n. We get:
Dj(aij+λaik)=Dj(aij)+λDj(aik)
We notice that Dj(aik) now has two columns with elements
aik, so we get
Dj(aij)+λDj(aik)=Dj(aij).
Cofactors and Minors
The cofactor of the element aij in a determinant is defined in
the following way: Consider the equation
D=∑(−1)N(α1,α2,…,αn)aα11aα22…aαnn
take all the terms that contain the element aij on the right-hand
side, and sum them up. Then factor out aij. What we’re left with is
the cofactor of aij, Aij. Since every term contains an element
from column j, we have
D=a1jA1j+a2jA2j+⋯+anjAnj=k=1∑nakjAkj
This is called the expansion of D w.r.t the jth column. We can write
a similar expansion for rows as well.
If we delete a row and a column from a matrix of order n, we get a
matrix of order n−1. The determinant of this matrix is called a minor
of the original matrix. If we delete the ith row and the jth column
of D, we get the minor Mij(D). We have
Aij=(−1)i+jMij
because from the sum of all the terms of
D containing a11, but with a11 removed, we get M11.
Since the determinant is a sum of signed products of elements from
different rows and columns, we can write:
D=a11(sum of all valid products from remaining (n-1)×(n-1) matrix)+(terms without a11)
The sum of all valid products from the remaining
(n−1)×(n−1) matrix is exactly the determinant of that smaller matrix,
which is the minor M11. So, we get:
For arbitrary i,j, we simply do i−1+j−1=i+j−2 adjacent
interchanges between rows and columns (which all change the sign) to
move aij to the (1,1) position. So we thus have
is called triangular. We see that Dn
equals a11 times a triangular determinant of order n−1. We
again expand Dn−1 to get that Dn−1=a22Dn−2.
Continuing this process we see that
Dn=k=1∏nakk
We can treat W(x1,…,xn) as a
polynomial of degree n−1 in xn. The polynomial vanishes if
xn takes any of the values x1,x2,…,xn−1, since it’d
have two identical columns. We know from this that the polynomial is
divisible by the product
(xn−x1)⋯(xn−xn−1)=k=1∏n−1(xn−xk).
We thus have
W(x1,…,xn)=a(x1,…,xn−1)k=1∏n−1(xn−xk)
where a(x1,…,xn−1) is the leading coefficient. If we
expand the determinant by the last column, we see the leading
coefficient is just W(x1,…,xn−1) (because we get the
exact same determinant, just without the last column and row). From
this we get
We multiply the first of the equations (18) by the cofactor A11 of
the element a11 in the coefficient matrix, then we multiply the
second equation by A21, the third by A31, and so on, and
finally the last equation by An1. Then we add all the equations so
obtained. The result is
By Theorem 1.51, the coefficient of c1 in (19) equals the determinant
D itself. By Theorem 1.52, the coefficients of all the other cj
(j=1) vanish. The expression in the right-hand side of (19) is
the expansion of the determinant
with respect to its first column. Therefore (19) can now
be written in the form
Dc1=D1⟹c1=DD1.
In
an analogous manner, we have cj=DDj, where Dj is the
determinant otained from D by replacing its jth column with the
numbers b1,b2,…,bn.
Minors of Arbitrary Order. Laplace’s Theorem
If we delete k<n rows and the same number of columns from a square
matrix of order n, the remaining elements form a square matrix of
order k. The determinant of this matrix is called a minor of order
k of the original matrix. It is denoted by:
M=Mj1,j2,…,jni1,i2,…,in,
where
i1,i2…,ik are the numbers of the deleted rows, and
j1,j2,…,jk are the numbers of the deleted columns.
If we on the other hand delete the rows and columns that make up M
from the original matrix, we get the complementary minor, Mˉ,
of the minor M, a square matrix of order n−k. For example, if M is
of order 1, i.e. is just some element aij, then
Mˉ=Mij.
Linear Dependence between Columns
If one of the columns of the determinant D is a linear combination of
other columns, then D=0. We can subtract the other columns from
that one column, it wouldn’t change the value of the determinant, but we
would end up with a determinant that has a column that only consists of
zeroes, and thus D=0. Also, the converse is true: If D=0, then
(at least) one of its columns is a linear combination of the other
columns.
The rank of a matrix A is some integer r for which: 1. The
matrix A has a minor (called the basis minor) of order r which does
not vanish; 2. Every minor of A of order r+1 and higher vanishes.
Basis minor theorem.Any column of the matrix A is a linear
combination of its basis columns.
Consider we have the following relation describing the linear
independence of columns with a coefficient, say, λm=0:
k=1∑m−1λkAk+λmAm=0
from here
we immediately see that:
k=1∑m−1λm−λkAk=Am
which
shows that Am is a linear combination of the columns
A1,A2,…Am−1.
A determinant D vanishes if and only if there is linear dependence
between its columns. And from the transposition operation, we get A
determinant D vanishes if and only if there is linear dependence
between its rows.
Linear Spaces
Definitions
Linear spaces generalize vector operations to the set of all vectors. It
steps away from the concreteness of the objects (directed line segment,
vectors) without changing the properties of the operations on the
objects. A set K is called a linear (or affine) space over a field
K if
Given any two elements, x,y∈K, there is a rule that
leads to a unique element x+y∈K, called the sum
of x and y.
Given any element x∈K and any number λ∈K,
there is a rule leading to a unique element
λx∈K, called the product of the element x
and the number λ.
These two rules obey the axioms listed below.
The elements of a linear space will be called vectors, though their
nature will differ slightly to directed line segments. It will help with
the geometric intuition for things.
Axioms For the Addition Rule
x+y=y+x for all x,y∈K.
(x+y)+z=x+(y+z) for all x,y,z∈K.
There exists an element 0∈K so that x+0=x for
all x∈K.
For every x∈K there exists an element
y∈K so that x+y=0.
Axioms For the Multiplication Rule
1⋅x=x for every x∈K.
(αβ)x=α(βx) for all
α,β∈K.
(α+β)x=αx+βx for all x∈K
and every α,β∈K.
α(x+y)=αx+αy for all x,y∈K
and every α∈K.
A linear space over the field R of real numbers will be called real
and denoted by R. A linear space over the field C of
complex numbers will likewise be called complex and be denoted by
C. If the nature of elements x,y,z,… and the rules for
operating on them are specified, we call the linear space concrete. Some
example of concrete spaces:
The space V3. The elements of this space are the free vectors
studied in three-dimensional analytic geometry. Each vector is
characterized by a length and a direction. We also have V2 for
two-dimensional vectors, V1 for one-dimensional vectors, etc.
The space Kn. The numbers ξ1,ξ2,…,ξn are called
the component of the element x. The operations of addition and
multiplication by a number λ∈K are specified by:
If K is the field of real numbers, we write Rn; if K is the
field of complex numbers, we write Cn.
The space R(a,b). An element of this space is any continuous real
function x=x(t) defined on the interval a≤t≤b.
Correspondingly, we have C(a,b), the space of all continuous
complex-valued functions on the interval a≤t≤b.
Linear Dependence
Let x1,x2,…,xn be vectors of the linear space K over a
field K, and let α1,α2,…,αk be numbers from
K. The vector
y=α1x1+α2x2+⋯+αkxk
is
called a linear combination of the vectors x1,x2,…,xk
and the numbers α1,α2,…,αk are the
coefficients of the linear combination. If may exist a linear
combination of the vectors x1,x2,…,xk which equals the zero
vector, even though not all αi would be zero. In other words, if
α1x1+α2x2+⋯+αkxk=0
for some set
of coefficients that are not all zero. The vectors would be called
linearly dependent. If the equality is true only when αi=0
(for i=1,2,…,k), the vectors are said to be linearly
independent (over K).
Examples
In V3, linear dependence of two vectors mean that they are
parallel to the same straight line. Linear dependence of three
vectors means that they are parallel to the same plane. Any four
vectors are linearly dependent.
Linear dependence of the vectors
x1=x1(t),x2=x2(t),…,xk=xk(t) in the space
R(a,b) or C(a,b) means that the function
x1(t),x2(t),…,xn(t) satisfy a relation of the form
α1x1(t)+α2x2(t)+⋯+αkxk(t)=0
where at least one of the constants αi=0.
For example, the functions
x1(t)=cos2t,x2(t)=sin2t,x3(t)=1
are
linearly dependent, since the relation
x1(t)+x2(t)−x3(t)=0
holds. The functions
1,t,t2,…,tk are linearly independent.
Bases, Components, Dimension
A system of linearly independent vectors e1,e2,…,en in a
linear space K over a field K is called a basis for K if,
given any x∈K, there exists an expansion
x=ξ1e1+ξ2e2,+⋯+ξnen(ξj∈K,j=1,2,…,n)
i.e. if any vector x can be expressed as a linear combination of the
basis vectors e1,e2,…,en. Here the uniquely defined numbers
ξ1,ξ2,…,ξn are called the components of the vector
x with respect to the basis e1,e2,…,en.
When the two components of a linear space K are added, their
components are added. When a vector is multiplied by a number λ,
all it components are multiplied by λ. Let
If in a linear space K we can find n linearly independent vectors
while every n+1 vectors of the space are linearly dependent, then
the number n is called the dimension of the space K, and the space
K itself is called n-dimensional. A linear space in which we can
find an arbitrarily large number of linearly independent vectors is
called infinite-dimensional.
Subspaces
Suppose that a set L of elements of a linear space K has the
following properties:
If x,y∈L, then x+y∈L;
If x∈L and λ is an element of the field K,
then λx∈L.
We find that L is a linear space. We also find that every set
L⊂K with properties (a) and (b) is called a
linear subspace (or simply subspace) of the space K. Examples:
The set 0 is the smallest subspace of K, and K itself is
the largest possible subspace of K. These are called the
trivial subspaces of K.
Consider the set L of all vector (ξ1,ξ2,…,ξn) in
the space Kn, whose coordinates satisfy a system of linear
equations of the form
with all aij∈K. Such a system
is called a homogenous linear system. This kind of system is
always compatible, since it obviously has the solution
x1=x2=⋯=xn=0.
Let c1(1),c2(1),…,cn(1) and
c1(2),c2(2),…,cn(2) be two solutions of this
system. We also have
as a solution to the system. Similarly, for every fixed
λ∈K, the numbers
λc1,λc2,…,λcn also form a solution
to the system, given that the numbers c1,c2,…,cn
themselves consistute a solution.
The set L we mentioned is also a linear space in its own right. It
is called the solution space of the system.
Every linear relation which connects the vectors x,y,…,z in a
subspace L is also valid in the whole space K and conversely.
The fact that vectors x,y,…,z∈L are linearly
dependent holds simulatenously true in the subspace L and in the
space K. E.g., if every set of n+1 vectors is linearly dependent
in K, this fact is true in the subspace L. It follows that
dim(L)≤dim(K), given
L⊂K.
Given a basis f1,f2,…,fl of the subspace L (of dimension
l<n), we can always choose additional vectors fl+1,…fn
in the whole space K such that the system
f1,f2,…,fl,…,fn is a basis for K.
The vectors g1,…gk are linearly independent over the subspace
L⊂K if the relation
α1g1+⋯+αkgk∈L(α1,…,αk∈K)
implies
α1=⋯=αk=0.
So essentially, no linear
combination of these vectors can \“fall back\” into L unless the
coefficients are all zero. Conversely, to be *linearly dependent over
L* is to have some non-trivial way to combine the vectors
g1,…,gk to get a vector in L.
The largest possible number of vectors of the space K which are
linearly independent over the subspace L⊂K
is called the *dimension of K over L*. This number is always
dim(K)−dim(L).
A linear space L is the direct sum of given subspaces
L1,…,Lm⊂L if
For every x∈L there exists an expansion
x=x1+⋯+xm,
given
x1∈L1,…,xm∈Lm, and
This expansion is unique, i.e.:
x=x1+⋯+xm=y1+⋯+ym⟹x1=y1,…,xm=ym,
given xj,yj∈Lj(j=1,…,m)
We can express any n-dimensional space Kn as the direct sum
of two subspaces L,M⊂Kn. If the
basis of L consists of vectors f1,…fl, then the basis of
M can be represented as vectors fl+1,…fn. Combining
these, we get the basis of Kn.
The dimension of the sum of two subspaces is equal to the sum of their
dimensions minus the dimension of their intersection.
Classes and quotient spaces. Let K be a vector space and
L⊂K a subspace. The space K can be
partitioned into equivalence classes (denoted K/L), where two
vectors x,y∈K are in the same class if
x−y∈L. Each class represents vectors differing by
elements of L.
Addition: For classes X,Y∈K/L choose
representatives x∈X, y∈Y. Then
X+Y is the class containing x+y.
Scalar multiplication: For α∈K and
X∈K/L, αX is the class
containing αx.
These operations are well-defined and make K/L a vector space,
called the factor space or quotient space of K with respect to
L.
In K/L, L is collapsed to the zero class. Vectors differing by
an element of L are treated as equivalent, ignoring variations
within L. For example, in K=R2 and L the
x-axis, vectors like (1,0) and (3,0) are in the same class, as
their difference (2,0)∈L. Thus, K/L collapses to the
y-axis, a 1-dimensional space, since the x-coordinate (variations along
L) is ignored.
The quotient space K/L always has a dimension of n−l, given that
n=dim(K),l=dim(L).
Linear Manifolds
Let x,y,z,… be a system of vectors of a linear space K. The
linear manifold spanned by x,y,z,…, denoted L(x,y,z,…) is
the set of all (finite) linear combinations
αx+βy+γz+…
with coefficients
α,β,γ,⋯∈K. Every subspace containing the
vectors x,y,z,… also contains their linear combinations.
Consequently L(x,y,z,…) is the smallest subspace containing these
vectors.
Hyperplanes
Let L be a subspace of a linear space K and let
x0∈K be a fixed vector which does not belong to L.
Consider the set H of all vectors of the form
x=x0+y
where
y ranges over the whole subspace L. Then H is called a
hyperplane, more specifically, the result of shifting the subspace
L by the vector x0. Note that in general a hyperplane itself is
not a linear space.
Morphisms of Linear Spaces
Let ω be a rule which assign to every given vector x′ of a
linear space K′ a vector x′′ in a linear space
K′′. Then ω is called a morphism (or linear
operator) (from K′ to K′′) if the following two
conditions hold:
ω(x′+y′)=ω(x′)+ω(y′) for every
x′,y′∈K′
ω(αx′)=αω(x′) for every
x′∈K′ and every α∈K.
Different kinds of morphisms:
A morphism ω mapping the space K′ onto the whole
space K′′ is called an epimorphism.
A morphism ω mapping K′ onto part (or all) of
K′′ in a one-to-one fahion
(x′=y′⟹ω(x′)=ω(y′)) is called a
monomorphism.
A morphism ω mapping K′ onto all of K′′
in a one-to-one fashion (both an epimorphism and a monomorphism) is
called an isomorphism, and the spaces K′ and
K′′ are said to be isomorphic (more exacly,
K-isomorphic).
The usual notation for a morphism is
ω:K′→K′′
Systems of Linear Equations
More on the Rank of a Matrix
Given a general matrix ∣∣aij∣∣ with n rows and k columns,
consider any chosen m rows and columns. The elements which appear at
the intersection of these rows and columns form a square matrix of order
m. The determinant of this matrix is called a minor of order M of
the matrix A. If this minor is nonvanishing, the rank of the matrix
will be m.
The General Solution of a Linear System
Consider we have a general (i.e. nonhomogeneous) system of linear
equations
and a corresponding homogeneous linear system (i.e.
b1=b2=⋯=bk=0).
By a general solution of the system is meant a set of expressions
xj=fj(a11,…,akn,b1,…,bk,q1,…,qs)(j=1,…,n)
where every fj is a function depending on the coefficients aij,
the constants bj and certain undetermined parameters
q1,…,qs such that
The quantities xj=cj(j=1,…,n) obtained for arbitrary
fixed parameters q1,…,qs∈K constitute a solution to
the system;
Any given solution of the system can be obtained this way by
suitably choosing the parameters q1,qs∈K
The general solution of the nonhomogeneous system can be expressed as
x0+y, where x0 is any solution of the nonhomogeneous system, and
y ranges over the set of all solutions of the corresponding
homogeneous system. The set of all solutions to the nonhomogeneous
system is just x0+y.
Say we know that the rank of the coefficient matrix of a system of order
m is r. We can put the rest of the columns to the other side in the
following manner:
We can give the unknowns xr+1,…,xn arbitrary values
cr+1,…,cn, and see that the right-hand side can be expressed
as a linear combination of the vectors x1,…,xr. We can use
Cramer’s Rule to solve this system. We get
cj=MMj(b1−a1,r+1xr+1−⋯−a1nxn) for j=1,2,…,r
Geometric Properties of the Solution Space
As we have seen, the solutions of a generic homogeneous linear system
with a coefficient matrix of rank r form a linear solution space, L.
The dimension of L is n−r. The solution space and the space
Kn−r are isomorphic.
Any system of n−r linearly dependent solutions of a homogenous linear
system of equations (which forms a basis in the space of all
solutions) is called a fundamental system of solutions.
We can construct a fundamental system of solutions by using any basis of
the space Kn−r.
Methods for Calculating the Rank of a Matrix
Given a matrix with n rows and k columns, formed from vectors
x1,x2,…,xk in the n-dimensional space Rn, we can
calculate the rank of a matrix using the following elementary
operations:
Permutation of columns. The dimension of the linear manifold
spanned by the vectors is not affected by the order in which the
vectors are written.
Dividing out a nonzero common factor of the elements of a column.
Say we factor out a number λ=0 from the jth vector.
This is equivalent to replacing the system of vectors
x1,x2,…,λxj,…,xk with
x1,x2,…,xj,…,xk. This does not actually change
the dimension of the linear manifold spanned by the vectors.
Adding a multiple of one column to another column.
Deletion of a column consisting entirely of zeros. Deleting the
zero vector from the system does not change the linear manifold.
Deletion of a column which is a linear combination of the other
columns.
We can reduce a matrix A to one of the two following forms:
where α1,α2,…=0. The rank of
A1 is k and its basis minor is at the upper left-hand corner. The
rank of A2 is m and its basis minor is in the first m rows.