Linear Equations
Theorem 2.2.2 also gives a useful way to describe the solutions to a system
of linear equations. There is a related system
called the associated homogeneous system, obtained from the original system by replacing all the constants by zeros. Suppose
is a solution to
and
is a solution to
(that is
and
). Then
is another solution to
. Indeed, Theorem 2.2.2 gives
This observation has a useful converse.
Theorem :
Suppose is any particular solution to the system
of linear equations. Then every solution
to
has the form
for some solution of the associated homogeneous system
.
Proof:
Suppose is also a solution to
, so that
. Write
. Then
and, using Theorem 2.2.2, we compute
Hence is a solution to the associated homogeneous system
.
Note that gaussian elimination provides one such representation.
Example :
Express every solution to the following system as the sum of a specific solution plus a solution to the associated homogeneous system.
Solution:
Gaussian elimination gives ,
,
, and
where
and
are arbitrary parameters. Hence the general solution can be written
Thus
is a particular solution (where ), and
gives all
solutions to the associated homogeneous system. (To see why this is so,
carry out the gaussian elimination again but with all the constants set
equal to zero.)
The following useful result is included with no proof.
Theorem :
The Dot Product
Definition 2.5 is not always the easiest way to compute a matrix-vector product because it requires that the columns of
be explicitly identified. There is another way to find such a product which uses the matrix
as a whole with no reference to its columns, and hence is useful in practice. The method depends on the following notion.
If and
are two ordered
-tuples, their
is defined to be the number
obtained by multiplying corresponding entries and adding the results.
To see how this relates to matrix products, let denote a
matrix and let
be a
-vector. Writing
in the notation of Section 2.1, we compute
From this we see that each entry of is the dot product of the corresponding row of
with
. This computation goes through in general, and we record the result in Theorem 2.2.5.
Theorem 2.2.5 Dot Product Rule
Let be an
matrix and let
be an
-vector. Then each entry of the vector
is the dot product of the corresponding row of
with
.
This result is used extensively throughout linear algebra.
If is
and
is an
-vector, the computation of
by the dot product rule is simpler than using Definition 2.5 because
the computation can be carried out directly with no explicit reference
to the columns of
(as in Definition 2.5. The first entry of
is the dot product of row 1 of
with
. In hand calculations this is computed by going across row one of
, going down the column
, multiplying corresponding entries, and adding the results. The other entries of
are computed in the same way using the other rows of
with the column
.

In general, compute entry of
as follows (see the diagram):
Go across row of
and down column
, multiply corresponding entries, and add the results.
As an illustration, we rework Example 2.2.2 using the dot product rule instead of Definition.
Example :
If
and , compute
.
Solution:
The entries of are the dot products of the rows of
with
:
Of course, this agrees with the outcome in above Example
Example :
Write the following system of linear equations in the form .
Solution:
Write ,
, and
. Then the dot product rule gives
, so the entries of
are the left sides of the equations in the linear system. Hence the system becomes
because matrices are equal if and only corresponding entries are equal.
Example :
If is the zero
matrix, then
for each
-vector
.
Solution:
For each , entry
of
is the dot product of row
of
with
, and this is zero because row
of
consists of zeros.
The Identity Matrix
The first few identity matrices are
In Example 2.2.6 we showed that for each
-vector
using Definition 2.5. The following result shows that this holds in general, and is the reason for the name.
Example :
For each we have
for each
-vector
in
.
Solution:
We verify the case . Given the
-vector
the dot product rule gives
In general, because entry
of
is the dot product of row
of
with
, and row
of
has
in position
and zeros elsewhere.
Example :
Let be any
matrix with columns
. If
denotes column
of the
identity matrix
, then
for each
.
Solution:
Write
where , but
for all
. Then Theorem 2.2.5 gives
Example 2.2.12will be referred to later; for now we use it to prove:
Theorem :
Let and
be
matrices. If
for all
in
, then
.
Proof:
Write and
and in terms of their columns. It is enough to show that
holds for all
. But we are assuming that
, which gives
by Example 2.2.12.
We have introduced matrix-vector multiplication as a new way to think
about systems of linear equations. But it has several other uses as
well. It turns out that many geometric operations can be described using
matrix multiplication, and we now investigate how this happens. As a
bonus, this description provides a geometric “picture” of a matrix by
revealing the effect on a vector when it is multiplied by . This “geometric view” of matrices is a fundamental tool in understanding them.
Matrix Multiplication
In Section 2.2 matrix-vector products were introduced. If is an
matrix, the product
was defined for any
-column
in
as follows: If
where the
are the columns of
, and if
,
Definition 2.5 reads
(2.5)
This was motivated as a way of describing systems of linear equations with coefficient matrix . Indeed every such system has the form
where
is the column of constants.
In this section we extend this matrix-vector multiplication to a way of multiplying matrices in general, and then investigate matrix algebra for its own sake. While it shares several properties of ordinary arithmetic, it will soon become clear that matrix arithmetic is different in a number of ways.
Matrix Multiplication
Thus the product matrix is given in terms of its columns
: Column
of
is the matrix-vector product
of
and the corresponding column
of
. Note that each such product
makes sense by Definition 2.5 because
is
and each
is in
(since
has
rows). Note also that if
is a column matrix, this definition reduces to Definition 2.5 for matrix-vector multiplication.
Given matrices and
, Definition 2.9 and the above computation give
for all in
. We record this for reference.
Theorem :
Let be an
matrix and let
be an
matrix. Then the product matrix
is
and satisfies
Here is an example of how to compute the product of two matrices using Definition 2.9.
Example :
Compute if
and
.
Solution:
The columns of are
and
, so Definition 2.5 gives
Hence Definition 2.9 above gives .
While Definition 2.9 is important, there is another way to compute the matrix product that gives a way to calculate each individual entry. In Section 2.2 we defined the dot product of two
-tuples to be the sum of the products of corresponding entries. We went on to show (Theorem 2.2.5) that if
is an
matrix and
is an
-vector, then entry
of the product
is the dot product of row
of
with
.
This observation was called the “dot product rule” for matrix-vector
multiplication, and the next theorem shows that it extends to matrix
multiplication in general.
Dot Product Rule
product of row
Proof:
Write in terms of its columns. Then
is column
of
for each
. Hence the
-entry of
is entry
of
, which is the dot product of row
of
with
. This proves the theorem.
Thus to compute the -entry of
, proceed as follows (see the diagram):
Go across row of
, and down column
of
, multiply corresponding entries, and add the results.

Note that this requires that the rows of must be the same length as the columns of
. The following rule is useful for remembering this and for deciding the size of the product matrix
.
Compatibility Rule
Let and
denote matrices. If
is
and
is
, the product
can be formed if and only if
. In this case the size of the product matrix
is
, and we say that
is defined, or that
and
are compatible for multiplication.

The diagram provides a useful mnemonic for remembering this. We adopt the following convention:
Whenever a product of matrices is written, it is tacitly assumed that the sizes of the factors are such that the product is defined.
To illustrate the dot product rule, we recompute the matrix product in Example .
Example :
and
Solution:
Here is
and
is
, so the product matrix
is defined and will be of size
. Theorem 2.3.2 gives each entry of
as the dot product of the corresponding row of
with the corresponding column of
that is,
Of course, this agrees with Example
Example :
Compute the – and
-entries of
where
Then compute .
Solution:
The -entry of
is the dot product of row 1 of
and column 3 of
(highlighted in the following display), computed by multiplying corresponding entries and adding the results.

Similarly, the -entry of
involves row 2 of
and column 4 of
.

Since is
and
is
, the product is
.
Example :
If and
, compute
,
,
, and
when they are defined.
Solution:
Here, is a
matrix and
is a
matrix, so
and
are not defined. However, the compatibility rule reads
so both and
can be formed and these are
and
matrices, respectively.
Unlike numerical multiplication, matrix products and
need not be equal. In fact they need not even be the same size, as Example 2.3.5 shows. It turns out to be rare that
(although it is by no means impossible), and
and
are said to commute when this happens.
Example :
Let and
. Compute
,
,
.
Solution:
, so
can occur even if
. Next,
Hence , even though
and
are the same size.
Example :
If is any matrix, then
and
, and where
denotes an identity matrix of a size so that the multiplications are defined.
Solution:
These both follow from the dot product rule as the reader should verify. For a more formal proof, write where
is column
of
. Then Definition 2.9 and Example 2.2.1 give
If denotes column
of
, then
for each
by Example 2.2.12. Hence Definition 2.9 gives:
The following theorem collects several results about matrix multiplication that are used everywhere in linear algebra.
Theorem :
Assume that is any scalar, and that
,
, and
are matrices of sizes such that the indicated matrix products are defined. Then:
1. and
where
denotes an identity matrix.
2. .
3. .
4. .
5. .
6. .
Proof:
Condition (1) is Example 2.3.7; we prove (2), (4), and (6) and leave (3) and (5) as exercises.
1. If in terms of its columns, then
by Definition 2.9, so
4. We know (Theorem 2.2.) that holds for every column
. If we write
in terms of its columns, we get
6. As in Section 2.1, write and
, so that
and
where
and
for all
and
. If
denotes the
-entry of
, then
is the dot product of row
of
with column
of
. Hence
But this is the dot product of row of
with column
of
; that is, the
-entry of
; that is, the
-entry of
. This proves (6).
Property 2 in Theorem 2.3.3 is called the associative law of matrix multiplication. It asserts that the equation
holds for all matrices (if the products are defined). Hence this
product is the same no matter how it is formed, and so is written simply
as
. This extends: The product
of four matrices can be formed several ways—for example,
,
, and
—but the associative law implies that they are all equal and so are written as
. A similar remark applies in general: Matrix products can be written unambiguously with no parentheses.
However, a note of caution about matrix multiplication must be taken: The fact that and
need not be equal means that the order of the factors is important in a product of matrices. For example
and
may not be equal.
Warning:
If the order of the factors in a product of matrices is changed, the
product matrix may change (or may not be defined). Ignoring this warning
is a source of many errors by students of linear algebra!}
Properties 3 and 4 in Theorem 2.3.3 are called distributive laws. They assert that and
hold whenever the sums and products are defined. These rules extend to
more than two terms and, together with Property 5, ensure that many
manipulations familiar from ordinary algebra extend to matrices. For
example
Note again that the warning is in effect: For example need not equal
. These rules make possible a lot of simplification of matrix expressions.
Solution:
Matrix Inverse
Three basic operations on matrices, addition, multiplication, and subtraction, are analogs for matrices of the same operations for numbers. In this section we introduce the matrix analog of numerical division.
To begin, consider how a numerical equation is solved when
and
are known numbers. If
, there is no solution (unless
). But if
, we can multiply both sides by the inverse
to obtain the solution
. Of course multiplying by
is just dividing by
, and the property of
that makes this work is that
. Moreover, we saw in Section~?? that the role that
plays in arithmetic is played in matrix algebra by the identity matrix
. This suggests the following definition.
If is a square matrix, a matrix
is called an inverse of
if and only if
A matrix that has an inverse is called an
Note that only square matrices have inverses. Even though it is plausible that nonsquare matrices and
could exist such that
and
, where
is
and
is
, we claim that this forces
. Indeed, if
there exists a nonzero column
such that
(by Theorem 1.3.1), so
, a contradiction. Hence
. Similarly, the condition
implies that
. Hence
so
is square.}
Example :
Show that
is an inverse of .
Solution:
Compute and
.
Hence , so
is indeed an inverse of
.
Show that
has no inverse.
Solution:
Let
denote an arbitrary matrix. Then
so has a row of zeros. Hence
cannot equal
for any
.
The argument in Example 2.4.2 shows that no zero matrix has an inverse. But Example 2.4.2 also shows that, unlike arithmetic, it is possible for a nonzero matrix to have no inverse. However, if a matrix does have an inverse, it has only one.
Theorem :
If and
are both inverses of
, then
.
Proof:
Since and
are both inverses of
, we have
. Hence
If is an invertible matrix, the (unique) inverse of
is denoted
. Hence
(when it exists) is a square matrix of the same size as
with the property that
These equations characterize in the following sense:
Inverse Criterion: If somehow a matrix can be found such that
and
, then
is invertible and
is the inverse of
; in symbols,
.}
This is a way to verify that the inverse of a matrix exists. Example 2.3.3 and Example 2.3.4 offer illustrations.
Example 2.4.3
If , show that
and so find
.
Solution:
We have , and so
Hence , as asserted. This can be written as
, so it shows that
is the inverse of
. That is,
.
The next example presents a useful formula for the inverse of a matrix
when it exists. To state it, we define the
and the
of the matrix
as follows:
If , show that
has an inverse if and only if
, and in this case
Solution:
For convenience, write and
. Then
as the reader can verify. So if
, scalar multiplication by
gives
Hence is invertible and
. Thus it remains only to show that if
exists, then
.
We prove this by showing that assuming leads to a contradiction. In fact, if
, then
, so left multiplication by
gives
; that is,
, so
. But this implies that
,
,
, and
are all zero, so
, contrary to the assumption that
exists.
As an illustration, if
then . Hence
is invertible and
, as the reader is invited to verify.
Inverse and Linear systems
Matrix inverses can be used to solve certain systems of linear equations. Recall that a of linear equations can be written as a
matrix equation
where and
are known and
is to be determined. If
is invertible, we multiply each side of the equation on the left by
to get
This gives the solution to the system of equations (the reader should verify that really does satisfy
). Furthermore, the argument shows that if
is
solution, then necessarily
, so the solution is unique. Of course the technique works only when the coefficient matrix
has an inverse. This proves Theorem 2.4.2.
Theorem 2.4.2
If the coefficient matrix
is invertible, the system has the unique solution
Use Example 2.4.4 to solve the system .
Solution:
In matrix form this is where
,
, and
. Then
, so
is invertible and
by Example 2.4.4. Thus Theorem 2.4.2 gives
so the solution is and
.
An inversion method
If a matrix is
and invertible, it is desirable to have an efficient technique for
finding the inverse.
Matrix Inversion Algorithm
where the row operations on and
are carried out simultaneously.
Use the inversion algorithm to find the inverse of the matrix
Solution:
Apply elementary row operations to the double matrix
so as to carry to
. First interchange rows 1 and 2.
Next subtract times row 1 from row 2, and subtract row 1 from row 3.
Continue to reduced row-echelon form.
Hence , as is readily verified.
Given any matrix
, Theorem 1.2.1 shows that
can be carried by elementary row operations to a matrix
in reduced row-echelon form. If
, the matrix
is invertible (this will be proved in the next section), so the algorithm produces
. If
, then
has a row of zeros (it is square), so no system of linear equations
can have a unique solution. But then
is not invertible by Theorem 2.4.2. Hence, the algorithm is effective in the sense conveyed in Theorem 2.4.3.
Theorem 2.4.3
first case, the algorithm produces
Properties of inverses
The following properties of an invertible matrix are used everywhere.
Example 2.4.7: Cancellation Laws
Let be an invertible matrix. Show that:
1. If , then
.
2. If , then
.
Solution:
Given the equation , left multiply both sides by
to obtain
. Thus
, that is
. This proves (1) and the proof of (2) is left to the reader.
Properties (1) and (2) in Example 2.4.7 are described by saying that
an invertible matrix can be “left cancelled” and “right cancelled”,
respectively. Note however that “mixed” cancellation does not hold in
general: If is invertible and
, then
and
may
be equal, even if both are
. Here is a specific example:
Sometimes the inverse of a matrix is given by a formula. Example
2.4.4 is one illustration; Example 2.4.8 and Example 2.4.9 provide two
more. The idea is the : If a matrix
can be found such that
, then
is invertible and
.
Theorem 2.4.4
All the following matrices are square matrices of the same size.
1. is invertible and
.
2. If is invertible, so is
, and
.
3. If and
are invertible, so is
, and
.
4. If are all invertible, so is their product
, and
5. If is invertible, so is
for any
, and
.
6. If is invertible and
is a number, then
is invertible and
.
7. If is invertible, so is its transpose
, and
.
Proof:
1. This is an immediate consequence of the fact that .
2. The equations show that
is the inverse of
; in symbols,
.
3. This is Example 2.4.9.
4. Use induction on . If
, there is nothing to prove, and if
, the result is property 3. If
, assume inductively that
. We apply this fact together with property 3 as follows:
So the proof by induction is complete.
5. This is property 4 with .
6. The readers are invited to verify it.
7. This is Example 2.4.8.
The reversal of the order of the inverses in properties 3 and 4 of
Theorem 2.4.4 is a consequence of the fact that matrix multiplication is
not
commutative. Another manifestation of this comes when matrix equations are dealt with. If a matrix equation is given, it can be
by a matrix
to yield
. Similarly,
gives
. However, we cannot mix the two: If
, it need
be the case that
even if
is invertible, for example,
,
.
Part 7 of Theorem 2.4.4 together with the fact that gives
Corollary 2.4.1
A square matrix is invertible if and only if
is invertible.
No comments:
Post a Comment