Matrix Addition, Scalar Multiplication, and Transposition
A rectangular array of numbers is called a matrix (the plural is matrices), and the numbers are called the entries of the matrix. Matrices are usually denoted by uppercase letters: , , , and so on. Hence,
are matrices. Clearly matrices come in various shapes depending on the number of rows and columns. For example, the matrix shown has rows and columns. In general, a matrix with rows and columns is referred to as an matrix or as having size . Thus matrices , , and above have sizes , , and , respectively. A matrix of size is called a row matrix, whereas one of size is called a column matrix. Matrices of size for some are called square matrices.
Each entry of a matrix is identified by the row and column in which it lies. The rows are numbered from the top down, and the columns are numbered from left to right. Then the -entry of a matrix is the number lying simultaneously in row and column . For example,
A special notation is commonly used for the entries of a matrix. If is an matrix, and if the -entry of is denoted as , then is displayed as follows:
This is usually denoted simply as . Thus is the entry in row and column of . For example, a matrix in this notation is written
It is worth pointing out a convention regarding rows and columns: Rows are mentioned before columns. For example:
- If a matrix has size , it has rows and columns.
- If we speak of the -entry of a matrix, it lies in row and column .
- If an entry is denoted , the first subscript refers to the row and the second subscript to the column in which lies.
Two points and in the plane are equal if and only if they have the same coordinates, that is and . Similarly, two matrices and are called equal (written ) if and only if:
- They have the same size.
- Corresponding entries are equal.
If the entries of and are written in the form , , described earlier, then the second condition takes the following form:
Example :
Given , and
discuss the possibility that , , .
Solution:
is impossible because and are of different sizes: is whereas is . Similarly, is impossible. But is possible provided that corresponding entries are equal:
means , , , and .
Matrix Addition
If and , this takes the form
Note that addition is not defined for matrices of different sizes.
Example 2 :
If
and ,
compute .
Solution:
Example 3 :
Find , , and if .
Solution:
Add the matrices on the left side to obtain
Because corresponding entries must be equal, this gives three equations: , , and . Solving these yields , , .
If , , and are any matrices of the same size, then
In fact, if and , then the -entries of and are, respectively, and . Since these are equal for all and , we get
The associative law is verified similarly.
The matrix in which every entry is zero is called the zero matrix and is denoted as (or if it is important to emphasize the size). Hence,
holds for all matrices . The negative of an matrix (written ) is defined to be the matrix obtained by multiplying each entry of by . If , this becomes . Hence,
holds for all matrices where, of course, is the zero matrix of the same size as .
A closely related notion is that of subtracting matrices. If and are two matrices, their difference is defined by
Note that if and , then
is the matrix formed by subtracting corresponding entries.
Example 4 :
Solution:
Example 5 :Solve
where is a matrix.
We solve a numerical equation by subtracting the number from both sides to obtain . This also works for matrices. To solve
simply subtract the matrix
from both sides to get
The reader should verify that this matrix does indeed satisfy the original equation.
The solution in Example 2.1.5 solves the single matrix equation directly via matrix subtraction: . This ability to work with matrices as entities lies at the heart of matrix algebra.
It is important to note that the sizes of matrices involved in some calculations are often determined by the context. For example, if
then and must be the same size (so that makes sense), and that size must be (so that the sum is ). For simplicity we shall often omit reference to such facts when they are clear from the context.
Scalar Multiplication
In gaussian elimination, multiplying a row of a matrix by a number means multiplying every entry of that row by .
More generally, if is any matrix and is any number, the scalar multiple is the matrix obtained from by multiplying each entry of by .
The term scalar arises here because the set of numbers from which the entries are drawn is usually referred to as the set of scalars. We have been using real numbers as scalars, but we could equally well have been using complex numbers.
Example 1 :
If
and
compute , , and .
Solution:
If is any matrix, note that is the same size as for all scalars . We also have
because the zero matrix has every entry zero. In other words, if either or . The converse of this statement is also true, as Example 2.1.7 shows.
Example 1 :
If , show that either or .
Solution:
Write so that means for all and . If , there is nothing to do. If , then implies that for all and ; that is, .
Theorem
Let , , and denote arbitrary matrices where and are fixed. Let and denote arbitrary real numbers. Then
- .
- .
- There is an matrix , such that for each .
- For each there is an matrix, , such that .
- .
- .
- .
- .
Proof:
Properties 1–4 were given previously. To check Property 5, let and denote matrices of the same size. Then , as before, so the -entry of is
But this is just the -entry of , and it follows that . The other Properties can be similarly verified; the details are left to the reader.
The Properties in Theorem 2.1.1 enable us to do calculations with matrices in much the same way that
numerical calculations are carried out. To begin, Property 2 implies that the sum
is the same no matter how it is formed and so is written as . Similarly, the sum
is independent of how it is formed; for example, it equals both and . Furthermore, property 1 ensures that, for example,
In other words, the order in which the matrices are added does not matter. A similar remark applies to sums of five (or more) matrices.
Properties 5 and 6 in Theorem 2.1.1 are called distributive laws for scalar multiplication, and they extend to sums of more than two terms. For example,
Similar observations hold for more than three summands. These facts, together with properties 7 and 8, enable us to simplify expressions by collecting like terms, expanding, and taking common factors in exactly the same way that algebraic expressions involving variables and real numbers are manipulated. The following example illustrates these techniques.
Example 1 :
Simplify where and are all matrices of the same size.
Solution:
The reduction proceeds as though , , and were variables.
Transpose of a Matrix
Many results about a matrix involve the rows of , and the corresponding result for columns is derived in an analogous way, essentially by replacing the word row by the word column throughout. The following definition is made with such applications in mind.
If is an matrix, the transpose of , written , is the matrix whose rows are just the columns of in the same order.
In other words, the first row of is the first column of (that is it consists of the entries of column 1 in order). Similarly the second row of is the second column of , and so on.
Write down the transpose of each of the following matrices.
Solution:
If is a matrix, write . Then is the th element of the th row of and so is the th element of the th column of . This means , so the definition of can be stated as follows:
This is useful in verifying the following properties of transposition.
Theorem :
Let and denote matrices of the same size, and let denote a scalar.
- If is an matrix, then is an matrix.
- .
- .
- .
Proof:
Property 1 is part of the definition of , and Property 2 follows from (2.1). As to Property 3: If , then , so (2.1) gives
Finally, if , then where Then (2.1) gives Property 4:
There is another useful way to think of transposition. If is an matrix, the elements are called the main diagonal of . Hence the main diagonal extends down and to the right from the upper left corner of the matrix ; it is shaded in the following examples:
Thus forming the transpose of a matrix can be viewed as “flipping” about its main diagonal, or as “rotating” through about the line containing the main diagonal. This makes Property 2 in Theorem~?? transparent.
Example :
Solve for if .
Solution:
Using Theorem 2.1.2, the left side of the equation is
Hence the equation becomes
Thus
, so finally
.
Note that Example 2.1.10 can also be solved by first transposing both sides, then solving for , and so obtaining . The reader should do this.
The matrix in Example 2.1.9 has the property that . Such matrices are important; a matrix is called symmetric if . A symmetric matrix is necessarily square (if is , then is , so forces ). The name comes from the fact that these matrices exhibit a symmetry about the main diagonal. That is, entries that are directly across the main diagonal from each other are equal.
For example, is symmetric when , , and .
Example :
If and are symmetric matrices, show that is symmetric.
Solution:
We have and , so, by Theorem 2.1.2, we have . Hence is symmetric.
Example :
Suppose a square matrix satisfies . Show that necessarily .
Solution:
If we iterate the given equation, Theorem 2.1.2 gives
Subtracting from both sides gives , so .
Matrix-Vector Multiplication
Up to now we have used matrices to solve systems of linear equations by manipulating the rows of the augmented matrix. In this section we introduce a different way of describing linear systems that makes more use of the coefficient matrix of the system and leads to a useful way of “multiplying” matrices.
Vectors
It is a well-known fact in analytic geometry that two points in the plane with coordinates and are equal if and only if and . Moreover, a similar condition applies to points in space. We extend this idea as follows.
An ordered sequence of real numbers is called an ordered –tuple. The word “ordered” here reflects our insistence that two ordered -tuples are equal if and only if corresponding entries are the same. In other words,
Thus the ordered -tuples and -tuples are just the ordered pairs and triples familiar from geometry.
Let denote the set of all real numbers. The set of all ordered -tuples from has a special notation:
There are two commonly used ways to denote the -tuples in : As rows or columns ;
the notation we use depends on the context. In any event they are called vectors or –vectors and will be denoted using bold type such as x or v. For example, an matrix will be written as a row of columns:
If and are two -vectors in , it is clear that their matrix sum is also in as is the scalar multiple for any real number . We express this observation by saying that is closed under addition and scalar multiplication. In particular, all the basic properties in Theorem 2.1.1 are true of these -vectors. These properties are fundamental and will be used frequently below without comment. As for matrices in general, the zero matrix is called the zero –vector in and, if is an -vector, the -vector is called the negative .
Of course, we have already encountered these -vectors in Section 1.3 as the solutions to systems of linear equations with variables. In particular we defined the notion of a linear combination of vectors and showed that a linear combination of solutions to a homogeneous system is again a solution. Clearly, a linear combination of -vectors in is again in , a fact that we will be using.
Matrix-Vector Multiplication
Given a system of linear equations, the left sides of the equations depend only on the coefficient matrix and the column of variables, and not on the constants. This observation leads to a fundamental idea in linear algebra: We view the left sides of the equations as the “product” of the matrix and the vector . This simple change of perspective leads to a completely new way of viewing linear systems—one that is very useful and will occupy our attention throughout this book.
To motivate the definition of the “product” , consider first the following system of two equations in three variables:
(2.2)
and let , , denote the coefficient matrix, the variable matrix, and the constant matrix, respectively. The system (2.2) can be expressed as a single vector equation
which in turn can be written as follows:
Now observe that the vectors appearing on the left side are just the columns
of the coefficient matrix . Hence the system (2.2) takes the form
(2.3)
This shows that the system (2.2) has a solution if and only if the constant matrix is a linear combination of the columns of , and that in this case the entries of the solution are the coefficients , , and in this linear combination.
Moreover, this holds in general. If is any matrix, it is often convenient to view as a row of columns. That is, if are the columns of , we write
and say that is given in terms of its columns.
Now consider any system of linear equations with coefficient matrix . If is the constant matrix of the system, and if
is the matrix of variables then, exactly as above, the system can be written as a single vector equation
(2.4)
Write the system
in the form given in (2.4).
Solution:
As mentioned above, we view the left side of (2.4) as the product of the matrix and the vector . This basic idea is formalized in the following definition:
Let be an matrix, written in terms of its columns . If
is any n-vector, the product is defined to be the -vector given by:
In other words, if is and is an -vector, the product is the linear combination of the columns of where the coefficients are the entries of (in order).
Note that if is an matrix, the product is only defined if is an -vector and then the vector is an -vector because this is true of each column of . But in this case the system of linear equations with coefficient matrix and constant vector takes the form of a single matrix equation
The following theorem combines Definition 2.5 and equation (2.4) and summarizes the above discussion. Recall that a system of linear equations is said to be consistent if it has at least one solution.
Theorem :
- Every system of linear equations has the form where is the coefficient matrix, is the constant matrix, and is the matrix of variables.
- The system is consistent if and only if is a linear combination of the columns of .
- If are the columns of and if , then is a solution to the linear system if and only if are a solution of the vector equation
A system of linear equations in the form as in (1) of Theorem 2.2.1 is said to be written in matrix form. This is a useful way to view linear systems as we shall see.
Theorem 2.2.1 transforms the problem of solving the linear system into the problem of expressing the constant matrix as a linear combination of the columns of the coefficient matrix . Such a change in perspective is very useful because one approach or the other may be better in a particular situation; the importance of the theorem is that there is a choice.
Example :
If and
, compute .
Solution:
By Definition 2.5:
.
Example :
Given columns , , , and in , write in the form where is a matrix and is a vector.
Solution:
Here the column of coefficients is
Hence Definition 2.5 gives
where is the matrix with , , , and as its columns.
Let be the matrix given in terms of its columns
, , , and .
In each case below, either express as a linear combination of , , , and , or show that it is not such a linear combination. Explain what your answer means for the corresponding system of linear equations.
1.
2.
Solution:
By Theorem 2.2.1, is a linear combination of , , , and if and only if the system is consistent (that is, it has a solution). So in each case we carry the augmented matrix of the system to reduced form.
1. Here
, so the system has no solution in this case. Hence is \textit{not} a linear combination of , , , and .
2. Now
, so the system is consistent.
Thus is a linear combination of , , , and in this case. In fact the general solution is , , , and where and are arbitrary parameters. Hence
for any choice of and . If we take and , this becomes , whereas taking gives .
Taking to be the zero matrix, we have for all vectors by Definition 2.5 because every column of the zero matrix is zero. Similarly, for all matrices because every entry of the zero vector is zero.
Example :
If , show that for any vector in .
Solution:
If then Definition 2.5 gives
The matrix in Example 2.2.6 is called the identity matrix, and we will encounter such matrices again in future. Before proceeding, we develop some algebraic properties of matrix-vector multiplication that are used extensively throughout linear algebra.
Theorem :
Let and be matrices, and let and be -vectors in . Then:
- .
- for all scalars .
- .
Proof:
We prove (3); the other verifications are similar and are left as exercises. Let and be given in terms of their columns. Since adding two matrices is the same as adding their columns, we have
If we write
Definition 2.5 gives
Theorem 2.2.2 allows matrix-vector computations to be carried out much as in ordinary arithmetic. For example, for any matrices and and any -vectors and , we have:
We will use such manipulations throughout the book, often without mention.
No comments:
Post a Comment