Rant: Matrices Are Not Arrays of Numbers

The following is an excerpt from a current work of mine. I thought I’d share it here, as some people have told me they enjoyed it.

As I’ll stress repeatedly, a matrix represents a linear map between two vector spaces. Writing it in the form of an {m \times n} matrix is merely a very convenient way to see the map concretely. But it obfuscates the fact that this map is, well, a map, not an array of numbers.

If you took high school precalculus, you’ll see everything done in terms of matrices. To any typical high school student, a matrix is an array of numbers. No one is sure what exactly these numbers represent, but they’re told how to magically multiply these arrays to get more arrays. They’re told that the matrix

\displaystyle \left( \begin{array}{cccc} 1 & 0 & \dots & 0 \\ 0 & 1 & \dots & 0 \\ \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & \dots & 1 \\ \end{array} \right)

is an “identity matrix”, because when you multiply by another matrix it doesn’t change. Then they’re told that the determinant is some magical combination of these numbers formed by this weird multiplication rule. No one knows what this determinant does, other than the fact that {\det(AB) = \det A \det B}, and something about areas and row operations and Cramer’s rule.

Then you go into linear algebra in college, and you do more magic with these arrays of numbers. You’re told that two matrices {T_1} and {T_2} are similar if

\displaystyle T_2 = ST_1S^{-1}

for some invertible matrix {S}. You’re told that the trace of a matrix {\text{Tr } T} is the sum of the diagonal entries. Somehow this doesn’t change if you look at a similar matrix, but you’re not sure why. Then you define the characteristic polynomial as

\displaystyle p_T = \det (XI - T).

Somehow this also doesn’t change if you take a similar matrix, but now you really don’t know why. And then you have the Cayley-Hamilton Theorem in all its black magic: {p_T(T)} is the zero map. Out of curiosity you Google the proof, and you find some ad-hoc procedure which still leaves you with no idea why it’s true.

This is terrible. Who gives a — about {T_2 = ST_1S^{-1}}? Only if you know that the matrices are linear maps does this make sense: {T_2} is just {T_1} rewritten with a different choice of basis.

In my eyes, this mess is evil. Linear algebra is the study of linear maps, but it is taught as the study of arrays of numbers, and no one knows what these numbers mean. And for a good reason: the numbers are meaningless. They are a highly convenient way of encoding the matrix, but they are not the main objects of study, any more than the dates of events are the main objects of study in history.

When I took Math 55a as a freshman at Harvard, I got the exact opposite treatment: we did all of linear algebra without writing down a single matrix. During all this time I was quite confused. What’s wrong with a basis? I didn’t appreciate until later that this approach was the morally correct way to treat the subject: it made it clear what was happening.

Throughout this project, I’ve tried to strike a balance between these two approaches, using matrices to illustrate the maps and to simplify proofs, but writing theorems and definitions in their morally correct form. I hope that this has both the advantage of giving the “right” definitions while being concrete enough to be digested. But I would just like to say for the record that, if I had to pick between the high school approach and the 55a approach, I would pick 55a in a heartbeat.

5 thoughts on “Rant: Matrices Are Not Arrays of Numbers”

  1. To push in the other direction, I think that some mathematicians seem to eschew the concrete approach so much that they do not know about certain important structural theorems that are easy to state in coordinates, such as the LU decomposition, Cholesky decomposition, QR factorization, etc. There are also pretty important results which can be stated in a basis-free way but are more natural with a basis, such as the Cauchy interlacing theorem and Perron-Frobenius theorem. I would definitely agree with you that the basis-free perspective is extremely powerful, but I think that choosing a basis can be pretty useful as well, beyond simply being useful for concreteness.

    (Note that I also spend at least some of my time arguing to machine learning people that they should be thinking of tensors in terms of their universal property, so I don’t think I’m really arguing for one perspective over the other, just that both perspectives when used in their mature form are extremely useful, and it’s easy to overlook this in pure math unless you do combinatorics. This is related to the Mental Modes and Frameworks class that Nisan and I taught at SPARC; when I actually do linear algebra I often flip back and forth rapidly between coordinates and no coordinates.)

    Like

    1. I fully agree with your comments, particularly the last bit; being able to move back and forth between the two views just seems strictly better than rigidly confirming to one alone.

      I’m also a little embarrassed to admit I haven’t heard of any of the theorems that you’ve named. :) Guess that means I get to learn something new today.

      Liked by 1 person

  2. I must admit I’m stunned to think of learning linear algebra without seeing any grids-of-numbers. To an extent that just reflects feeling like the way I learned things must be the natural way. But it seems to me like there’s (for example) an understanding of a contraction mapping that you’d get from a couple examples that’s good for reassuring the intuitive feeling that you’re reasoning rightly. That’s as a supplement, mind, the way you might work out whether you’re proving a number theory question correctly by seeing what it makes of ’15’, but that’s still a use.

    Like

    1. Fully agree. :) I think this was a big part of why I had no clue what was happening in Math 55a for many weeks; it wasn’t until I started writing concrete matrices on my own time that things started making sense!

      Indeed, I think most of higher math has a problem of having *not enough* concrete examples, and it’s rare for me to advocate that something isn’t taught abstractly enough. Linear algebra just happens to be the exception, because the “dumbed-down” version that’s taught to non math-majors (which is most of what I’m complaining about) completely misses the essence of what’s going on.

      Like

Leave a comment