The point of this post is to show from whence the weird definition of matrix multiplication comes, and why it simply MUST be the way it is. Actually matrices don’t appear in this post, just the underlying equations they represent. We’re dealing with spaces of finite dimension at this point (infinite dimensional spaces come later). Such spaces have a basis — meaning a collection of elements (**basis vectors**) which are enough to describe every element of the space UNIQUELY, as a linear combination.

To make things a bit more concrete, think of good old 3 dimensional space with basis vectors **E1** = (1,0, 0) aka** i**, **E2** = (0,1,0) aka **j**, and **E3** = (0,0,1) aka **j**. Every point in this space is uniquely described as a1 * **E1** + a2 * **E2** + a3 * **E3** — e. g. a linear combination of the 3 basis vectors. You can also think of each point as a vector from the origin (0,0,0) to the point (a1,a2,a3). Once you establish what the basis is each vector is specified by its (unique) triple of numerical coordinates (a1, a2, a3). Choose a different basis and you get a different set of coordinates, but you always get no more and no less than 3 coordinates — that’s what dimension is all about. Note that the combination of basis vectors is linear (no powers greater than 1).

So now we’re going to consider several spaces, namely A, B and C of dimensions 3, 4 and 5. Their basis vectors are the set {**A1**, **A2**, **A3 }** for A, {**B1**, **B2**, **B3**, **B4 }** for B — fill in the dots for C.

What does a **linea**r transformation from A to B look like? Because of the way things have been set up, there is really no choice at all.

Consider any vector of A — it must be of the form a1 * **A1** + a2 * **A2** + a3 * **A3** , e.g. a linear combination of the basis vectors {**A1**, **A2**, **A3**} — where the { } notation means set. For any given vector in A, a1 a2 and a3 are uniquely determined. Sorry to stress this so much but uniqueness is crucial.

Similarly any vector of C must be of the form c1 * **C1** + c2 * **C2** + c3 * **C3** + c4 * **C4** + c5 * **C5**. Go back and fill in the dots for B.

Any linear function T from A to B must satisfy

T (**X** + **Y**) = T(**X**) + T(**Y**)

where X and Y are vectors in A and T(**X**), T(**Y**) are vectors in B. So what? A lot. We only have to worry about what T does to **A1**, **A2** and **A3**. Why ? ? Because the {**Ai**} are basis vectors, and because of the second thing a linear function must satisfy

T ( number * **X**) = number * (T ( **X**)) so combining both properties

T (a1 * **A1** + a2 * **A2** + a3 * **A3**) = a1 * T(**A1**) + a2 * T(**A2**) + a3 * T(**A3**)

All we have to worry about is what T does to the 3 basis vectors of A. Everything else follows easily enough.

So what is T(**A1**) ? Well, it’s a vector in B. Since B has a basis T(**A1**) is a unique linear combination of them. Now the nomenclature will shift a bit. I’m going to write T(A1) as follows.

T(**A1**) = AB11 * **B1** + AB12 * **B2** + AB13 *** B3** + AB14 * **B4**

AB signifies that the function is from space A to space B, the numbers after AB are to be taken as subscripts. Terms of art: linear functions between vector spaces are usually called **linear transformations**. When the vectors spaces on either end of the transformation are the same the linear transformation is called a **linear operator** (or **operator** for short). Sound familiar? An example of a linear operator in 3 dimensional space would just be a rotation of the coordinate axes, leaving the origin fixed. For why the origin has to be fixed if the transformation is to be linear see the first post in the series.

Fill in the dots for T(**A2**) = AB21 * **B1** + . . .

T(**A3**) = AB31 * **B1** + . . .

Now for a blizzard of (similar and pretty simple) algebra. Consider the linear transformation from B to C. Call the transformation S. I’m going to stop putting the Bi’s and Ci’s in bold. you know they are basis vectors. Also in what follows to get the equations to line up on top of each other you might have to make the characters smaller (say by holding down the Command and the minus key at the same time — in the Apple world)

S(B1) = BC11 * C1 + BC12 * C2 + BC13 * C3 + BC14 * C4 + BC15 * C5

S(B2) = BC21 * C1 + BC22 * C2 + BC23 * C3 + BC24 * C4 + BC25 * C5

S(B3) = BC31 * C1 + BC32 * C2 + BC33 * C3 + BC34 * C4 + BC35 * C5

S(B4) = BC41 * C1 + BC42 * C2 + BC43 * C3 + BC44 * C4 + BC45 * C5

It’s pretty simple to plug S(Bi) into T(A1).

Recall that T(A1) = AB11 * B1 + AB12 * B2 + AB13 * B3 + AB14 * B4

So we get

T(A1) = AB11 * ( BC11 * C1 + BC12 * C2 + BC13 * C3 + BC14 * C4 + BC15 * C5 ) +

AB12 * ( BC21 * C1 + BC22 * C2 + BC23 * C3 + BC24 * C4 + BC25 * C5 ) +

AB13 * ( BC31 * C1 + BC32 * C2 + BC33 * C3 + BC34 * C4 + BC35 * C5 ) +

AB14 * ( BC41 * C1 + BC42 * C2 + BC43 * C3 + BC44 * C4 + BC45 * C5 )

So now we have a linear transformation of space A to space C, just by simple substitution. Do you see the pattern yet? If not just collect terms of A1 in terms of {C1, C2, C3, C4, C5}. It’s easy to do as they are all above each other. If we write

S(T(A1)) = AC11 * C1 + AC12 * C2 + AC13 * C3 + AC14 * C4 + AC15 * C5

you can see that AC13 = AB11 * BC13 + AB12 * BC23 + AB13 * BC33 + AB14 * BC43. This is the sum of 4 terms of which are of the form AB1x * BCx, where x runs from 1 to 4

This should look very familiar if you know the formula for matrix multiplication. If not don’t sweat it, I’ll discuss matrices next time, but you’ve basically just seen them (they’re just a compact way of representing the above equations). Linear transformations between (appropriately dimensioned) vector spaces can always be mushed together (combined) like this. Why? (1) all finite dimensional vector spaces have a basis, with all that goes with them and (2) linear transformations are a very special type of function (according to an instructor in a graduate algebra course — the only type of function mathematicians understand completely).

It is the very simple algebra of combining linear transformations between finite dimensional vector spaces that makes matrix multiplication exactly what it is. It simply can’t be anything else. Now you know. Quantum mechanics is written in this language, the syntax of which is the linear transformation, the representation the matrix. Remarkably, when Heisenberg formulated quantum mechanics this way, he knew nothing about matrices. A Hilbert trained mathematician and physicist (Max Born) had to tell him what he was really doing. So much for the notion that physicists shoehorn our view of the world into a mathematical mold. Amazingly, the mathematics always seems to get there first (Newton excepted).