The point of this post is to show from whence the weird definition of matrix multiplication comes, and why it simply MUST be the way it is. Actually matrices don’t appear in this post, just the underlying equations they represent. We’re dealing with spaces of finite dimension at this point (infinite dimensional spaces come later). Such spaces have a basis — meaning a collection of elements (**basis vectors**) which are enough to describe every element of the space UNIQUELY, as a linear combination.

To make things a bit more concrete, think of good old 3 dimensional space with basis vectors **E1** = (1,0, 0) aka** i**, **E2** = (0,1,0) aka **j**, and **E3** = (0,0,1) aka **j**. Every point in this space is uniquely described as a1 * **E1** + a2 * **E2** + a3 * **E3** — e. g. a linear combination of the 3 basis vectors. You can also think of each point as a vector from the origin (0,0,0) to the point (a1,a2,a3). Once you establish what the basis is each vector is specified by its (unique) triple of numerical coordinates (a1, a2, a3). Choose a different basis and you get a different set of coordinates, but you always get no more and no less than 3 coordinates — that’s what dimension is all about. Note that the combination of basis vectors is linear (no powers greater than 1).

So now we’re going to consider several spaces, namely A, B and C of dimensions 3, 4 and 5. Their basis vectors are the set {**A1**, **A2**, **A3 }** for A, {**B1**, **B2**, **B3**, **B4 }** for B — fill in the dots for C.

What does a **linea**r transformation from A to B look like? Because of the way things have been set up, there is really no choice at all.

Consider any vector of A — it must be of the form a1 * **A1** + a2 * **A2** + a3 * **A3** , e.g. a linear combination of the basis vectors {**A1**, **A2**, **A3**} — where the { } notation means set. For any given vector in A, a1 a2 and a3 are uniquely determined. Sorry to stress this so much but uniqueness is crucial.

Similarly any vector of C must be of the form c1 * **C1** + c2 * **C2** + c3 * **C3** + c4 * **C4** + c5 * **C5**. Go back and fill in the dots for B.

Any linear function T from A to B must satisfy

T (**X** + **Y**) = T(**X**) + T(**Y**)

where X and Y are vectors in A and T(**X**), T(**Y**) are vectors in B. So what? A lot. We only have to worry about what T does to **A1**, **A2** and **A3**. Why ? ? Because the {**Ai**} are basis vectors, and because of the second thing a linear function must satisfy

T ( number * **X**) = number * (T ( **X**)) so combining both properties

T (a1 * **A1** + a2 * **A2** + a3 * **A3**) = a1 * T(**A1**) + a2 * T(**A2**) + a3 * T(**A3**)

All we have to worry about is what T does to the 3 basis vectors of A. Everything else follows easily enough.

So what is T(**A1**) ? Well, it’s a vector in B. Since B has a basis T(**A1**) is a unique linear combination of them. Now the nomenclature will shift a bit. I’m going to write T(A1) as follows.

T(**A1**) = AB11 * **B1** + AB12 * **B2** + AB13 *** B3** + AB14 * **B4**

AB signifies that the function is from space A to space B, the numbers after AB are to be taken as subscripts. Terms of art: linear functions between vector spaces are usually called **linear transformations**. When the vectors spaces on either end of the transformation are the same the linear transformation is called a **linear operator** (or **operator** for short). Sound familiar? An example of a linear operator in 3 dimensional space would just be a rotation of the coordinate axes, leaving the origin fixed. For why the origin has to be fixed if the transformation is to be linear see the first post in the series.

Fill in the dots for T(**A2**) = AB21 * **B1** + . . .

T(**A3**) = AB31 * **B1** + . . .

Now for a blizzard of (similar and pretty simple) algebra. Consider the linear transformation from B to C. Call the transformation S. I’m going to stop putting the Bi’s and Ci’s in bold. you know they are basis vectors. Also in what follows to get the equations to line up on top of each other you might have to make the characters smaller (say by holding down the Command and the minus key at the same time — in the Apple world)

S(B1) = BC11 * C1 + BC12 * C2 + BC13 * C3 + BC14 * C4 + BC15 * C5

S(B2) = BC21 * C1 + BC22 * C2 + BC23 * C3 + BC24 * C4 + BC25 * C5

S(B3) = BC31 * C1 + BC32 * C2 + BC33 * C3 + BC34 * C4 + BC35 * C5

S(B4) = BC41 * C1 + BC42 * C2 + BC43 * C3 + BC44 * C4 + BC45 * C5

It’s pretty simple to plug S(Bi) into T(A1).

Recall that T(A1) = AB11 * B1 + AB12 * B2 + AB13 * B3 + AB14 * B4

So we get

T(A1) = AB11 * ( BC11 * C1 + BC12 * C2 + BC13 * C3 + BC14 * C4 + BC15 * C5 ) +

AB12 * ( BC21 * C1 + BC22 * C2 + BC23 * C3 + BC24 * C4 + BC25 * C5 ) +

AB13 * ( BC31 * C1 + BC32 * C2 + BC33 * C3 + BC34 * C4 + BC35 * C5 ) +

AB14 * ( BC41 * C1 + BC42 * C2 + BC43 * C3 + BC44 * C4 + BC45 * C5 )

So now we have a linear transformation of space A to space C, just by simple substitution. Do you see the pattern yet? If not just collect terms of A1 in terms of {C1, C2, C3, C4, C5}. It’s easy to do as they are all above each other. If we write

S(T(A1)) = AC11 * C1 + AC12 * C2 + AC13 * C3 + AC14 * C4 + AC15 * C5

you can see that AC13 = AB11 * BC13 + AB12 * BC23 + AB13 * BC33 + AB14 * BC43. This is the sum of 4 terms of which are of the form AB1x * BCx, where x runs from 1 to 4

This should look very familiar if you know the formula for matrix multiplication. If not don’t sweat it, I’ll discuss matrices next time, but you’ve basically just seen them (they’re just a compact way of representing the above equations). Linear transformations between (appropriately dimensioned) vector spaces can always be mushed together (combined) like this. Why? (1) all finite dimensional vector spaces have a basis, with all that goes with them and (2) linear transformations are a very special type of function (according to an instructor in a graduate algebra course — the only type of function mathematicians understand completely).

It is the very simple algebra of combining linear transformations between finite dimensional vector spaces that makes matrix multiplication exactly what it is. It simply can’t be anything else. Now you know. Quantum mechanics is written in this language, the syntax of which is the linear transformation, the representation the matrix. Remarkably, when Heisenberg formulated quantum mechanics this way, he knew nothing about matrices. A Hilbert trained mathematician and physicist (Max Born) had to tell him what he was really doing. So much for the notion that physicists shoehorn our view of the world into a mathematical mold. Amazingly, the mathematics always seems to get there first (Newton excepted).

## Comments

I find it amazing that matrices were not part of Heisenberg’s education but were part of Born’s. This is in spite of the fact that the kind of German education Heisenberg received was probably the most rigorous then around.

Well, Born was Hilbert’s right hand man at Gottingen, spending years there as his personal assistant.

Einstein was right about the shortcomings of Quantum Mechanics and so therefore String Theory is also the incorrect approach. As an alternative to Quantum Theory there is a new theory that describes and explains the mysteries of physical reality. While not disrespecting the value of Quantum Mechanics as a tool to explain the role of quanta in our universe. This theory states that there is also a classical explanation for the paradoxes such as EPR and the Wave-Particle Duality. The Theory is called the Theory of Super Relativity. This theory is a philosophical attempt to reconnect the physical universe to realism and

deterministic concepts. It explains the mysterious.

You may be interested in using Latex to format the math in your posts (see http://en.blog.wordpress.com/2007/02/17/math-for-the-masses/ for more information).

Yggdrasil: Thanks for the tip, I’ve been back and forth with support about this. I’m amazed at how helpful they are and quickly they get back. I’m even more amazed that it’s all for free — hopefully, they’re making money some way or other.

Here’s the latest correspondence with them.

I went to http://en.support.wordpress.com/latex/ where they

told me how to get mathematical symbols into my post using Latex.

So I put the example shown into the end of the post

“Diprivan and Versed in AM means no Linear Algebra in PM” of 6 Jan 10

by editing it. And that’s exactly what appeared in the post, not the

mathematical symbolism which it codes.

Any idea what went wrong?

Hi There,

Fixed – it’s best to write the Latex code directly in the HTML editor, not the visual editor.

This will make sure no HTML code is thrown in the mix.

At this point my priority is to crank out the rest of the Linear Algebra survival guides in sequence, and quickly (since there seems to be a fair amount of interest in them). I’ve never done HTML before and think it would take a while to get up to speed. Perhaps when they’re all done I’ll go back and redo the posts in the HTML editor so they are more readable. Until then, my apologies for the way it looks.

It’s interesting that if you are learning “practical” quantum mechanics such as quantum chemistry, you can get away with a lot without almost any linear algebra. One has to only take a look at popular QC texts like Levine, Atkins, or Pauling and Wilson; almost no LA there.