Category Archives: Linear Algebra Survival Guide for Quantum Mechanics

The pleasures of enough time

One of the joys of retirement is the ability to take the time to fully understand the math behind statistical mechanics and thermodynamics (on which large parts of chemistry are based — cellular biophysics as well). I’m going through some biophysics this year reading “Physical Biology of the Cell” 2nd Edition and “Molecular Driving Forces” 2nd Edition. Back in the day, what with other courses, research, career plans and hormones to contend with, there just wasn’t enough time.

To really understand the derivation of the Boltzmann equation, you must understand Lagrange multipliers, which requires an understanding of the gradient and where it comes from. To understand the partition function you must understand change of variables in an integral, and to understand that you must understand why the determinant of the Jacobian matrix of a set of independent vectors is the volume multiplier you need.

These were all math tools whose use was fairly simple and which didn’t require any understanding of where they came from. What a great preparation for a career in medicine, where we understood very little of why we did the things we did, not because of lack of time but because the deep understanding of the systems we were mucking about with simply didn’t (and doesn’t) exist. It was intellectually unsatisfying, but you couldn’t argue with the importance of what we were doing. Things are better now with the accretion of knowledge, but if we really understood things perfectly we’d have effective treatments for cancer and Alzheimer’s. We don’t.

But in the pure world of math, whether a human creation or existing outside of us all, this need not be accepted.

I’m not going to put page after page of derivation of the topics mentioned in the second paragraph, but mention a few things to know which might help you when you’re trying learn about them, and point you to books (with page numbers) that I’ve found helpful.

Let’s start with the gradient. If you remember it at all, you know that it’s a way of taking a continuous real valued function of several variables and making a vector of it. The vector has the miraculous property of pointing in the direction of greatest change in the function. How did this happen?

The most helpful derivation I’ve found was in Thomas’ textbook of calculus (9th Edition pp. 957–> ). Yes Thomas — the same book I used as a freshman 6o years ago ! Like most living things that have aged, it’s become fat. Thomas is now up to the 13th edition.

The simplest example of a continuous real valued function is a topographic map. Thomas starts with the directional derivative — which is how the function height(north, east) changes in the direction of a vector whose absolute value is 1. That’s the definition — to get something you can actually calculate, you need to know the chain rule, and how to put a path on the topo map. The derivative of the real valued function in the direction of a unit vector turns out to be the dot product of the gradient vector and any vector at that point whose absolute value is 1. The unit vector can point any direction but the value of the derivative (the dot product) will be greatest when the unit vector points in the direction of the gradient vector. That’s where the magic comes from. If you’re slightly shaky on linear algebra, vectors and dot products — here’s a (hopefully explanatory) link to some basic linear algebra — This is the first in a series — just follow the links.

The discussion of Lagrange multipliers (which is essentially the relation between two gradients — one of a function, the other of a constraint in Dill pp.68 -> 72 is only fair, and I did a lot more work to understand it (which can’t be reproduced here).

For an excellent discussion of wedge product and why the volume multiplier in an integral must be the determinant of the Jacobian — see Callahan Advanced Calculus p. 41 and exercise 2.15 p. 61, the latter being the most important. It explains why things work this way in 2 dimensions. The exercise takes you through the derivation step by step asking you to fill in some fairly easy dots. Even better is  exercise 2.34 on p. 67 which proves the same thing for any collection of n independent vectors in R^n.

The Jacobian is just the determinant of a square matrix, something familiar from linear algebra. The numbers are just the coefficients of the vectors at a given point. But in integrals we’re changing dx and dy to something else — dr and dTheta when you go to polar coordinates. Why a matrix here? Because if differential calculus is about anything it is about linearization of nonLinear functions, which is why you can use a matrix of derivatives (the Jacobian matrix)  for dx and dy.

Why is this important for statistical mechanics. Because one of the integrals you must evaluate is of exp(-ax^2) from -infinity to + infinity, and the switch to polar coordinates is the way to do it. You also must evaluate integrals of this type to understand the kinetic theory of ideal gases.

Not necessary in this context, but one of the best discussions of the derivative in its geometric context I’ve ever seen is on pp. 105 –> 106 of Callahan’s bok

So these are some pointers and hints, not a full discussion — I hope it makes the road easier for you, should you choose to take it.


How formal tensor mathematics and the postulates of quantum mechanics give rise to entanglement

Tensors continue to amaze. I never thought I’d get a simple mathematical explanation of entanglement, but here it is. Explanation is probably too strong a word, because it relies on the postulates of quantum mechanics, which are extremely simple but which lead to extremely bizarre consequences (such as entanglement). As Feynman famously said ‘no one understands quantum mechanics’. Despite that it’s never made a prediction not confirmed by experiments, so the theory is correct even if we don’t understand ‘how it can be like that’. 100 years of correct prediction of experimentation are not to be sneezed at.

If you’re a bit foggy on just what entanglement is — have a look at Even better; read the book by Zeilinger referred to in the link (if you have the time).

Actually you don’t even need all the postulates for quantum mechanics (as given in the book “Quantum Computation and Quantum Information by Nielsen and Chuang). No differential equations. No Schrodinger equation. No operators. No eigenvalues. What could be nicer for those thirsting for knowledge? Such a deal ! ! ! Just 2 postulates and a little formal mathematics.

Postulate #1 “Associated to any isolated physical system, is a complex vector space with inner product (that is a Hilbert space) known as the state space of the system. The system is completely described by its state vector which is a unit vector in the system’s state space”. If this is unsatisfying, see an explication of this on p. 80 of Nielson and Chuang (where the postulate appears)

Because the linear algebra underlying quantum mechanics seemed to be largely ignored in the course I audited, I wrote a series of posts called Linear Algebra Survival Guide for Quantum Mechanics. The first should be all you need. but there are several more.

Even though I wrote a post on tensors, showing how they were a way of describing an object independently of the coordinates used to describe it, I did’t even discuss another aspect of tensors — multi linearity — which is crucial here. The post itself can be viewed at

Start by thinking of a simple tensor as a vector in a vector space. The tensor product is just a way of combining vectors in vector spaces to get another (and larger) vector space. So the tensor product isn’t a product in the sense that multiplication of two objects (real numbers, complex numbers, square matrices) produces another object of the exactly same kind.

So mathematicians use a special symbol for the tensor product — a circle with an x inside. I’m going to use something similar ‘®’ because I can’t figure out how to produce the actual symbol. So let V and W be the quantum mechanical state spaces of two systems.

Their tensor product is just V ® W. Mathematicians can define things any way they want. A crucial aspect of the tensor product is that is multilinear. So if v and v’ are elements of V, then v + v’ is also an element of V (because two vectors in a given vector space can always be added). Similarly w + w’ is an element of W if w an w’ are. Adding to the confusion trying to learn this stuff is the fact that all vectors are themselves tensors.

Multilinearity of the tensor product is what you’d think

(v + v’) ® (w + w’) = v ® (w + w’ ) + v’ ® (w + w’)

= v ® w + v ® w’ + v’ ® w + v’ ® w’

You get all 4 tensor products in this case.

This brings us to Postulate #2 (actually #4 on the book on p. 94 — we don’t need the other two — I told you this was fairly simple)

Postulate #2 “The state space of a composite physical system is the tensor product of the state spaces of the component physical systems.”

Where does entanglement come in? Patience, we’re nearly done. One now must distinguish simple and non-simple tensors. Each of the 4 tensors products in the sum on the last line is simple being the tensor product of two vectors.

What about v ® w’ + v’ ® w ?? It isn’t simple because there is no way to get this by itself as simple_tensor1 ® simple_tensor2 So it’s called a compound tensor. (v + v’) ® (w + w’) is a simple tensor because v + v’ is just another single element of V (call it v”) and w + w’ is just another single element of W (call it w”).

So the tensor product of (v + v’) ® (w + w’) — the elements of the two state spaces can be understood as though V has state v” and W has state w”.

v ® w’ + v’ ® w can’t be understood this way. The full system can’t be understood by considering V and W in isolation, e.g. the two subsystems V and W are ENTANGLED.

Yup, that’s all there is to entanglement (mathematically at least). The paradoxes entanglement including Einstein’s ‘creepy action at a distance’ are left for you to explore — again Zeilinger’s book is a great source.

But how can it be like that you ask? Feynman said not to start thinking these thoughts, and if he didn’t know you expect a retired neurologist to tell you? Please.

An old year’s resolution

One of the things I thought I was going to do in 2012 was learn about relativity.   For why see  Also my cousin’s new husband wrote a paper on a new way of looking at it.  I’ve been putting him off as I thought I should know the old way first.

I knew that general relativity involved lots of math such as manifolds and the curvature of space-time.  So rather than read verbal explanations, I thought I’d learn the math first.  I started reading John M. Lee’s two books on manifolds.  The first involves topological manifolds, the second involves manifolds with extra structure (smoothness) permitting calculus to be done on them.  Distance is not a topological concept, but is absolutely required for calculus — that’s what the smoothness is about.

I started with “Introduction to Topological Manifolds” (2nd. Edition) by John M. Lee.  I’ve got about 34 pages of notes on the first 95 pages (25% of the text), and made a list of the definitions I thought worth writing down — there are 170 of them. Eventually I got through a third of its 380 pages of text.  I thought that might be enough to help me read his second book “Introduction to Smooth Manifolds” but I only got through 100 of its 600 pages before I could see that I really needed to go back and completely go through the first book.

This seemed endless, and would probably take 2 more years.  This shouldn’t be taken as a criticism of Lee — his writing is clear as a bell.  One of the few criticisms of his books is that they are so clear, you think you understand what you are reading when you don’t.

So what to do?  A prof at one of the local colleges, James J. Callahan, wrote a book called “The Geometry of Spacetime” which concerns special and general relativity.  I asked if I could audit the course on it he’d been teaching there for decades.  Unfortunately he said “been there, done that” and had no plans ever to teach the course again.

Well, for the last month or so, I’ve been going through his book.  It’s excellent, with lots of diagrams and pictures, and wide margins for taking notes.  A symbol table would have been helpful, as would answers to the excellent (and fairly difficult) problems.

This also explains why there have been no posts in the past month.

The good news is that the only math you need for special relativity is calculus and linear algebra.  Really nothing more.  No manifolds.  At the end of the first third of the book (about 145 pages) you will have a clear understanding of

l. time dilation — why time slows down for moving objects

2. length contraction — why moving objects shrink

3. why two observers looking at the same event can see it happening at different times.

4. the Michelson Morley experiment — but the explanation of it in the Feynman lectures on physics 15-3, 15-4 is much better

5. The Kludge Lorentz used to make Maxwell’s equations obey the Galilean principle of relativity (e.g. Newton’s first law)

6. How Einstein derived Lorentz’s kludge purely by assuming the velocity of light was constant for all observers, never mind how they were moving relative to each other.  Reading how he did it, is like watching a master sculptor at work.

Well, I’ll never get through the rest of Callahan by the end of 2012, but I can see doing it in a few more months.  You could conceivably learn linear algebra by reading his book, but it would be tough.  I’ve written some fairly simplistic background linear algebra for quantum mechanics posts — you might find them useful.

One of the nicest things was seeing clearly what it means for different matrices to represent the same transformation, and why you should care.  I’d seen this many times in linear algebra, but seeing how simple reflection through an arbitrary line through the origin can be when you (1) rotate the line to the x axis by tan(y/x) radians (2) change the y coordinate to – y  — by an incredibly simple matrix  (3) rotate it back to the original angle .

That’s why any two n x n matrices X and Y represent the same linear transformation if they are related by the invertible matrix Z in the following way  X = Z^-1 * Y * Z

Merry Christmas and Happy New Year (none of that Happy Holidays crap for me)

Willock pp. 51 – 104

This is a continuation of my notes, as I read  Molecular Symmetry” by David J. Willock.  As you’ll see, things aren’t going particularly well.  Examples of concepts are great once they’ve been defined, but in this book it’s examples first, definitions later (if ever).

p. 51 — Note all the heavy lifting  required to produce an object with only (italics) C4 symmetry (figure 3.6)  First,  you need 4 objects in a plane (so they rotate into each other), separated by 90 degrees.  That’s far from enough objects as there are multiple planes of symmetry for 4 objects in a plane (I count 5 — how many do you get?)  So you need another 4 objects in a plane parallel to the first.  These objects must be a different distance from the symmetry axis, otherwise the object will have A C2 axis of symmetry, midway between the two planes. Lastly no object in the second plane can lie on a line parallel to the axis of symmetry which contains an object in the first plane — e.g. the two groups of 4 must be staggered relative to each other.    It’s even more complicated for S4 symmetry.  

p. 51 — The term classes of operation really hasn’t been defined (except by example).   Also this is the first example of (the heading of) a character table — which hasn’t been defined at this point.

p. 52 — Note H2O2 has C2 symmetry because it is not (italics) planar.   Ditto for 1,2 (S, S) dimethyl cyclopropane (more importantly this is true for disulfide bonds between cysteines forming cystines — a way of tying parts of proteins to each other. 

p. 55 — Pay attention to the nomenclature: Cnh means that an axis of degree n is present along with a horizontal plane of symmetry.  Cnv means that, instead, a vertical plane of symmetry is present (along with the Cn axis)

p. 57 — Make sure you understain why C4h doesn’t  have vertical planes of symmetry.

p. 59 — A bizarre pedagogical device — defining groups whose first letter is D by something they are not (italics) — which itself (cubic groups) is at present undefined.  

Willock then regroups by defining what Dn actually is.

It’s a good exercise to try to construct the D4 point group yourself. 

p. 61 — “It does form a subgroup” — If subgroup was ever defined, I missed it.  Subgroup is not in the index (neither is group !).  Point group is in the index, and point subgroup is as well appearing on p. 47 — but point subgroup isn’t defined there.  

p. 62 — Note the convention — the Z direction is perpendicular to the plane of a planar molecule.

p. 64 — Why are linear molecules called Cinfinity ? — because any rotation around the axis of symmetry (the molecule itself) leaves the molecule unchanged, and there are an infinity of such rotations.

p. 67 — Ah,  the tetrahedron embedded in a cube — exactly the way an organic chemist should think of the sp3 carbon bonds.  Here’s a mathematical problem for you.  Let the cube have sides of 1, the bonds as shown in figure 3.27, the carbon in the very center of the cube — now derive the classic tetrahedral bond angle — answer at the end of this post. 

p. 67 — 74 — The discussions of symmetries in various molecules is exactly why you should have the conventions for naming them down pat.  

p. 75 — in the second paragraph affect should be effect (at least in American English)

p. 76 — “Based on the atom positions alone we cannot tell the difference between the C2 rotation and the sigma(v) reflection, because either operation swaps the positions of the hydrogen atoms.”   Do we ever want to actually do this (for water that is)? Hopefully this will turn out to be chemically relevant. 

p. 77 — Note that the definition of character refers to the effect of a symmetry operation on one of an atom’s orbitals (not it’s position).  Does this only affect atoms whose position is not (italics) changed by the symmetry operation?  Very important to note that the character is -1 only on reversal of the orbital — later on, non-integer characters will be seen.  Note also that each symmetry operation produces a character (number) for each orbital, so there are (number of symmetry operations) * (number of orbital) characters in a character table

p. 77 – 78 — Note that the naming of the orbitals is consistent with what has gone on before.  p(z) is in the plane of the molecule because that’s where the axis of rotation is.

Labels are introduced for each of the possible standard sets of characters (but standard set really isn’t defined).  A standard set (of sets of characters??) is an irreducible representation for the group.  

Is one set of characters an irreducible representation by itself or is it a bunch of them? The index claims that this is the definition of irreducible represenation, but given the amiguity about what a standard set of characters actually is (italics) we don’t really know what an irreducible representation actually is.   This is definition by example, a pedagogical device foreign to math, but possibly a good pedagogical device — we’ll see.  But at this point, I’m not really clear what an irreducible represenation actually is.

p. 77 — In a future edition, it would be a good idea to lable the x, y and z axes (and even perhaps draw in the px, py and pz orbitals), and, if possible, put figure 4.2 on the same page as table 4.2.  Eventually things get figured out but it takes a lot of page flipping. 

p. 79 — Further tightening of the definition of a representation — it’s one row of a character table.

p. 79 — Nice explanation of orbital phases, but do electrons in atoms know or care about them?

p. 80 — Note that in the x-y axes are rotated 90 degrees in going from figure 4.4a to figure 4.4b  (why?).   Why talk about d orbitals? — they’re empty in H20 but possibly not in other molecules with C2v symmetry.  

p. 80 — Affect should be effect (at least in American English)

p. 81 — B1 x B2 = A2 doesn’t look like a sum to me.  If you actually summed them you’d get 2 for E, -2 for C2, and 0 for the other two.  It does look like the product though.

pp. 81 – 82 — Far from sure what is going on in section 4.3

p.82 — Table 4.4b does look like multiplication of the elements of B1 by itself. 

p. 82 — Not sure when basis vectors first made their appearance, possibly here.  I slid over this on first reading since basis vectors were quite familiar to me from linear algebra (see the category ).  But again, the term is used here without really being defined.  Probably not to confuse, the first basis vectors shown first are at 90 degrees to each other (x and y), but later on (p. 85 they don’t have to be — the basis 0vectors point along the 3 hydrogens of ammonia).

p. 83 — Very nice way to bring in matrices, but it’s worth nothing that each matrix stands for just one symmetry operation.  But each matrix lets you see what happens to all (italics) the basis vectors you’ve chosen. 

p. 84 — Get very clear in your mind that when you see an expression of the form

symmetry_operation1 symmetry_operation2 

juxtaposed to each other — that you do symmetry_operation2  FIRST.

p. 87  — Notice that the term character is acquiring a second meaning here — it no longer is the effect of a symmetry operation on one of an atom’s orbitals (not the atom’s position), it’s the effect of a symmetry operation on a whole set of basis elements.

p. 88 — Notice that in BF3, the basis vectors no longer align with the bonds (as they did in NH3), meaning that you can choose the basis vectors any way you want.  

p.89 — Figure 4.9 could be markedly improved.  One must distinguish between two types of lines (interrupted and continuous), and two types of arrowheads (solid and barbed), making for confuion in the diagrams where they all appear together (and often superimposed).  

Given the orbitals as combinations of two basis vectors, the character of symmetry operation and a basis vector, acquires yet another meaning — how much of the original orbital is left after the symmetry operation. 

p. 91 — A definition of irreducible representations as the ‘simplest’ symmetry behavior.  Simplest is not defined.  Also for the first time it is noted that symmetries can be of orbitals or vibrations.  We already know they can be of the locations of the atoms in a molecule.  

Section 4.8 is extremely confusing.

p. 92 — We now find out that what was going on with a character sum of 2 on p. 81 — The sums  were 2 and 0 because the representations were reducible.  


p. 93 (added 29 Jan ’12) — We later find out (p. 115) that the number of reducible representations of a point group is the number of classes.  The index says that class is defined an ‘equivalent set of operations’ — but how two distinct operations are equivalent is never defined, just illustrated.

p. 100 — Great to have the logic behind the naming of the labels used for irreducible representations (even if they are far from intuitive)

p. 101 — There is no explanation of the difference between basis vector and basis function. 

All in all, a very difficult chapter to untangle.  I’m far from sure I understand from p. 92 – 100.  However, hope lies in future chapters and I’ll push on.  I think it would be very difficult to learn from this book (so far) if you were totally unfamiliar with symmetry.  

Answer to the problem on p. 67.  Let the sides of the cube be of length 1.  The bonds are all the same length, so the carbon must be in the center of the cube.  Any two of the bonds point to the opposite corners of a square of length 1.  Therefore the ends of the bonds are sqrt(2) apart.   Now drop a perpendicular to the middle of this line to get to the carbon in the center.  This has length 1/2.  So we have a right triangle of side 1/2 and ( sqrt(2))/2.  So the answer is 2 * arctan(1.414).  Arctan(1.414 is) 54.731533 degrees giving the angle as 109.46 degrees.

Linear Algebra survival guide for Quantum Mechanics -IX

The heavy lifting is pretty much done.  Now for some fairly spectacular results, and then back to reading Clayden et. al.  To make things concrete, let Y be a 3 dimensional vector with complex coefficients c1, c2 and c3.  The coefficients multiply a set of basis vectors (which exist since all finite and infinite vector spaces have a basis).  The glory of abstraction is that we don’t actually have worry about what the basis vectors actually  are, just that they exist.  We are free to use their properties, one of which is orthogonality (I may not have proved this,  you should if I haven’t). So the column vector is




and the corresponding row vector (the conjugate transpose)  is 

c1*  c2*  c3*

Next, I’m going to write a corresponding hermitian matrix M as follows where Aij is an arbitrary complex number. 

A11  A12   A13

A21  A22  A23

A31  A32  A33

Now form the product

                            A11  A12   A13

                           A21  A22  A23

                           A31  A32  A33

c1*  c2*  c3*      X      Y       Z

The net effect is to form another row vector with 3 components.   All we need for what I want to prove  is an explicit formula for  X

X =  c1*(A11) + c2*(A21) + c3*(A31)

When we  multiply the row vector obtained by the column vector on the right we get

c1 [ c1*(A11) + c2*(A21) + c3*(A31) ] + c2 [ Y ] + c3 [ Z ]  — which by assumption must be a real number 

Next, form the product of M with the column vector 




A11  A12   A13     X’

A21  A22  A23     Y’

A31  A32  A33     Z’

This time all we need is X’  which is c1(A11) + c2(A12) + c3(A13)

When we multiply the column vector obtained by the row vector on the left we get

c1* [  c1(A11) + c2(A12) + c3(A13) ] + c2* Y’ + c3* Z’ — the same number as 

c1 [ c1*(A11) + c2*(A21) + c3*(A31) ] + c2 [ Y ] + c3 [ Z ]

Notice that c1, c2, c3 can each be any of the infinite number of complex numbers, without disturbing the equality. The ONLY way this can happen is if

c1*[c1(A11)] = c1[c1*(A11)]  — this is obviously true

and c1*[c2[A12)] = c1[c2*(A21)]  — something fishy

and c1*[c3[A13)] = c1[c3*(A31)]  ditto

The last two equalities look a bit strange.  If you go back to LASGFQM – II , you will see that c1*(c2) does NOT equal c1(c2*).  However 

c1*(c2)  does  equal [ c1 (c2* ) ]*.  They aren’t the same, but at least they are the complex conjugates of the other. This means that to make

c1*[c2[A12)] = c1[c2*(A21)],      A12 = A21* or  A12* = A21 which is the same thing.

So just by following the postulate of quantum mechanics about the type of linear transformation (called Hermitian) which can result in a measurement, we find that the matrix representing the linear transformation, the Hermitian matrix, has the property that Mij  = Mji*  (the first letter is the row index and the second is the column index).  This also means that the diagonal elements of any Hermitian matrix are real.  Now when I first bumped up against Hermitian matrices they were DEFINED this way, making them seem rather magical.  Hermitian matrices are in fact natural, and they do just what quantum mechanics wants them to do. 

Some more nomenclature:  Mij  = Mji* means that a Hermitian matrix equals its conjugate transpose   (which is another even more obscure way to define them). The conjugate transpose of a matrix is called the adjoint.  This means that the row vector as we’ve defined it is the adjoint of the column vector.  This  also  is why Hermitian matrices are called self-adjoint.   

That’s about it. Hopefully when you see this stuff in the future, you won’t be just mumbling incantations.   But perhaps you are wondering, where are the eigenvectors, where are the eigenvalues in all this?  What happened to the Schrodinger equation beloved in song and story?   That’s for the course you’re taking, but briefly and without explanation, the basis vectors I’ve been talking about (without explictly describing them) all result as follows:

Any Hermitian operator times wavefunction = some number times same wavefunction.  [1]

Several points:  many Hermitian operators change one wave function into another, so [ 1 ] doesn’t always hold.

IF [1] does hold the wavefunction is called an eigenfunction, and  ‘some number’ is the eigenvalue.  

There is usually a set of eigenfunctions for a given Hermitian operator — these are the basis functions (basis vectors of the infinite dimensional Hilbert space) of the vector space I was describing.  You find them by finding solutions of the Schrodinger equation H Psi = E Psi, but that’s for your course, but at least now you know the lingo.   Hopefully, these last few words are  less frustrating than the way Tom Wolfe ended “The Bonfire of the Vanities” years ago — the book just stopped rather than ended.  

I thought the course I audited was excellent, but we never even got into bonding.  Nonetheless, I think the base it gave was quite solid and it’s time to find out.  Michelle Francl recommended “Modern Quantum Chemistry” by Atilla (yes Atilla ! ) Szabo and Neil Ostlund as the next step.  You can’t beat the price as it’s a Dover paperback.  I’ve taken a brief look at ‘”Molecular Quantum Mechanics” by Atkins and Friedman — it starts with the postulates and moves on from there.  Plenty of pictures and diagrams, but no idea how good it is.  Finally, 40 years ago I lived across the street from a Physics grad student (whose name I can’t recall), and the real hot stuff back then was a book by Prugovecki called “Quantum Mechanics in Hilbert Space”.  Being a pack rat, I still have it. We’ll see. 

One further point.  I sort of dumped on Giancoli”s book on Physics, which I bought when the course was starting up 9/09 — pretty pictures and all that.  Having been through the first 300 pages or so (all on mechanics), I must say it’s damn good.  The pictures are appropriate, the diagrams well thought out, the exposition clear and user friendly without being sappy. 

Time to delve.  

Amen Selah

Linear Algebra survival guide for Quantum Mechanics – VIII

Quantum mechanics has never made an incorrect prediction.  What does it predict? Numbers basically, and real numbers at that.  When you read a dial, or measure an energy in a spectrum you get a (real) number.  Imaginary currents exist, but I don’t know if you can measure them (I”ll ask the EE who just married into the family this weekend).   So couple the real number output of a measurement with the postulate of quantum that tells you how to get them and out pop Hermitian matrices.  

A variety of equivalent postulate systems for QM exist  (Atkins uses 5, our instructor used 4).  All of them say that the state of the system is described by a wavefunction  (which we’re going to think of as a vector, since we’re in linear algebra land).  In LASGFQM – V the  equivalence of the integral of a function and a vector in infinite dimensional space was explained.  LASGFOM – VII explained why every linear transformation could be represented by a matrix, and why every matrix represents a linear transformation.  

An operator is just a linear transformation of a vector space to itself.  This means that if we’re dealing with a finite dimensional vector space, the matrix representing the operator will be square.   Recalling the rules for matrix multiplication (LASGFQM – IV), this means that you can do things like this 

            x  x  x

            x  x  x

            x  x  x

y  y  y               giving  the row vector  xy  xy  xy

 and things like this 




            x  x  x          giving the  column vector  xz

            x  x  x                                                           xz

            x  x  x                                                           xz

Of course way back at the beginning it was explained why the inner product of a V vector with itself, had to make one the complex conjugate (V*) of the other (so the the inner product of a vector with itself was a real number), and in LASGFQM  – VI  it was explained why multiplying a row vector by a column vector gives a number . Here it is




y  y  y      yz

So given that < V | V > really means < V* | V > to physicists, the inner product can be regarded as just another form of matrix multiplication, with the row vector being the conjugate transpose of the column vector.    

If you reverse the order of multiplication (column vector first, row vector second), you get an n x n matrix, not a number.   It should be pretty clear by now that you can multiply all 3 matrices together (row vector, n x n matrix, column vector) as long as you keep the order correct.  After all this huffing an puffing, you wind up with — drum roll — a number, which is complex because the vectors of quantum mechanics have complex coefficients (another one of the postulates). 

We’re at a fairly high level of abstraction here.  We haven’t chosen a basis, but all vector spaces have one (even infinite vector spaces).   We’ll talk about them in the next (and probably final) post.

Call the column vector Y, the row vector X, and the matrix M.  We have Y M X = some number.  It should be clear that it doesn’t matter which two matrices we multiply together first e.g. (Y M) X = Y (M X).

Recall that differentiation and integration are linear operators, so they can be represented by matrices.  The wavefunction is represented by a column vector.  Various things you want to know (kinetic energy, position) are represented by linear operators in QM.  

Here’s the postulate: For a given wavefunction Y,  any measurement on it (given by a linear operator M ) is always a REAL number  and is given by  the

conjugate transpose of Y  times  M times Y (the column vector).   

You have to accept the postulate (because it works ! ! !)  as the QM instructor  said many times.   Don’t ask how it can be like that (Feynman).   

This postulate is all that it takes to make the linear transformation M a very special one — e.g. a Hermitian matrix, with all sorts of interesting properties. Hermite described these matrices in 1855, long before QM.  I’ve tried to find out what he was working on without success.  More about the properties of Hermitian matrices next time, but to whet your appetite, if an element of M is written  Mij, where i is the row and j is the column, and Mij is a complex number, then Mji is the complex conjugate of Mij.  Believe it or not, this all follows from the postulate.

Linear Algebra survival guide for Quantum Mechanics – VII

In linear algebra all the world’s a matrix (even vectors). Everyone (except me in the last post) numbers matrix elements by the following subscript convention — the row always comes first, then the columns (mnemonic Roman Catholic).  Similarly matrix size is always written  a x b where a is the number of rows and b the number of columns.  Vectors in quantum mechanics are written both ways, as column vectors  1 x n, or as row vectors (n x 1).

Vectors aren’t usually called matrices, but matrices they are when it comes to multiplication. Vectors can be multiplied by a matrix (or multiply a matrix) using the usual matrix multiplication rules.  That’s one reason the example in LASGFQM – VI was so tedious — I wanted to show how matrices of different sizes could be multiplied together.  The order of the matrices is crucial.  The first matrix A must have the same number of columns  that the second matrix (B) has rows — otherwise it just doesn’t work.  The product matrix has the number of rows of matrix A and the columns of matrix B.  

So  it is possible to form  A B where A is 3 x 4 and B is 4 x 5 giving a 3 x 5 matrix, but B A makes no sense.  If you get stuck use the Hubbard method of writing them out (see the last post).  Here is a 3 x 3 matrix (A) multiplying a 3 x 1 matrix (vector B)




A11 A12 A13     A11*B11 + A12 B21 + A13 * B31  — this is a single number

A21 A22 A23    A21*B11 + A22*B21 + A23* B31 — ditto

A31 A32 A33   etc.

AB is just another 3 x 1 vector.  So the matrix just transforms one 3 dimensional vector into another

You should draw a similar diagram and see why B A is impossible.  What about

C  (3 x 1) times D (3 x 3)?  You get CD a 3 x 1 matrix (row vector) back .

                          D11 D12 D13

                         D21 D22 D23

                         D31 D32  D33

C11 C12 C13                                    What is CD12?

Suppose we get concrete and make B into a column vector of the following type




A11 A12 A13     A11

A21 A22 A23    A21

A31 A32 A33    A31

The first time I saw this, I didn’t understand it.   I thought  the mathematicians were going back to the old Cartesian system of standard orthonormal vectors.  They weren’t doing this at all.  Recall that we’re in a vector space and the column vector is really the 3 coefficients multiplying  the 3 basis vectors (which are not specified).  So you don’t have to mess around with choosing a basis, the result is true for ALL bases of a 3 dimensional vector space.  The power of abstraction.  The first column of A shows what the first basis vector goes to (in general), the second column shows what the second basis goes to.  Back in LSQFQM – IV, it was explained why any linear transformation (call it T) of a basis vector (call it C1) to another vector space must look like this

T(C1) =  t11 * D1 + t12 * D2 + . ..   for however many basis vectors vector space D has.

 Well, in the above example we’re going from a 3 dimensional vector space to another, and the first row of matrix A tells us what basis vector #1 is going to.  This is why every linear transformation can be represented by a matrix and every matrix represents a linear transformation.  Sometimes abstraction saves a lot of legwork.  

A more geometric way to look at all this is to regard an  n x n matrix multiplying an n x 1 vector as moving it around in n dimensional space (keeping one end fixed at the origin — see below).  So 

1  0  0 

0  1  0 

0  0  2

just multiplies the third basis vector by 2 leaving the other two alone.  

The notation is consistent. Recall that any linear transformation must leave the zero vector unchanged (see LSQFQM – I for a proof).  Given the rules for multiplying a matrix times a vector, this happens with a column vector which is all zeros.

The geometrically inclined can start thinking about what the possible linear transformations can do to three dimensional space (leaving the origin fixed).  Rotations about the origin are one possibility, expansion or contraction along a single basis vector are two more, projections down to a 2 dimensional plane or a 1 dimensional line are two more.  There are others (particularly when we’re in a vector space with complex numbers for coefficients — e.g. all of quantum mechanics). 

Up next time, eigenvectors, adjoints, and (hopefully) Hermitian operators.  That will be about it.  The point of these posts (which are far more extensive than I thought they would be when I started out) is to show you how natural the language of linear algebra is, once you see what’s going on under the hood.  It is not to teach quantum mechanics, which I’m still learning to see how it is used in chemistry.  QM is far from natural (although it describes the submicroscopic world — whether it can ever describe the world we live in is another question), but, if these posts are any good at all, you should be able to understand the language in which QM is expressed.

Linear Algebra survival guide for Quantum Mechanics – VI

Why is linear algebra like real estate?   Well, in linear algebra the 3 most important things are notation, notation, notation.  I’ve shown how two sequential linear transformations can be melded into one, but you’ve seen nothing about the matrix representation of a linear transformation.  

Here’s the playing field from LASGFQM – IV again.  There are 3 vector spaces, A, B and C of dimensions 3, 4, and 5, with bases {A1, A2, A3}, {B1, B2, B3, B4} and {C1, C2, C3, C4, C5}.  Then there is linear transformation T which transforms A into B, and linear tranformation S which transforms B into C.

We have T(A1) = AB11 * B1 + AB12 * B2 + AB13 *B3 + AB14*B4

S(B1) = BC11 *C1 + BC12 *C2 + BC13 *C3 + BC14 * C4 + BC15 * C5
S(B2) = BC21 *C1 + BC22 *C2 + BC23 *C3 + BC24 * C4 + BC25 * C5
S(B3) = BC31 *C1 + BC32 *C2 + BC33 *C3 + BC34 * C4 + BC35 * C5
S(B4) = BC41 *C1 + BC42 *C2 + BC43 *C3 + BC44 * C4 + BC45 * C5

To see the symmetry of what is going on you may have to make the print size smaller so the equations don’t slop over the linebreak. 

So after some heavy lifting we eventually arrived at: 

T(A1) = AB11 * ( BC11 * C1  +  BC12 * C2  +  BC13 * C3   +   BC14 * C4   +   BC15 * C5 ) +

                AB12 * ( BC21 * C1  +  BC22 * C2  +  BC23 * C3   +   BC24 * C4   +   BC25 * C5 ) +

                AB13 * ( BC31 * C1  +  BC32 * C2  +  BC33 * C3   +   BC34 * C4   +   BC35 * C5 ) +

                 AB14 * ( BC41 * C1  +  BC42 * C2  +  BC43 * C3   +   BC44 * C4   +   BC45 * C5 )

So that 

A1 = (AB11 * BC11 + AB12 * BC21 + AB13 * BC31 + AB14 * BC41) C1  +  

         (AB11 * BC12 + AB12 *BC22 + AB13 * BC32 + AB14 * BC42)  C2 + 

   etc. etc. 

All very open and above board, and obtained  just by plugging the B”s in terms of the C’s into the A’s in terms of the B’s to get the A’s in terms of the C’s.  

Notice that what we could call AC11 is just AB11 * BC11 + AB12 * BC21 + AB13 * BC31 + AB14 * BC41 and AC12 is just AB11 * BC12 + AB12 *BC22 + AB13 * BC32 + AB14 * BC42.  We need another 13 such sums to be able to express a vector in A (which is a unique linear combination of A1, A2, A3 because the three of them are a basis) in terms of the 5 C’ basis vectors.  It’s dreary but it can be done, and you just saw part of it.  

You don’t want to figure this out all the time.  So represent T as a rectangular array with 4 rows and 3 columns

AB11   AB21  AB31
AB12   AB22  AB32
AB13   AB23  AB33
AB14   AB24  AB34

Represent S as a rectangular array with 5 rows and 4 columns 

BC11   BC21   BC31  BC41
BC12   BC22  BC32  BC42
BC13   BC23  BC33  BC43
BC14   BC24  BC34  BC44
BC15   BC25  BC34  BC45

Now plunk the array of AB’s on top of (and to the right) of the array of BC’s

                                                 AB11   AB21  AB31
                                                AB12   AB22  AB32
                                                AB13   AB23  AB33
                                                AB14   AB24  AB34
BC11   BC21   BC31  BC41  AC11
BC12   BC22  BC32  BC42
BC13   BC23  BC33  BC43
BC14   BC24  BC34  BC44
BC15   BC25  BC34  BC45

Recall that (after much tedious algebra) we obtained that

AC11 was just AB11 * BC11 + AB12 * BC21 + AB13 * BC31 + AB14 * BC41

But AC11 is just the as if the first row of the BC array was a vector and the first column of the AB array was also a vector and you formed the dot product.  Well they are and you did just that to find element AC11 of the array representing the linear transformation from A to C.  Do this 14 more times to get all 15 possible combinations of 3 As and 5 Cs and you get an array of numbers with 5 rows and 3 columns.  This is the AC matrix and this is why matrix multiplication is the way it is.

Note: we have multiplied a 5 row times 4 column array by a 4 row 3 column array.  Recall that you can only form the inner product of vectors with the same numbers of components (e.g. they have to be in vector spaces of the same dimension).  

We have T: A to B (dimension 3 to dimension 4)

                 S: B to C (dimension 4 to dimension 5)

     This is written as ST (convention has it that the transformation on the right is always done first — this takes some getting used to, but at least everyone follows it, so it’s like medical school — the appendix is on the right, just remember it).   Notice that  TS makes absolutely no sense.   S takes you to a vector space of dimension 5, then T tries to start with a different vector space.   This is why when multiplying arrays (matrices) the number of rows of the matrix on the left must match the number of columns of the matrix on the right (or the top as I’ve drawn it — thanks to John and Barbara Hubbard and their great book on Vector Calculus).  If the two matrices are rectangular (as we have here), only one way of  multiplication is possible.  

More notation, and an apology.  Matrix T is a 4 row by 3 column matrix — this is always written as a 4 x 3 matrix.  Similarly for the coefficients of each element which I have in some way screwed up (but at least I did so consistently).  Invariably the matrix element (just a number) in the 3rd column of the fourth row is written element43 — If you look at what I’ve written everything is bassackwards.  Sorry, but the principles are correct. The mnemonic for the order of the coefficients is Roman Catholic (row column), a nonscatological mnemonic for once. 

That’s a lot of tedium, but it does explain why matrix multiplication is the way it is.  Notice a few other things.  The matrices you saw were 4 x 3 and 5 x 4, but 3 x 1 matrices are possible as well.  Such matrices are called column vectors.  Similarly 1 x 3 matrices exist and are called row vectors.  So what do you get if you multiply a 1 x 3 vector by a 3 x 1 vector?  

You get a 1 x 1 vector or a number.  This is another way to look at the inner product of two vectors.  Usually vectors are written as column vectors ( n x 1 ) with n rows and 1 columns.  1 x n row vectors are known as the transpose of the column vector. 

That’s plenty for now.  Hopefully the next post will be more interesting.  However, physics needs to calculate things and see if the numbers they get match up with experiment.  This means that they must choose a basis for each vector space, and express each vector as an array of coefficients of that basis.  Mathematicians avoid this where possible, just using the properties of vector space bases to reason about linear transformations, and the properties of various linear transformations to reason about bases.  You’ll see the power of this sort of thinking in the next post.  If you ever study differentiable manifolds you’ll see it in spades.

Linear Algebra survival guide for Quantum Mechanics – V

We’ve established a pretty good base camp for the final assault.  It’s time to acclimate to the altitude, look around and wax a bit philosophic.  What’s happened to the integrals and derivatives in all of this?  A vector is a vector and its components can be differentiated, but linear algebra never talks about integrating vectors.  During the QM course, I was constantly bombarding the instructor with questions about things I didn’t understand.  Finally, he said that he wished the students were asking those sorts of questions.  I told him they were just doing what most people do on their first exposure to QM — trying to survive.  That’s certainly the way I was the first time around QM.  True for calculus as well.    I quickly learned to ignore what a Riemann integral really is — the limit of an infinite sum of products.  Cut the baloney, to integrate something just find the antiderivative.  We all know that.   Well, that’s pretty much true for continuous functions and the problems you meet in Calculus I.  

Well you’re not in Kansas anymore, and to understand why an infinite dimensional vector is like an integral, you’ve got to go back to Riemann’s definition of the integral of a function.  You start with some finite interval (infinite intervals come later).  Then you chop it up into many (say 100) smaller nonoverlapping but contiguous subintervals (each of which has a finite nonzero length).   Then you pick one value of the function in each of the intervals (which can’t be infinite or the process fails), multiply it by the length of each subinterval and form the sum of all 100 products (which is just a number after all).   Then you chop each of the subintervals into subsubintervals and repeat the process obtaining a new number.  If the series of numbers approaches a limit as the process proceeds than the integral exists and is a number.  Purists will note that I’ve skipped all sorts of analysis, such that each interval is a compact (closed and bounded) set of real numbers, that the function is continuous on the intervals, so that it reaches a maximum and a minimum on each interval, and that if the integral exists, the sums of the maxima times the interval length on each interval and the sums of the minima times interval length approach each other etc. etc.  Parenthetically, the best analysis book I’ve met is “Understanding Analysis” by Stephen Abbott.

As you subdivide, the length of the sub-sub- .. . sub intervals get smaller and smaller (and of course more numerous).  What if you call each of the subintervals a dimension rather than an interval and the value of the function, the coefficient of the vector on that dimension?  Then as the number of subintervals increases, the plot of the function values you chosen for each interval get closer and closer, so that plotting a high dimension vector looks just like the continuous function you started with.  This is why an infinite dimensional vector looks like the integral of a function (and why quantum mechanics uses them).   

Now imagine a linear transformation of this vector into another vector in the same infinite dimensional space, and you’re almost to what quantum mechanics means by an operator.  Inner products of infinite dimensional vectors can be defined (with just a minor bit of heavy lifting).  Just multiply the coefficients of the vectors in each dimension together and form their sum.  This won’t be impossible.  Let the nth coefficient of vector #1 be 1/2^n, that of vector #2 1/3^n.  The sum of even an infinite number of such products is finite.   This implies that to be of use in QM the coefficients of any of its infinite vectors must form a convergent series.

Now, what if some (or all) of the coefficients are complex numbers?  No problem, because of the way inner product of vectors with complex coefficients was defined in the second post of the series.  The inner product of (even an infinite dimensional ) complex vector with itself is guaranteed to be a real number.  You’re almost in the playing field of QM, e.g. Hilbert space — an infinite dimensional space with an inner product defined on it.  The only other thing needed for Hilbert space is something called completeness, something I don’t understand well enough to explain, but it means something like plugging up the holes in the space, in the same way that the real numbers plug the holes in the rational numbers. 

Certainly not in Kansas anymore, and apparently barely in Physics either.  It’s time to respond to Wavefuntion’s comment on the last post. “It’s interesting that if you are learning “practical” quantum mechanics such as quantum chemistry, you can get away with a lot without almost any linear algebra. One has to only take a look at popular QC texts like Levine, Atkins, or Pauling and Wilson; almost no LA there.”  So what’s the point of all these posts?

It’s back to Feynman and another of his famous quotes “I think I can safely say that nobody understands quantum mechanics.” This from 1965.  A look a Louisa Gilder’s recent book “The Age of Entanglement” should convince you that, on a deep level, no one still does.  Feynman also warns us not to start thinking ‘how can it be like that’ (so did the instructor in the QM course).  So why all this verbiage?  

Because all QM follows from a few simple postulates, and these postulates are written in linear algebra.  Hopefully at the end of this, you’ll understand the language in which QM is written, so any difficulty will be with the underlying structure of QM (which is plenty), not the way QM is expressed (or why it is expressed the way it is).

Next up, vector and matrix notation and what the adjoint is, and why it’s important.  If you begin thinking hard about the inner product of two different complex vectors (even the finite ones) you’ll see that usually a complex number will result.  How does QM avoid this (since all measurable values must be real — one of the postulates). Adjoints and Hermitian operators are the way out.  There’s still some pretty hard stuff ahead.

Linear Algebra survival guide for Quantum Mechanics – IV

 The point of this post is to show from whence  the weird definition of matrix multiplication comes, and why it simply MUST be the way it is. Actually matrices don’t appear in this post, just the underlying equations they represent.   We’re dealing with spaces of finite dimension at this point (infinite dimensional spaces come later).  Such spaces have a basis — meaning a collection of elements (basis vectors) which are enough to describe every element of the space UNIQUELY, as a linear combination.  

To make things a bit more concrete, think of good old 3 dimensional space with basis vectors E1 = (1,0, 0) aka i, E2 = (0,1,0) aka j, and E3 = (0,0,1) aka j.  Every point in this space is uniquely described as a1 * E1 + a2 * E2 + a3 * E3 — e. g. a linear combination of the 3 basis vectors.  You can also think of each point as a vector from the origin (0,0,0) to the point (a1,a2,a3).  Once you establish what the basis is each vector is specified by its (unique) triple of numerical coordinates (a1, a2, a3).  Choose a different basis and you get a different set of coordinates, but you always get no more and no less than 3 coordinates — that’s what dimension is all about.  Note that the combination of basis vectors is linear (no powers greater than 1).  

So now we’re going to consider several spaces, namely A, B and C of dimensions 3, 4 and 5.  Their basis vectors are the set {A1, A2, A3 } for A,  {B1, B2, B3, B4 } for B — fill in the dots for C. 

What does a linear transformation from A to B look like?   Because of the way things have been set up, there is really no choice at all. 

Consider any vector of A — it must be of the form a1 * A1  +  a2 * A2  +  a3 * A3 , e.g. a linear combination of the basis vectors {A1, A2, A3}  — where the { } notation means set.  For any given vector in A, a1 a2 and a3 are uniquely determined.  Sorry to stress this so much but uniqueness is crucial.

Similarly any vector of C must be of the form  c1 * C1 + c2 * C2 + c3 * C3 + c4 * C4 + c5 * C5.  Go back and fill in the dots for B. 

Any linear function T from A to B must satisfy

T (X + Y) = T(X) + T(Y)

where X and Y are vectors in A and T(X), T(Y) are vectors in B.  So what?  A lot.  We only have to worry about what T does to A1, A2 and A3.  Why ? ?  Because the {Ai} are  basis vectors, and because of the second thing a linear function must satisfy

T ( number * X) = number * (T ( X))  so combining both properties

T (a1 * A1 + a2 * A2 + a3 * A3) = a1 * T(A1) + a2 * T(A2) + a3 * T(A3)

All we have to worry about is what T does to the 3 basis vectors of A.  Everything else follows easily enough.

So what is T(A1) ?  Well, it’s a vector in B.  Since B has a basis T(A1) is a unique linear combination of them.  Now the nomenclature will shift a bit. I’m going to write T(A1) as follows.

T(A1)  =  AB11 * B1  +  AB12 * B2  +  AB13 * B3  +  AB14 * B4

AB signifies that the function is from space A to space B, the numbers after AB are to be taken as subscripts.  Terms of art:  linear functions between vector spaces are usually called linear transformations.  When the vectors spaces on either end of the transformation are the same the linear transformation is called a linear operator (or operator for short).  Sound familiar?   An example of a linear operator in 3 dimensional space would just be a rotation of the coordinate axes, leaving the origin fixed.  For why the origin has to be fixed if the transformation is to be linear see the first post in the series.

Fill in the dots for T(A2) = AB21 * B1 + . . . 

T(A3) = AB31 * B1 + . . . 

Now for a blizzard of (similar and pretty simple) algebra.  Consider the linear transformation from B to C. Call the transformation S.  I’m going to stop putting the Bi’s and Ci’s in bold. you know they are basis vectors.  Also in what follows to get the equations to line up on top of each other you might have to make the characters smaller (say by holding down the Command and the minus key at the same time — in the Apple world)

S(B1)  =  BC11 * C1  +  BC12 * C2  +  BC13 * C3   +   BC14 * C4   +   BC15 * C5

S(B2)  =  BC21 * C1  +  BC22 * C2  +  BC23 * C3   +   BC24 * C4   +   BC25 * C5
S(B3)  =  BC31 * C1  +  BC32 * C2  +  BC33 * C3   +   BC34 * C4   +   BC35 * C5
S(B4)  =  BC41 * C1  +  BC42 * C2  +  BC43 * C3   +   BC44 * C4   +   BC45 * C5

It’s pretty simple to plug S(Bi) into T(A1). 

Recall that T(A1) = AB11 * B1  +  AB12 * B2  + AB13 * B3  + AB14 * B4

So we  get

T(A1) = AB11 * ( BC11 * C1  +  BC12 * C2  +  BC13 * C3   +   BC14 * C4   +   BC15 * C5 ) +

                AB12 * ( BC21 * C1  +  BC22 * C2  +  BC23 * C3   +   BC24 * C4   +   BC25 * C5 ) +

                AB13 * ( BC31 * C1  +  BC32 * C2  +  BC33 * C3   +   BC34 * C4   +   BC35 * C5 ) +

                 AB14 * ( BC41 * C1  +  BC42 * C2  +  BC43 * C3   +   BC44 * C4   +   BC45 * C5 )

So now we have a linear transformation of space A to space C, just by simple substitution.   Do you see the pattern yet? If not just collect terms of A1 in terms of {C1, C2, C3, C4, C5}.  It’s easy to do as they are all above each other.  If we write

S(T(A1))  = AC11 * C1  +  AC12 * C2  +  AC13 * C3  + AC14 * C4  +  AC15 * C5 

you can see that AC13 = AB11 * BC13 + AB12 * BC23   + AB13 * BC33   +  AB14 * BC43.  This is the sum of 4 terms of which are of the form AB1x * BCx, where x runs from 1 to 4

This should look very familiar if you know the formula for matrix multiplication.  If not don’t sweat it, I’ll discuss matrices next time, but you’ve basically  just seen them (they’re just a compact way of representing the above equations).   Linear transformations between (appropriately dimensioned) vector spaces can always be mushed together (combined) like this.  Why? (1) all finite dimensional vector spaces have a basis, with all that goes with them  and (2) linear transformations are a very special type of function (according to an instructor in a graduate algebra course — the only type of function mathematicians understand completely).  

It is the very simple algebra of combining linear transformations between finite dimensional vector spaces that makes matrix multiplication exactly what it is.  It simply can’t be anything else.  Now you know.   Quantum mechanics is written in this language, the syntax of which is the linear transformation, the representation the matrix.  Remarkably, when Heisenberg formulated quantum mechanics this way, he knew nothing about matrices.  A Hilbert trained mathematician and physicist (Max Born) had to tell him what he was really doing.  So much for the notion that physicists shoehorn our view of the world into a mathematical mold.  Amazingly, the mathematics always seems to get there first (Newton excepted). 


Get every new post delivered to your Inbox.

Join 91 other followers