Category Archives: Math

The many ways the many tensor notations can confuse you

This post is for the hardy autodictats attempting to learn tensors on their own. If you use multiple sources, you’ll find that they define the same terms used to describe tensors in diametrically opposed ways, so that just when you thought you knew what terms like covariant and contravariant tensor meant,  another source defines them completely differently, leading you to wonder (1) about your intelligence (2) your sanity.

Tensors involve vector spaces and their bases. This post assumes you know what they are. If you don’t understand how a vector can be expressed in terms of coordinates relative to a basis, pick up any book on linear algebra.

Tensors can be defined by the way their elements transform under a change of coordinate basis. This is where the terms covariant and contravariant come from. By the way when Einstein says that physical quantities must transform covariantly, he means they transform like tensors do (even contravariant tensors).

True enough, but this approach doesn’t help you understand the term tensor product or the weird ® notation (where there is an x within the circle) used to describe it.

The best way to view tensors (from a notational point of view) is to look on them as functions which take finite Cartesian products (https://en.wikipedia.org/wiki/Cartesian_product) of vectors and covectors and produce a single real number.

To understand what a covector (aka dual vector) is, you must understand the inner product (aka dot product).

The definition of inner product (dot product) of a vector V with itself written < V | V >, probably came from the notion of vector length. Given the standard basis in two dimensional space E1 = (1,0) and E2 = (0,1) all vectors V can be written as x * E1 + y * E2 (x is known as the coefficient of E1). Vector length is given by the good old Pythagorean theorem as SQRT[ x^2 + y^2]. The dot product (inner product) is just x^2 + y^2 without the square root.

In 3 dimensions the distance of a point (x, y, z) from the origin is SQRT [x^2 + y^2 + z^2]. The definition of vector length (or distance) easily extends (by analogy) to n dimensions where the length of V is SQRT[x1^2 + x2^2 + . . . . + xn^2] and the dot product is x1^2 + x2^2 + . . . . + xn^2. Length is always a non-negative real number.

The definition of inner product also extends to the the dot product of two different vectors V and W where V = v1 * E1 + v2 * E2 + . … vn * En, W = w1 * E1 + . . + wn * En — e.g. < V | W >  = v1 * w1 + v2 * w2 + . . . + vn * wn. Again always a real number, but not always positive as any of the v’s and w’s can be negative.

So, if you hold W constant you can regard it as a function on the vector space in which V and W reside which takes any V and produces a real number. You can regard V the same way if you hold it constant.

Now with some of the complications which mathematicians love, you can regard the set of functions { W } operating on a vector space, as a vector space itself. Functions can be added (by their results) and can be multiplied by a real number (a scalar). The set of functions { W } regarded as a vector space is called the dual vector space.

Well if { W } along with function addition and scalar multiplication is a vector space, it must have a basis. Everything I’ve every read about tensors  involves finite dimensional vector spaces. So assume the vector space A is n dimensional where n is an positive integer, and call its basis vectors the ordered set a1, . . . , an. The dual vector space (call it B) is also n dimensional with another basis the ordered set b1, . . . , bn.

The bi are chosen so that their dot product with elements of A’s basis = Kronecker delta, e.g. if i = j then  < bi | aj >
= 1. If i doesn’t equal j  then < bi | aj >  = 0. This can be done by a long and horrible process (back in the day before computer algebra systems) called Gram Schmidt orthonormalization. Assume this can be done. If you’re a true masochist have a look at https://en.wikipedia.org/wiki/Gram–Schmidt_process.

Notice what we have here. Any particular element of the dual space B (a real valued function operating on A) call it f can be written down as f1 * b1 + . . . + fn * bn. It will take any vector in A (written g1 * a1 + . . . + gn * an) and give you f1 * g1 + . . . + fn * gn which is a real number. Basically any element ( say bj) of the basis of dual space B just looks at a vector in A and picks out the coefficient of aj (when it forms the dot product with the vector in A.

Now (at long last) we can begin to look at the contrary way tensors are described. The most fruitful way is to look at them as the product of individual dot products between a vector and a dual vector.

Have a look at — https://luysii.wordpress.com/2014/12/08/tensors/. To summarize  — the whole point of tensor use in physics is that they describe physical quantities which are ‘out there’ independently of the coordinates used to describe them. A hot dog has a certain length independently of its description in inches or centimeters. Change your viewpoint and the its coordinates in space will change as well (the hot dog doesn’t care about this). Tensors are a way to accomplish this.

It’s to good to pass up, but the length of the hot dog stays the same no matter how many times you (non invasively) measure it.  This is completely different than the situations in quantum mechanics, and is one of the reasons that quantum mechanics has never been unified with general relativity (which is a theory of gravity based on tensors).

Remember the dot product concerns  < dual vector — V | vector — W > . If you change the basis of vector  W (so vector W has different coordinates) the basis of dual vector   V must also change (to keep the dot product the same). A choice must be made as to which of the two concurrent basis changes is fundamental (actually neither is as they both are).

Mathematics has chosen the basis of vector W in as fundamental.

When you change the basis of W, the coefficients of W must change in the opposite way (to keep the vector length constant). The coefficients of W are said to change contravariantly. What about the coefficients of V? The basis of V changes oppositely to the basis of W (e.g. contravariantly), so the coefficients of V must change differently from this e.g. the same way the basis of W changes — e.g. covariantly. Confused?  Nonetheless, that’s the way they are named

Vectors and convectors and other mathematical entities such differentials, metrics and gradients are labelled as covariant or contravariant by the way their numerical coefficients change with a change in basis.

So the coefficients of vector W transform contravariantly, and the coefficients of dual vector V transform covariantly. This is true even though the coefficients of V and W always transform contravariantly (e. g. oppositely) to the way their basis transforms.

An immense source of confusion.

As mentioned above, one can regard vectors and dual vectors as real valued functions on elements of a vector space. So (adding to the confusion) vectors and dual vectors are both tensors. Vectors are contravariant tensors, and dual vectors are covariant tensors.

Now we form Cartesian products of vectors W (now called V) and convectors V (hereafter called V* to keep them straight).

We get something like this V x V x V x V* x V*, a cartesian product of 3 contravariant vectors and 2 dual vectors.

To get a real number out of them we form the tensor product V* ® V* ® V* ® V ® V, where the first V* operates on the first V to produce a real number, the second operates . . . and the last V* operates on the last V to produce a real number. All real numbers produced are multiplied together to produce the result.

Why not just call  V* ® V* ® V* ® V ® V a product? Well each V and V* is an n dimensional vector space, and the tensor V ® V is a n^2 dimensional space (and  V* ® V* ® V* ® V ® V is an n^5 dimensional vector space). When we form the product of two numbers (real or complex) we just get another number of the same species (real or complex). The tensor product of two n dimensional vector spaces is not another n dimensional space, hence the need for the adjective modifying the name product. The dot product nomenclature is much the same, the dot product of two vectors is not another vector, but a real number.

Here is yet another source of confusion. What we really have is a tensor product V* ® V* ® V* ® V ® V operating on a Cartesian product of vectors and covectors (tensors themselves) V x V x V x V* x V* to produce a real number.

Tensors can either be named by their operands making this a 3 contravariant 2 covariant tensor — (3, 2) tensor.

Other books name them by their operator (e.g. the tensor product) making it a 3 covariant 3 contravariant tensor (a 2, 3) tensor.

If you don’t get this settled when you switch books you’ll think you don’t really understand what contravariant and covariant mean (when in fact you do). Mercifully, one constancy in notation (thankfully) is that the contravariant number always comes first (or on top) and the covariant number second (or on bottom).

Hopefully this is helpful.  I wish I’d had this spelled out when I started.

What is schizophrenia really like ?

The recent tragic death of John Nash and his wife warrants reposting the following written 11 October 2009

“I feel that writing to you there I am writing to the source of a ray of light from within a pit of semi-darkness. It is a strange place where you live, where administration is heaped upon administration, and all tremble with fear or abhorrence (in spite of pious phrases) at symptoms of actual non-local thinking. Up the river, slightly better, but still very strange in a certain area with which we are both familiar. And yet, to see this strangeness, the viewer must be strange.”

“I observed the local Romans show a considerable interest in getting into telephone booths and talking on the telephone and one of their favorite words was pronto. So it’s like ping-pong, pinging back again the bell pinged to me.”

Could you paraphrase this? Neither can I, and when, as a neurologist I had occasion to see schizophrenics, the only way to capture their speech was to transcribe it verbatim. It can’t be paraphrased, because it makes no sense, even though it’s reasonably gramatical.

What is a neurologist doing seeing schizophrenics? That’s for shrinks isn’t it? Sometimes in the early stages, the symptoms suggest something neurological. Epilepsy for example. One lady with funny spells was sent to me with her husband. Family history is important in just about all neurological disorders, particularly epilepsy. I asked if anyone in her family had epilepsy. She thought her nephew might have it. Her husband looked puzzled and asked her why. She said she thought so because they had the same birthday.

It’s time for a little history. The board which certifies neurologists, is called the American Board of Psychiatry and Neurology. This is not an accident as the two fields are joined at the hip. Freud himself started out as a neurologist, wrote papers on cerebral palsy, and studied with a great neurologist of the time, Charcot at la Salpetriere in Paris. 6 months of my 3 year residency were spent in Psychiatry, just as psychiatrists spend time learning neurology (and are tested on it when they take their Boards).

Once a month, a psychiatrist friend and I would go to lunch, discussing cases that were neither psychiatric nor neurologic but a mixture of both. We never lacked for new material.

Mental illness is scary as hell. Society deals with it the same way that kids deal with their fears, by romanticizing it, making it somehow more human and less horrible in the process. My kids were always talking about good monsters and bad monsters when they were little. Look at Sesame street. There are some fairly horrible looking characters on it which turn out actually to be pretty nice. Adults have books like “One flew over the Cuckoo’s nest” etc. etc.

The first quote above is from a letter John Nash wrote to Norbert Weiner in 1959. All this, and much much more, can be found in “A Beatiful Mind” by Sylvia Nasar. It is absolutely the best description of schizophrenia I’ve ever come across. No, I haven’t seen the movie, but there’s no way it can be more accurate than the book.

Unfortunately, the book is about a mathematician, which immediately turns off 95% of the populace. But that is exactly its strength. Nash became ill much later than most schizophrenics — around 30 when he had already done great work. So people saved what he wrote, and could describe what went on decades later. Even better, the mathematicians had no theoretical axe to grind (Freudian or otherwise). So there’s no ego, id, superego or penis envy in the book, just page after page of description from well over 100 people interviewed for the book, who just talked about what they saw. The description of Nash at his sickest covers 120 pages or so in the middle of the book. It’s extremely depressing reading, but you’ll never find a better description of what schizophrenia is actually like — e.g. (p. 242) She recalled that “he kept shifting from station to station. We thought he was just being pesky. But he thought that they were broadcasting messages to him. The things he did were mad, but we didn’t really know it.”

Because of his previous mathematical achievments, people saved what he wrote — the second quote above being from a letter written in 1971 and kept by the recipient for decades, the first quote from a letter written in 12 years before that.

There are a few heartening aspects of the book. His wife Alicia is a true saint, and stood by him and tried to help as best she could. The mathematicians also come off very well, in their attempts to shelter him and to get him treatment (they even took up a collection for this at one point).

I was also very pleased to see rather sympathetic portraits of the docs who took care of him. No 20/20 hindsight is to be found. They are described as doing the best for him that they could given the limited knowledge (and therapies) of the time. This is the way medicine has been and always will be practiced — we never really know enough about the diseases we’re treating, and the therapies are almost never optimal. We just try to do our best with what we know and what we have.

I actually ran into Nash shortly after the book came out. The Princeton University Store had a fabulous collection of math books back then — several hundred at least, most of them over $50, so it was a great place to browse, which I did whenever I was in the area. Afterwards, I stopped in a coffee shop in Nassau Square and there he was, carrying a large disheveled bunch of papers with what appeared to be scribbling on them. I couldn’t bring myself to speak to him. He had the eyes of a hunted animal.

Read Einstein

Devoted readers of this blog (assuming there are any) know that I’ve been studying relativity for some time — for why see https://luysii.wordpress.com/2011/12/31/some-new-years-resolutions/.

Probably some of you have looked at writings about relativity, and have seen equations containing terms like ( 1 – v^2/c^2)^1/2. You need a lot of math for general relativity (which is about gravity), but to my surprise not so much for special relativity.

Back in the early 50’s we were told not to study Calculus before reaching 18, as it was simply to hard for the young brain, and would harm it, the way lifting something too heavy could bring on a hernia. That all changed after Sputnik in ’58 (but too late for me).

I had similar temerity in approaching anything written by Einstein himself. But somehow I began looking at his book “Relativity” to clear up a few questions I had. The Routledge paperback edition (which I got in England) cost me all of 13 pounds. Routledge is a branch of a much larger publisher Taylor and Francis.

The book is extremely accessible. You need almost no math to read it. No linear algebra, no calculus, no topology, no manifolds, no differential geometry, just high school algebra.

You will see a great mind at work in terms you can understand.

Some background. Galileo had a theory of relativity, which basically said that there was no absolute position, and that motion was only meaningful relative to another object. Not much algebra was available to him, and later Galilean relativity came be taken to mean that the equations of physics should look the same to people in unaccelerated motion relative to each other.

Newton’s laws worked out quite well this way, but in the late 1800’s Maxwell’s equations for electromagnetism did not. This was recognized as a problem by physicists, so much so that some of them even wondered if the Maxwell equations were correct. In 1895 Lorentz figured out a way (purely by trying different equations out) to transform the Maxwell equations so they looked the same to two observers in relative motion to each other. It was a classic kludge (before there even were kludges).

The equation to transform the x coordinate of observer 1 to the x’ of observer 2 looks like this

x’ = ( x – v*t) / ( 1 – v^2/c^2)^1/2)

t = time, v = the constant velocity of the two observers relative to each other, c = velocity of light

Gruesome no ?

All Lorentz knew was that it made Maxwell’s equations transform properly from x to x’.

What you will see on pp. 117 – 123 of the book, is Einstein derive the Lorentz equation from
l. the constancy of the velocity of light to both observers regardless of whether they are moving relative to each other
2. the fact that as judged from observer1 the length of a rod at rest relative to observer2, is the same as the length of the same rod at rest relative to observer1 as judged from observer2. Tricky to state, but this just means that the rod is out there and has a length independent of who is measuring it.

To follow his derivation you need only high school algebra. That’s right — no linear algebra, no calculus, no topology, no manifolds, no differential geometry. Honest to God.

It’s a good idea to have figure 2 from p. 34 in front of you

The derivation isn’t particularly easy to follow, but the steps are quite clear, and you will have the experience of Einstein explaining relativity to you in terms you can understand. Like reading the Origin of Species, it’s fascinating to see a great mind at work.

Enjoy

Why we imperfectly understand randomness the way we do.

The cognoscenti think the average individual is pretty dumb when it comes to probability and randomness. Not so, says a fascinating recent paper [ Proc. Natl. Acad. Sci. vol. 112 pp. 3788 – 3792 ’15 ] http://www.pnas.org/content/112/12/3788.abstract. The average joe (this may mean you) when asked to draw a random series of fifty or so heads and tails never puts in enough runs of heads or runs of tails. This leads to the gambler’s fallacy, that if an honest coin gives a run of say 5 heads, the next result is more likely to be tails.

There is a surprising amount of structure lurking within purely random sequences such as the toss of a fair coin where the probability of heads is exactly 50%. Even with a series with 50% heads, the waiting time for two heads (HH) or two tails (TT) to appear is significantly longer than for an alternation (HT or TH). On average 6 tosses will be required for HH or TT to appear while only an average of 4 are needed for HT or TH.

This is why Joe SixPack never puts in enough runs of Hs or Ts.

Why should the wait be longer for HH or TT even when 50% of the time you get a H or T. The mean time for HH and TT is the same as for HT and TH. The variance is different because the occurrences of HH and TT are bunched in time, while the HT and TH are spread evenly.

It gets worse for longer repetitions — they can build on each other. HHH contains two instances of HH, while alterations do not. Repetitions bunch together as noted earlier. We are very good at perceiving waiting times, and this is probably why we think repetitions are less likely and soon to break up.

The paper goes a lot farther constructing a neural model, based on the way our brains integrate information over time when processing sequences of events. It takes into consideration our perceptions of mean time AND waiting times. We average the two. This produces the best fitting bias gain parameter for an existing Bayesian model of randomness.

See, you’re not as dumb as they thought you were.

Another reason for our behavior comes from neuropsychology and physiological psychology. We have ways to watch the electrical activity of your brain and find out when you perceive something as different. It’s called mismatch negativity (see http://en.wikipedia.org/wiki/Mismatch_negativity for more detail). It a brain potential (called P300) peaking .1 -.25 seconds after a deviant tone or syllable.

Play 5 middle c’s in a row followed by a d than c’s again. The potential doesn’t occur after any of the c’s just after the d. This has been applied to the study of infant perception long before they can speak.

It has shown us that asian and western newborn infants both hear ‘r’ and ‘l’ quite well (showing mismatch negativity to a sudden ‘r’ or ‘l’ in a sequence of other sounds). If the asian infant never hears people speaking words with r and l in them for 6 months, it loses mismatch negativity to them (and clinical perception of them). So our brains are literally ‘tuned’ to understand the language we hear.

So we are more likely to notice the T after a run of H’s, or an H after a run of T’s. We are also likely to notice just how long it has been since it last occurred.

This is part of a more general phenomenon — the ability of our brains to pick up and focus on changes in stimuli. Exactly the same phenomenon explains why we see edges of objects so well — at least here we have a solid physiologic explanation — surround inhibition (for details see — http://en.wikipedia.org/wiki/Lateral_inhibition). It happens in the complicated circuitry of the retina, before the brain is involved.

Philosophers should note that this destroys the concept of the pure (e.g. uninterpreted) sensory percept — information is being processed within our eyes before it ever gets to the brain.

Update 31 Mar — I wrote the following to the lead author

” Dr. Sun:

Fascinating paper. I greatly enjoyed it.

You might be interested in a post from my blog (particularly the last few paragraphs). I didn’t read your paper carefully enough to see if you mention mismatch negativity, P300 and surround inhibition. if not, you should find this quite interesting.

Luysii

And received the following back in an hour or two

“Hi, Luysii- Thanks for your interest in our paper. I read your post, and find it very interesting, and your interpretation of our findings is very accurate. I completely agree with you making connections to the phenomenon of change detection and surround inhibition. We did not spell it out in the paper, but in the supplementary material, you may find some relevant references. For example, the inhibitory competition between HH and HT detectors is a key factor for the unsupervised pattern association we found in the neural model.

Yanlong”

Nice ! ! !

How formal tensor mathematics and the postulates of quantum mechanics give rise to entanglement

Tensors continue to amaze. I never thought I’d get a simple mathematical explanation of entanglement, but here it is. Explanation is probably too strong a word, because it relies on the postulates of quantum mechanics, which are extremely simple but which lead to extremely bizarre consequences (such as entanglement). As Feynman famously said ‘no one understands quantum mechanics’. Despite that it’s never made a prediction not confirmed by experiments, so the theory is correct even if we don’t understand ‘how it can be like that’. 100 years of correct prediction of experimentation are not to be sneezed at.

If you’re a bit foggy on just what entanglement is — have a look at https://luysii.wordpress.com/2010/12/13/bells-inequality-entanglement-and-the-demise-of-local-reality-i/. Even better; read the book by Zeilinger referred to in the link (if you have the time).

Actually you don’t even need all the postulates for quantum mechanics (as given in the book “Quantum Computation and Quantum Information by Nielsen and Chuang). No differential equations. No Schrodinger equation. No operators. No eigenvalues. What could be nicer for those thirsting for knowledge? Such a deal ! ! ! Just 2 postulates and a little formal mathematics.

Postulate #1 “Associated to any isolated physical system, is a complex vector space with inner product (that is a Hilbert space) known as the state space of the system. The system is completely described by its state vector which is a unit vector in the system’s state space”. If this is unsatisfying, see an explication of this on p. 80 of Nielson and Chuang (where the postulate appears)

Because the linear algebra underlying quantum mechanics seemed to be largely ignored in the course I audited, I wrote a series of posts called Linear Algebra Survival Guide for Quantum Mechanics. The first should be all you need. https://luysii.wordpress.com/2010/01/04/linear-algebra-survival-guide-for-quantum-mechanics-i/ but there are several more.

Even though I wrote a post on tensors, showing how they were a way of describing an object independently of the coordinates used to describe it, I did’t even discuss another aspect of tensors — multi linearity — which is crucial here. The post itself can be viewed at https://luysii.wordpress.com/2014/12/08/tensors/

Start by thinking of a simple tensor as a vector in a vector space. The tensor product is just a way of combining vectors in vector spaces to get another (and larger) vector space. So the tensor product isn’t a product in the sense that multiplication of two objects (real numbers, complex numbers, square matrices) produces another object of the exactly same kind.

So mathematicians use a special symbol for the tensor product — a circle with an x inside. I’m going to use something similar ‘®’ because I can’t figure out how to produce the actual symbol. So let V and W be the quantum mechanical state spaces of two systems.

Their tensor product is just V ® W. Mathematicians can define things any way they want. A crucial aspect of the tensor product is that is multilinear. So if v and v’ are elements of V, then v + v’ is also an element of V (because two vectors in a given vector space can always be added). Similarly w + w’ is an element of W if w an w’ are. Adding to the confusion trying to learn this stuff is the fact that all vectors are themselves tensors.

Multilinearity of the tensor product is what you’d think

(v + v’) ® (w + w’) = v ® (w + w’ ) + v’ ® (w + w’)

= v ® w + v ® w’ + v’ ® w + v’ ® w’

You get all 4 tensor products in this case.

This brings us to Postulate #2 (actually #4 on the book on p. 94 — we don’t need the other two — I told you this was fairly simple)

Postulate #2 “The state space of a composite physical system is the tensor product of the state spaces of the component physical systems.”

http://planetmath.org/simpletensor

Where does entanglement come in? Patience, we’re nearly done. One now must distinguish simple and non-simple tensors. Each of the 4 tensors products in the sum on the last line is simple being the tensor product of two vectors.

What about v ® w’ + v’ ® w ?? It isn’t simple because there is no way to get this by itself as simple_tensor1 ® simple_tensor2 So it’s called a compound tensor. (v + v’) ® (w + w’) is a simple tensor because v + v’ is just another single element of V (call it v”) and w + w’ is just another single element of W (call it w”).

So the tensor product of (v + v’) ® (w + w’) — the elements of the two state spaces can be understood as though V has state v” and W has state w”.

v ® w’ + v’ ® w can’t be understood this way. The full system can’t be understood by considering V and W in isolation, e.g. the two subsystems V and W are ENTANGLED.

Yup, that’s all there is to entanglement (mathematically at least). The paradoxes entanglement including Einstein’s ‘creepy action at a distance’ are left for you to explore — again Zeilinger’s book is a great source.

But how can it be like that you ask? Feynman said not to start thinking these thoughts, and if he didn’t know you expect a retired neurologist to tell you? Please.

Tensors

Anyone wanting to understand the language of general relativity must eventually tackle tensors. The following is what I wished I’d known about them before I started studying them on my own.

First, mathematicians and physicists describe tensors so differently, that it’s hard to even see that they’re talking about the same thing (one math book of mine says exactly that). Also mathematicians basically dump on the physicists’ way of doing tensors.

My first experience with tensors was years ago when auditing a graduate abstract algebra course. The instructor prefaced his first lecture by saying that tensors were the hardest thing in mathematics. Unfortunately right at that time my father became ill and I had to leave the area.

I’ll write a bit more about the mathematical approach at the end.

The physicist’s way of looking at tensors actually is a philosophical position. It basically says that there is something out there, and how two people viewing that something from different perspectives are seeing the same thing, and how they numerically describe it, while important, is irrelevant to the thing itself (ding an sich if you want to get fancy). What a tensor tries to capture is how one view of the object can be transformed into another without losing the object in the process.

This is a bit more subtle than using different measuring scales (fahrenheit vs. centigrade). That salt shaker siting there looks a bit different to everyone present at the table. Relative to themselves they’d all use different numbers to describe its location, height and width. Depending on distance it would subtend different visual angles. But it’s out there and has but one height and no one around the table would disagree.

You’re tall and see it from above, while your child sees it at eye level. You measure the distances from your eye to its top and to its bottom, subtract them and get the height. So does you child. You get the same number.

The two of you have actually used two distinct vectors in two different coordinate systems. To transform your view into that of your child’s you have to transform your coordinate system (whose origin is your eye) to the child’s. The distance numbers to the shaker from the eye are the coordinates of the shaker in each system.

So the position of the bottom of the shaker actually has two parts (e.g. the vector describing it)
l. The coordinate system of the viewer
2. The distances measured by each (the components or the coefficients of the vector).

To shift from your view of the salt shaker to that of your child’s you must change both the coordinate system and the distances measured in each. This is what tensors are all about. So the vector from the top to the bottom of the salt shaker is what you want to keep constant. To do this the coordinate system and the components must change in opposite ways. This is where the terms covariant and contravariant and all the indices come in.

What is taken as the basic change is that of the coordinate system (the basis vectors if you know what they are). In the case of the vector to the salt shaker the components transform the opposite way (as they must to keep the height of the salt shaker the same). That’s why they are called contravariant.

The use of the term contravariant vector is terribly confusing, because every vector has two parts (the coefficients and the basis) which transform oppositely. There are mathematical objects whose components (coefficients) transform the same way as the original basis vectors — these are called covariant (the most familiar is the metric, a bilinear symmetric function which takes two vectors and produces a real number). Remember it’s the way the coefficients of the mathematical object transform which determines whether they are covariant or contravariant. To make things a bit easier to remember, contRavariant coefficients have their indices above the letter (R for roof), while covariant coefficients have their indices below the letter. The basis vectors (when written in) always have the opposite position of their indices.

Another trap — the usual notation for a vector skips the basis vectors entirely, so the most familial example (x, y, z) or (x^1, x^2, x^3) is really
x^1 * e_1 + x^2 * e_2 + x^3 * e-3. Where e_1 is (1,0,0), etc. etc.

So the crucial thing about tensors is the way they transform from one coordinate system to another.

There is a far more abstract way to define tensors, as the way multilinear products of vector spaces factor through it. I don’t think you need it for relativity (I hope not). If you want to see a very concrete to this admittedly abstract business — I recommend “Differential Geometry of Manifolds” by Stephen Lovett pp. 381 – 383.

An even more abstract definition of tensors (seen in the graduate math course) is to define them on modules, not vector spaces. Modules are just vector spaces whose scalars are rings, rather than fields like the real or the complex numbers. The difference, is that unlike fields the nonZero elements don’t have inverses.

I hope this is helpful to some of you

Maryam Mirzakhani

“The universal scientific language is broken English.” So sayeth Don Voet 50+ years ago when we were graduate students. He should know, as his parents were smart enough to get the hell out of the Netherlands before WWII. I met them and they told me that there was some minor incident there involving Germans who promptly went bananas. They decided that this wasn’t the way a friendly country behaved and got out. Just about everyone two generations back in my family was an immigrant, so I heard a lot of heavily accented (if not broken) English growing up.

Which (at last) brings us to Maryam Mirzakhani, a person probably not familiar to chemists, but a brilliant mathematician who has just won the Fields Medal (the Nobel of mathematics). Born in Teheran and educated through college there, she came to Harvard for her PhD, and has remained here ever since and is presently a full prof. at Stanford.

Why she chose to stay here isn’t clear. The USA has picked up all sorts of brains from the various European upheavals and petty hatreds (see https://luysii.wordpress.com/2013/10/27/hitlers-gifts-and-russias-gift/). Given the present and past state of the middle East, I’ve always wondered if we’d scooped up any of the talent originating there. Of course, all chemists know of E. J. Corey, a Lebanese Christian, but he was born here 86 years ago. Elias Zerhouni former director of the NIH, was born in Algeria. That’s about all I know at this level of brilliance and achievement. I’m sure there are others that I’ve missed. Hopefully more such people are already here but haven’t established themselves as yet. This is possible, given that they come from a region without world class scientific institutions. Hitler singlehandedly destroyed the great German departments of Mathematics and Physics and the USA (and England) picked up the best of them.

Given the way things are going presently, the USA may shortly acquire a lot of Muslim brains from Europe. All it will take is a few random beheadings of Europeans in their home countries by the maniacs of ISIS and their ilk. Look what Europeans did to a people who did not physically threaten them during WWII. Lest you think this sort of behavior was a purely German aberration, try Googling Quisling and Marshal Petain. God knows what they’ll do when they are actually threatened. Remember, less than 20 years ago, the Europeans did nothing as Muslims were being slaughtered by Serbs in Kosovo.

Not to ignore the awful other side of the coin, the religious cleansing of the middle East of Christians by the larger Muslim community. The politically correct here have no love of Christianity. However, the continued passivity of American Christians is surprising. Whatever happened to “Onward Christian Soldiers” which seemed to be sung by all at least once a week in the grade school I attended 60+ years ago.

These are very scary times.

Two math tips

Two of the most important theorems in differential geometry are Gauss’s Theorem egregium and the Inverse function theorem. Basically the theorem egregium says that you don’t need to look at the shape of a two dimensional surface (say the surface of a walnut) from outside (e.g. from the way it sits in 3 dimensional space) to understand its shape. All the information is contained in the surface itself.

The inverse function theorem (InFT) is used over and over. If you have a continuous function from Euclidean space U of finite dimension n to Euclidean space V of the same dimension, and certain properties of its derivative are present at a point x of U, then there exists a another function to get you back from space V to U.

Even better, once you’ve proved the inverse function theorem, proof of another important theorem (the implicit function theorem aka the ImFT) is quite simple. The ImFT lets you know if given f(x, y, .. .) –> R (e.g. a real valued function) if you can express one variable (say x) in terms of the others. Again sometimes it’s difficult to solve such an equation for x in terms of y — consider arctan(e^(x + y^2) * sin(xy) + ln x). What is important to know in this case, is whether it’s even possible.

The proofs of both are tricky. In particular, the proof of the inverse function theorem is an existence proof. You may not be able to write down the function from V to U even though you’ve just proved that it exists. So using the InFT to prove the implicit function theory is also nonconstructive.

At some point in your mathematical adolescence, you should sit down and follow these proofs. They aren’t easy and they aren’t short.

Here’s where to go. Both can be found in books by James J. Callahan, emeritus professor of Mathematics at Smith College in Northampton Mass. The proof of the InVT is to be found on pages 169 – 174 of his “Advanced Calculus, A Geometric View”, which is geometric, with lots of pictures. What’s good about this proof is that it’s broken down into some 13 steps. Be prepared to meet a lot of functions and variables.

Just the statement of InVT involves functions f, f^-1, df, df^-1, spaces U^n, R^n, variables a, q, B

The proof of InVT involves functions g, phi, dphi, h, dh, N, most of which are vector valued (N is real valued)

Then there are the geometric objects U^n, R^n, Wa, Wfa, Br, Br/2

Vectors a, x, u, delta x, delta u, delta v, delta w

Real number r

That’s just to get you through step 8 of the 13 step proof, which proves the existence of the inverse function (aka f^-1). The rest involves proving properties of f^-1 such as continuity and differentiability. I must confess that just proving existence of f^-1 was enough for me.

The proof of the implicit function theorem for two variables — e.g. f(x, y) = k takes less than a page (190).

The proof of the Theorem Egregium is to be found in his book “The Geometry of Spacetime” pp. 258 – 262 in 9 steps. Be prepared for fewer functions, but many more symbols.

As to why I’m doing this please see https://luysii.wordpress.com/2011/12/31/some-new-years-resolutions/

Help wanted

Just about done with special relativity. It is simply marvelous to see how everything follows from the constancy of the speed of light — time moving more slowly for a moving object (relative to an object standing still in its own frame of reference), a moving object shrinking (ditto), the increase in mass which occurs as an object begins to approach the speed of light, and how this leads to the equivalence of mass and energy. Special relativity is even sufficient to show how a gravitational field will bend light — although to really understand this, general relativity is required.

The one fly in the intellectual ointment is the Minkowski metric for the space time of special relativity. In all the sources I’ve been able to find, it appears ad hoc, or is defined analogously to the euclidean metric. I’d love to see an argument why this metric (time coordinates positive, space coordinates negative) must follow from the constancy of the speed of light. It is clear that the Minkowski metric is preserved under the hyperbolic transformation of space-time, but likely others are as well. Why this particular metric and not something else.

Consider the determinant function of an n by n matrix. It has a god awful mathematical form involving the sum of n ! terms. Yet all you need to get the (unique) formula are a few postulates — the determinant of the identity matrix is 1, the determinant is a linear function of its rows (or its columns), interchanging any two rows of the determinant reverses the sign of the determinant, etc. etc. This basically determines the (unique) formula of the determinant. I’d really like to see the Minkowski metric come out of something like that.

Can anyone out there shed light on this or give me a link?

A Mathematical Near Death Experience

As I’ve alluded to from time to time, I’m trying to learn relativity — not the popularizations, of which there are many, but the full Monty as it were, with all the math required. I’ve been at it a while as the following New Year’s Resolution of a few years ago will show.

“Why relativity? It’s something I’ve always wanted to understand at a deeper level than the popularizations of it (reading the sacred texts in the original so to speak). I may have enough background in math, to understand how to study it. Topology is something I started looking at years ago as a chief neurology resident, to get my mind off the ghastly cases I was seeing.

I’d forgotten about it, but a fellow ancient alum, mentioned our college president’s speech to us on opening day some 55 years ago. All the high school guys were nervously looking at our neighbors and wondering if we really belonged there. The prez told us that if they accepted us that they were sure we could do the work, and that although there were a few geniuses in the entering class, there were many more people in the class who thought they were.

Which brings me to our class relativist. I knew a lot of the physics majors as an undergrad, but not this guy. The index of the new book on Hawking by Ferguson has multiple entries about his work with Hawking (which is ongoing). Another physicist (now a semi-famous historian) felt validated when the guy asked him for help with a problem. He never tooted his own horn, and seemed quite modest at the 50th reunion. As far as I know, one physics self-proclaimed genius (and class valedictorian) has done little work of any significance. Maybe at the end of the year I’ll be able to read the relativist’s textbook on the subject. Who knows? It’s certainly a personal reason for studying relativity. Maybe at the end of the year I’ll be able to ask him a sensible question.”

Well that year has come and gone, but I’m making progress, going through a book with a mathematical approach to the subject written by a local retired math prof (who shall remain nameless). The only way to learn any math or physics is to do the problems, and he was kind enough to send me the answer sheet to all the problems in his book (which he worked out himself).

I am able to do most of the problems, and usually get the right answer, but his answers are far more elegant than mine. It is fascinating to see the way a professional mathematician thinks about these things.

The process of trying to learn something which everyone says is hard, is actually quite existential for someone now 76. Do I have the mental horsepower to get the stuff? Did I ever? etc. etc.

So when I got to one problem and the profs answer I was really quite upset. My answer appeared fairly straightforward and simple, yet his answer required a long derivation. Even though we both came out with the same thing, I was certain that I’d missed something really basic which required all the work he put in.

One of the joys of reading math these days (at least math books written by someone who is still alive) is that you can correspond with them. Mathematicians are so used to being dumped on by presumably intellectual people, that they’re happy to see some love. Response time is usually under a day. So I wrote him the following

“Along those lines, you do a lot of heavy lifting in your answer to 3a in section 4.3. Why not just say the point you are trying to find in R’s world is the image under M of the point (h.h) in G’s world and apply M to get t and z.”

Now usually any mathematician I EMail about their books gets back quickly — my sardonic wife says that it’s because they don’t have much to do.

Fo days, I heard nothing. I figured that he was trying to figure out a nice way to tell me to take up watching sports or golf, and that relativity was a mountain my intellect couldn’t climb. True existential gloom set in. Then I go the following back.

“You are absolutely right about the question; what you propose is elegant and incisive. I can’t figure out why I didn’t make the simple direct connection in the text itself, because I went to some pains to structure everything around the map M. But all that was fifteen or more years ago, and I have no notes about my thinking as I was writing.”

A true mathematical (and existential) near death experience.

Follow

Get every new post delivered to your Inbox.

Join 77 other followers