Anyone wanting to understand the language of general relativity must eventually tackle tensors. The following is what I wished I’d known about them before I started studying them on my own.
First, mathematicians and physicists describe tensors so differently, that it’s hard to even see that they’re talking about the same thing (one math book of mine says exactly that). Also mathematicians basically dump on the physicists’ way of doing tensors.
My first experience with tensors was years ago when auditing a graduate abstract algebra course. The instructor prefaced his first lecture by saying that tensors were the hardest thing in mathematics. Unfortunately right at that time my father became ill and I had to leave the area.
I’ll write a bit more about the mathematical approach at the end.
The physicist’s way of looking at tensors actually is a philosophical position. It basically says that there is something out there, and how two people viewing that something from different perspectives are seeing the same thing, and how they numerically describe it, while important, is irrelevant to the thing itself (ding an sich if you want to get fancy). What a tensor tries to capture is how one view of the object can be transformed into another without losing the object in the process.
This is a bit more subtle than using different measuring scales (fahrenheit vs. centigrade). That salt shaker siting there looks a bit different to everyone present at the table. Relative to themselves they’d all use different numbers to describe its location, height and width. Depending on distance it would subtend different visual angles. But it’s out there and has but one height and no one around the table would disagree.
You’re tall and see it from above, while your child sees it at eye level. You measure the distances from your eye to its top and to its bottom, subtract them and get the height. So does you child. You get the same number.
The two of you have actually used two distinct vectors in two different coordinate systems. To transform your view into that of your child’s you have to transform your coordinate system (whose origin is your eye) to the child’s. The distance numbers to the shaker from the eye are the coordinates of the shaker in each system.
So the position of the bottom of the shaker actually has two parts (e.g. the vector describing it)
l. The coordinate system of the viewer
2. The distances measured by each (the components or the coefficients of the vector).
To shift from your view of the salt shaker to that of your child’s you must change both the coordinate system and the distances measured in each. This is what tensors are all about. So the vector from the top to the bottom of the salt shaker is what you want to keep constant. To do this the coordinate system and the components must change in opposite ways. This is where the terms covariant and contravariant and all the indices come in.
What is taken as the basic change is that of the coordinate system (the basis vectors if you know what they are). In the case of the vector to the salt shaker the components transform the opposite way (as they must to keep the height of the salt shaker the same). That’s why they are called contravariant.
The use of the term contravariant vector is terribly confusing, because every vector has two parts (the coefficients and the basis) which transform oppositely. There are mathematical objects whose components (coefficients) transform the same way as the original basis vectors — these are called covariant (the most familiar is the metric, a bilinear symmetric function which takes two vectors and produces a real number). Remember it’s the way the coefficients of the mathematical object transform which determines whether they are covariant or contravariant. To make things a bit easier to remember, contRavariant coefficients have their indices above the letter (R for roof), while covariant coefficients have their indices below the letter. The basis vectors (when written in) always have the opposite position of their indices.
Another trap — the usual notation for a vector skips the basis vectors entirely, so the most familial example (x, y, z) or (x^1, x^2, x^3) is really
x^1 * e_1 + x^2 * e_2 + x^3 * e-3. Where e_1 is (1,0,0), etc. etc.
So the crucial thing about tensors is the way they transform from one coordinate system to another.
There is a far more abstract way to define tensors, as the way multilinear products of vector spaces factor through it. I don’t think you need it for relativity (I hope not). If you want to see a very concrete to this admittedly abstract business — I recommend “Differential Geometry of Manifolds” by Stephen Lovett pp. 381 – 383.
An even more abstract definition of tensors (seen in the graduate math course) is to define them on modules, not vector spaces. Modules are just vector spaces whose scalars are rings, rather than fields like the real or the complex numbers. The difference, is that unlike fields the nonZero elements don’t have inverses.
I hope this is helpful to some of you