Tag Archives: dual vector

Tensors yet again

In the grad school course on abstract algebra I audited a decade or so ago, the instructor began the discussion about tensors by saying they were the hardest thing in mathematics. Unfortunately I had to drop this section of the course due a family illness. I’ve written about tensors before and their baffling notation and nomenclature. The following is yet another way to look at them which may help with their confusing terminology

First, this post will assume you have a significant familiarity with linear algebra. I’ve written a series of posts on the subject if you need a brush up — pretty basic — here’s a link to the first post — https://luysii.wordpress.com/2010/01/04/linear-algebra-survival-guide-for-quantum-mechanics-i/
All of them can be found here — https://luysii.wordpress.com/category/linear-algebra-survival-guide-for-quantum-mechanics/.

Here’s another attempt to explain them — which will give you the background on dual vectors you’ll need for this post — https://luysii.wordpress.com/2015/06/15/the-many-ways-the-many-tensor-notations-can-confuse-you/

To the physicist, tensors really represent a philosophical position — e.g. there are shapes and processes external to us which are real, and independent of the way we choose to describe them mathematically. E. g. describing them by locating their various parts and physical extents in some sort of coordinate system. That approach is described here — https://luysii.wordpress.com/2014/12/08/tensors/

Zee in one of his books defines tensors as something that transforms like a tensor (honest to god). Neuenschwander in his book says “What kind of a definition is that supposed to be, that doesn’t tell you what it is that is changing.”

The following approach may help — it’s from an excellent book which I’ve not completely gotten through — “An Introduction to Tensors and Group Theory for Physicists” by Nadir Jeevanjee.

He says that tensors are just functions that take a bunch of vectors and return a number (either real or complex). It’s a good idea to keep the volume tensor (which takes 3 vectors and returns a real number) in mind while reading further. The tensor function just has one other constraint — it must be multilinear (https://en.wikipedia.org/wiki/Multilinear_map). Amazingly, it turns out that this is all you need.

Tensors are named by the number of vectors (written V) and dual vectors (written V*) they massage to produce the number. This is fairly weird when you think of it. We don’t name sin (x) by x because this wouldn’t distinguish it from the zillion other real valued functions of a single variable.

So an (r, s) tensor is named by the ordered array of its operands — (V, …V,V*, …,V*) with r V’s first and s V* next in the array. The array tells you what the tensor function must be.

How can Jeevanjee get away with this? Amazingly, multilinearity is all you need. Recall that the great thing about the linearity of any function or operator on a vector space is that ALL you need to know is what the function or operator does to the basis vectors of the space. The effect on ANY vector in the vector space then follows by linearity.

Going back to the volume tensor whose operand is (V, V, V) and the vector space for all 3 V’s (R^3), how many basis vectors are there for V x V x V ? There are 3 for each V meaning that there are 3^3 = 27 possible basis vectors. You probably remember the formula for the volume enclosed by 3 vectors (call them u, v, w). The 3 components of u are u1 u2 and u3.

The volume tensor calculates volume by ( U crossproduct V ) dot product W.
Writing the calculation out

Volume = u1*v2*w3 – u1*v3*w2 + u2*v3*w1 – u2*v1*w3 + u3*v1*w2 – u3*v2*w1. What about the other 21 combinations of basis vectors? They are all zero, but they are all present in the tensor.

While any tensor manipulating two vectors can be expressed as a square matrix, clearly the volume tensor with 27 components can not be. So don’t confuse tensors with matrices (as I did).

Note that the formula for volume implicitly used the usual standard orthogonal coordinates for R^3. What would it be in spherical coordinates? You’d have to use a change of basis matrix to (r, theta, phi). Actually you’d have to have 3 of them, as basis vectors in V x V x V are 3 places arrays. This gives the horrible subscript and superscript notation of matrices by which tensors are usually defined. So rather than memorizing how tensors transform you can derive things like

T_i’^j’ = (A^k_i’)*(A^k_i’) * T_k^l where _ before a letter means subscript and ^ before a letter means superscript and A^k_i’ and A^k_i’ are change of basis matrices and the Einstein summation convention is used. Note that the chance of basis formula for tensor components for the volume tensor would have 3 such matrices, not two as I’ve shown.

One further point. You can regard a dual vector as a function that takes a vector and returns a number — so a dual vector is a (1,0) tensor. Similarly you can regard vectors as functions that take dual vectors and returns a number, so they are (0,1) tensors. So, actually vectors and dual vectors are tensors as well.

The distinction between describing what a tensor does (e.g. its function) and what its operands actually are caused me endless confusion. You write a tensor operating on a dual vector as a (0, 1) tensor, but a dual vector is a (1,0) considered as a function.

None of this discussion applies to the tensor product, which is an entirely different (but similar) story.

Hopefully this helps

The many ways the many tensor notations can confuse you

This post is for the hardy autodictats attempting to learn tensors on their own. If you use multiple sources, you’ll find that they define the same terms used to describe tensors in diametrically opposed ways, so that just when you thought you knew what terms like covariant and contravariant tensor meant,  another source defines them completely differently, leading you to wonder (1) about your intelligence (2) your sanity.

Tensors involve vector spaces and their bases. This post assumes you know what they are. If you don’t understand how a vector can be expressed in terms of coordinates relative to a basis, pick up any book on linear algebra.

Tensors can be defined by the way their elements transform under a change of coordinate basis. This is where the terms covariant and contravariant come from. By the way when Einstein says that physical quantities must transform covariantly, he means they transform like tensors do (even contravariant tensors).

True enough, but this approach doesn’t help you understand the term tensor product or the weird ® notation (where there is an x within the circle) used to describe it.

The best way to view tensors (from a notational point of view) is to look on them as functions which take finite Cartesian products (https://en.wikipedia.org/wiki/Cartesian_product) of vectors and covectors and produce a single real number.

To understand what a covector (aka dual vector) is, you must understand the inner product (aka dot product).

The definition of inner product (dot product) of a vector V with itself written < V | V >, probably came from the notion of vector length. Given the standard basis in two dimensional space E1 = (1,0) and E2 = (0,1) all vectors V can be written as x * E1 + y * E2 (x is known as the coefficient of E1). Vector length is given by the good old Pythagorean theorem as SQRT[ x^2 + y^2]. The dot product (inner product) is just x^2 + y^2 without the square root.

In 3 dimensions the distance of a point (x, y, z) from the origin is SQRT [x^2 + y^2 + z^2]. The definition of vector length (or distance) easily extends (by analogy) to n dimensions where the length of V is SQRT[x1^2 + x2^2 + . . . . + xn^2] and the dot product is x1^2 + x2^2 + . . . . + xn^2. Length is always a non-negative real number.

The definition of inner product also extends to the the dot product of two different vectors V and W where V = v1 * E1 + v2 * E2 + . … vn * En, W = w1 * E1 + . . + wn * En — e.g. < V | W >  = v1 * w1 + v2 * w2 + . . . + vn * wn. Again always a real number, but not always positive as any of the v’s and w’s can be negative.

So, if you hold W constant you can regard it as a function on the vector space in which V and W reside which takes any V and produces a real number. You can regard V the same way if you hold it constant.

Now with some of the complications which mathematicians love, you can regard the set of functions { W } operating on a vector space, as a vector space itself. Functions can be added (by their results) and can be multiplied by a real number (a scalar). The set of functions { W } regarded as a vector space is called the dual vector space.

Well if { W } along with function addition and scalar multiplication is a vector space, it must have a basis. Everything I’ve every read about tensors  involves finite dimensional vector spaces. So assume the vector space A is n dimensional where n is an positive integer, and call its basis vectors the ordered set a1, . . . , an. The dual vector space (call it B) is also n dimensional with another basis the ordered set b1, . . . , bn.

The bi are chosen so that their dot product with elements of A’s basis = Kronecker delta, e.g. if i = j then  < bi | aj >
= 1. If i doesn’t equal j  then < bi | aj >  = 0. This can be done by a long and horrible process (back in the day before computer algebra systems) called Gram Schmidt orthonormalization. Assume this can be done. If you’re a true masochist have a look at https://en.wikipedia.org/wiki/Gram–Schmidt_process.

Notice what we have here. Any particular element of the dual space B (a real valued function operating on A) call it f can be written down as f1 * b1 + . . . + fn * bn. It will take any vector in A (written g1 * a1 + . . . + gn * an) and give you f1 * g1 + . . . + fn * gn which is a real number. Basically any element ( say bj) of the basis of dual space B just looks at a vector in A and picks out the coefficient of aj (when it forms the dot product with the vector in A.

Now (at long last) we can begin to look at the contrary way tensors are described. The most fruitful way is to look at them as the product of individual dot products between a vector and a dual vector.

Have a look at — https://luysii.wordpress.com/2014/12/08/tensors/. To summarize  — the whole point of tensor use in physics is that they describe physical quantities which are ‘out there’ independently of the coordinates used to describe them. A hot dog has a certain length independently of its description in inches or centimeters. Change your viewpoint and the its coordinates in space will change as well (the hot dog doesn’t care about this). Tensors are a way to accomplish this.

It’s to good to pass up, but the length of the hot dog stays the same no matter how many times you (non invasively) measure it.  This is completely different than the situations in quantum mechanics, and is one of the reasons that quantum mechanics has never been unified with general relativity (which is a theory of gravity based on tensors).

Remember the dot product concerns  < dual vector — V | vector — W > . If you change the basis of vector  W (so vector W has different coordinates) the basis of dual vector   V must also change (to keep the dot product the same). A choice must be made as to which of the two concurrent basis changes is fundamental (actually neither is as they both are).

Mathematics has chosen the basis of vector W in as fundamental.

When you change the basis of W, the coefficients of W must change in the opposite way (to keep the vector length constant). The coefficients of W are said to change contravariantly. What about the coefficients of V? The basis of V changes oppositely to the basis of W (e.g. contravariantly), so the coefficients of V must change differently from this e.g. the same way the basis of W changes — e.g. covariantly. Confused?  Nonetheless, that’s the way they are named

Vectors and convectors and other mathematical entities such differentials, metrics and gradients are labelled as covariant or contravariant by the way their numerical coefficients change with a change in basis.

So the coefficients of vector W transform contravariantly, and the coefficients of dual vector V transform covariantly. This is true even though the coefficients of V and W always transform contravariantly (e. g. oppositely) to the way their basis transforms.

An immense source of confusion.

As mentioned above, one can regard vectors and dual vectors as real valued functions on elements of a vector space. So (adding to the confusion) vectors and dual vectors are both tensors. Vectors are contravariant tensors, and dual vectors are covariant tensors.

Now we form Cartesian products of vectors W (now called V) and convectors V (hereafter called V* to keep them straight).

We get something like this V x V x V x V* x V*, a cartesian product of 3 contravariant vectors and 2 dual vectors.

To get a real number out of them we form the tensor product V* ® V* ® V* ® V ® V, where the first V* operates on the first V to produce a real number, the second operates . . . and the last V* operates on the last V to produce a real number. All real numbers produced are multiplied together to produce the result.

Why not just call  V* ® V* ® V* ® V ® V a product? Well each V and V* is an n dimensional vector space, and the tensor V ® V is a n^2 dimensional space (and  V* ® V* ® V* ® V ® V is an n^5 dimensional vector space). When we form the product of two numbers (real or complex) we just get another number of the same species (real or complex). The tensor product of two n dimensional vector spaces is not another n dimensional space, hence the need for the adjective modifying the name product. The dot product nomenclature is much the same, the dot product of two vectors is not another vector, but a real number.

Here is yet another source of confusion. What we really have is a tensor product V* ® V* ® V* ® V ® V operating on a Cartesian product of vectors and covectors (tensors themselves) V x V x V x V* x V* to produce a real number.

Tensors can either be named by their operands making this a 3 contravariant 2 covariant tensor — (3, 2) tensor.

Other books name them by their operator (e.g. the tensor product) making it a 3 covariant 3 contravariant tensor (a 2, 3) tensor.

If you don’t get this settled when you switch books you’ll think you don’t really understand what contravariant and covariant mean (when in fact you do). Mercifully, one constancy in notation (thankfully) is that the contravariant number always comes first (or on top) and the covariant number second (or on bottom).

Hopefully this is helpful.  I wish I’d had this spelled out when I started.