Tag Archives: Barrett O’Neill

Axiomatize This !

“Analyze This”, is a very funny 1999 sendup of the Mafia and psychiatry with Robert DeNiro and Billy Crystal.  For some reason the diagram on p. 7 of Barrett O’Neill’s book “Elementary Differential Geometry” revised 2nd edition 2006 made me think of it.

O’Neill’s  book was highly recommended by the wonderful “Visual Differential Geometry and Forms” by Tristan Needham — as “the single most clear-eyed, elegant and (ironically) modern treatment of the subject available — present company excpted !”

So O’Neill starts by defining a point  as an ordered triple of real numbers.  Then he defines R^3 as a set of such points along with the ability to add them and multiply them by another real number.

O’Neill then defines tangent vector (written v_p) as two points (p and v) in R^3 where p is the point of application (aka the tail of the tangent vector) and v as its vector part (the tip of the tangent vector).

All terribly abstract but at least clear and unambiguous until he says — “We shall always picture v_p as the arrow from point p t0 the point p + v”.

The picture is a huge leap and impossible to axiomatize (e.g. “Axiomatize This”).   Actually the (mental) picture came first and gave rise to all these definitions and axioms.

The picture is figure 1.1 on p. 7 — it’s a stick figure of a box shaped like an orange crate sitting in a drawing of R^3 with 3 orthogonal axes (none of which is or can be axiomatized).  p sits at one vertex of the box, and p + v at another.  An arrow is drawn from p to p + v (with a barb at p + v) which is then labeled v_p.  Notice also, that point v appears nowhere in the diagram.

What the definitions and axioms are trying to capture is our intuition of what a (tangent) vector really is.

So on p. 7 what are we actually doing?  We’re looking at a plane in visual R^3 with a bunch of ‘straight’ lines on it.  Photons from that plane go to our (nearly) spherical eye which clearly is no longer a plane.  My late good friend Peter Dodwell, psychology professor at Queen’s University in Ontario, told me that the retinal image actually preserves angles of the image (e.g. it’s conformal). 1,000,000 nerve fibers from each eye go back to our brain (don’t try to axiomatize them).   The information each fiber carries is far more processed than that of a single pixel (retinal photoreceptor) but that’s another story, and perhaps one that could be axiomatized with a lot of work.

100 years ago Wilder Penfield noted that blood flowing through a part of the brain which was active looked red rather than blue (because it contained more oxygen).  That’s the way the brain appears to work.  Any part of the brain doing something gets more blood flow than it needs, so it can’t possibly suck out all the oxygen the blood carries.  Decades of work and zillions researchers have studied the mechanisms by which this happens.  We know a lot more, but still not enough.

Today we don’t have to open the skull as Penfield did, but just do a special type of Magnetic Resonance Imaging (MRI) called functional MRI (fMRI) to watch changes in vessel oxygenation (or lack of it) as conscious people perform various tasks.

When we look at that simple stick figure on p. 7, roughly half of our brain lights up on fMRI, to give us the perception that that stick figure really is something in 3 dimensional space (even though it isn’t).  Axiomatizing that would require us to know what consciousness is (which we don’t) and trace it down to the activity of billions of neurons and trillions of synapses between them.

So what O’Neill is trying to do, is tie down the magnificent Gulliver which is our perception of space with Lilliputian strands of logic.

You’ve got to admire mathematicians for trying.

What does (∂h/∂x)dx + (∂h/∂y)dy + (∂h/∂z)dz really mean?

We’ve all seen (∂h/∂x)dx + (∂h/∂y)dy + (∂h/∂z)dz many times and have used it to calculate without understanding what the various symbols  actually mean.  In my case, it’s just another example of mouthing mathematical incantations without understanding them, something I became very good at at young age — see https://luysii.wordpress.com/2022/06/27/the-chinese-room-argument-understanding-math-and-the-imposter-syndrome/ for the gory details.

And now, finally, within a month of my 85th birthday, I finally understand what’s going on by reading only the first 25 pages of “Elementary Differential Geometry” revised second edition 2006 by Barrett O’Neill.

I was pointed to it by the marvelous Visual Differential Geometry by Tristan Needham, about which I’ve written 3 posts — this link has references to the other two — https://luysii.wordpress.com/2022/03/07/visual-differential-geometry-and-forms-q-take-3/

He describes O’Neill’s book as follows.  “First published in 1966, this trail-blazing text pioneered the use of Forms at the undergraduate level.  Today more than a half-century later, O’Neill’s work remains, in my view the single most clear-eyed, elegant and (ironically) modern treatment of the subject available — present company excepted! — at the undergraduate level”

It took a lot of work to untangle the notation (typical of all works in Differential Geometry). There is an old joke “differential geometry is the study of properties that are invariant under change of notation” which is funny because it is so close to the truth (John M. Lee)

So armed with no more than calculus 101, knowing what a vector space is,  and a good deal of notational patience, the meaning of (∂h/∂x)dx + (∂h/∂y)dy + (∂h/∂z)dz (including what dx, dy and dz really are) should be within your grasp.

We begin with R^3, the set of triples of real numbers (a_1, a_2, a_3) where _ means that 1, 2, 3 are taken as subscripts). Interestingly, these aren’t vectors to O’Neill which will be defined shortly.  All 3 components of a triple can be multiplied by a real number c — giving (c*a_1, c*a_2, c*a_3). Pairs of triples can be added.  This makes R^3 into a vector space (which O’Neill calls Euclidean 3-space), the components of which are triples (which O’Neill calls points), but that is not how O’Neill defines a vector, which are pairs of points p = (p_1, p_2, p_3) and v = (v_1, v_2, v_3) — we’ll see why shortly.

A tangent vector to point p in R^3 is called a tangent vector to p (and is written v_p) and is defined as an ordered pair of points (p, v) where

p is the point of application of v_p (aka the tail of p)

v is the vector part of v_p (aka the tip of v_p)

It is visualized as an arrow whose tail is at p and whose tip (barb) is at  p + v (remember you are allowed to add points).  In the visualization of v_p, v does not appear.

The tangent space of R^3 at p is written T_pR^3 and is the set of vectors (p, v) such that p is constant and v varies over all possible points.

Each p in R^3 has its own tangent space, and tangent vectors in different tangent spaces can’t be added.

Next up functions.

A real value function on R^3 is written

f :  R^3 –> R^1 (the real numbers)

f : (a_1, a_2, a_3) |—> c (some real number)

This is typical of the way functions are written in more advanced math, with the first line giving the domain (R^3) of the function and the range of the function (R^1) and the second line giving what happens to a typical element of the domain on application of the function to it.

O’Neill assumes that all the functions on domain R^3 have continuous derivatives of all orders.  So the functions are smooth, differentiable or C^infinity — take your pick — they are equivalent.

The assumption of differentiability means that you have some mechanism for seeing how close two points are to each other.  He doesn’t say it until later, but this assumes the usual distance metric using the Pythagorean theorem — if you’ve taken calc. 101 you know what these are.

For mental visualization it’s better to think of the function as from R^2 (x and y variables — e.g,. the Euclidean plane) to the real numbers.  This is the classic topographic map, which tells how high over the ground you are at each point.

Now at last we’re getting close to (∂f/∂x)dx + (∂f/∂y)dy + (∂f/∂z)dz.

So now you’re on a ridge ascending to the summit of your favorite mountain.  The height function tells you how high your are where you’re standing (call this point p), but what you really want to know is which way to go to get to the peak.  You want to find a direction in which height is increasing.   Enter the directional derivative (of the height function)  Clearly height drops off on either side of the ridge and increases or decreases along the ridge.   Equally clearly there is no single directional derivative here (as there would be for a function g : R^1 –> R^1).  The directional derivative  depends on p (where you are) and v the direction you choose — this is why O’Neill defines tangent vectors by two points (p, and v)

So the directional derivative requires two functions

the height function h : R^3 –> R^1

the direction function f : p + t*v where t is in R^1.  This gives the a line through p going in direction v

So the directional derivative of  h at p is

d/dt  (h (p + t*v)) | _t = 0  ; take the limit of h (p + t*v)  as t approaches zero

Causing me a lot of confusion, O’Neill gives the directional derivative the following name v_p[h] — which gives you no sense that a derivative of anything is involved.  This is his equation

v_p[f] = d/dt  (h (p + t*v)) | _t = 0

Notice that changing p (say to the peak of the mountain) changes the directional derivative —  all of them point down.   This is why O’Neill defines tangent vectors using two points (p, v).

Now a few more functions and the chain rule and we’re about done.

x :    R^3                    –>  R^1

x : (v_1, v_2, v_3 ) |–>  v_1

similarly y :R^3 –> R^1 picks out the y coordinate of (v_1, v_2, v_3 )  e.g. v_2

Let’s look at p + t*v in coordinate form, remembering what p and v are that way

p + t*v  = ( p_1 + t * v_1, p_2 + t * v_2, p_3 + t * v_3)

Remember that we defined f = p  + t *v

so df/dt = d( p + t*v )/dt

expanding

df’/dt= d( p_1 + t * v_1, p_2 + t * v_2, p_3 + t * v_3)/dt = (v_1, v_2, v_3)

Let’s be definite about what h : R^3 –> R^1 actually is

h : (x, y, z) |—> x^2 * y^3 *z ^4 meaning you must use partial derivatives

so ∂h/∂x = 2 x * y^3 * z* 4,  etc.,

Look at v_p[h] = d/dt  (h (p + t*v)) | _t = 0 again

It’s really v_p[h] = d/dt (h (f (t))|_=0

so it’s time for the chain rule

d/dt (h (f (t)) = (dh/df ) * (df/dt)

dh/df in coordinates is really

(∂h/∂x, ∂h/∂y,∂h/∂z)

df/dt in coordinates is really

(v_1, v_2, v_3)

But the chain rule is applied to each of the three terms

so what you have is d/dt (h (f (t))  = (∂h/∂x * v_1,  ∂h/∂y * v_2, ∂h/∂z * v_3)

I left one thing out.  The |_=0

So to do this you need to plug in the numbers (evaluating everything at p) and sum so what you get is

v_p[h] = ∂h/∂x * v_1 +  ∂h/∂y * v_2 +  ∂h/∂z * v_3

We need one more definition. Recall that the tangent space of R^3 at p is written T_pR^3 and is the set of vectors (p, v) such that p is constant and v varies over all possible points.

The set of all tangent spaces over R^3 is written (TR^3)

Finally on p. 24 O’Neill defines what you’ve all been waiting for :  dh

dh : TR^3 –> R^1

dh : p  ——>  v_p[h] = ∂h/∂x * v_1 +  ∂h/∂y * v_2 +  ∂h/∂z * v_3

One last bit of manipulation — what is dx (and dy and dz)?

we know that  the function x is defined as follows

x :    R^3                    –>  R^1

x : (v_1, v_2, v_3 ) |–>  v_1

so dx = (dx/dx, dx/dy, dx/dz)|_=0

is just  v_1

so at (very) long last we have

dh : TR^3 –> R^1

dh : p  ——>  v_p[h] = ∂h/∂x * dx +  ∂h/∂y * dy +  ∂h/∂z * dz

Remember ∂h/∂x, ∂h/∂y,  ∂h/∂z are all evaluated at p = (p_1, p_2, p_3)

So it’s a (fairly) simple matter to apply dh to any point p in R^3 and any direction  (v_1, v_2, v_3) in R^3 to get the directional derivative

Amen. Selah.

How to study math by yourself far away from an academic center

“Differential geometry is the study of things that are invariant under a change of notation.”   Sad but true, and not original as it appears in the introduction to two different differential geometry books I own.

Which brings me to symbol tables and indexes in math books. If you have a perfect mathematical mind and can read math books straight through understanding everything and never need to look back in the book for a symbol or concept you’re not clear on, then you don’t need them.  I suspect most people aren’t like that.  I’m not.

Even worse is failing to understand something (say the connection matrix) and trying to find another discussion in another book.  If you go to an older book (most of which do not have symbol tables) the notation will likely be completely different and you have to start back at ground zero.  This happened when I tried to find what a connection form was, finding the discussion in one book rather skimpy.  I found it in O’Neill’s book on elementary differential geometry, but the notation was completely different and I had to read page after page to pick up the lingo until I could understand his discussion (which was quite clear).

Connections are important, and they underlie gauge theory and a lot of modern physics.

Good math books aren’t just theorem proof theorem proof, but have discussions about why you’d want to know something etc. etc.  Even better are discussions about why things are the way they are.  Tu’s book on Differential geometry is particularly good on this, showing (after a careful discussion of why the directional derivative is the way it is) how the rather abstract definition of a connection on a manifold arises by formalizing the properties of the directional derivative and using them to define the connection.

Unfortunately, he presents curvature in a very ad hoc fashion, and I’m back to starting at ground zero in another book (older and without a symbol table).

Nonetheless I find it very helpful when taking notes to always start by listing what is given.  Then a statement of the theorem, particularly putting statements like for all i in { 1, …. ,n} in front.  In particular if a concept is defined, put how the concept is written in the definition

e.g.

Given X, Y smooth vector fields

def:  Lie Bracket (written [ X, Y ] ) ::= DxY – DyX

with maybe a link to a page in your notes where Dx is defined

So before buying a math book, look to see how fulsome the index is, and whether it has a symbol table.