Tag Archives: linear algebra

How to think about two tricky theorems and other matters

I’m continuing to plow through classic differential geometry en route to studying manifolds en route to understanding relativity. The following thoughts might help someone just starting out.

Derivatives of one variable functions are fairly easy to understand. Plot y = f(x) and measure the slope of the curve. That’s the derivative.

So why do you need a matrix to find the derivative for more than one variable? Imagine standing on the side of a mountain. The slope depends on the direction you look. So something giving you the slope(s) of a mountain just has to be more complicated. It must be something that operates on the direction you’re looking (e.g. a vector).

Another point to remember about derivatives is that they basically take something that looks bumpy (like a mountain), look very closely at it under a magnifying glass and flatten it out (e.g. linearize it). Anything linear comes under the rubric of linear algebra — about which I wrote a lot, because it underlies quantum mechanics — for details see the 9 articles I wrote about it in — https://luysii.wordpress.com/category/linear-algebra-survival-guide-for-quantum-mechanics/.

Any linear transformation of a vector (of which the direction of your gaze on the side of a mountain is but one) can be represented by a matrix of numbers, which is why to find a slope in the direction of a vector it must be multiplied by a matrix (the Jacobian if you want to get technical).

Now on to the two tricky theorems — the Inverse Function Theorem and the Implicit Function theorem. I’ve been plowing through a variety of books on differential geometry (Banchoff & Lovett, McInenery, DoCarmo, Kreyszig, Thorpe) and they all refer you for proofs of both to an analysis book. They are absolutely crucial to differential geometry, so it’s surprising that none of these books prove them. They all involve linear transformations (because derivatives are linear) from an arbitrary real vector space R^n — elements are ordered n-tuples of real numbers to to another real vector space R^m. So they must inherently involve matrices, which quickly gets rather technical.

To keep your eye on the ball let’s go back to y = f(x). Y and x are real numbers. They have the lovely property that between any two real numbers there lies another, and between those two another and another. So there is no smallest real number greater than 0. If there is a point x at which the derivative isn’t zero but some positive number a to keep it simple (but a negative number would work as well), then y is increasing at x. If the derivative is continuous at a (which it usually is) then there is a delta greater than zero such that the derivative is greater than zero in the open interval (x – delta, x + delta). This means that y = f(x) is always increasing over that interval. This means that there is a one to one function y = g(x) defined over the same interval. This is called an inverse function.

Now you’re ready for the inverse function theorem — but the conditions are the same — the derivative at a point should be greater than zero and continuous at that point — and an inverse function exists. The trickiness and the mountains of notation come from the fact that the function is from R^n to R^m where n and m are any positive integers.

It’s important to know that, although the inverse and implicit functions are shown logically to exist, almost never can they be written down explicitly. The implicit function theorem follows from the inverse function theorem with even more notation involved, but this is the basic idea behind them.

A few other points on differential geometry. Much of it involves surfaces, and they are defined 3 ways. The easiest way to understand two of them takes you back to the side of a mountain. Now you’re standing on it half way up and wondering which would be the best way to get to the top. So you whip out your topographic map which has lines of constant elevation on it. This brings to the first way to define a surface. Assume the mountain is given by the function z = f (x, y) — every point on the earth has a height above it where the land stops and the sky beings (z) — so the function is a parameterization of the surface. Another way to define a surface in space is by level sets: put z equal to some height — call it z’ and define the surface as the set of two dimensional points (x, y) such that f (x, y ) = z’. These are the lines of constant elevation (e.g. the contour lines) – on the mountain. Differential geometry takes a broad view of surfaces — yes a curve on f (x, y) is considered a surface, just as a surface of constant temperature around the sun is a level set on f(x,y,z). The third way to define a surface is by f (x1, x2, …, xn) = 0. This is where the implicit function theorem comes in if some variables are in fact functions of others.

Well, I hope this helps when you plunge into the actual details.

For the record — the best derivation of these theorems are in Apostol Mathematical Analysis 1957 third printing pp. 138 – 148. The development is leisurely and quite clear. I bought the book in 1960 for $10.50. The second edition came out in ’74 — you can now buy it for 76.00 from Amazon — proving you should never throw out your old math books.

An old year’s resolution

One of the things I thought I was going to do in 2012 was learn about relativity.   For why see http://luysii.wordpress.com/2012/09/11/why-math-is-hard-for-me-and-organic-chemistry-is-easy/.  Also my cousin’s new husband wrote a paper on a new way of looking at it.  I’ve been putting him off as I thought I should know the old way first.

I knew that general relativity involved lots of math such as manifolds and the curvature of space-time.  So rather than read verbal explanations, I thought I’d learn the math first.  I started reading John M. Lee’s two books on manifolds.  The first involves topological manifolds, the second involves manifolds with extra structure (smoothness) permitting calculus to be done on them.  Distance is not a topological concept, but is absolutely required for calculus — that’s what the smoothness is about.

I started with “Introduction to Topological Manifolds” (2nd. Edition) by John M. Lee.  I’ve got about 34 pages of notes on the first 95 pages (25% of the text), and made a list of the definitions I thought worth writing down — there are 170 of them. Eventually I got through a third of its 380 pages of text.  I thought that might be enough to help me read his second book “Introduction to Smooth Manifolds” but I only got through 100 of its 600 pages before I could see that I really needed to go back and completely go through the first book.

This seemed endless, and would probably take 2 more years.  This shouldn’t be taken as a criticism of Lee — his writing is clear as a bell.  One of the few criticisms of his books is that they are so clear, you think you understand what you are reading when you don’t.

So what to do?  A prof at one of the local colleges, James J. Callahan, wrote a book called “The Geometry of Spacetime” which concerns special and general relativity.  I asked if I could audit the course on it he’d been teaching there for decades.  Unfortunately he said “been there, done that” and had no plans ever to teach the course again.

Well, for the last month or so, I’ve been going through his book.  It’s excellent, with lots of diagrams and pictures, and wide margins for taking notes.  A symbol table would have been helpful, as would answers to the excellent (and fairly difficult) problems.

This also explains why there have been no posts in the past month.

The good news is that the only math you need for special relativity is calculus and linear algebra.  Really nothing more.  No manifolds.  At the end of the first third of the book (about 145 pages) you will have a clear understanding of

l. time dilation — why time slows down for moving objects

2. length contraction — why moving objects shrink

3. why two observers looking at the same event can see it happening at different times.

4. the Michelson Morley experiment — but the explanation of it in the Feynman lectures on physics 15-3, 15-4 is much better

5. The Kludge Lorentz used to make Maxwell’s equations obey the Galilean principle of relativity (e.g. Newton’s first law)

6. How Einstein derived Lorentz’s kludge purely by assuming the velocity of light was constant for all observers, never mind how they were moving relative to each other.  Reading how he did it, is like watching a master sculptor at work.

Well, I’ll never get through the rest of Callahan by the end of 2012, but I can see doing it in a few more months.  You could conceivably learn linear algebra by reading his book, but it would be tough.  I’ve written some fairly simplistic background linear algebra for quantum mechanics posts — you might find them useful. https://luysii.wordpress.com/category/linear-algebra-survival-guide-for-quantum-mechanics/

One of the nicest things was seeing clearly what it means for different matrices to represent the same transformation, and why you should care.  I’d seen this many times in linear algebra, but seeing how simple reflection through an arbitrary line through the origin can be when you (1) rotate the line to the x axis by tan(y/x) radians (2) change the y coordinate to – y  — by an incredibly simple matrix  (3) rotate it back to the original angle .

That’s why any two n x n matrices X and Y represent the same linear transformation if they are related by the invertible matrix Z in the following way  X = Z^-1 * Y * Z

Merry Christmas and Happy New Year (none of that Happy Holidays crap for me)

Follow

Get every new post delivered to your Inbox.

Join 66 other followers