I’m continuing to plow through classic differential geometry en route to studying manifolds en route to understanding relativity. The following thoughts might help someone just starting out.
Derivatives of one variable functions are fairly easy to understand. Plot y = f(x) and measure the slope of the curve. That’s the derivative.
So why do you need a matrix to find the derivative for more than one variable? Imagine standing on the side of a mountain. The slope depends on the direction you look. So something giving you the slope(s) of a mountain just has to be more complicated. It must be something that operates on the direction you’re looking (e.g. a vector).
Another point to remember about derivatives is that they basically take something that looks bumpy (like a mountain), look very closely at it under a magnifying glass and flatten it out (e.g. linearize it). Anything linear comes under the rubric of linear algebra — about which I wrote a lot, because it underlies quantum mechanics — for details see the 9 articles I wrote about it in — https://luysii.wordpress.com/category/linear-algebra-survival-guide-for-quantum-mechanics/.
Any linear transformation of a vector (of which the direction of your gaze on the side of a mountain is but one) can be represented by a matrix of numbers, which is why to find a slope in the direction of a vector it must be multiplied by a matrix (the Jacobian if you want to get technical).
Now on to the two tricky theorems — the Inverse Function Theorem and the Implicit Function theorem. I’ve been plowing through a variety of books on differential geometry (Banchoff & Lovett, McInenery, DoCarmo, Kreyszig, Thorpe) and they all refer you for proofs of both to an analysis book. They are absolutely crucial to differential geometry, so it’s surprising that none of these books prove them. They all involve linear transformations (because derivatives are linear) from an arbitrary real vector space R^n — elements are ordered n-tuples of real numbers to to another real vector space R^m. So they must inherently involve matrices, which quickly gets rather technical.
To keep your eye on the ball let’s go back to y = f(x). Y and x are real numbers. They have the lovely property that between any two real numbers there lies another, and between those two another and another. So there is no smallest real number greater than 0. If there is a point x at which the derivative isn’t zero but some positive number a to keep it simple (but a negative number would work as well), then y is increasing at x. If the derivative is continuous at a (which it usually is) then there is a delta greater than zero such that the derivative is greater than zero in the open interval (x – delta, x + delta). This means that y = f(x) is always increasing over that interval. This means that there is a one to one function y = g(x) defined over the same interval. This is called an inverse function.
Now you’re ready for the inverse function theorem — but the conditions are the same — the derivative at a point should be greater than zero and continuous at that point — and an inverse function exists. The trickiness and the mountains of notation come from the fact that the function is from R^n to R^m where n and m are any positive integers.
It’s important to know that, although the inverse and implicit functions are shown logically to exist, almost never can they be written down explicitly. The implicit function theorem follows from the inverse function theorem with even more notation involved, but this is the basic idea behind them.
A few other points on differential geometry. Much of it involves surfaces, and they are defined 3 ways. The easiest way to understand two of them takes you back to the side of a mountain. Now you’re standing on it half way up and wondering which would be the best way to get to the top. So you whip out your topographic map which has lines of constant elevation on it. This brings to the first way to define a surface. Assume the mountain is given by the function z = f (x, y) — every point on the earth has a height above it where the land stops and the sky beings (z) — so the function is a parameterization of the surface. Another way to define a surface in space is by level sets: put z equal to some height — call it z’ and define the surface as the set of two dimensional points (x, y) such that f (x, y ) = z’. These are the lines of constant elevation (e.g. the contour lines) – on the mountain. Differential geometry takes a broad view of surfaces — yes a curve on f (x, y) is considered a surface, just as a surface of constant temperature around the sun is a level set on f(x,y,z). The third way to define a surface is by f (x1, x2, …, xn) = 0. This is where the implicit function theorem comes in if some variables are in fact functions of others.
Well, I hope this helps when you plunge into the actual details.
For the record — the best derivation of these theorems are in Apostol Mathematical Analysis 1957 third printing pp. 138 – 148. The development is leisurely and quite clear. I bought the book in 1960 for $10.50. The second edition came out in ’74 — you can now buy it for 76.00 from Amazon — proving you should never throw out your old math books.