Category Archives: Math

Tensors

Anyone wanting to understand the language of general relativity must eventually tackle tensors. The following is what I wished I’d known about them before I started studying them on my own.

First, mathematicians and physicists describe tensors so differently, that it’s hard to even see that they’re talking about the same thing (one math book of mine says exactly that). Also mathematicians basically dump on the physicists’ way of doing tensors.

My first experience with tensors was years ago when auditing a graduate abstract algebra course. The instructor prefaced his first lecture by saying that tensors were the hardest thing in mathematics. Unfortunately right at that time my father became ill and I had to leave the area.

I’ll write a bit more about the mathematical approach at the end.

The physicist’s way of looking at tensors actually is a philosophical position. It basically says that there is something out there, and how two people viewing that something from different perspectives are seeing the same thing, and how they numerically describe it, while important, is irrelevant to the thing itself (ding an sich if you want to get fancy). What a tensor tries to capture is how one view of the object can be transformed into another without losing the object in the process.

This is a bit more subtle than using different measuring scales (fahrenheit vs. centigrade). That salt shaker siting there looks a bit different to everyone present at the table. Relative to themselves they’d all use different numbers to describe its location, height and width. Depending on distance it would subtend different visual angles. But it’s out there and has but one height and no one around the table would disagree.

You’re tall and see it from above, while your child sees it at eye level. You measure the distances from your eye to its top and to its bottom, subtract them and get the height. So does you child. You get the same number.

The two of you have actually used two distinct vectors in two different coordinate systems. To transform your view into that of your child’s you have to transform your coordinate system (whose origin is your eye) to the child’s. The distance numbers to the shaker from the eye are the coordinates of the shaker in each system.

So the position of the bottom of the shaker actually has two parts (e.g. the vector describing it)
l. The coordinate system of the viewer
2. The distances measured by each (the components or the coefficients of the vector).

To shift from your view of the salt shaker to that of your child’s you must change both the coordinate system and the distances measured in each. This is what tensors are all about. So the vector from the top to the bottom of the salt shaker is what you want to keep constant. To do this the coordinate system and the components must change in opposite ways. This is where the terms covariant and contravariant and all the indices come in.

What is taken as the basic change is that of the coordinate system (the basis vectors if you know what they are). In the case of the vector to the salt shaker the components transform the opposite way (as they must to keep the height of the salt shaker the same). That’s why they are called contravariant.

The use of the term contravariant vector is terribly confusing, because every vector has two parts (the coefficients and the basis) which transform oppositely. There are mathematical objects whose components (coefficients) transform the same way as the original basis vectors — these are called covariant (the most familiar is the metric, a bilinear symmetric function which takes two vectors and produces a real number). Remember it’s the way the coefficients of the mathematical object transform which determines whether they are covariant or contravariant. To make things a bit easier to remember, contRavariant coefficients have their indices above the letter (R for roof), while covariant coefficients have their indices below the letter. The basis vectors (when written in) always have the opposite position of their indices.

Another trap — the usual notation for a vector skips the basis vectors entirely, so the most familial example (x, y, z) or (x^1, x^2, x^3) is really
x^1 * e_1 + x^2 * e_2 + x^3 * e-3. Where e_1 is (1,0,0), etc. etc.

So the crucial thing about tensors is the way they transform from one coordinate system to another.

There is a far more abstract way to define tensors, as the way multilinear products of vector spaces factor through it. I don’t think you need it for relativity (I hope not). If you want to see a very concrete to this admittedly abstract business — I recommend “Differential Geometry of Manifolds” by Stephen Lovett pp. 381 – 383.

An even more abstract definition of tensors (seen in the graduate math course) is to define them on modules, not vector spaces. Modules are just vector spaces whose scalars are rings, rather than fields like the real or the complex numbers. The difference, is that unlike fields the nonZero elements don’t have inverses.

I hope this is helpful to some of you

Maryam Mirzakhani

“The universal scientific language is broken English.” So sayeth Don Voet 50+ years ago when we were graduate students. He should know, as his parents were smart enough to get the hell out of the Netherlands before WWII. I met them and they told me that there was some minor incident there involving Germans who promptly went bananas. They decided that this wasn’t the way a friendly country behaved and got out. Just about everyone two generations back in my family was an immigrant, so I heard a lot of heavily accented (if not broken) English growing up.

Which (at last) brings us to Maryam Mirzakhani, a person probably not familiar to chemists, but a brilliant mathematician who has just won the Fields Medal (the Nobel of mathematics). Born in Teheran and educated through college there, she came to Harvard for her PhD, and has remained here ever since and is presently a full prof. at Stanford.

Why she chose to stay here isn’t clear. The USA has picked up all sorts of brains from the various European upheavals and petty hatreds (see http://luysii.wordpress.com/2013/10/27/hitlers-gifts-and-russias-gift/). Given the present and past state of the middle East, I’ve always wondered if we’d scooped up any of the talent originating there. Of course, all chemists know of E. J. Corey, a Lebanese Christian, but he was born here 86 years ago. Elias Zerhouni former director of the NIH, was born in Algeria. That’s about all I know at this level of brilliance and achievement. I’m sure there are others that I’ve missed. Hopefully more such people are already here but haven’t established themselves as yet. This is possible, given that they come from a region without world class scientific institutions. Hitler singlehandedly destroyed the great German departments of Mathematics and Physics and the USA (and England) picked up the best of them.

Given the way things are going presently, the USA may shortly acquire a lot of Muslim brains from Europe. All it will take is a few random beheadings of Europeans in their home countries by the maniacs of ISIS and their ilk. Look what Europeans did to a people who did not physically threaten them during WWII. Lest you think this sort of behavior was a purely German aberration, try Googling Quisling and Marshal Petain. God knows what they’ll do when they are actually threatened. Remember, less than 20 years ago, the Europeans did nothing as Muslims were being slaughtered by Serbs in Kosovo.

Not to ignore the awful other side of the coin, the religious cleansing of the middle East of Christians by the larger Muslim community. The politically correct here have no love of Christianity. However, the continued passivity of American Christians is surprising. Whatever happened to “Onward Christian Soldiers” which seemed to be sung by all at least once a week in the grade school I attended 60+ years ago.

These are very scary times.

Two math tips

Two of the most important theorems in differential geometry are Gauss’s Theorem egregium and the Inverse function theorem. Basically the theorem egregium says that you don’t need to look at the shape of a two dimensional surface (say the surface of a walnut) from outside (e.g. from the way it sits in 3 dimensional space) to understand its shape. All the information is contained in the surface itself.

The inverse function theorem (InFT) is used over and over. If you have a continuous function from Euclidean space U of finite dimension n to Euclidean space V of the same dimension, and certain properties of its derivative are present at a point x of U, then there exists a another function to get you back from space V to U.

Even better, once you’ve proved the inverse function theorem, proof of another important theorem (the implicit function theorem aka the ImFT) is quite simple. The ImFT lets you know if given f(x, y, .. .) –> R (e.g. a real valued function) if you can express one variable (say x) in terms of the others. Again sometimes it’s difficult to solve such an equation for x in terms of y — consider arctan(e^(x + y^2) * sin(xy) + ln x). What is important to know in this case, is whether it’s even possible.

The proofs of both are tricky. In particular, the proof of the inverse function theorem is an existence proof. You may not be able to write down the function from V to U even though you’ve just proved that it exists. So using the InFT to prove the implicit function theory is also nonconstructive.

At some point in your mathematical adolescence, you should sit down and follow these proofs. They aren’t easy and they aren’t short.

Here’s where to go. Both can be found in books by James J. Callahan, emeritus professor of Mathematics at Smith College in Northampton Mass. The proof of the InVT is to be found on pages 169 – 174 of his “Advanced Calculus, A Geometric View”, which is geometric, with lots of pictures. What’s good about this proof is that it’s broken down into some 13 steps. Be prepared to meet a lot of functions and variables.

Just the statement of InVT involves functions f, f^-1, df, df^-1, spaces U^n, R^n, variables a, q, B

The proof of InVT involves functions g, phi, dphi, h, dh, N, most of which are vector valued (N is real valued)

Then there are the geometric objects U^n, R^n, Wa, Wfa, Br, Br/2

Vectors a, x, u, delta x, delta u, delta v, delta w

Real number r

That’s just to get you through step 8 of the 13 step proof, which proves the existence of the inverse function (aka f^-1). The rest involves proving properties of f^-1 such as continuity and differentiability. I must confess that just proving existence of f^-1 was enough for me.

The proof of the implicit function theorem for two variables — e.g. f(x, y) = k takes less than a page (190).

The proof of the Theorem Egregium is to be found in his book “The Geometry of Spacetime” pp. 258 – 262 in 9 steps. Be prepared for fewer functions, but many more symbols.

As to why I’m doing this please see http://luysii.wordpress.com/2011/12/31/some-new-years-resolutions/

Help wanted

Just about done with special relativity. It is simply marvelous to see how everything follows from the constancy of the speed of light — time moving more slowly for a moving object (relative to an object standing still in its own frame of reference), a moving object shrinking (ditto), the increase in mass which occurs as an object begins to approach the speed of light, and how this leads to the equivalence of mass and energy. Special relativity is even sufficient to show how a gravitational field will bend light — although to really understand this, general relativity is required.

The one fly in the intellectual ointment is the Minkowski metric for the space time of special relativity. In all the sources I’ve been able to find, it appears ad hoc, or is defined analogously to the euclidean metric. I’d love to see an argument why this metric (time coordinates positive, space coordinates negative) must follow from the constancy of the speed of light. It is clear that the Minkowski metric is preserved under the hyperbolic transformation of space-time, but likely others are as well. Why this particular metric and not something else.

Consider the determinant function of an n by n matrix. It has a god awful mathematical form involving the sum of n ! terms. Yet all you need to get the (unique) formula are a few postulates — the determinant of the identity matrix is 1, the determinant is a linear function of its rows (or its columns), interchanging any two rows of the determinant reverses the sign of the determinant, etc. etc. This basically determines the (unique) formula of the determinant. I’d really like to see the Minkowski metric come out of something like that.

Can anyone out there shed light on this or give me a link?

A Mathematical Near Death Experience

As I’ve alluded to from time to time, I’m trying to learn relativity — not the popularizations, of which there are many, but the full Monty as it were, with all the math required. I’ve been at it a while as the following New Year’s Resolution of a few years ago will show.

“Why relativity? It’s something I’ve always wanted to understand at a deeper level than the popularizations of it (reading the sacred texts in the original so to speak). I may have enough background in math, to understand how to study it. Topology is something I started looking at years ago as a chief neurology resident, to get my mind off the ghastly cases I was seeing.

I’d forgotten about it, but a fellow ancient alum, mentioned our college president’s speech to us on opening day some 55 years ago. All the high school guys were nervously looking at our neighbors and wondering if we really belonged there. The prez told us that if they accepted us that they were sure we could do the work, and that although there were a few geniuses in the entering class, there were many more people in the class who thought they were.

Which brings me to our class relativist. I knew a lot of the physics majors as an undergrad, but not this guy. The index of the new book on Hawking by Ferguson has multiple entries about his work with Hawking (which is ongoing). Another physicist (now a semi-famous historian) felt validated when the guy asked him for help with a problem. He never tooted his own horn, and seemed quite modest at the 50th reunion. As far as I know, one physics self-proclaimed genius (and class valedictorian) has done little work of any significance. Maybe at the end of the year I’ll be able to read the relativist’s textbook on the subject. Who knows? It’s certainly a personal reason for studying relativity. Maybe at the end of the year I’ll be able to ask him a sensible question.”

Well that year has come and gone, but I’m making progress, going through a book with a mathematical approach to the subject written by a local retired math prof (who shall remain nameless). The only way to learn any math or physics is to do the problems, and he was kind enough to send me the answer sheet to all the problems in his book (which he worked out himself).

I am able to do most of the problems, and usually get the right answer, but his answers are far more elegant than mine. It is fascinating to see the way a professional mathematician thinks about these things.

The process of trying to learn something which everyone says is hard, is actually quite existential for someone now 76. Do I have the mental horsepower to get the stuff? Did I ever? etc. etc.

So when I got to one problem and the profs answer I was really quite upset. My answer appeared fairly straightforward and simple, yet his answer required a long derivation. Even though we both came out with the same thing, I was certain that I’d missed something really basic which required all the work he put in.

One of the joys of reading math these days (at least math books written by someone who is still alive) is that you can correspond with them. Mathematicians are so used to being dumped on by presumably intellectual people, that they’re happy to see some love. Response time is usually under a day. So I wrote him the following

“Along those lines, you do a lot of heavy lifting in your answer to 3a in section 4.3. Why not just say the point you are trying to find in R’s world is the image under M of the point (h.h) in G’s world and apply M to get t and z.”

Now usually any mathematician I EMail about their books gets back quickly — my sardonic wife says that it’s because they don’t have much to do.

Fo days, I heard nothing. I figured that he was trying to figure out a nice way to tell me to take up watching sports or golf, and that relativity was a mountain my intellect couldn’t climb. True existential gloom set in. Then I go the following back.

“You are absolutely right about the question; what you propose is elegant and incisive. I can’t figure out why I didn’t make the simple direct connection in the text itself, because I went to some pains to structure everything around the map M. But all that was fifteen or more years ago, and I have no notes about my thinking as I was writing.”

A true mathematical (and existential) near death experience.

How to think about two tricky theorems and other matters

I’m continuing to plow through classic differential geometry en route to studying manifolds en route to understanding relativity. The following thoughts might help someone just starting out.

Derivatives of one variable functions are fairly easy to understand. Plot y = f(x) and measure the slope of the curve. That’s the derivative.

So why do you need a matrix to find the derivative for more than one variable? Imagine standing on the side of a mountain. The slope depends on the direction you look. So something giving you the slope(s) of a mountain just has to be more complicated. It must be something that operates on the direction you’re looking (e.g. a vector).

Another point to remember about derivatives is that they basically take something that looks bumpy (like a mountain), look very closely at it under a magnifying glass and flatten it out (e.g. linearize it). Anything linear comes under the rubric of linear algebra — about which I wrote a lot, because it underlies quantum mechanics — for details see the 9 articles I wrote about it in — https://luysii.wordpress.com/category/linear-algebra-survival-guide-for-quantum-mechanics/.

Any linear transformation of a vector (of which the direction of your gaze on the side of a mountain is but one) can be represented by a matrix of numbers, which is why to find a slope in the direction of a vector it must be multiplied by a matrix (the Jacobian if you want to get technical).

Now on to the two tricky theorems — the Inverse Function Theorem and the Implicit Function theorem. I’ve been plowing through a variety of books on differential geometry (Banchoff & Lovett, McInenery, DoCarmo, Kreyszig, Thorpe) and they all refer you for proofs of both to an analysis book. They are absolutely crucial to differential geometry, so it’s surprising that none of these books prove them. They all involve linear transformations (because derivatives are linear) from an arbitrary real vector space R^n — elements are ordered n-tuples of real numbers to to another real vector space R^m. So they must inherently involve matrices, which quickly gets rather technical.

To keep your eye on the ball let’s go back to y = f(x). Y and x are real numbers. They have the lovely property that between any two real numbers there lies another, and between those two another and another. So there is no smallest real number greater than 0. If there is a point x at which the derivative isn’t zero but some positive number a to keep it simple (but a negative number would work as well), then y is increasing at x. If the derivative is continuous at a (which it usually is) then there is a delta greater than zero such that the derivative is greater than zero in the open interval (x – delta, x + delta). This means that y = f(x) is always increasing over that interval. This means that there is a one to one function y = g(x) defined over the same interval. This is called an inverse function.

Now you’re ready for the inverse function theorem — but the conditions are the same — the derivative at a point should be greater than zero and continuous at that point — and an inverse function exists. The trickiness and the mountains of notation come from the fact that the function is from R^n to R^m where n and m are any positive integers.

It’s important to know that, although the inverse and implicit functions are shown logically to exist, almost never can they be written down explicitly. The implicit function theorem follows from the inverse function theorem with even more notation involved, but this is the basic idea behind them.

A few other points on differential geometry. Much of it involves surfaces, and they are defined 3 ways. The easiest way to understand two of them takes you back to the side of a mountain. Now you’re standing on it half way up and wondering which would be the best way to get to the top. So you whip out your topographic map which has lines of constant elevation on it. This brings to the first way to define a surface. Assume the mountain is given by the function z = f (x, y) — every point on the earth has a height above it where the land stops and the sky beings (z) — so the function is a parameterization of the surface. Another way to define a surface in space is by level sets: put z equal to some height — call it z’ and define the surface as the set of two dimensional points (x, y) such that f (x, y ) = z’. These are the lines of constant elevation (e.g. the contour lines) – on the mountain. Differential geometry takes a broad view of surfaces — yes a curve on f (x, y) is considered a surface, just as a surface of constant temperature around the sun is a level set on f(x,y,z). The third way to define a surface is by f (x1, x2, …, xn) = 0. This is where the implicit function theorem comes in if some variables are in fact functions of others.

Well, I hope this helps when you plunge into the actual details.

For the record — the best derivation of these theorems are in Apostol Mathematical Analysis 1957 third printing pp. 138 – 148. The development is leisurely and quite clear. I bought the book in 1960 for $10.50. The second edition came out in ’74 — you can now buy it for 76.00 from Amazon — proving you should never throw out your old math books.

An old year’s resolution

One of the things I thought I was going to do in 2012 was learn about relativity.   For why see http://luysii.wordpress.com/2012/09/11/why-math-is-hard-for-me-and-organic-chemistry-is-easy/.  Also my cousin’s new husband wrote a paper on a new way of looking at it.  I’ve been putting him off as I thought I should know the old way first.

I knew that general relativity involved lots of math such as manifolds and the curvature of space-time.  So rather than read verbal explanations, I thought I’d learn the math first.  I started reading John M. Lee’s two books on manifolds.  The first involves topological manifolds, the second involves manifolds with extra structure (smoothness) permitting calculus to be done on them.  Distance is not a topological concept, but is absolutely required for calculus — that’s what the smoothness is about.

I started with “Introduction to Topological Manifolds” (2nd. Edition) by John M. Lee.  I’ve got about 34 pages of notes on the first 95 pages (25% of the text), and made a list of the definitions I thought worth writing down — there are 170 of them. Eventually I got through a third of its 380 pages of text.  I thought that might be enough to help me read his second book “Introduction to Smooth Manifolds” but I only got through 100 of its 600 pages before I could see that I really needed to go back and completely go through the first book.

This seemed endless, and would probably take 2 more years.  This shouldn’t be taken as a criticism of Lee — his writing is clear as a bell.  One of the few criticisms of his books is that they are so clear, you think you understand what you are reading when you don’t.

So what to do?  A prof at one of the local colleges, James J. Callahan, wrote a book called “The Geometry of Spacetime” which concerns special and general relativity.  I asked if I could audit the course on it he’d been teaching there for decades.  Unfortunately he said “been there, done that” and had no plans ever to teach the course again.

Well, for the last month or so, I’ve been going through his book.  It’s excellent, with lots of diagrams and pictures, and wide margins for taking notes.  A symbol table would have been helpful, as would answers to the excellent (and fairly difficult) problems.

This also explains why there have been no posts in the past month.

The good news is that the only math you need for special relativity is calculus and linear algebra.  Really nothing more.  No manifolds.  At the end of the first third of the book (about 145 pages) you will have a clear understanding of

l. time dilation — why time slows down for moving objects

2. length contraction — why moving objects shrink

3. why two observers looking at the same event can see it happening at different times.

4. the Michelson Morley experiment — but the explanation of it in the Feynman lectures on physics 15-3, 15-4 is much better

5. The Kludge Lorentz used to make Maxwell’s equations obey the Galilean principle of relativity (e.g. Newton’s first law)

6. How Einstein derived Lorentz’s kludge purely by assuming the velocity of light was constant for all observers, never mind how they were moving relative to each other.  Reading how he did it, is like watching a master sculptor at work.

Well, I’ll never get through the rest of Callahan by the end of 2012, but I can see doing it in a few more months.  You could conceivably learn linear algebra by reading his book, but it would be tough.  I’ve written some fairly simplistic background linear algebra for quantum mechanics posts — you might find them useful. https://luysii.wordpress.com/category/linear-algebra-survival-guide-for-quantum-mechanics/

One of the nicest things was seeing clearly what it means for different matrices to represent the same transformation, and why you should care.  I’d seen this many times in linear algebra, but seeing how simple reflection through an arbitrary line through the origin can be when you (1) rotate the line to the x axis by tan(y/x) radians (2) change the y coordinate to – y  — by an incredibly simple matrix  (3) rotate it back to the original angle .

That’s why any two n x n matrices X and Y represent the same linear transformation if they are related by the invertible matrix Z in the following way  X = Z^-1 * Y * Z

Merry Christmas and Happy New Year (none of that Happy Holidays crap for me)

The New Clayden pp. 1029 – 1068

p. 1034 — “Small amounts of radicals are formed in many reactions in which the products are actually formed by simple ionic processes.”  Interesting — how ‘small’ is small?  

p. 1036 — A very improbable mechanism (but true) given in the last reaction involving breaking benzene aromaticity and forming a cyclopropene ring to boot.  

p. 1043 — Americans should note that gradient (as in Hammett’s rho constant) means slope (or derivative if the plot of substituents vs. sigma for a particular reaction isn’t a straight line).  However we are talking log vs. log plots, and you can fit an elephant onto a log log plot.  It’s worth remembering why logarithms are necessary iin the first place.  Much of interest to chemists (equilibrium constants, reaction rates) are exponential in free energy (of products vs. reactants in the first case, of transition state vs. reactions in the second).

p. 1044 — Optimally I shouldn’t have to remember that a positve rho (for reaction value) means electrons flow toward the aromatic ring in the rate determining step), but should gut it out from the electron withdrawing or pushing effects on the transition state, and how this affects sigma, by remembering what equilibrium constant is over what for sigma, and rho), but this implies a very high working memory capacity (which I don’t have unfortunately).  I think mathematicians do, which is why I’m so slow at it.  They have to keep all sorts of definitions in working memory at once to come up with proofs (and I do to follow them).  

If you don’t know what working memory is, here’s a link — http://en.wikipedia.org/wiki/Working_memory.  

Here are a few literature references 

        [ Proc. Natl. Acad. Sci. vol. 106 pp. 21017 – 21018 ’09 ] This one is particularly interesting to me as it states that differences among people in working memory capacity are thought to reflect a core cognitive ability, because they strongly predict performance in fluid inteliigenece, reading, attentional control etc. etc.  This may explain why you have to have a certain sort of smarts to be a mathematician (the sort that helps you on IQ tests).  

       [ Science vol. 323 pp. 800 – 802 ’09 ] Intensive training on working memory tasks can improve working memory capacity, and reduce cognitively related clinical symptoms.  The improvements have been associated with an increase in brain activity in parietal and frontal regions. 

I think there are some websites which will train working memory (and claim to improve it).  I may give them a shot. 

Unrelated to this chapter, but Science vol. 337 pp. 1648 – 1651 ’12, but worth bringing to the attention of the cognoscenti reading this –as there is some fascinating looking organometallic chemistry in it.  This is a totally new development since the early 60’s and I look forward to reading the next chapter on Organometallic chemistry.   Hopefully orbitals and stereochemistry will be involved there, as they are in this paper.  Fig 1 C has A uranium atom bound to 3 oxygens and 3 nitrogens, and also by dotted bonds to H and C.

p. 1050 — The unspoken assumption about the kinetic isotope effect is that the C-D and C-H bonds have the same strength (since the curve of potential energy vs. atomic separation is the same for both — this is probably true — but why?    Also, there is no explanation of why the maximum kinetic isotope effect is 7.1.  So I thought I’d look and see what the current Bible of physical organic chemistry had to say about it. 

Anslyn and Dougherty (p. 422 –> ) leave the calculation of the maximum isotope effect (at 298 Kelvin) as an exercise.  They also assume that the force constant is the same.  Exercise 1 (p. 482) says one equation used to calculate kinetic isotope effects is given below — you are asked to derive it 

kH/kD = exp [ hc (vbarH – vbarD)/2KT }, and then in problem #2 plug in a stretching frequency for C-H of 3000 cm^-1 to calculate the isotope effect at 298 Kelvin coming up with 6.5

Far from satisfying.  I doubt that the average organic chemist reading Anslyn and Dougherty could solve it.  Perhaps I could have  done it back in ’61 when I had the incredible experience of auditing E. B. Wilson’s course on Statistical Mechanics while waiting to go to med school (yes he’s the Wilson of Pauling and Wilson).   More about him when I start reading Molecular Driving Forces. 

On another level, it’s rather surprising that mass should have such an effect on reaction rates.  Bonds are about the distribution of charge, and the force between charged particles is 10^36 times stronger than that between particles of the same mass. 

p. 1052 — Entropy is a subtle concept (particularly in bulk thermodynamics), but easy enough to measure there.    Organic chemists have a very intuitive concept of it as shown here.

p. 1054 — Very slick explanation of the inverse isotope effect.  

Again out of context — but more chemistry seems to be appearing in Nature and Science these days.   A carbon coordinated to 6 iron atoms ( yes six ! ! ! ) exists in an enzyme essential for life itself — the plant enzyme nitrogenase which reduces N2 to usable ammonia equivalents for formation of amino acids, nucleotides.   No mention seems to be made about just how unusual this is.  See Science vol. 337 pp. 1672 – 1675 ’12. 

p. 1061 — The trapping of the benzyne intermediate by a Diels Alder is clever and exactly what I was trying to do years ago in a failed PhD project — see https://luysii.wordpress.com/2012/10/04/carbenes-and-a-defense-of-pre-meds-and-docs/

p. 1064 — In the mechanism of attack on epichlorohydrin, the reason for the preference of attack on the epoxide isn’t given — it’s probably both steric and kinetic, steric because attack on the ring is less hindered — the H’s are splayed out, and kinetic, because anything opening up a strained ring should have a lower energy transition state. 

The New Clayden pp. 931 – 969

p. 935 — I don’t understand why neighboring group participation is less common using 4 membered rings than it is using  3 and 5 membered rings.  It may be entropy and the enthalpy of strain balancing out.  I think they’ve said this elsewhere (or in the previous edition).   Actually — looking at the side bar, they did say exactly that in Ch. 31.  

As we used to say, when scooped in the literature — at least we were thinking well.

p. 935 — “During the 1950’s and 1960’s, this sort of question provoked a prolonged and acrimonious debate”  – you better believe it.  Schleyer worked on norbornane, but I don’t think he got into the dust up.  Sol Winstein (who Schleyer called solvolysis Sol) was one of the participants along with H. C. Brown (HydroBoration Brown).

p. 936 — The elegance of Cram’s work.  Reading math has changed the way I’m reading organic chemistry.  What you want in math is an understanding of what is being said, and subsequently an ability to reconstruct a given proof.  You don’t have to have the proof at the tip of your tongue ready to spew out, but you should be able to reconstruct it given a bit of time.   The hard thing is remembering the definitions of the elements of a proof precisely, because precise they are and quite arbitrary in order to make things work properly.  It’s why I always leave a blank page next to my notes on a proof — to contain the definitions I’ve usually forgotten (or not remembered precisely).

I also find it much easier to remember mathematical definitions if I write them out (as opposed to reading them as sentences) as logical statements.  This means using ==> for implies | for such that, upside down A for ‘for all’, backwards E for ‘there exists, etc. etc. There’s too much linguistic fog in my mind when I read them as English sentences.

       So just knowing some general principles will be enough to reconstruct Cram’s elegant work described here.  There’s no point in trying to remember it exactly (although there used to be for me).   It think this is where beginning students get trapped — at first it seems that you can remember it all.  But then the inundation starts.  What should save them, is understanding and applying the principles, which are relatively few.  Again, this is similar to what happens in medicine — and why passing organic chemistry sets up the premed for this style of thinking. 

p. 938 – In the example of the Payne rearrangement, why doesn’t OH attack the epoxide rather than deprotonating the primary alcohol (which is much less acidic than OH itself).

p. 955 – Although the orbitals in the explanation of why stereochemistry is retained in 1,2 migrations are called  molecular orbitals (e.g. HOMO, LUMO) they look awfully like atomic orbitals just forming localized bonds between two atoms to me.  In fact the whole notion of molecular orbital has disappeared in most of the explanations (except linguistically).  The notions of 50 years ago retain their explanatory power.  

p. 956 — How did Eschenmoser ever think of the reaction bearing his name?  Did he stumble into it by accident? 

p. 956 — The starting material for the synthesis of juvenile hormone looks nothing like it.  I suppose you could say its the disconnection approach writ large, but the authors don’t take the opportunity.   The use of fragmentation to control double bond stereochemistry is extremely clever.   This is really the first stuff in the book that I think I’d have had trouble coming up with.  The fragmentation syntheses at the end of the chapter are elegant and delicious.

On a more philosophical note, the use of stereochemistry and orbitals to make molecules is exactly what I mean by explanatory power.  Anti-syn periplanar is a very general concept, which I doubt was brought into being to explain the stereochemistry of fragmentation reactions (yet it does).  It appears over and over throughout the book in various guises.

Urysohn’s Lemma

“Now we come to the first deep theorem of the book,. a theorem that is commonly called the “Urysohn lemma”.  . . .  It is the crucial tool used in proving a number of important theorems. . . .  Why do we call the Urysohn lemma a ‘deep’ theorem?  Because its proof involves a really original idea, which the previous proofs did not.  Perhaps we can explain what we mean this way:  By and large, one would expect that if one went through this book and deleted all the proofs we have given up to now and then handed the book to a bright student who had not studied topology, that student ought to be able to go through the book and work out the proofs independently.  (It would take a good deal of time and effort, of course, and one would not expect the student to handle the trickier examples.)  But the Uyrsohn lemma is on a different level.  It would take considerably more originality than most of us possess to prove this lemma.”

The above quote is  from  one of the standard topology texts for undergraduates (or perhaps the standard text) by James R. Munkres of MIT. It appears on  page 207 of 514 pages of text.  Lee’s text book on Topological Manifolds gets to it on p. 112 (of 405).  For why I’m reading Lee see https://luysii.wordpress.com/2012/09/11/why-math-is-hard-for-me-and-organic-chemistry-is-easy/.

Well it is a great theorem, and the proof is ingenious, and understanding it gives you a sense of triumph that you actually did it, and a sense of awe about Urysohn, a Russian mathematician who died at 26.   Understanding Urysohn is an esthetic experience, like a Dvorak trio or a clever organic synthesis [ Nature vol. 489 pp. 278 – 281 ’12 ].

Clearly, you have to have a fair amount of topology under your belt before you can even tackle it, but I’m not even going to state or prove the theorem.  It does bring up some general philosophical points about math and its relation to reality (e.g. the physical world we live in and what we currently know about it).

I’ve talked about the large number of extremely precise definitions to be found in math (particularly topology).  Actually what topology is about, is space, and what it means for objects to be near each other in space.  Well, physics does that too, but it uses numbers — topology tries to get beyond numbers, and although precise, the 202 definitions I’ve written down as I’ve gone through Lee to this point don’t mention them for the most part.

Essentially topology reasons about our concept of space qualitatively, rather than quantitatively.  In this, it resembles philosophy which uses a similar sort of qualitative reasoning to get at what are basically rather nebulous concepts — knowledge, truth, reality.   As a neurologist, I can tell you that half the cranial nerves, and probably half our brains are involved with vision, so we automatically have a concept of space (and a very sophisticated one at that).  Topologists are mental Lilliputians trying to tack down the giant Gulliver which is our conception of space with definitions, theorems, lemmas etc. etc.

Well one form of space anyway.  Urysohn talks about normal spaces.  Just think of a closed set as a Russian Doll with a bright shiny surface.  Remove the surface, and you have a rather beat up Russian doll — this is an open set.  When you open a Russian doll, there’s another one inside (smaller but still a Russian doll).  What a normal space permits you to do (by its very definition), is insert a complete Russian doll of intermediate size, between any two Dolls.

This all sounds quite innocent until you realize that between any two Russian dolls an infinite number of concentric Russian dolls can be inserted.  Where did they get a weird idea like this?  From the number system of course.  Between any two distinct rational numbers p/q and r/s where p, q, r and s are whole numbers, you can  always insert a new one halfway between.  This is where the infinite regress comes from.

For mathematics (and particularly for calculus) even this isn’t enough.  The square root of two isn’t a rational number (one of the great Euclid proofs), but you can get as close to it as you wish using rational numbers.  So there are an infinite number of non-rational numbers between any two rational numbers.  In fact that’s how non-rational numbers (aka real numbers) are defined — essentially by fiat, that any series of real numbers bounded above has a greatest number (think 1, 1.4, 1.41, 1.414, defining the square root of 2).

What does this skullduggery have to do with space?  It says essentially that space is infinitely divisible, and that you can always slice and dice it as finely as you wish.  This is the calculus of Newton and the relativity of Einstein.  It clearly is right, or we wouldn’t have GPS systems (which actually require a relativistic correction).

But it’s clearly wrong as any chemist knows. Matter isn’t infinitely divisible, Just go down 10 orders of magnitude from the visible and you get the hydrogen atom, which can’t be split into smaller and smaller hydrogen atoms (although it can be split).

It’s also clearly wrong as far as quantum mechanics goes — while space might not be quantized, there is no reasonable way to keep chopping it up once you get down to the elementary particle level.  You can’t know where they are and where they are going exactly at the same time.

This is exactly one of the great unsolved problems of physics — bringing relativity, with it’s infinitely divisible space together with quantum mechanics, where the very meaning of space becomes somewhat blurry (if you can’t know exactly where anything is).

Interesting isn’t it?

Follow

Get every new post delivered to your Inbox.

Join 69 other followers