That was a sort of daunting task, so I recruited Matt Goldman as a co-author (Matt is currently working in the chief economist’s office at Microsoft, and has done some great work on optimal behavior in basketball), and together we put together a review that you can read here. The Chapter will appear in the upcoming *Handbook of Statistical Methods for Design and Analysis in Sports*, one of the Chapman & Hall/CRC Handbooks of Modern Statistical Methods.

Of course, it’s likely that you don’t want to read some academic-minded book chapter. Luckily, however, I was given the opportunity to write a short summary of the chapter for the blog Nylon Calculus (probably the greatest of the ultra-nerdy basketball blogs right now).

You can read it here.

A few things you might learn:

- An optimal team is one where everyone’s
*worst*shots have the same quality. - An optimal strategy does not generally lead to the largest expected margin of victory
- NBA players are shockingly good at their version of the Secretary Problem

(Long-time readers of this blog might see some familiar themes from these blog posts.)

]]>

In a column today, Zach considered the following question: How valuable is it to try and get offensive rebounds?

Obviously, getting an offensive rebound has great value, since it effectively gives your team another chance to score. But if you send too many players to “crash the boards” in pursuit of a rebound, you leave yourself wide open to a fast-break opportunity by the opposing team.

So there is some optimization to be done here. One needs to weigh the benefit of increased offensive rebounding against the cost of worse transition defense.

In recent years, the consensus opinion in the NBA seems to have shifted away from a focus on offensive rebounding and towards playing it safe against fast-breaks. In his analysis, however, Zach toys with the idea that some teams might find a strategic advantage to pursuing an opposite strategy, and putting a lot of resources into offensive rebounding.

This might very well be true, but my suspicions were raised when Zach made the following comments:

There may be real danger in banking too much on offensive rebounds. And that may be especially true for the best teams. Good teams have good offenses, and good offenses make almost half their shots. If the first shot is a decent bet to go in, perhaps the risk-reward calculus favors getting back on defense. This probably plays some role in explaining why good teams appear to avoid the offensive glass: because they’re good, not because offensive rebounding is on its face a bad thing.

…

Bad teams have even more incentive to crash hard; they miss more often than good teams!

Zach is right, of course, that a team with a high shooting percentage is less likely to get an offensive rebound. But it is also true that offensive rebounds are more valuable for teams that score more effectively. For example, if your team scores 1.2 points per possession on average, then an offensive rebound is more valuable to you than it is to a team that only scores 0.9 points per possession, since the rebound is effectively granting you one extra possession.

Put another way: both the team that shoots 100% and the team that shoots 0% have no incentive to improve their offensive rebounding. The first has no rebounds to collect, and the second has nothing to gain by grabbing them. I might have naively expected that a team making half its shots, as Zach mentions in his comment above, has the *very most* incentive to improve its offensive rebounding!

So let’s put some math to the problem, and try to answer the question: how much do you stand to gain by improving your offensive rebounding? By the end of this post I’ll present a formula to answer this question, along with some preliminary statistical results.

The starting point is to map out the possible outcomes of an offensive possession. For a given shot attempt, there are two possibilities: make or miss. If the shot misses, there are also two possibilities: a defensive rebound, or an offensive rebound. If the team gets the offensive rebound, then they get another shot attempt, as long as they can avoid committing a turnover before the attempt. Let’s say that a team’s shooting percentage is *p*, their offensive rebound rate is *r*, and the turnover rate is *t*.

Mapping out all these possibilities in graphical form gives a diagram like this:

The paths through this tree (left to right) that end in x’s result in zero points. The path that ends in o results in some number of points. Let’s call that number *v* (it should be between 2 and 3, depending on how often your team shoots 3’s). The figures written in italics at each branch represent the probabilities of following the given branch. So, for example, the probability that you will miss the first shot and then get another attempt is .

Of course, once you take another shot, the whole tree of possibilities is repeated. So the full diagram of possible outcomes is something like this:

Now there are many possible sequences of outcomes. If you want to know the expected number of points scored, which I’ll call *F*, you just need to sum up the probability of ending at a green circle and multiply by *v*. This is

(A little calculus was used to get that last line.)

So now if you want to know how much you stand to gain by improving your offensive rebounding, you just need to look at how quickly the expected number of points scored, *F*, increases with the offensive rebound rate, *r*. This is the derivative *dF/dr*, which I’ll call the “Value of Improved Offensive Rebounding”, or VIOR (since basketball nerds love to make up acr0nyms for their “advanced stats”). It looks like this:

Here’s how to interpret this stat: VIOR is the number of points per 100 possessions by which your scoring will increase for every percent improvement of the offensive rebound rate *r*.

Of course, VIOR only tells you how much the *offense* improves, and thus it cannot by itself tell you whether it’s worthwhile to improve your offensive rebounding. For that you need to understand how much your *defense* suffers for each incremental improvement in offensive rebounding rate. That’s a problem for another time.

But still, for curiosity’s sake, we can take a stab at estimating which current NBA teams would benefit the most, offensively, from improving their offensive rebounding. Taking some data from basketball-reference, I get the following table:

In this table the teams are sorted by their VIOR score (i.e., by how much they would benefit from improved offensive rebounding). Columns 2-5 list the relevant statistics for calculating VIOR.

The ordering of teams seems a bit scattered, with good teams near the top and the bottom of the list, but there are a few trends that come out if you stare at the numbers long enough.

- First, teams that have a lower turnover rate tend to have a higher VIOR. This seems somewhat obvious: rebounds are only valuable if you don’t turn the ball over right after getting them.
- Teams that shoot more 3’s also tend to have a higher VIOR. This is presumably because shooting more 3’s allows you to maintain a high scoring efficiency (so that an additional possession is valuable) while still having lots of missed shots out there for you to collect.
- The teams that would most benefit from improved offensive rebounding are generally the teams that are
*already the best**at offensive rebounding*. This seems counterintuitive, but it comes out quite clearly from the logic above. If your team is already good at offensive rebounding, then grabbing one more offensive rebound buys you more than one additional shot attempt, on average.

(Of course, it is also true that a team with a high offensive rebounding rate might find it especially difficult to improve that rate.)

Looking at the league-average level, the takeaway is this: an NBA team generally improves on offense by about 0.62 points per 100 possessions for each percentage point increase in its offensive rebound rate. This means that if NBA teams were to improve their offensive rebounding from 23% (where it is now) to 30% (where it was a few years ago), they would generally score about 4.3 points more per 100 possessions.

So now the remaining question is this: are teams saving more than 4.3 points per 100 possessions by virtue of their improved transition defense?

**Footnotes:**

- There are, of course, plenty of ways that you can poke holes in the logic above. For example: is the shooting percentage
*p*really a constant, independent of whether you are shooting after an offensive rebound or before? Is the turnover rate*t*a constant? The offensive rebound rate*r*? If you want to allow for all of these things to vary situationally, then you’ll need to draw a much bigger tree.I’m not saying that these kinds of considerations aren’t important (this is a very preliminary analysis, after all), only that I haven’t thought deeply about them.

- Much of the logic of this post was first laid out by Brian Tung after watching game 7 of the 2010 NBA finals, where the Lakers shot 32.5% but rebounded 42% of their own misses.
- I know I’ve made this point before, but I will never get over how useful Calculus is. I use it every day of my life, and it helps me think about essentially every topic.

]]>

I don’t mean to brag, but if you’ve been following this sequence of posts on the topic of particles and fields, then I’ve sort of taught you the secret to modern physics.

The secret goes like this:

Everything arises from fields, and fields arise from everything.

…

Go ahead.

You can indulge in a good eye-roll over the new-agey sound of that line.

(And over the braggadocio of the author.)

But eye-rolling aside, that line actually does refer to a very profound idea in physics. Namely, that the most fundamental object in nature is the *field*: a continuous, space-filling entity that has a simple mathematical structure and supports “undulations” or “ripples” that act like physical particles. (I offered a few ways to visualize fields in this post and this post.) To me, it is the most mind-blowing fact of modern physics that we call *particles* are really just “ripples” or “defects” on some infinite field.

But the miraculousness of fields isn’t just limited to fundamental particles. Fields also emerge at much higher levels of reality, as composite objects made from the motion of many active and jostling things. For example, one can talk about a “field” made from a large collection of electrons, atoms, molecules, cells, or even people. The “particles” in these fields are ripples or defects that move through the crowd. It is one of the miracles of science that essentially any sufficiently large group of interacting objects gives rise to simple collective excitations that behave like independent, free-moving particles.

Maybe this discussion seems excessively esoteric to you. I can certainly understand that objection. But the truth is that the basic paradigm of particles and fields is so generic and so powerful that one can apply it to just about any level of nature.

So we might as well use it to talk about something awesome.

Let’s talk about swords.

* * *

A sword, of course, is a solid piece of metal, and that means that if you look at it under sufficiently high magnification it will look something like this:

The little balls in this picture represent atoms (say, iron atoms), and in a solid metal they generally sit in a nice, periodic arrangement. (The lines in the drawing are just there to illustrate the orderliness of this arrangement.) The positions of the atoms will constitute our *field*.

Now let’s ask the question: how strong is a sword? How much force can you apply on it before the sword deforms or breaks?

To make the question more specific, let’s suppose that you swing your sword directly into a sharp surface (like, I don’t know, another sword). At the point of impact there will be a force that tries to push one plane of atoms in such a way that it slides across the neighboring plane. This kind of force is called *shear*.

How big does the shear force have to be before your sword breaks? Looking at the picture above, one very natural answer to this question might come to mind. Namely, that the breaking force should be equal to the repulsive force between two neighboring atoms multiplied by the number of atoms in a given plane.

This answer is very natural, but also very wrong. In fact, if you use that answer to make an estimate of a sword’s breaking force, you’ll find that even a laughably puny “sword” with a 1 millimeter cross section would withstand multiple tons of force before it broke. Since we do not live in a world where people go confidently into battle with millimeter-thick swords (and since, relatedly, you are probably capable of deforming an steel paper clip with your bare hands), there must be something wrong with this answer.

To understand what went wrong, we need to think about the *particles* in our field.

Remember that a *particle* is basically just a defect in a field. And if your field is a crystal of iron atoms, then there is one particular kind of defect that is especially relevant. This defect is called a *dislocation*, and it looks like this:

A dislocation is a place where the lattice planes don’t line up with each other. This failure of alignment produces stress in the nearby regions of the crystal (illustrated by the orangeish area), as atoms are forced into positions that are slightly closer or slightly further from their neighbors than they would prefer. Notice, however, that there is no easy way to eliminate all that stress. Moving atoms around locally just shifts the position of the dislocation, and the stress remains the same.

Of course, you should also keep in mind that the dislocations are not little points. That orange region of stress is not just a single point-like region where the lattice planes are mismatched. In a thick piece of metal, the dislocations are actually long lines of mismatched atoms.

(In this picture, that line of dislocation extends into the screen.)

Consequently, our “particles” in this field are better drawn as long, stringy lines that extend through the metal. I’ll draw them like this:

As it turns out, these dislocations have a serious implication for the strength of our hypothetical sword. When a dislocation is present, all you need to do to deform the sword is to move the dislocation from one side to the other. Like this:

In contrast to the Herculean effort required to make two atomic planes slip against each other, moving a dislocation is easy, since you are only displacing a small number of atoms at a time. One analogy is that moving a dislocation is something like trying to move a very heavy carpet across the floor. Dragging the whole thing may be prohibitively difficult, but if you make a wrinkle or a roll in the carpet, you can simply push that wrinkle to shift the position of the carpet.

This is also why your puny hands are capable of bending a paper clip: when you bend the clip, you are in fact just pushing dislocations from one side of the material to the other.

So what can you do if you want a sword that doesn’t bend or break easily?

You might think that the answer is to be extremely fastidious in preparing or choosing your metal, with the goal of having as few dislocations as possible. But this turns out to be a fool’s errand. Even a small number of dislocations enable the material to deform, and new dislocations can always enter the metal from either edge (as in the animated gif above).

The correct strategy, as it turns out, is to make *more* dislocations. And to make them as disordered as possible.

The crucial idea behind this strategy is that dislocations can’t really move through each other. When two dislocations are brought together, the stress in the crystal builds up intensely around them.

Such a stress build-up leads to a strong repulsive force that pushes the dislocations back apart, and thereby prevents them from moving through each other.

So now if you have two dislocations aligned in different directions, they can get caught on each other in a way that prevents each of them from slipping past the other.

In fact, this kind of dislocation tangling is one of the most important reasons for all that hammering during the process of metal forging.

When the mighty smith stands at work over his anvil (the muscles of his brawny arms as strong as iron bands), his effort is largely going into creating a tangled knot of dislocations inside the metal. Such a tangle keeps the metal strong by pinning the dislocations in place, and prevents the metal from deforming under future stresses. (This part of the process is also known as *work hardening*, or *strain hardening*.)

In this way, the value of the blacksmith is not that he’s strong enough to deform crystalline steel (he’s not). It’s just that he’s pretty good at making a tangled mess of dislocations. And tangled dislocations make good swords.

I guess you could call him an applied field theorist.

One footnote is in order: I learned a great deal about forging from this excellent article written by the renowned blade/swordsmith Kevin Cashen.

I also stole the rug picture from his website, and I hope he doesn’t mind.

]]>

But when you’re a graduate student or postdoc struggling to make a career in physics, that quote rarely feels true. Instead, you are usually made to feel like productivity and technical ability are the qualities that will make or break your career.

But Leo Kadanoff was someone who made that quote feel true.

Leo Kadanoff, one of the true giants in theoretical physics during the last half century, passed away just a few days ago. Kadanoff’s work was marked by its depth of thought and its relentless creativity. I’m sure that over the next week many people will be commemorating his life and his career.

But I thought it might be worth telling a brief story about my own memories of Leo Kadanoff, however minor they may be.

During the early part of 2014, I had started playing with some ideas that were well outside my area of expertise (if, indeed, I can be said to have any such area). I thought these projects were pretty cool, but I was tremendously unconfident about them. My lack of confidence was actually pretty justified: I was highly ignorant about the fields to which these projects properly belonged, and I had no reason to think that anyone else would find them interesting. I was also working in the sort-of-sober environment of the Materials Science Division at Argonne National Laboratory, and I was afraid that at any moment my bosses would tell me to shape up and do real science instead of nonsense.

In a moment of insecure hubris (and trust me, that combination of emotions makes sense when you’re a struggling scientist), I wrote an email to Leo Kadanoff. I sent him a draft of a manuscript (which had already been rejected twice without review) and asked him whether he would be willing to give me any comments. The truth is that the work really had no connection to Kadanoff, or to any of his past or present interests. I just knew that he was someone who had wide-ranging interests and a history of creativity, and I wasn’t sure who else to write to.

His reply to me was remarkable. He told me that he found the paper interesting, and that I should come give a seminar and spend a day at the University of Chicago. I quickly took him up on the offer, and he slotted me into his truly remarkable seminar series called “Computations in Science.” (“The title is old,” he said, “inherited from the days when that title would bring in money. We are closer to ‘concepts in science’ or maybe ‘all things considered.’ ”)

When I arrived for the seminar, Kadanoff was the first person to meet me. He had just arrived himself, by bike. Apparently at age 77 he still rode his bike to work every day. We had a very friendly conversation for an hour or so, in which we talked partly about science and partly about life, and in which he gave me a brief guide to theater and music in the city of Chicago. (At one point I mentioned that my wife was about to start medical residency, which is notorious for its long and stressful hours. He sympathized, and said “I am married to a woman who has been remarkably intelligent all of her life, except during her three years of residency.”)

When it came time for the talk to start, I was more than a little nervous. Kadanoff stood in front of the room to introduce me.

“Brian is a lot like David Nelson …” he began.

[And here my eyes got wide. David Nelson is another giant in theoretical physics, well-respected and well-liked by essentially everyone. So I was bracing myself for some outrageous compliment.]

“… he grew up in a military family.”

I don’t think that line was meant as a joke. But somehow it put me in a good mood, and the rest of the day went remarkably well. The seminar was friendly, and the audience was enthusiastic and critical (another combination of emotions that goes very well together in science). In short, it was a beautiful day for me, and I basked in the atmosphere that surrounded Kadanoff in Chicago. It seemed to me a place where creativity and inquisitiveness were valued intensely, and I found it immensely energizing and inspiring.

Scientists love to tell the public about how their work is driven by the joy of discovery and the pleasure of figuring things out. But rarely does it feel so directly true as it did during my visit to University of Chicago.

On the whole, the truth is that I didn’t know Leo Kadanoff that well. My interactions with him didn’t extend much beyond one excellent day, a few emails, and a few times where I was in the audience of his talks. But when Kadanoff was around, I really felt like science and the profession of scientist lived up to their promise.

It’s pretty sad to think that I will probably never get that exact feeling again.

Take a moment, if you like, and listen to Kadanoff talk about his greatest work. It starts with comic books.

]]>

Let’s start with a big question: why does science work?

Writ large, science is the process of identifying and codifying the rules obeyed by nature. Beyond this general goal, however, science has essentially no specificity of topic. It attempts to describe natural phenomena on all scales of space, time, and complexity: from atomic nuclei to galaxy clusters to humans themselves. And the scientific enterprise has been so successful at each and every one of these scales that at this point its efficacy is essentially taken for granted.

But, by just about any *a priori* standard, the extent of science’s success is extremely surprising. After all, the human brain has a very limited capacity for complex thought. We human tend to think (consciously) only about simple things in simple terms, and we are quickly overwhelmed when asked to simultaneously keep track of multiple independent ideas or dependencies.

As an extreme example, consider that human thinking struggles to describe even individual atoms with real precision. How is it, then, that we can possibly have good science about things that are made up of many atoms, like magnets or tornadoes or eukaryotic cells or planets or animals? It seems like a miracle that the natural world can contain patterns and objects that lie within our understanding, because the individual constituents of those objects are usually far too complex for us to parse.

You can call this occurrence the “miracle of emergence”. I don’t know how to explain its origin. To me, it is truly one of the deepest and most wondrous realities of the universe: that simplicity continuously emerges from the teeming of the complex.

But in this post I want to try and present the nature of this miracle in one of its cleanest and most essential forms. I’m going to talk about quasiparticles.

Let’s talk for a moment about the most world-altering scientific development of the last 400 years: electronics.

When humans learned how to harness the flow of electric current, it completely changed the way we live and our relationship to the natural world. In the modern era, the idea of electricity is so fundamental to our way of living that we are taught about it within the first few years of elementary school. That teaching usually begins with pictures that look something like this:

That is, we are given the image of electrons as little points that flow like a river through some conducting material. This image more or less sticks around, with relatively little modification, all the way through a PhD in physics or electrical engineering.

But there is a dirty secret behind that image: it doesn’t make any sense.

And an even deeper secret: it isn’t *electrons* that carry electric current. Instead, the current is carried by much larger and more nuanced objects called “electrons”.

Let me explain.

To see the problem with the standard picture, think for a moment about what metals are actually made of. Let’s even take the simplest possible example of a metal: metallic lithium, with three electrons per atom. A single lithium atom looks something like this:

Those points are meant to show the probability density for the electrons inside the atom. Two of the electrons (the yellow points) are closely bound to the nucleus, while the third (blue points) is more loosely bound. The exact arrangement of electrons around the nucleus is actually a difficult question, because the three electrons are continually pushing on each other (via very strong electric forces) as they orbit around the nucleus. So the precise structure of even this simple atom does not have an easy solution.

The situation gets exponentially messier, though, when you bring a whole bunch of atoms together to make a block of lithium. Inside that block, the atoms are packed together very tightly, something like this:

This picture may look tidy, but consider it from the point of view of an electron traveling through the metal. Such an electron has no hope for a smooth and simple trajectory. Instead, it gets continuously buffeted around by the enormous forces coming from the other nearby electrons and from the nuclei. So, for example, if you injected an electron into one side of a piece of lithium, it would absolutely *not* just sail smoothly across to the other side. It would quickly adopt a completely chaotic trajectory, and any information about its initial direction or speed would be lost.

(Of course, this is all to say nothing about how messy things are in a more typical metal like copper. In copper each atom has 29 electrons swirling around it in complicated orbits, rather than just 3. I can’t even draw you a good picture of a Copper atom.)

So thinking about individual electrons is hard – much too hard to be useful for any simple human reasoning. As it turns out, if you want to make any headway thinking about electric current, it actually makes sense to forget about the electrons’ individuality and just imagine clouds of probability density around each nucleus. Now, when you inject an electron into one side of the metal, it just adds some probability to the electron clouds on that side. And you can imagine that over time this probability moves on down the line, like so:

So this is how electric current is really carried. Not by free-sailing electrons, but by waves of probability density that are themselves made from the swirling, chaotic trajectories of many different electrons.

Messy, right? Well, now comes the miracle.

The key insight is that you can think about all those swirling, chaotic electron trajectories as a *quantum field* of electric charge, conceptually similar to the quantum fields out of which the fundamental particles arise. And now you can ask the question: what do the ripples on that field look like?

The answer: they look almost identical to real electrons.

In fact, those waves of electric charge density look so similar to “bare” electrons flying through free space that we even call them “electrons”. But they are not *electrons* as God made them. These “electrons” are instead an emergent concept: a collective movement of many jumbled and densely-packed God-given electrons, all pushing on each other and flying around at millions of miles per hour in chaotic trajectories.

But the emergent wave, that so-called “electron”, is startling in its simplicity. It moves through the crystal in straight lines and with a constant speed, like a ghost that can travel through walls. It carries with it the exact same charge as a single electron (and the exact same quantum-mechanical spin). It has the same type of kinetic energy, , and for all the world behaves like a naked electron moving through empty space. In fact, the only way you could tell, from a distance, that the “electron” is not really an *electron*, is that its mass is different. The “electron” that emerges from the sea of chaotic electrons feels either a bit heavier, or a bit lighter, than a bare electron. (And sometimes it as much as 100 times heavier or lighter, depending on the details of the atomic orbitals and the atom spacing.)

This discovery – that the fundamental emergent excitation from a soup of electrons looks and acts *just like a real, solitary electron* – was one of the great triumphs of 20^{th} century physics. The theory of these excitations is called *Fermi liquid theory*, and was pioneered by the near-mythological Soviet physicist Lev Landau. Landau called these emergent waves “quasiparticles”, because they behave just like free, unimpeded particles, even though the electrons from which they are made are very much neither free nor unimpeded.

To my mind, this discovery emphasizes the essence of what is beautiful about physical science. Complexity, by itself, has no inherent beauty. But there is something beautiful about observing a very simple thing emerging from an environment that initially appears to be a complicated mess. It gives the same fundamental pleasure as, say, watching waves roll onto the seashore (or, to a lesser extent, seeing people in a stadium do the wave).

A good deal of physics (especially condensed matter physics, my own specialty) is built on the pattern that Landau set for us. Its practitioners spend much of their time combing through the physical universe in search of quasiparticles, those little miracles that allow us to understand the whole, even though we have no hope of understanding the sum of its parts.

And at this point, we have amassed a fair collection of them. I’ll give you a few examples, in table form:

Each item in this list has its own illustrious history and deserves to have its story told on its own. But what they all have in common are the traits that make them quasiparticles. They all move through their host material in simple straight lines, as if they were freely traveling through empty space. They are all stable, meaning that they live for a long time without decaying back into the field from which they arose. They all come in discrete, indivisible units. And they all have simple laws determining how they interact with each other.

They are, in short, particles. It’s just that we understand what they’re made of.

One could ask, finally, *why* it is that these quasiparticles exist in the first place. Why do we keep finding simple emergent objects wherever we look?

As I alluded to in the beginning, there is no really satisfying answer to that question. But there is a hint. All of these objects have a *mathematically* simple structure. It somehow seems to be a rule of nature that if you can write a simple and aesthetically pleasing equation, somewhere in nature there will be a manifestation of the equation you wrote down. And the simpler and more beautiful the equation you can manage to write, the more manifestations you will find.

So physicists have slowly learned that the one of the best pathways to discovery is through mathematical *parsimony*. We write the simplest equation that we think could possibly describe the object we want to understand. And, in no small number of instances, nature finds a way to realize exactly that equation.

Of course, the big question remains: why should nature care about man’s mathematics, or his sense of beauty? How can the same sorts of simple equations keep appearing at every scale of nature that we look for them? How is it that math, seemingly an invention by feeble human brains, is capable of transcending so thoroughly the understanding of its creators?

These questions seem to have no good answers. But they are, to me, continually awe-inspiring. And they swirl around the heart of the mystery of why science is possible.

]]>

Specifically, this problem:

The problem was written into this year’s Higher Maths exam in Scotland, and has since been the source of much angst for Scottish high schoolers and many Twitter jokes for everyone else.

As with most word problems, I’m sure that what confounded people was making the translation between a verbal description of the problem and a set of equations. The actual math problem that needs to be solved is pretty standard for a calculus class. It just comes down to finding the minimum of the function (which is one of the things that calculus is absolutely most useful for). In practical terms, that means taking the derivative and setting it equal to zero.

But it turns out that there is a more clever way to solve the problem that doesn’t require you to know any calculus or take any derivatives. It has fewer technical steps (and therefore comes with a smaller chance of screwing up your calculation somewhere along the way), but more steps of logical thinking. And it goes like this.

The problem is essentially asking you to find the path of shortest time for a crocodile moving from one point to another. If the crocodile were just walking on land, this would be easy: the quickest route is always a straight line. The tricky part is that the crocodile has to move partly on land and partly on water, and the water section is slower than the land section.

If you want to know the speed of the crocodile on land and on water, you can pretty much read them off directly from the problem statement. The problem gives you the equation . The quantity in that equation represents the path length for the on-land section, and the quantity is the path length for the water section. (That square root is the length of the hypotenuse of a triangle with side lengths and — apparently the river is 6 meters wide.) Since , this means that the on-land speed is m/s, and the water velocity is m/s (remember that the units of time were given in tenths of a second, and not seconds — not that it matters for the final answer).

That was just interpreting the problem statement. Now comes the clever part.

The trick is to realize that the problem of “find the shortest time path across two areas with different speed” is not new. It’s something that nature does continually whenever light passes from one medium to another:

I’m talking, of course, about Fermat’s principle: any time you see light go from one point to another, you can be confident that it took the shortest time path to get there. And when light goes from one medium to another one where it has a different speed, it bends. (Like in the picture above: light moves slower through the glass, so the light beam bends inward in order to cross through the glass more quickly.)

The bending of light is described by Snell’s law:

,

where and are the speeds in regions and , and and are the angles that the light makes with the surface normal.

Since our crocodile is solving the exact same problem as a light ray, it follows that its motion is described by the *exact same equation*. Which means this:

Here, m/s is the crocodile speed on land, and m/s is its speed in the river. The sine of is , and the sine of is .

So in the end the fastest path for the crocodile is the one that satisfies

.

If you solve that equation (square both sides and rearrange), you’ll get the correct answer: m.

So knowing some basic optics will give you a quick solution to the crocodile problems.

This happy coincidence brings to mind a great Richard Feynman quote: “Nature uses only the longest threads to weave her patterns, so each small piece of her fabric reveals the organization of the entire tapestry.” It turns out that this particular tapestry had both light rays and crocodiles in it.

By the way, this post has a very cool footnote. It turns out that ants frequently have to solve a version of this same problem: they need to make an efficient trail from the anthill to some food source, but the trail passes over different pieces of terrain that have different walking speeds.

And, as it turns out, ants understand how to follow Fermat’s principle too! (original paper here)

]]>

You are invited to teach a class to a group of highly-motivated high school students. It can be about absolutely any topic, and can last for as little as 5 minutes or as long as 9 hours.

What topic would you choose for your class?

As it happens, this is not just a hypothetical question for me at the moment. In November MIT is hosting its annual MIT Splash event, and the call for volunteer teachers is almost exactly what is written in the quote above. Students, staff, and faculty from all over MIT are invited to teach short courses on a topic of their choosing, and the results are pretty wild.

A few of my favorite courses from last year:

- The History of Video Game Music
- How to Create a Language
- Cryptography for People Without a Computer
- Build a Mini Aeroponic Farm
- Calculating Pi With a Coconut
- Advanced Topics in Murder

So now I would like to turn to you, dear blog readers, for help.

What should I teach about? Please let me know, in the comments, what you think about either of these two questions:

- If you were a high school student, what kind of class would you want to go to?
- If you were in my position, what kind of class would you want to teach?

The two ideas that come to mind immediately are:

- Quantum Mechanics with middle school math

Use Algebra 1 – level math to figure out answers to questions like: What is wave/particle duality? How big is an atom? How do magnets work? What is quantum entanglement?

- The Math Behind Basketball Strategy

Learn about some of the difficult strategic decisions that basketball teams are faced with, and see how they can be described with math. Then solve a few of them yourself!

Imagining yourself as a high school student, which of those two sounds better to you? Any suggestions for alternative ideas or refinements?

**UPDATE:** You can find my courses listed on the MIT Splash catalog here. Thanks for all your helpful comments, everyone! This should be a lot of fun.

]]>

* *

I was 12 years old when I first encountered this quote by Samuel Beckett:

“Every word is like an unnecessary stain on silence and nothingness.”

That quote impressed me quite a bit at the time. It appeared to my young self to be simultaneously profound, important, and impossible to understand. Now, nineteen years later, I’m still not sure I understand what Beckett meant by that short sentence. But I nonetheless find that its dark Zen has worked itself into me indelibly.

The Beckett quote comes to mind in particular as I sit down to write again about quantum field theory (QFT). QFT, to recap, is the science of describing particles, the most basic building blocks of matter. QFT concerns itself with how particles move, how they interact with each other, how they arise from nothingness, and how they disappear into nothingness again. As a framing idea or motif for QFT, I can’t resist presenting an adaptation of Beckett’s words as they might apply to the idea of particles and fields:

“Every particle is an unnecessary defect in a smooth and featureless field.”

Of course, it is not my intention to depress anyone with existential philosophy. But in this post I want to introduce, in a pictorial way, the idea of particles as *defects*. The discussion will allow me to draw some fun pictures, and also to touch on some deeper questions in physics like “what is the difference between matter and antimatter?”, “what is meant by *rest mass energy*?”, “what are *fermions* and *bosons*?”, and “why does the universe have matter instead of nothing?”

***

Let’s start by imagining that you have screwed up your zipper.

A properly functioning zipper, in the pictorial land of this blog post, looks like this:

But let’s say that your zipper has become dysfunctional, perhaps because of an overly hasty zip, and now looks more like this:

This zipper is in a fairly unhappy state. There is zipping a defect right in the middle of it: the two teeth above the letter “B” have gotten twisted around each other, and now all the zipper parts in the neighborhood of that pair are bending and bulging with stress. You could relieve all that stress, with a little work, by pulling the two B teeth back around each other.

But maybe you don’t want to fix it. You could, instead, push the two teeth labeled “A” around each other, and similarly for the two “C” teeth. Then the zipper would end up in a state like this:

Now you may notice that your zipping error looks not like one defect, but like two defects that have become separated from each other. The first defect is a spot where two upper teeth are wedged between an adjacent pair of lower teeth (this is centered more or less around the number “1”). The second defect is a spot where two lower teeth are wedged between two upper teeth (“2”).

You can continue the process of moving the defects away from each other, if you want. Just keep braiding the teeth on the outside of each defect around each other. After a long while of this process, you might end up with something that looks like this:

In this picture the two types of defects have been moved so far from each other that you can sort of forget that they came from the same place. You can now describe them independently, if you want, in terms of how hard it is to move them around and how much stress they create in the zipper. If you ever bring them back together again, though, the two defects will eliminate each other, and the zipper will be healed.

My contention in this post is that what we call *particles* and *antiparticles* are something like those zipper defects. Empty space (the *vacuum*) is like an unbroken zipper, with all the teeth sewed up in their proper arrangement. In this sense, empty space can be called “smooth”, or “featureless”, but it cannot really be said to have *nothing in it*. The zipper is in it, and with the zipper comes the potential for creating pairs of equal and opposite defects that can move about as independent objects. The potential for defects, and all that comes with them, is present in the zipper itself.

Like the zipper, the quantum fields that pervade all of space encode within themselves the potential for particles and antiparticles, and dictate the rules of how they behave. Creating those particles and antiparticles may be difficult, just as moving two teeth around each other in the zipper can be difficult, and such creation results in lots of “stress” in the field. The total amount of stress created in the field is the analog of the *rest mass energy* of a particle (as defined by Einstein’s famous , which says that a particle with large mass takes a lot of energy to create). Once created, the particles and antiparticles can move away from each other as independent objects, but if they ever come back together all of their energy is released, and the field is healed.

Since the point of this post is to be “picture book”, let me offer a couple more visual analogies for particles and antiparticles. While the zipper example is more or less my own invention, the following examples come from actual field theory.

Imagine now a long line of freely-swinging pendulums, all affixed to a central axle. And let’s say that you tie the ends of adjacent pendulums together with elastic bands. Perhaps something like this:

In its rest state, this *field* will have all its pendulums pointing downward. But consider what would happen if someone were to grab one of the pendulums in the middle of the line and flip it around the axle. This process would create two defects, or “kinks”, in the line of pendulums. One defect is a 360 degree clockwise flip around the axis, and the other is a 360 degree counterclockwise flip. Something like this:

As with the zipper, each of these kinks represents a sort of frustrating state for the field. The universe would prefer for all those pendulums to be pulled downward with gravity, but when there is a kink in the line this is impossible. Consequently, there is a large (“rest mass”) energy associated with each kink, and this energy can only be released when two opposite kinks are brought together.

(By the way, the defects in this “line of pendulums” example are an example of what we call *solitons. *Their motion is described by the so-called Sine-Gordon equation. You can go on YouTube and watch a number of videos of people playing with these kinds of things.)

In case you’re starting to worry that these kinds of particle-antiparticle images are only possible in one dimension, let me assuage your fears by offering one more example, this time in two dimensions. Consider a field that is made up of arrows pointing in the 2D plane. These arrows have no preferred direction that they like to point in, but each arrow likes to point in the same direction as its neighbors. In other words, there is an energetic cost to having neighboring arrows point in different directions. Consequently, the lowest energy arrangement for the field looks something like this:

If an individual arrow is wiggled slightly out of alignment with its neighbors, the situation can be righted easily by nudging it back into place. But it is possible to make big defects in the field that cannot be fixed without a painful, large-scale rearrangement. Like this vortex:

Or this configuration, which is called an *antivortex*:

The reason for the name *antivortex* is that a vortex and antivortex are in a very exact sense opposite partners to each other, meaning that they are created from the vacuum in pairs and they can destroy each other when brought together. Like this:

(A wonky note: this “field of arrows” is what one calls a *vector field*, as opposed to a *scalar field*. What I have described is known as the XY model. It will make an appearance in my next post as well.)

***

Now that you have some pictures, let me use them as a backdrop for some deeper and more general ideas about QFT.

The first important idea that you should remember is that in a *quantum* field, nothing is ever allowed to be at rest. All the pieces that make up the field are continuously jittering back and forth: the teeth of the zipper are rattling around and occasionally twisting over each other; the pendulums are swinging back and forth, and on rare occasion swinging all the way over the axle; the arrows are shivering and occasionally making spontaneous vortex-antivortex pairs. In this way the vacuum is never quiet. In fact, it is completely correct to say that from the vacuum there are always spontaneously arising particle-antiparticle pairs, although these usually annihilate each other quickly after appearing. (Which is not to say that they never make their presence felt.)

If you read the previous post, you might also notice a big difference between the way I talk about fields here and the way I talked about them before. The previous post employed much more pastoral language, going on about gentle “ripples” in an infinite quantum “mattress”. But this post uses the harsher imagery of “unhappy defects” that cannot find rest. (Perhaps you should have expected this, since the last post was “A children’s picture book”, while this one is “Samuel Beckett’s Guide”.) But the two types of language were actually chosen to reflect a fundamental dichotomy of the fields of nature.

In particular, the previous post was really a description of what we call *bosonic* fields (named after the great Indian physicist Satyendra Bose). A bosonic field houses quantized ripples that we call particles, but it admits no concept of antiparticles. All excitations of a bosonic field are essentially the same as all others, and these excitations can blend with each other and overlap and interfere and, generally speaking, happily coincide at the same place and the same time. Equivalently, one can say that bosonic particles are the same as bosonic antiparticles. In a bosonic field with many excitations, all particles are one merry slosh and there is literally no way of saying how many of them you have. In the language of physics, we say that for bosonic fields the particle number is “not conserved”.

The pictures presented in this post, however, were of *fermionic* fields (after the Italian Enrico Fermi). The particles of fermionic fields — fermions — are very different objects than bosons. For one thing, there is no ambiguity about their number: if you want to know how many you have, you just need to count how many “kinks” or “vortices” there are in your field (their number *is *conserved). Fermions also don’t share space well with each other – there is really no way to put two kinks or two vortices on top of each other, since they each have hard “cores”. These properties of fermions, together, imply that they are much more suitable for making solid, tangible matter than bosons are. You don’t have to worry about a bunch of fermions constantly changing their number or collapsing into a big heap. Consequently, it is only fermions that make up atoms (electrons, protons, and neutrons are all fermions), and it is only fermions that typically get referred to as *matter*.

Of course, bosonic fields still play an important role in nature. But they appear mostly in the form of so-called *force carriers*. Specifically, bosons are usually seen only when they mediate interactions between fermions. This mediation is basically a process in which some fermion slaps the sloshy sea of a bosonic field, and thereby sets a wave in motion that ends up hitting another fermion. It is in this way that our fermionic atoms get held together (or pushed apart), and fermionic matter abides.

Finally, you might be bothered by the idea that particles and antiparticles are always created together, and are therefore seemingly always on the verge of destruction. It is true, of course, that a single particle by itself is perfectly stable. But if every particle is necessarily created together with the agent of its own destruction, an antiparticle, then why isn’t any given piece of matter subject to being annihilated at any moment? Why do the solid, matter-y things that we see around us persist for so long? Why isn’t the world plagued by randomly-occurring atomic blasts?

In other words, where are all the anti-particles?

The best I can say about this question is that it is one of the biggest puzzles of modern physics. (It is often, boringly, called the “baryon asymmetry” problem; I might have called it the “random atomic bombs” problem.) To use the language of this post, we somehow ended up in a universe, or at least a neighborhood of the universe, where there are more “kinks” than “antikinks”, or more “vortices” than “antivortices”. This observation brings up a rabbit hole of deep questions. For example, does it imply that there is some asymmetry between matter and antimatter that we don’t understand? Or are we simply lucky enough to live in a suburb of the universe where one type of matter predominates over the other? How unlikely would that have to be before it seems *too* unlikely to swallow?

And, for that matter, are we even allowed to use the fact of our own existence as evidence for a physical law? After all, if matter were equally common as antimatter, then no one would be around to ask the question.

And perhaps Samuel Beckett would have preferred it that way.

]]>

You can read the post here.

As a teaser, I’ll give you my preferred picture of a Cooper pair:

which I think is an upgrade over the typical illustrations.

]]>

This may surprise you, but the answer to the question in the title leads to some profound quantum mechanical phenomena that fall under the umbrella of the Berry or geometric phase, the topic I’ll be addressing here. There are also classical manifestations of the same kind of effects, a prime example being the Foucault pendulum.

Let’s now return to the original question. Ultimately, whether or not one returns to oneself after walking in a closed loop depends on what you are and where you’re walking, as I’ll describe below. The geometric phase is easier to visualize classically, so let me start there. Let’s consider a boy, Raj, who is pictured below:

Now, Raj, for whatever reason, wants to walk in a rectangle (a closed loop). But he has one very strict constraint: he can’t turn/twist his body while he’s walking. So if Raj starts his walk in the top right corner of the rectangle (pictured below) and then walks forward normally, he has to start side-stepping when he reaches the bottom right corner to walk to the left. Similarly, on the left side of the rectangle, he is constrained to walk backwards, and on the top side of the rectangle, he must side-step to the right. The arrow in the diagram below is supposed to indicate the direction which Raj faces as he walks. When Raj returns to the top right corner, he ends up exactly in the same place that he did when he started — not very profound at all! But now let’s consider the case where Raj is not walking on a plane, but walking on the surface of a sphere:

Again, the arrow is supposed to indicate the direction that Raj is facing. This time, Raj starts his trek at the north pole, heads to Quito, Ecuador on the equator, then continues his walk along the equator and heads back up to the north pole. Notice that on this journey, even though Raj obeys the non-twisting constraint, he ends up facing a different direction when he returns to the north pole! Even though he has returned to the same position, something is slightly different. We call this difference anholonomy.

Why did anholonomy result in the spherical case and not the flat rectangle case? Amazingly, it turns out to be related to the hairy ball theorem (don’t ask me how it got its name). Crudely speaking, the hairy ball theorem states that if you have a ball covered with hairs, you can’t comb the hairs straight without leaving at least one little bald spot or tuft. In the image below you can see a little tuft on both the top and bottom of the sphere:

The sphere which Raj traversed had a little bald spot at the north pole, leading to the anoholonomy.

Now before moving onto the Foucault pendulum, I want to explicitly state the items that were critical in obtaining the anholonomy for the spherical case: (i) The object that is transported must have a direction (i.e. be a vector); (ii) The object must be a transported on a surface on which one cannot properly comb hair.

Now, how does the Foucault pendulum get tied up in all this? Well, the Foucault pendulum, swinging in Paris, does not oscillate in the same plane after the earth makes a full 24-hour rotation. This difference in angle between the original and the next-day plane of oscillation is also an anholonomy, except the earth rotates instead of the pendulum taking a walk. Check out this great animation from Wikipedia below:

If the pendulum was at the north (south) pole, the pendulum would come back to itself after a 24-hour rotation of the earth, (). If the pendulum was at the equator, the pendulum would not change its oscillation plane at all (). Now, depending on the latitude in the northern hemisphere where a Foucault pendulum is set up, the pendulum will make an angle between and after the earth makes its daily rotation. The north pole case, in light of the animation above, can be inferred from this image by imagining the earth rotating about its vertical axis:

The concept on anholonomy in the quantum mechanical case can actually be pictured quite similarly to the classical case described above. In quantum mechanics, we describe particles using a wavefunction, which in a very basic sense is also a vector. The vector does not exist in “real space” but what physicists refer to as “Hilbert space”. Nonetheless, the geometrical game I played with classical anholonomy can also be played in this abstract “Hilbert space”. The main difference is that *the anholonomy angle becomes a phase factor in the quantum realm*. The correspondence is as so:

(Classical) (Quantum)

The expression on the right is precisely the Berry/geometric phase. In the quantum case, regarding the two criteria above, we already have (i) a “vector” in the form of the wavefunction — so all we need is (ii) an appropriate surface on which hair cannot be combed straight. It turns out that in quantum mechanics, there are many ways to do this, but the most famous is undoubtedly the case of the Aharonov-Bohm effect.

In the Aharonov-Bohm experiment, one prepares a beam of electrons, splits the beam and passes them on either side of the solenoid, recombining them on the other side. A schematic of this experimental steup is shown below:

In the image, **B** labels the magnetic field and **A** is the vector potential. While many readers are probably familiar with the magnetic field, the vector potential may not be as household a concept. The vector potential was originally thought up by Maxwell, and considered to be a mathematical oddity. He realized that one could obtain **B **by measuring the curl of **A** at each point in space, but **A** was not given any physical meaning.

Now, it doesn’t immediately seem like this experiment would give one a geometric phase, especially considering the fact that the magnetic field *outside* the solenoid is *zero*. But let’s take look at the pattern for the vector potential outside the solenoid (top-down view):

Interestingly, the vector potential outside the solenoid looks like it would have a tuft in the center! Criterion number (ii) may therefore potentially be met. The one question left to be answered is this: can the vector potential actually “rotate” the electron wavefunction (or “vector”)? The answer to that question deserves a post to itself, and perhaps Brian or I can fill that hole in the future, but the answer seems to be emphatically in the affirmative.

The equation describing the relationship between the anholonomy angle and the vector potential is:

where is the rotation (or anholonomy) angle, the integral is over the closed loop of the electron path, and is just a proportionality constant.

The way to think about the equality is as so: is an infinitesimal “step” that the electron takes, much in the way that Raj took steps earlier. At each step the wavefunction is rotated a little compared to the previous step by an amount , dictated by the vector potential. When I add up all the little rotations caused by over the entire path of the electrons, I get the integral around the closed loop.

Now that we have the anholonomy angle, we need to use the classical quantum relation from above. This gives us a phase difference of between the electrons that go to the right and left of the solenoid. Whenever there is a non-zero phase difference, one should always be able to measure it using an interference experiment — and this is indeed the case here.

An experiment consisting of an electron beam fired at double-slit interference setup coupled with a solenoid demonstrates this interference effect most profoundly. On the setup to the left, the usual interference pattern is set up due to the path length difference of the electrons. On the setup to the right, the entire spectrum is shifted because the extra phase factor from the anholonomy angle.

Again, let me emphasize that there is no magnetic field in the region in where the electrons travel. This effect is due purely to the geometric effect of the anholonomy angle, a.k.a. the Berry phase, and the geometric effect arises in relation to the swirly tuft of hair!

So next time you’re taking a long walk, think about how much the earth has rotated while you’ve been walking and whether you really end up where you started — chances are that something’s just a little bit different.

]]>