# The simplest derivation of the Pythagorean theorem

Sometimes I am amazed by the permanence of mathematical discovery. Math, it seems to me, is quite unique among the creative intellectual pursuits (science, art, engineering) for the seemingly unlimited lifetime of its innovations.

For example, Aristotle was a brilliant natural philosopher, as much a genius as just about any modern scientist, and he advanced (what would become) physics tremendously during the 4th century BC. But by now his theory of the five elements is completely unnecessary for anyone to learn. While it produced an important advancement in our thinking, it has been replaced by more correct physical theories. Thus, Aristotle suffered that same fate that meets seemingly every scientist or inventor eventually: further discoveries made him obsolete.

Pythagoras, on the other hand, who lived roughly 200 years before Aristotle, is someone whose major contribution to mathematics is still used every day. I literally could not do my job without the Pythagorean theorem, and neither could just about any scientist or engineer. Unlike nearly all other kinds of innovations, it has very much not been replaced.

What’s important to notice is not just that Pythagoras’s result is still important, but that the *type of reasoning* that leads to his result is still important. Put simply, a good scientist or engineer needs to be capable of understanding and reproducing a derivation of the 2500-year-old Pythagorean theorem, not just because the theorem is important, but because that level of logical thinking is necessary for his/her job.

So in this post I think it’s worth sharing my own favorite derivation of the Pythagorean theorem. This derivation is the simplest one I know of, and it doesn’t require any tremendous geometric cleverness (like a tangram puzzle) or complicated diagrams. Instead, it relies only on a very basic use of scaling arguments.

Scaling arguments are among the simplest and most powerful tools in theoretical physics. They allow you to reach remarkably concrete conclusions about a problem even when you don’t know essentially any details about the system in question. The key idea is to imagine scaling the system up or down in size, and then saying something about how it should change as you do so.

For example, suppose you don’t know anything about triangles except that they have an area. Since area is measured in units of length squared, you can immediately say that if you take some triangle and make its length times bigger, than its area must get times larger.

In other words, if the following triangle has area

then the triangle below, which is the same as the previous one only magnified two times, must have an area .

Meanwhile, all the side lengths of the bigger triangle are exactly two times longer than for the smaller one.

What all this means is that, for a given triangle, the area is proportional to the square of any one of its side lengths. I know this because as I make the triangle times bigger, the side lengths all get times longer, and the area gets times bigger. So if I want I can write

.

The “something” in that equation depends on the angles in the triangle, but for now let’s assume that I am more or less completely ignorant about triangles and I can’t tell you what it is. Luckily enough for ignorant me, it turns out I don’t need to know what the “something” is in order to prove the Pythagorean theorem.

The key trick is to divide the large triangle into two smaller and completely equivalent triangles. That is, take this triangle:

and draw one line (an altitude through the right angle) so that it gets divided into two smaller triangles, like this:

You can tell that the two newly-created triangles are just scaled-down versions of the original one, because they have all the same angles. This means that the original triangle can be written as the sum of two smaller but otherwise completely identical triangles. Like this:

Finally, to prove the Pythagorean theorem, we just have to invoke the one equation in this post, for each triangle. This gives:

.

Since all the triangles are the same, all the “something”s are also the same, which means

.

Not bad, eh?

I don’t know whether you found the above proof “aesthetic,” but I certainly did. And it’s a pretty nice feeling to think that an insight had by someone more than 2,500 years ago can still feel beautiful to someone like me. And even more remarkably, that my life (and professional career) continue to profit from it.

## Footnote

I learned the proof above from Leonid Levitov. As it happens, he presented it during a talk about atomic collapse!

UPDATE: A number of readers have pointed out that they learned this argument from Migdal’s wonderful book Qualitative Methods in Quantum Theory (which is probably where Levitov learned it also).

I love it! That’s my new favorite proof of the theorem.

Loving the new blogging, keep it coming!

I don’t know whether you found the above proof “aesthetic”With a variable name like “something,” how could it not be?

This is basically kind of scaling derivation of Pythagoras theorem