If you guys want to see something awful: this paper. From the twitter discussion: For those who aren't normally exposed to mathematics writing: it normally isn't this dense. Maybe as incomprehensible to nonexperts, but not this dense.
So today on "Mathematicians are sometimes trolls", courtesy of Vladimir Arnol'd: Consider the following problem: given a right triangle of hypotenuse length 10, and altitude from the right angle to the hypotenuse of length 6, find the area of the triangle. For those unfamiliar with the terminology, the hypotenuse is the longest side of the right triangle, and the altitude in question is the shortest path from the right angled vertex to the hypotenuse. So the altitude forms a right angle with the hypotenuse. When given to a group of Russian students, they failed to compute the area, whereas a group of American students each instantly came up with the answer of 30, which is indeed what the formula for the area of a triangle gives in this situation. However, this is not an outcome that favors the American students. Why not? Spoiler Such a triangle can't exist. The altitude of a right triangle is at most half the length of the hypotenuse. The implication is that the American students simply applied the formula for the area of a triangle without actually applying any thought to the geometry of the question.
Clifford Algebras Let's talk about an alternative to the octonions. This will take a little bit of linear algebra, but hopefully not too much. Consider R^{n}. We want to define a way to multiply vectors. We have several things called "products" for vectors, but let's make a new one, called the Clifford product. We do this as follows: The product of a vector and a scalar is just the usual result: a[b, c,...] = [ab, ac,...]. The product of a vector with another vector is a little more tricky. Let's look at some examples. First is n = 1. We'll now form Cl_{1}(R)^{-}. So our only vectors are of the form [a]. Boring. But the product is [a][d] = -ad. Now we say that we can add scalars and vectors, so that a + is a Clifford element. Extending the product by the distributive law and the associative law, we get that Cl_{1}(R)^{-} looks like C. Now let's look at n = 2, i.e. Cl_{2}(R)^{-}. We say that [1,0][1,0] = -1, [0,1][0,1] = -1, and [1,0][0,1] + [0,1][1,0] = 0. That allows us to define [a,b][c,d] = -[ac + bd] + (ad-bc)[1,0][0,1]. [1,0][0,1] doesn't simplify any further; we call it a bivector. Note that multiplying 3 vectors together doesn't get us anything new. So we have four total basis objects: 1, [1,0], [0,1] and [1,0][0,1]. The latter three all square to -1, and anticommute: for any two of them, the product of the two in one order yields negative the product in the other order. So we've ended up with the quaternions again: Cl_{2}(R)^{-} = H. What about n = 3? What is Cl_{3}(R)^{-}? We now have 8 basic pieces, 1, [1,0,0], [0,1,0], [0,0,1], the bivectors we can make from the pairs of the basis vectors, and the trivector [1,0,0][0,1,0][0,0,1]. Well, it's not the octonions. For one thing, we're insisting on associativity. The octonions aren't associative. Also, our algebra isn't a division algebra: (1 + [1,0,0][0,1,0][0,0,1])(1-[1,0,0][0,1,0][0,0,1]) = 0, so we can't always divide. But it's associative! Which isn't bad. So what is this weird vector multiplication deal good for? Well, another name for the Clifford product is the geometric product. [1, 0][0, 1] defines a square, or at least an area. In particular, given two vectors [a, b] and [c, d], [a, b][c, d] = -ab-cd+(ad - bc)[1,0][0,1], so extracting the bivector part gives the area of the parallelogram that has the two vectors as sides. There's a sign, saying whether [a,b] is clockwise or counterclockwise from [c,d]. If we had vectors [a,b,c] and [d,e,f], the product would be a little weirder to interpret, but it's still made up of areas of parallelograms. [1,0,0][0,1,0][0,0,1] is a cube, or at least a volume, and the trivector part of [a,b,c][d,e,f][g,h,i] is the volume of the resulting parallelipiped times [1,0,0][0,1,0][0,0,1] along with a sign. But there's probably also a vector part to the product. In general, for k less than the dimension of your vector space R^{n}, the k-vector part of the product of k vectors is the k-volume of the resulting k-dimensional parallelipiped, along with a sign, along with lower-dimensional stuff. In general, Cl_{n}(R)^{-} is 2^{n} dimensional. We can also make variations, such as Cl_{n}(R)^{+}, where we have that, for a standard basis vector e_{i} = [0,0,...,0,1,0,...,0], e_{i}^{2} = 1, or Cl_{n}(C), where we take our vectors to be in C^{n}. For example, Cl_{1}(R)^{+} has [a][d] = ad. This is not equivalent to C because (1 + [1])(1 - [1]) = 0. Similarly, Cl_{1}(C)^{-} is not the quaternions, in particular because everything in it commutes. So we get a variety of algebraic structures based on vector spaces with geometric properties. We have some fun facts about these Clifford algebras. Notably, Cl_{n+8}(R)^{-} = Mat_{16}(Cl_{n}(R)^{-}), i.e. 16 by 16 matrices whose entries are elements of Cl_{n}(R)^{-}. Similarly, Cl_{n+8}(R)^{+} = Mat_{16}(Cl_{n}(R)^{+}), and Cl_{n+2}(C)^{-} = Mat_{2}(Cl_{n}(C)^{-}). This is sometimes called Bott periodicity, because of an analogy to a relationship between n-dimensional and n+8 dimensional topological objects. For those who have delved too far into physics, this is where we get spinors from. Cl_{2n} can be written as 2^{n} by 2^{n}-dimensional complex-valued matrices, and you can embed so(n) into Cl_{n}, and then you get that weird half-angle behavior from the corresponding groups. Also those two dots on the branching end of the D_{k} graph correspond to representations derived from Clifford algebras. There's also a version of the Fourier transform that uses Clifford algebras, recently popular in computer vision.[/c]
Question: Wait, huh? Cl_{2n}? not n? But Cl_{1}, or C, can be described by 2 by 2 matrices, while Cl_{2}, or H, needs 4 by 4 matrices to be described, right? I'd imagine Cl_{3} should be able to be described by 8 by 8 matrices, right?
Oh, sorry: complex-valued matrices. I'll fix that. Also the 2n-1 case also becomes 2^{n} by 2^{n} matrices. The example to keep in mind is the Dirac spinor, where the relevant Clifford algebra is Cl_{4}(R) (technically Cl_{1,3}(R)) and the spinors live in 4 dimensions.
There's a trick for constructing matrix versions of Clifford algebras. Take P, Q and R to be the three matrices [i, 0] [0,-i] [0,-1] [1, 0] [0, 1] [1, 0] Then you form Kronecker products, which I'll denote by x: e_{1} becomes P x I x I x...x I, e_{2} becomes Q x I x I x...x I, e_{3} = R x P x I x...x I, e_{4} = R x Q x I x...x I, e_{5} = R x R x P x I x...x I, e_{6} = R x R x Q x I x...x I, etc. Note that for 2n + 1, you can actually write your matrices as block diagonal matrices with 2 blocks. For Cl_{1}(R)^{-}, we get e_{1} becoming [i, 0] [0,-i] and thus splits into blocks. For Cl_{3}(R)^{-}, we get e_{1}, e_{2} and e_{3} becoming [i, 0, 0, 0] [0,-i, 0, 0] [0, 0, i, 0] [0, 0, 0,-i] [0,-1, 0, 0] [1, 0, 0, 0] [0, 0, 0,-1] [0, 0, 1, 0] [0, i, 0, 0] [i, 0, 0, 0] [0, 0, 0,-i] [0, 0,-i, 0] each of which split into two 2-by-2 blocks. This gives us the electron spin stuff.
Ramsey's theorem So suppose there's a party, one of those fancy high-society ones where people show up more because they're expected to than because they or anyone else actually wants them to be there. And maybe they know some people at this party. We're going to assume for simplicity that if person A knows person B, then person B knows person A. Conversely, if person A doesn't know person B, then person B doesn't know person A. The question we now ask is: how big can this party be if we have neither three people who all know one another nor three people who all don't know one another? So for instance, we can imagine a party of three people (pretty awful party, to be honest) where person A and person B know each other, and person B and person C know each other, but person A and person C don't know each other. Or alternatively Can we have a party of four people where no three all know one another and no three all don't know one another? How about five? Six? Spoiler: Answer The answer is 5, as you could probably figure out on your own with sufficient time and a bit of doodling. If you have 6 people, either you'll have three people who all know one another and/or three people who don't. We call a set of three who either all know one another or all don't know one another a 3-clique, although the term "clique" sounds a bit weird for a group of people who don't know each other. What if we ask about 4-cliques? I.e. for a group of four people who either all know one another or all don't know one another. How large a party can we have with no 4-cliques? Here the answer is probably not solvable without some serious computation, so I'll just tell you that it's 17. You can have 17 people without any 4-cliques, but you can't have 18. We call these numbers "Ramsey numbers", i.e. R(m, n) is the largest number of people you can have without having m people who all know one another or n people who all don't know one another. So R(3, 3) is in the spoiler above, R(4, 4) is 17. Mathematicians have, via arduous labor, determined that R(5, 5) is between 43 and 49. Those don't look like terribly large numbers to be unsure about, but here we are. Let's get a bit closer to how mathematicians view this. The question of whether we must have cliques in a group of k people can be viewed as saying: consider k points on the plane, and draw a line between every pair of points. We call this a complete graph on k points. Now color the lines either red or blue. A red m-clique is then a set of m points such that the lines joining those points to one another are all red, i.e. if we just look at those points and the lines joining them, we have a complete graph on m points that is entirely red. Similarly a blue n-clique is a set of n points such that the lines joining those points to one another are all blue. If we have k points, we end up with k(k-1)/2 lines between them, each of which can be red or blue. So if we do this the brue-force way and check every possible coloring one by one, we end up with 2^{k(k-1)/2} colorings to check. There are k!/m!(k-m)! sets of m points that might form a red clique, and we have to check each of those sets for each coloring, and there are k!/n!(k-n)! sets of n points that might form a blue clique. So suppose we look at 43. There are 2^{43(42)/2} = 2^{903} colorings and 962,598 sets of 5 points that must be checked to be a red or blue clique. Note that 2^{10} is about 10^{3}, so we're looking at about 10^{276} checks that need to be done if we're doing this by brute force. That's a lot of work! And it only gets worse from there. Paul Erdős, famous itinerant mathematician and pacifist, is remarked as having said: "Erdős asks us to imagine an alien force, vastly more powerful than us, landing on Earth and demanding the value of R(5, 5) or they will destroy our planet. In that case, he claims, we should marshal all our computers and all our mathematicians and attempt to find the value. But suppose, instead, that they ask for R(6, 6). In that case, he believes, we should attempt to destroy the aliens." -Joel Spencer Notably R(6, 6) is somewhere between 102 and 165. Ramsey's theorem states that R(m, n) is finite for any finite m and n; eventually you're going to have enough points that you'll need to have a red m-clique or a blue n-clique or both. In fact, it goes further: suppose instead of just two colors, we allowed more colors. Then Ramsey's theorem says that for any number of colors and any size clique for each color, there is a finite number such that you'll eventually have at least one single-colored clique of the corresponding size. Ramsey Numbers are part of a general area of mathematics called Ramsey Theory, whose basic idea is an extrapolation of Ramsey's theorem: if you have enough stuff, then no matter how random the stuff is some part of it has to exhibit order. In our case, the stuff is points and the colored lines between them, and the ordered parts are the cliques. You may have heard of Graham's Number, once touted as the largest finite number used in a mathematical proof*; it was an upper bound for a problem in Ramsey Theory. Note that a 2-clique is just two points and the the line joining them, so Ramsey's theorem talks about getting m-cliques out of 2-cliques. But we can consider n-cliques and look for m-cliques made from those n-cliques. This is the focus of the strengthened finite Ramsey's theorem: "For any finite positive n, k and m, we can find N such that if we each color n-element subset of S = {1,2,...,N} with k colors, then we can find a subset Y of S with at least m elements such that all n-element subsets of Y have the same color, and the number of elements of Y is at least the smallest element of Y." That last condition is kind of weird, and if we drop it then we can prove this theorem from the regular Ramsey's theorem. However, with that condition, while it is provable in second-order arithmetic, if it were provable in Peano arithmetic then the strengthened finite Ramsey's theorem would prove the consistency of Peano arithmetic. So by Godel's Incompleteness theorem, it can't be proven in Peano arithmetic. This is quite unlike the original statements that we saw for incompleteness, which were pretty awful workarounds for self-reference and which I didn't even bother to spell out fully because it would be too unwieldy. The strengthened finite Ramsey's theorem is just a purely arithmetic statement about the positive integers. In fact, while the least value of N for a given n, k and m can be computed from those three parameters, as a function of those three parameters it grows so fast that Peano arithmetic can't even prove that the function is defined for all values of n, k and m. This gives a glimpse into a weird correlation, in which an axiomatic system's ability to prove the existence of large infinite sets is somehow related to its ability to describe large finite numbers. There are actually three levels: large uncountable cardinals, large countable ordinals, and large finite numbers. The abilities to describe/prove the existence of each of the three types are highly correlated for reasons that I don't understand at all. *It wasn't actually used in the proof; it was a simplified explanation, describing a much larger number than the one actually used in the proof, given to Martin Gardner, popularized in Gardner's column in Scientific American. It (the larger, simplified one) has since been surpassed by other large numbers in proofs.
http://www.satirev.org/harvard/math-department-releases-classification-safe-spaces#.WEd9FvUo7Js Will possibly explain the jokes at some point.
Pffffahahahaha, I get maybe two thirds of the jokes and it's hilarious. The bit about the precalc textbook was a cheap shot, though.
omfg that was hilarious. The Mobius strip one, jesus. Orientations, Boundaries, Limits, Norms, JESUS CHRIST Holy fuck I love this. So fucking much.
The Hairy Ball theorem* Since @budgie mentioned learning about this theorem today, everyone else gets to learn about it today. Anyway, suppose you have a circle. At a point on the circle, you draw a line tangent to the circle at that point. Starting at the point on the circle, which we'll call the "basepoint", you can go some length in either direction along the segment. Pick a direction and call the point you end up at the "endpoint". Now you want to do this for each point on the circle: from each point, go some length along a tangent line in some direction. Moreover, you want to pick these lengths and directions so that the endpoints are "continuous", in the sense that for basepoints that are near each other, the corresponding endpoints should be near to each other. A fairly simple way to do it is just to have the direction always be clockwise and always go one unit, or always counterclockwise and always go one unit. Another way to look at it is that if you just look at all the endpoints, they should form a circle somewhat larger than the original circle. We call such a set of continuous choices a vector field on the circle, and because the lengths are always nonzero we call this a nonvanishing vector field. The Hairy Ball theorem refers to what happens when we try to do something similar for a sphere. Instead of tangent lines we have tangent planes, so now for each basepoint there are lots of directions to pick from rather than just two. So for each point on the sphere, you pick a length and a direction along the tangent plane and go that length in that direction to get an endpoint. And again you want to pick your lengths and directions so that the collection of endpoints you get form a shape without any edges or holes; maybe not a perfect sphere, but not obviously missing anything. Again we call such a choice of lengths and directions a vector field on the sphere. Anyway, the Hairy Ball theorem says that if you do this, then the vector field has to vanish somewhere: there is at least one point on the sphere where the chosen length for that point is 0. The name is because if you suppose the sphere is covered in hair and you tried to comb the hair so that it all lay flat, you couldn't do it in a continuous manner unless there's a bald spot. This is an interesting theorem because it's topological; if you deform the sphere in a way that doesn't give it any sharp corners or sharp edges and doesn't tear or glue anything, and then again do this endpoint-picking process to your new shape, again you'll have to have some endpoint on the shape itself. And it's not a matter of dimension, either. Consider, say, a torus, i.e. the surface of a donut. Like the circle, you can just pick your directions to go clockwise around the donut hole, and pick your lengths to all be 1, and you'll have a nonvanishing vector field, but the torus has dimension 2. Moreover, if you take several donuts and glue them together and take the surface of the result, and again try to make a vector field, it has to vanish somewhere. There's a easy way to determine whether a given, nice-enough shape can have a nonvanishing vector field, called the Euler characteristic. A quick, dirty explanation of this: Take a bunch of points. Take a bunch of curves that start and end at those points. Take a bunch of sheets and glue them in so that the boundaries of the sheets completely attach to the points and curves we drew earlier; no sheet can have any edges unattached. Take a bunch of 3-volumes and glue them in so that the boundaries of those volumes completely attach to the points and curves and sheets we drew earlier. Etc. We call the n-dimensional pieces that we glue in the n-cells of the object M. Now we define the Euler characteristic χ(M) to be the number of even-dimensional cells minus the number of odd dimensional cells. So consider the circle. We start with a single 0-cell (a point), and a single 1-cell that starts and ends at that point, and we get a circle. So the circle has Euler characteristic 0. The sphere we can get by taking a 0-cell, and a single 1-cell that starts and ends at that point, and two 2-cells that we turn into domes that we glue to the circle formed by the point and the curve. So the sphere has Euler characteristic 3 - 1 = 2. The torus we can get by taking a single 0-cell, and two 1-cells, and a single 2-cell, and being clever about gluing in the 2-cell, so it has Euler characteristic 0. If you glue n donuts together and take the surface of the result, you can construct that surface from a 0-cell, 2n 1-cells, and a single 2-cells, so we get Euler characteristic 2 - 2n. And the objects for which we can make a nonvanishing vector field have Euler characteristic 0. In fact, the Euler characteristic gives more information; it says what has to happen near the points where the vector field vanishes if it does (well, in aggregate). Of note is the fact that an n-dimensional sphere has Euler characteristic 0 if n is odd and 2 if n is even. You can see this by saying that you have an (n-1)-dimensional "equator" which is an (n-1)-sphere, and then you glue two n-cells. If n is even, then we get that χ(S^{n}) = χ(S^{n-1}) + 2, whereas if n is odd, we get that χ(S^{n}) = χ(S^{n-1}) - 2. Starting with the fact that for the circle S^{1}, χ(S^{1}) = 0, we get that the Euler characteristics of the spheres alternate between 0 and 2. So that means that we should be able to find a nonvanishing vector field on the 3-sphere, and indeed we can! Consider the quaternions. They form a 4-dimensional space, so we can just take the unit-length quaternions and we get a 3-sphere. Now pick an imaginary quaternion, something of the form q = ai + bj + ck. Then for every point p on our 3-sphere, we pick our endpoint to be p + pq. Because q is imaginary, pq is perpendicular to p, and thus p + pq lies on the tangent space to the 3-sphere at p. Because the function p -> p + pq is a polynomial, the result is continuous, and so we get a nonvanishing vector field. Even better, we can pick three different nonvanishing vector fields, say, p + pi, p + pj, p + pk, so that at each point p, pi, pj and pk are all linearly independent (over the real numbers) and hence form a basis of the tangent space at p. We say that an n-dimensional object is parallelizable if it has n nonvanishing vector fields such that at each point p, the corresponding vectors form a basis of the tangent space. This is stronger than simply having a single nonvanishing vector field. Every odd-dimensional sphere has a nonvanishing vector field. But the only spheres that have parallelizations are those in dimensions 0**, 1, 3 and 7, which you'll note as being 1 - 1, 2 - 1, 4 - 1, and 8 - 1. And indeed, the normed division algebras do give us parallelizations via basically the same trick. Showing that you can't do it for any other spheres is hard. *I'm pretty sure this was named deliberately. ** For a 0-dimensional object, the definition of parallelizable then asks for 0 nonvanishing vector fields, which the 0-dimensional object is always happy to provide.