Math(s)! Currently: Summer of Math Exposition

Exohedron · Dec 30, 2018

I really like Wikipedia's proof of why the determinant is multiplicative: first you show that any n-linear, alternating function F from kⁿ to k is determined by the value it takes on the identity matrix I_n (viewed as n vectors), and thus that the determinant is the unique such function that takes the value 1 at I_n, and that for any other n-linear alternating functions F, F applied to M is just F(I_n)det(M) by linearity in the space of multilinear functions.
Then you note that the function d_A that takes a matrix M to det(AM) is such a function, and takes the value det(A) at M = I_n, and so in general is just det(A)det(M).

Step one takes a little effort, though, mostly to convince the reader that the bookkeeping works out.

Exohedron · Jan 19, 2019 at 1:27 AM

A slight divergence from pure math stuff, because I attempted to explain some of this to my parents recently:

Digital Signatures

You want to send someone a letter, and you want them to be certain that you sent the letter. Conversely, you don't want people sending things under your name.
If we have a physical letter, the traditional method is to append a signature or a seal, some symbol or sign that your intended recipient can recognize as yours and that can't be produced by anyone else.
But forgery is a thing, duplicating someone's handwriting or copying a seal. Most of us don't have to worry about that on the physical plane, because that takes effort that aren't worth it for most people.
But there are types of forgery that do pay well. For instance, art forgery; learning someone's painting style, techniques, materials well enough to claim that you've "found" a lost piece.

In the digital realm, we send a lot of messages, important ones, and it's very easy to copy bits. So we need a good way of signing things that is more than just appending a bunch of bits containing your name.

Fortunately, we have some decent public key encryption systems.
For a recap: in a public key system, you have some algorithm E that takes a key K_E and a message p and produces an encrypted string c, and a different algorithm D that takes a different key K_D and the encrypted string c and produces the message p. This is different from more traditional, symmetric key cryptography, wherein the two keys are the same. In public key cryptography, because the encryption key is distinct from the decryption key, you can tell people the encryption key, and they can send you encrypted messages using the algorithm E and the key K_E, but as long as you keep the decryption key secret, nobody can read those encrypted messages except for you, even if they know the algorithm D.

So how does that help with signing things?
A digital signature works similarly to public key encryption. You, the signer, have a key K_S and a key K_V, and you release K_V to the public. There is an algorithm S which takes a message m and your key K_S, and produces a signature h, and an algorithm V that takes a message m, a supposed signature h, and a key K_V, and verifies whether the signature would be produced by S given the message and the key K_S.
Note that m also gets transmitted. The reason for this is that we want the signature to depend on the message in some fashion. Otherwise a forger could take a valid signature from one message and paste it onto a fake message, like ripping a wax seal from one letter and gluing it onto a fake letter. At the very least, the verifier wants to know that not only did they get a message from the correct person, they got the correct message from that person. This is called authentication, and is slightly different from identifying. But it's still important, because someone might intercept a legitimate message from you, mess with the contents of the message without touching the signature, and then send it onward, and you find yourself promising money to people you didn't mean to.
Also, for the verifier's benefit, a message with a valid signature can't be repudiated by the sender, because nobody else could have produced a valid signature for that message; hence you can't deny having signed and therefore presumably having sent that message. But this is usually considered a less important aspect compared to identification and authentication.

Having a public key encryption system actually directly helps with making a digital signature scheme. Your standard public key setup promises that
p = D(K_D, E(K_E, p)))
so that decrypting with the proper decryption key undoes an encryption with the proper encryption key. But a lot of them also have the property that
c = E(K_E, D(K_D, c))
i.e. that encrypting with the proper encryption key undoes a decryption with the proper decryption key. This is not always the case, but it is true for a lot of public key encryption systems, in particular RSA.
So, to sign a message, you take the message m, you decrypt it using K_D to get some "decrypted" h, and you use h as your signature. The verifier, knowing the corresponding key K_E, encrypts h and sees if it matches m.
Voila, a signature. If a would-be forger doesn't know your decryption key K_D, they won't be able to "decrypt" m to get h, and so won't be able to produce a valid signature. Moreover, if they don't have the correct m, then the encrypted signature won't match. So we get identification and authentication.

Okay, so I'm leaving a step out. We don't want to decrypt the entire message, because messages can be long but your signature should be short. So we cheat a little. Instead of decrypting the message, we hash the message, using a complicated function that takes in messages and produces short strings of nonsense. Then we decrypt the hashed message to get a short signature. Then the verifier compares the encrypted signature with the hash of the message.
The danger here is if two messages generate the same hash in a predictable manner. Fortunately, we also have decent hash functions.

So that's what your computer is doing under the hood every time it tries to send a message that needs to be trustworthy. It hashes the message, turns the hash into a signature, and then sends the message and the signature; conversely, every time it receives a signed message, it processes the signature, hashes the message, and checks if it got matching results. If the signature corresponds to the message, great. If not, someone is trying to pull a fast one. Or something got garbled during transmission; either way, don't trust the message.

Now, I'm sure some of you are thinking "wait, isn't RSA going to be insecure when we have quantum computers? Aren't all of our public key systems currently in use going to be insecure against quantum computation? Doesn't that mean that our digital signature algorithms will be insecure?" And the answer is yes.

So in the future, possibly the near future, we are going to need both encryption algorithms that will stand up against a quantum computer, and digital signature schemes that will stand up against a quantum computer. It seems, from what I can tell, that signature schemes are slightly easier to come up with than encryption algorithms, due to how they handle what each person needs to know, but it's still a hard problem.

Exohedron · Jan 26, 2019

For reasons, I am currently liveblogging a math paper in my bijou blogette here. The paper, Homological Tools for the Quantum Mechanic, by Tom Mainiero, is about detecting factorizability of tensors via cohomology, available on the arXiv here.
If you want to see someone reacting in uninformative ways to what is really a small book (126 pages) then you can follow the tag #MathPaperLiveBlog.*

* I don't know if you can actually follow in a meaningful fashion other than watching the thread and/or searching for "#MathPaperLiveBlog" in the search bar.

Exohedron · Feb 6, 2019 at 10:47 PM

Today I learned that a map between Poisson manifolds that preserves the Poisson structure is called an ichthyomorphism.

Exohedron · Feb 25, 2019 at 10:52 PM

Given that I seem to be an algebraic differential geometer who was unfortunately mostly raised by analytic differential geometers, the discovery that torsion can be explained in terms of exterior derivatives applied to differential forms makes me very happy. I don't know what is meant by "twisting" in a differential-geometric sense, but the discrepancy between a covariant exterior derivative and the standard exterior derivative makes sense as a thing.

Exohedron · Mar 28, 2019

I learned a great π fact today.

π is the minimum value of π

On the plane, we can define what are called "L^p" norm, by which we mean the following: first, pick a real number p between 1 and ∞.
Given a vector with coordinates (x, y), we say that the L^p norm of (x, y) is (|x|^p + |y|^p)^1/p.
So the usual Euclidean norm is the L² norm, and the L¹ norm is just |x| + |y|.
There's also what we call the L^∞ norm, which gives max(|x|, |y|).
We can define the L^p-distance between two points (x₁, y₁) and (x₂, y₂) as the L^p norm of the vector difference between the two points, i.e.
d_p((x₁, y₁), (x₂, y₂)) = (|x₁ - x₂|^p + |y₁ - y₂|^p)^1/p

Now we can define a unit L^p circle to be the set of points of L^p-distance 1 from a given point. We can also define a notion of L^p-length of a curve using calculus and substituting in p where we normally would see 2 in the formula.

So now we can ask "What is the L^p-length of a unit L^p-circle?" We can divide that value by 2 and call the resulting number π_p.
So for instance, π₁ is 4, a fact that I'll let you puzzle out on your own. π₂ is the familiar 3.14... π_∞ also ends up being 4.

So here's the fun π fact: π, by which we mean π₂, is the minimum value of π_p with regards to all values of p between 1 and ∞.

Exohedron · Mar 21, 2019 at 9:06 PM

Given a polynomial with complex coefficients, we can consider the polynomial both as a bunch of coefficients but also as a function that takes complex numbers to complex numbers. Note that the product of any two polynomials is again a polynomial, and that if two polynomials aren't 0 everywhere, then their product isn't 0 everywhere.
This also works with real numbers, talking about polynomials in terms of either their coefficients or their values.

Something funny happens when we look at a finite field. Suppose that we look at a field F_p with p elements for some prime number p. Then Fermat's little theorem states that x^p = x for any x in F_p. We have that x is a nonzero polynomial, and x^p-1 - 1 is a nonzero polynomial since evaluating it at x = 0 gives -1. But the product of these two functions gives x^p - x, which always takes on the value 0 for any x in F_p.
So we two nonzero polynomials that appear to multiply to something that always evaluates to 0?
The problem that we have here is that we're very, very far away from being algebraically closed, and so the nonzero-ness of x^p - x is hidden from us. If we move to a field extension of F_p, then we get that it is no longer true that x^p - x always evaluates to 0; for elements of the extension that aren't in F_p, x^p - x yields a nonzero value.
However, there are still lurking issues. Suppose that we extend F_p by adding in roots of some irreducible degree k polynomial. Then we end up with the field F_p^k. So now most elements don't satisfy x^p - x = 0. However, everything satisfies x^{p^k} - x = 0, and so now we have that the product of x and x^{p^k-1} - 1 always evaluates to 0.
So we have to move to an even bigger extension, and at this point we might as well go all the way to the algebraic closure of F_p just in case.

Anyway, I was originally going to talk about the group structure of CP^∞ in terms of rational functions and roots and poles, and how that totally doesn't work for F_pP^∞ due to the problem of hidden nonzeros except I kind of lost the thread. I'm still learning how weird algebraic geometry is when you're so far away from being algebraically closed.

Exohedron · Mar 24, 2019

More fun with modular arithmetic.

Newton's method and Hensel's lemma

Suppose that you wanted to find a solution to the equation x³ + 2x + 1 = 0 mod 2873. Don't ask me why.
How would you find one? Well, first thing you do, as you do in all questions about modular arithmetic, is you factor the modulus.
2873 = 13² * 17.
According to the Chinese Remainder theorem, if you can find a solution modulo 13² and find a solution modulo 17, you can combine them into a solution modulo 2873.
But you can't use the Chinese Remainder theorem to combine a solution modulo 13 with a solution modulo 13 to get a solution modulo 13².
So what do you do?

Let's take a digression into the real numbers.

Suppose you have a function f and you want to find a point x such that f(x) = 0.
For some particular functions we have formulas for finding such points, but for most functions we don't.
But we can find approximate solutions in some cases.
Suppose we guess a point x₀. f(x₀) probably isn't 0, and probably isn't close enough to 0. Can we make a better guess? Well, assume that f is linear. If f has slope m, then we have that
f(x₀ - f(x₀)/m) = 0
So we set
x = x₀ - f(x₀)/m
And we're done!
But what if f is not linear? Well, then we just assume that f is approximately linear, i.e. differentiable. Then the slope of the line that approximates f at the point x₀ is f '(x₀).
So we get that we should look at
x₁ = x₀ - f(x₀)/f '(x₀)
and this might be a better approximation to x.
So we compute f(x₁), and if it isn't close enough to 0 then we set
x₂ = x₁ - f(x₁)/f '(x₁)
And so on.
This is called Newton's method for finding zeros of a function.
In general this will serve us well, unless we accidentally have that f ' = 0 at one of these points. If that happens, we pick another starting point.
Newton's method requires a decent enough function, by which we mean has continuous second derivative and has a good Taylor approximation. It also requires that we don't pick a really bad x₀. Also it requires that a solution actually exist; otherwise we might get all sorts of nonsense.

Now back to modular arithmetic.

Suppose we have the function f(x) = x³ + 2x + 1, and we want to find a solution to f(x) = 0 not in the real numbers, but in the integers modulo 13².
We could try finding one directly, either through testing each possible number or just using the cubic formula, but let's try to use a slightly different tack.
Suppose that we had a solution to f(x) = 0 mod 13². Then since 13 divides 13², x mod 13 should be a solution to f(x) = 0 mod 13. So let's find x mod 13.
Modulo 13, we get that the only solution is x = 2.
So now let's try to lift that to a solution modulo 13². We know x mod 13 is 2, so our actual solution modulo 13² is going to be 2 + 13t for some t. Let's plug that into f and see what we get:
f(2 + 13t) = (2 + 13t)³ + 2(2 + 13t) + 1
= 8 + 3*2²*13t + 3*2*(13t)² + (13t)³ + 2*2 + 2*13t + 1
= 13 + 3*2²*13t + 3*2*13²t² + 13³t³ + 2*13t
Since we're working modulo 13², the t² and t³ terms vanish, since both have coefficients that are divisible by 13². So we're left with
13 + 3*2²*13t + 2*13t = 13 + (3*2² + 2)*13t mod 13²
3*2² + 2 = 14, so we get
13 + 14*13t = 13 + 13t mod 13²
So t = -1, and our actual solution is 2 - 13 = -11 = 158 mod 13².
And you can check that this is an actual solution to f(x) = 0 mod 13².

Note what happened: the terms with higher powers of t vanished due to the modulus, and we were left with a linear equation in t. Solving the linear equation allowed us to get from a solution mod 13 to a solution mod 13².
And what linear equation did we end up with? Well, we were looking at solving
8 + 2*2 + 1 + (3*2² + 2)*13t = 0
which we can read as
f(2) + f '(2)*13t = 0
So we wanted that 13t = -f(2)/f '(2).
That term should look somewhat familiar. Indeed, it's the correction term from Newton's method. Our initial "approximate solution" was 2, and then we corrected by -13, which is indeed what we would get from computing -f(2)/f '(2) modulo 13².

So why does this work? The answer is that we are doing the same thing that we did above, approximating our function by a linear one and solving that to get a more accurate answer.
When we computed f(2 + 13t) and got rid of the terms that were divisible by 169, we got rid of anything that was divisible by t² and that gave us the linear approximation f(2) + f '(2)*13t.
So what's the notion of accuracy at work here? The p-adic one!
Recall that the p-adic absolute value for an integer is given by 1 over the highest power of p that divides that integer, and that the p-adic distance between two integers is 1 over the highest power of p that divides their difference.
When we're working mod p, we only get accuracy to within 1/p, Good enough to get initial approximations. Then when we move to mod p², we're asking for accuracy within 1/p², and the correction term -f(x)/f '(x) gives us that extra accuracy.
Note that this also works for moving from mod p² to mod p³: again we take the solution x₀ that works mod p², and we look for a solution of f(x₀ + p²t) = 0 mod p³, and we end up with
f(x₀) + f '(x₀)p²t = 0 mod p³
because all of the terms that involve higher powers of t also involve at least p³.
And so we get
p²t = -f(x₀)/f '(x₀)
and that gets us the solution mod p³.
So if we wanted to find a solution to x³ + 2x + 1 = 0 mod 13³, we would compute
158 - f(158)/f '(158) = 158 - 1014/196 = 158 - 1014 = 1341
which you can check does indeed solve f(x) = 0 modulo 13³.
And of course, finding a solution mod p^k is just repeating this process.

This is called Hensel's lemma, which also happens to tell us that for each solution to f(x) mod p, there is exactly one corresponding solution mod p², mod p³, mod p⁴ and so on, given by the corresponding correction terms. But it's really just Newton's method for p-adic integers.

So that's how we can solve polynomials over composite moduli: first we factor into powers of distinct primes, and then we solve the polynomial over each prime factor, and then we use Hensel's lemma to turn solutions modulo primes into solutions modulo powers of primes, and then we use the Chinese Remainder Theorem to put those solutions together over the original modulus.

Exohedron · Mar 28, 2019 at 10:55 PM

I actually really like the Miller-Rabin test for primality. It's simple to implement and easy to understand why it works with only a little bit of number theory. In contrast, the AKS test, while not awful to implement, is a much harder beast to prove. On the other hand, it is deterministically polynomial time, which is pretty good, unlike the Miller-Rabin test which is only probabilistically polynomial.

syntheme · Mar 29, 2019 at 10:22 PM

Isn't there a limit to the number you'd have to check assuming the generalized Riemann hypothesis?

Exohedron · Mar 29, 2019 at 10:39 PM

Yeah, if the Generalized Riemann Hypothesis is true, we get that there's a deterministic, polytime version of the Miller-Rabin test called the Miller test, where you only have to check up to 2(log n)². GRH gives us that there's a set of elements that generates (Z/nZ)^x where all of them are smaller than 2(log n)², so since the strong liars are all in a proper subgroup, checking all the generators means we must at some point test something that's outside the subgroup and thus give a witness of compositeness.

However, unlike the Miller-Rabin test, I don't think I understand how the stuff with the GRH works.

Exohedron · Mar 30, 2019 at 11:16 PM

Today's Math Fact:

A signature in the logic sense is a set of function symbols and relation symbols, each with specified arity (number of inputs). A theory is a collection of sentences using those symbols and any other logical grammar necessary to create actual sentences, without any free variables. A model of a theory is a collection of sets and a map from the relation symbols to relations on those sets and from the function symbols to functions on those sets, with the proper arities, such that the sentences in the theory are all satisfied. Sometimes we'll specify a theory just from a set of axioms and generate the rest via the rules of logical inference.

The Löwenheim-Skolem theorem says that any countable first-order theory with an infinite model has a model of every infinite cardinality.

For instance, the natural numbers using the Peano axioms is a countable first-order theory, and there is an infinite model, i.e. the natural numbers that we're used to. So, by Löwenheim-Skolem, there is a model of the Peano axioms that is the same size as the real numbers.
Conversely, if we consider the set of real numbers and all the true statements we can make about them just in terms of +, *, 0 and 1, we get what is called the theory of real-closed fields, and by Löwenheim-Skolem there is a model of this that is countable.

The fun math fact is that Thoralf Skolem didn't think much of this result, because he didn't believe in uncountable sets, and so for him the theorem was a tautology; if there's only one infinite cardinal then obviously anything with a model of that size has a model of every infinite cardinal.

Those familiar with set theory might be a bit confused at this point. Löwenheim-Skolem implies that ZF set theory, which is a countable theory that has an infinite model, must also have a countable model. But doesn't Cantor's diagonalization argument imply the existence of an uncountable set in ZF set theory? This is called Skolem's paradox.
Fortunately, Skolem, being a serious logician, did have a resolution, which was to be very, very careful about what it means to be countable.
In particular, the axioms of ZF and Cantor diagonalization imply that there is some set which is infinite and does not have a 1-1 correspondence with the natural numbers. The trick is then to construct a system in which there are sets that have the cardinality of the natural numbers, at least if you're allowed to view the system from the outside, but for which any possible correspondence fails to be a set; since everything in a model of set theory has to be a set, we get that we don't have a 1-1 correspondence to the natural numbers. It's kind of like how we can "prove" the Godel ststement of a formal system, but only by going to a meta-system.
So it turns out that "countable" and "uncountable" aren't absolute properties of a set, but are rather dependent on what model you're working in.

Anyway, Skolem eventually went even further and became a finitist.

Exohedron · Apr 7, 2019 at 10:01 PM

I keep thinking about writing up an introduction to homology/cohomology/homological algebra and I'm not entirely sure why. It's not even an area I'm super interested in beyond deRham, group and Lie algebra cohomology, which are all basically the same thing anyway.

Exohedron · Apr 13, 2019 at 5:42 PM

Why do I ever bother to talk about anything other than combinatorial group theory?

Consider the set of regular polygons. For each number greater than 2, we have a regular polygon, and a unique one, ignoring translations and rotations and dilations and whatnot.

Now let's go up a dimension and talk about regular solids, i.e. the Platonic solids. What do we mean by regular? We mean that all the faces are regular polygons and the faces are all identical and all of the corners are identical as well.
We suddenly get a drastic decrease in variety, in that there are only 5 Platonic solids.

If we go up another dimension, we can ask about regular hypersolids, where by regular we mean that the 3-dimensional faces are regular polyhedrons that are all identical and the vertices are all identical.
We end up with 6 regular hypersolids.

If we go up another dimension, we can ask about regular hyper- nevermind. The general term is "n-polytope". A polygon is a 2-polytope, a solid is a 3-polytope, a hypersolid is a 4-polytope, etc. So we can ask about regular 5-polytopes. How many are there? What about regular 6-polytopes?

The answer ties back to the Dynkin Diagrams I talked about pretty far back in this thread. Like, page 4 or 5 or something.

Anyway, a refresher:
A_n looks like a row of n dots, each connected to the next by a line marked 3.
BC_n looks like A_n except the last line is marked with a 4 instead of a 3.
D_n looks like a row of n-2 dots, each connected to the next by a line marked 3, and then two more dots, each connected by a line marked 3 to the last dot in the row.
E_n looks like a row of n-1 dots, each connected to the next by a line marked 3, and then another dot connected to the third dot in the row by a line marked 3. We have to restrict n to be at least 6 and at most 8.
F₄ looks like A₄ except the middle line is marked with a 4 instead of a 3.
G₂ looks like a 2 dots connected by a line marked 6.
H_n looks A_n except the last line is marked with a 5 instead of a 3. We have to restrict n to be at least 2 and at most 4.
I₂(k) looks like two dots connected with a line marked k. We have to restrict to k at least 3.

You might note that this list is kind of redundant in places. A₂ is I₂(3), A₃ is D₃, B₂ is I₂(4), G₂ is I₂(6), and H₂ is I₂(5).

It turns out, for reasons I'll get into later, that the diagrams that correspond to regular n-polytopes are the ones that are just rows of n dots. We call these "linear" Dynkin diagrams. So not the D_n or E_n.
Also because of the redundancies, we'll only look at A_n, BC_n and H_n for n > 2, and ignore G₂, in favor of I₂(k).

In fact, regular polytopes correspond to the endpoints of the linear Dynkin diagrams, so if the two endpoints of the Dynkin diagram look the same, then they yield the same polytope.

So for A_n, the diagram is symmetric, and so we only get one polytope from it. For n = 2 we get a triangle, and for n = 3 we get a tetrahedron, and for higher n we get what is called a simplex, which you get by taking n+1 equidistant points in Rⁿ and connecting all pairs by lines and then filling in all triples by triangles and all quartets by tetrahedrons etc.

For BC_n, the endpoint connected to the next dot by a line marked 4 yields a cube or hypercube of the appropriate dimension. The other end point yields what is called an "orthoplex", which is the dual of the hypercube in the same way that the octahedron is the dual of the cube: replace the n-1 dimensional faces by points and the n-2 dimensional faces by lines and so on. For BC₄, we get the usual hypercube and what is called the 16-cell, which is made of tetrahedrons.
This also explains why, for instance, the two endpoints of A₃ don't yield distinct polyhedra, because the tetrahedron is its own dual.

For F₄ we get a 4-dimensional object called a 24-cell, made of octahedrons.

For H₃ the endpoint connected to the next dot by a line marked 5 yields a dodecahedron, and the other endpoint yields the icosahedron.
H₄, the end point connected by a line marked 5 yields the 120-cell, made of dodecahedrons, and other gives the 600-cell, made of tetrahedrons.

For I₂(k) you can probably guess that I₂(k) yields a k-gon, again self-dual so it doesn't matter which endpoint we pick.

Counting gives a few numbers: for n = 2, we have an infinite set because we can let k be any number greater than 2.
For n = 3, we have one polyhedron from A₃, two from BC₃, and two from H₃, for a total of 5 regular solids.
For n = 4, we have one from A₄, two from BC₄, one from F₄ and two from H₄, for a total of 6 regular hypersolids.
For higher n, we have one from A_n, two from BC_n, and then no more, because we've run out of Fs and Hs. So for n greater than 4, there are only 3 regular n-polytopes.
So our pattern is infinity, 5, 6, 3, 3, 3,....

A certain type of person now has a few questions. Why this correspondence? What do the other dots in the linear Dynkin diagrams mean? What do the non-linear Dynkin diagrams mean?

I'll cover that in another post, because it gets kind of complicated.

Exohedron · Apr 16, 2019 at 11:43 PM

A flag in mathematics, namely in algebraic geometry and combinatorics, is a bunch of things that are contained in other things.

Let's give an example: take a cube.
A flag for a cube might be a vertex, and an edge containing that vertex, and a face containing that edge.
There are 48 such flags for a cube: there are eight vertices, and each vertex is contained in three edges, and each edge is contained in two faces.
Notably, the cube also has 48 symmetries, in the forms of rotations and reflections, because given a vertex you can put it into any of 8 places, and once you've done that you can pick an edge from that vertex and put it in any of three positions, and once you've done that you can take a face containing that edge and put it in one of two positions.
And indeed, given two flags there is a unique symmetry of the cube that moves one flag to the other.
So the idea that we have here is that the set of flags correspond to the set of symmetries of the object in question.

So let's look at an icosahedron. What does a flag look like here? It looks like a vertex contained in an edge contained in a triangular face. How many flags are there? There are 12 vertices, and five edges off of each vertex, and two faces off of each edge, so there are a total of 120 flags. And there are 120 symmetries of the icosahedron.

A flag doesn't have to look like a 0-dimensional thing contained in a 1-dimensional thing in a 2-dimensional thing. It could look like a 0-dimensional thing contained in a 2-dimensional thing; we call this a partial flag because it doesn't have a 1-dimensional piece.
It could, however, look like a 0-dimensional thing contained in a 2-dimensional thing and also contained in a different 2-dimensional thing.
For instance, consider the cuboctahedron; you get this by taking a cube and cutting off all of the corners so that the square faces become smaller square faces; you end up with a shape that has 6 square faces and 8 triangular faces; now each vertex is contained in four edges, and each edge is contained in a single triangular face and a single square face.
So a flag could be a vertex and an edge containing the vertex and a square face containing the edge and a triangular face containing the edge. But now we have too much information; if we only knew the square and the triangle, we could figure out the edge. So we can make a flag that's just the vertex, the square and the triangle, and the edge is implied.
How many such flags does the cuboctahedron have? Well, pick a vertex; there are 12 vertices. Pick a square containing that vertex; there are two such squares. And now pick a triangle that also contains that vertex and shares an edge with the square; there are two such triangles. So we get 12*2*2 = 48, same as the cube, which is probably to be expected since we got the cuboctahedron by slicing bits off of the cube.

Now let's look at some diagrams.

First let's look at the diagram A₀. This is a row of no dots, and I'm going to say that it corresponds to a vertex, being 0-dimensional. It has one flag, containing the vertex. Well say that #A₀ = 1
Let's look at the diagram A₁, which looks like a single dot, *. I'm going to say that it corresponds to an edge, being 1-dimensional. An edge has two flags, one for each vertex. We'll say that #A₁ = 2.
Let's look at the diagram A₂, which looks like two dots in a row connected by a line marked 3. I'm going to say that it corresponds to a triangle, and has six flags. We'll say that #A₂ = 6.
Let's also look at the diagram BC₂, which looks like two dots in a row, connected by a line marked 4. I'm going to say that it corresponds to a square, and has eight flags. We'll say that #BC₂ = 8.

Now let's look at BC₃, which looks like three dots in a row, the first two connected by a line marked 4 and the last two connected by a line marked 3. Now consider the last dot. If we delete it we get a diagram that looks like BC₂, which I said corresponded to a square.
If instead we deleted the second and third dots, we're left with A₁, which is an edge, and if we delete all three dots we get A₀, which is a vertex.
So our flags for the cube are what we get by deleting dots and looking at the remaining diagrams.

What if we flip it around? If we delete the first dot only, we get the diagram A₂, corresponding to a triangle; if we delete the first and second dots we get A₁, for an edge, and if we delete all three dots we get A₀, for a vertex. And these are the flags for an octahedron.

But wait, you say, the icosahedron also had flags that were made up of a vertex contained in an edge contained in a triangle, so why should we be so confident that we got an octahedron this time around?

Because we haven't used all the information in the diagram yet!

So consider our diagram BC₃ again. We said that deleting dots from the right got us pieces of the flag for a cube. What does deleting dots from the left tell us about the cube?
Well, suppose we deleted the first dot. Then we get the diagram A₂, i.e. the triangle.
Notice that if you take a cube and consider a vertex, that vertex is contained in 3 edges and each of those edges is contained in 2 faces, so that vertex is contained in six flags. 6 = #A₂. In fact, recalling the relation between flags and symmetries, we get that there are 6 symmetries of the cube that keep a particular vertex in place. We call this the "stabilizer" of the vertex.

Now suppose you delete the first and second dots from BC₃. You end up with A₁, and #A₁ = 2. Only now instead of two vertices contained in the single edge, we have two faces containing the single edge. Similarly, for each edge there are 2 symmetries of the cube that keep all the points of the edge in place.

And finally, we deleting all three dots yields A₀, where #A₀ = 1, and indeed, a face is contained in one...cube. I guess? More importantly, if we insist on keeping all of the points of a face in place, we can't actually move the cube at all.

So for the cube we assign the left-most dot to the vertices, because deleting it and everything to the right of it yields A₀, and then the dots to the right of the first dot tell us that there are 6 symmetries of the cube will keep the vertex in place.
The dot in the middle corresponds to the edges, since the dots to the left of it yield A₁, and the dot to the right tells us that there are two symmetries of the cube that fix all of the points of the edge.
The dot on the right corresponds to the faces, since the dots to the left of it give us a BC₂, i.e. a square, and the lack of dots to the right of it tell us that keeping the face in place locks the entire cube in place.

Now let's look at the octahedron. We'll flip our diagram over so that the 3 is between the first left two dots and the 4 is between the right two dots.
Our leftmost dot corresponds to the vertices since there's nothing to the left of it, and the dots to the right form a BC₂. So there are eight symmetries of the shape that keep the vertex in place; this is how we determine that we're getting an octahedron, since for an icosahedron there would be 10 symmetries.

Now what happens when we instead fold the diagram over so that the middle dot is to the left and the other two dots are to the right? What do we get?
Well now the left-most dot is still the vertex dot, but now the dots to the right of it are two disconnected dots. So we get an A₁ and another A₁ of symmetries, i.e. 2*2 symmetries.
Okay, what about the other dots? Well, for the dot connected to the left-most dot by a 3, the rest of the dots yield a BC₂, so it corresponds to a square, and there are no dots to the right of it so no symmetries leave it unmoved. For the dot connected to the left-most dot by a 4, the rest of the dots yield an A₂, so it corresponds to a triangle, and there are no dots to the right of it so no symmetries leave it unmoved.

So we have that each vertex is contained in a square and in a triangle. Moreover, the A₁ + A₁ set of symmetries that fixes a vertex indicates that each vertex must be contained in two squares and two triangles. This should sound kind of familiar: it's the cuboctahedron! The symmetries that leave a vertex fixed swap the two squares, or swap the two triangles, or swap the squares and swap the triangles, or leave everything in place.
So you see what happens when we pick something other than an endpoint of the diagram to be the vertex point: we end up with something that has multiple types of faces, because we have multiple types of right-most dots.
We'll also get similar results if we looked at diagrams that weren't linear: again we'd get multiple right-most points, and hence our shape wouldn't be regular.

Except! What if the two right-most dots gave the same shape when deleted?

Let's look at A₃, which is three dots in a row connected by lines marked 3. Let's fold the diagram so that the middle dot is on the left and the other two dots are on the right.
Now the two dots on the right both yield A₂ when deleted, i.e. triangles. Deleting the dot to the left yields A₁+A₁ again. So now we have that each vertex is contained in two triangles, and also in two other triangles. Sound familiar? That's an octahedron.
But now something slightly odd is going on, because according to the diagram, we no longer have a symmetry that will fix a vertex in place and rotate the octahedron by 90 degrees; there is no such motion in A₁+A₁. So now the eight triangular faces of the octahedron have been sundered into two sets of four. Identical sets, yes, and if you held an octahedron in your hand you'd be able to swap one set for the other, but according to the diagram you can't.
So that's what happens when the two right-most dots give the same shape: they tell you the shapes of the faces, but they lie about the symmetries because they think that you can't swap the faces even when the faces are the same shape.

So that's why the regular polytopes all correspond to endpoints of the linear Dynkin diagrams: from a Dynkin diagram you can read off a polytope by picking one dot to be the vertex dot, and if you don't pick an endpoint of a linear Dynkin diagram you either get a shape whose faces are different shapes or the diagram lies to you about how symmetrical the polytope is.

Exohedron · Apr 20, 2019

"1+1 = 2?"

During grad school, I was trying to explain homotopy type theory to some friends of mine in the department lounge, and, as is the habit of many people who give impromptu lectures, was writing somewhat uninformative, contextless tidbits on the board.
The most clear one was the question:

"1 + 1 = 2?"

And of course, this is immediately noticed by people who were in the lounge but hadn't been listening to the conversation, and who I like to think normally respect me as a mathematician.

So what the hell was I talking about?

Consider a pair of sets A and B. If we are only allowed to talk about maps to and from our sets but aren't allowed to talk about individual elements, how do we define the disjoint union of the sets?
Now consider the disjoint union of A and B, which we'll denote by A+B. We have a function i_l: A -> A+B, and a function i_r: B -> A+B, the usual inclusion maps. Moreover, for any set X and any functions f: A -> X and g: B -> X, we can uniquely define (f+g): A+B -> X such that
(f+g)i_l = f
(f+g)i_r = g
So the disjoint union allows us to combine functions with the same range set.
If we were allowed to talk about elements, all of this follows by looking at an element of A+B and asking "Is it from A? Then do the function from A to the set. Is it from B? Then do the function from B to the set."
But if we can't talk about elements, then this will have to do.

Note that the number of elements in A+B is the number of elements in A plus the number of elements in B.
Well, wait, we said that we aren't allowed to talk about elements. Fine.
We call a set T "terminal" if for any set S, there is exactly one function from S to T. Notably, all one-element sets are terminal, and indeed, the only terminal sets have exactly one element. Moreover, as sets, all the terminal sets are isomorphic, because there's exactly one function between any two terminal sets, and that function has to be invertible because there's exactly one function going the other way.
So we have a notion of 1, as any terminal set. We also can say that 1+1 has two elements, in the sense that there are two functions from 1 to 1+1, which, if we were allowed to talk about elements, would send the single element of 1 to one of the elements of 1+1, or to the other element.

While we're here, we can also define "0" as an initial object. We say that a set I is "initial" if for any set S, there is exactly one function from I to S. It turns out that the set I has to be the empty set, so "0" is a pretty good moniker for it.

Okay, let's abstract a bit, to categories. Instead of talking about sets, we'll talk about more generic "objects", and instead of functions, we'll talk about more generic "morphisms". We can still talk about terminal objects, in that there is always exactly one morphism from any object to a terminal object. We can talk about initial objects, in that there is exactly one morphism from an initial object to any object. And we can talk about "disjoint unions", except we call them coproducts, in the following sense:
Suppose that we have two objects A and B. Then we define A+B to be an object such that there are morphisms i_l: A -> A+B and i_r: B -> A+B and such that for any object X and any pair of morphisms f: A -> X and g: B ->X, we get a unique morphism (f+g):A+B -> X such that
(f+g)i_l = f
(f+g)i_r = g

While we're here, we can also define the product. It looks like the coproduct with all of the arrows backwards, like so:
Suppose that we have two objects A and B. Then we define A*B to be an object such that there are morphisms p_l: A*B -> A and p_r: A*B -> B and such that for any object X and any pair of morphisms f: X -> A and g: X -> B, we get a unique morphism (f*g): X -> A*B such that
p_l(f*g) = f
p_r(f*g) = g
And note that for sets, the cardinality of A*B is the cardinality of A times the cardinality of B.

For certain categories, we also have a notion of exponentiation: A^B, by which we mean all of the morphism from B to A. Note that if A and B are sets, then the cardinality of A^B is exactly the cardinality of A raised to the cardinality of B.

Suppose we have a category where products, coproducts and exponentials are always defined. Then we automatically get initial and terminal objects as well, which we'll denote by 0 and 1.
All of these things obey what Tarski called the "high-school algebra" axioms:
x+y = y+x
(x+y)+z = x+(y+z)
x*1 = x
x*y = y*x
(x*y)*z = x*(y*z)
x*(y+z) = (x*y)+(x*z)
1^x = 1
x¹ = x
x^y+z = x^y*x^z
(x*y)^z = x^z*y^z
(x^y)^z = x^y*z
and indeed, you might recognize these from high school. Note that we might need to replace those equalities with isomorphisms, but eh, close enough.

We also have a notion of "how many points does an object X have", again by saying "how many morphisms are there from a terminal object to X?"

Anyway, so we've defined a notion of 1 in a more general setting, and 1+1 in a more general setting, and the notion of number of points in a more general setting.

Of course, all of this can be demonstrated with a simple example:
Suppose our objects are pairs of sets, and our morphisms are pairs of functions. Then our terminal object looks like (1,1), and we have that (1,1)+(1,1) = (1+1,1+1).
So far so good, right?
Except now we have that there are four morphisms from (1,1) to (1+1,1+1). Oops!

So it turns out that the axioms given above, familiar from high-school algebra, are insufficient to actually tell you the size of 1+1.

Exohedron · May 12, 2019

My favorite fact about harmonic polynomials.

Suppose you have a function P in n variables x₁, ..., x_n. We define the Laplacian Δ as the operator that sends P to
Δ(P) = ∂²P/∂x₁² + ... + ∂²/∂x_n²
In other words, you take the second derivative of P in each direction, and add them together.

The Laplacian is a particularly interesting operator for a number of reasons. It shows up all over the place in differential equations, since it gives us a nice way to talk about wave-like behavior or about diffusion of heat or particles or generally any case where the behavior of an individual is described by averaging the behavior of its neighbors.

But as always I don't care about analysis, I want to talk about algebra.

We say that P is harmonic if Δ(P) = 0.
My favorite fact about harmonic functions goes as follows:

Let C be the polynomial x₁² + ... + x_n².
Then for any twice-differentiable function* in n variables with real coefficients, there is a unique way to write P as a (possibly infinite) series of terms C^kP_k where P_k is harmonic.

So for instance, let's suppose that n = 3, so that
C = x₁² + x₂² + x₃²
Suppose that
P = 3x₁⁴ - 4x₁²x₂² + 3x₁²x₃² + x₂⁴ + x₂²x₃² + x₃⁴
Then we have that
P₀ = x₁⁴ - 6x₁²x₂²+x₂⁴
P₁ = x₁²-x₂²
P₂ = 1
and you can check that each of these is harmonic.

So why do I like this fact? Because I'm a representation theorist, and this is about representations of the orthogonal group SO(n).
If we take our space Rⁿ and apply a rotation to it, then we can ask how this rotation affects our functions P and C and how it affects our operator Δ. It turns out that any rotation will yield the same C, because C is just the Euclidean distance which our rotations preserve by definition, and it will also preserve Δ, because we can get Δ by taking C and replacing the variables with the corresponding partial derivatives.
So if we write R for the rotation, then applying R to P and then applying Δ is the same as applying Δ to P and then applying R:
Δ(R(P)) = R(Δ(P))
In particular, if P is harmonic, then so is R(P) for any rotation R.
So this fact about harmonic functions is particularly nice since it allows us to write any function in terms of an invariant C and harmonic functions, which themselves are defined by a rotationally-invariant operator.

Indeed, this allows us to write functions on spheres in a particularly useful way, because the unit sphere in Rⁿ is what we get by setting C to 1. Conversely, if we have a function that we've only defined on the unit sphere, we can extend it to a harmonic function on all of Rⁿ in a unique way, by extending it in any way at all, writing the result in terms of C and harmonic functions, and then setting all of instances of C in the expression to 1.

There's some other fun you can have with certain other groups, where by "certain other" I mean Coxeter groups and simple Lie groups, because I never talk about any other groups (I really ought to talk about other groups at some point).

* You might need the function to be analytic. I'm not actually sure. I know that it's true for polynomials and therefore true for analytic functions, but I am not sure for more general cases.

Exohedron · May 14, 2019 at 6:48 PM

deRham cohomology is best cohomology (fight me)

There's an old theorem of Poincare that says that if you have a vector field v defined everywhere on the plane, and the curl of v is 0 everywhere, then there is some function f such that v is the gradient of f. Moreover, if you have a region R of the plane that doesn't have any holes in it, and v is defined everywhere on R and the curl of v is 0 everywhere on R, then there is some function f defined on R such that v is the gradient of f.
The converse is pretty clear just by writing out definitions and expanding terms and being careful with your bookkeeping.

The part about R not having any holes in it is pretty important. Consider the region that's the entire plane except the origin. Let v be the vector field given by
v(x, y) = (-y/(x²+y²), x/(x²+y²))
v isn't defined at the origin, but that's outside our region.
If I've gotten my numbers right, then we have that the curl of v is 0 everywhere in our region. But you can check that v is not the gradient of any function!

Suppose that v is in fact the gradient of a function f. If you start at a point in the plane, and evaluate f there, you get some value f(x,y). And just like regular integration will tell you how f changes as you move in terms of its derivative, the integral along a curve will tell you how f changes as you move around in the plane in terms of the gradient:
f(end) - f(start) = ∫_C grad(f)·t ds
where t is the unit tangent vector to the curve C at the point, and ds is your arclength element.
In particular, if you start and end at the same point, i.e. your curve is a loop, then integrating the gradient around the loop should yield no change.

So let's look at v on the unit circle. For (x,y) on the unit circle, we get v(x,y) = (-y, x). Also, our tangent vector to the unit circle is (-y, x).
∫_C v·t ds = ∫_C v·t ds = ∫_C (y²+x² ds = ∫_C 1 ds = 2π
which is not 0!
So we get that v can't be the gradient of a function!

And indeed, there is a sense in which this can be used to detect holes:
If you have a region R in the plane and someone hands you a vector field defined on R whose curl is 0 but which isn't the gradient of a function on R, then you know by Poincare's theorem that your region has to have a hole
If someone hands you a loop and the integral of the vector field around that loop isn't 0, then you know that the vector field isn't the gradient of a function

In fact, in order to tell you anything interesting, the loop has to go around a hole! Otherwise we can just apply Poincare's theorem to the region inside the loop, and so the integral around the loop would have to be 0.

So let's pull away from individual vector fields and functions, and look at a general statement: Let R be a connected region on the plane and consider all of the vector fields on our region R that have vanishing curl. Call this set Z¹(R). Note that it's a vector space, since the curl is a linear operation on vector fields. It's a very big (infinite dimensional in fact) vector space, but we're not interested in it on its own.
Let B¹(R) be the vector space of all vector fields that are gradients of functions on R. Again, very big vector space. Also it's a subspace of Z¹(R), since the curl of a gradient is always 0.
The question is: is there anything in Z¹(R) that isn't in B¹(R). Alternatively, is
H¹(R) = Z¹(R)/B¹(R)
nontrivial as a vector space?
If R has a hole in it, then H¹(R) does have something nontrivial in it, and thus is at least one dimensional.
Claim: The dimension of H¹(R) is the number of holes in R.

Now let's boost by a dimension. Here's where the fun starts.

Suppose you have a region in 3-dimensional space. We can define functions and vector fields in this region.

Consider a vector field v. We can again define the curl of v, and again we get the statement that if our region has no holes in it, then every vector field whose curl is 0 everywhere is the gradient of a function. And again we can say:
Define Z¹(R) as the space of vector fields with vanishing curl, and B¹(R) as the space of vector fields that are gradients of functions, and define H¹(R) = Z¹(R)/B¹(R)
And if H¹(R) is nontrivial, then we have a hole.
But not just missing isolated points or pockets. No, this indicates a tunnel that goes all the way through our space, like the hole of a donut. H¹ asks what happens when we take a rubber band around our supposed hole and let it go: for an empty pocket the rubber band will just slide around it and shrink to nothing, but around an tunnel all the way through the space, the band can't avoid the tunnel.
But what about empty pockets? How do we detect them?
Well, suppose you have a vector field v whose divergence is 0 everywhere. If our region has no holes in it, then every vector field whose divergence is everywhere 0 is the curl of a vector field.
So we define Z²(R) as the space of vector fields with vanishing divergence, and B²(R) as the space of vector fields that are curls of vector fields, and define H²(R) = Z²(R)/B²(R)
And now the dimension of H²(R) counts pockets rather than tunnels. H² asks if you have a balloon and you let it shrink, can it shrink to 0 or will it get caught on a patch of missing space.

This is the beginning of what is called deRham cohomology, which is the nicest cohomology but only works in spaces that have enough derivatives defined on them. But we have this idea that we can count holes of various types by taking some operations z and b such that zb = 0, and looking at the space Z of things that z sends to 0, and the space B of things that are the result of applying b to stuff, and looking at H = Z/B.
If we're a bit more careful about what kinds of derivative-like operators we're talking about and switch from vector fields to what are called "differential forms", we can extend this idea to any number of dimensions. If we have a region of four-dimensional space, what kind of holes can we have, and what should we compute to detect these holes? But that requires a bit more machinery.
We can even extend it to things that aren't subsets of Rⁿ. We just need appropriate notions of derivative and differential form.

On the other hand, because Z and B are so large, I guess it's not great if you want to actually compute stuff. But who cares about that kind of thing?

Exohedron · May 23, 2019 at 9:36 PM

I'm starting to appreciate generating functions more. I mean, I already liked the idea, but now that I'm occasionally using them to actually do stuff I'm really seeing how powerful and elegant they can be.

Exohedron · May 25, 2019 at 11:58 PM

I just remembered that we can tie the harmonic function stuff and the deRham cohomology stuff together:
There is a natural way to extend the Laplacian Δ to act on differential forms, and using that extension we get that for each w in Hⁿ(R) there is a unique element z in Zⁿ(R) such that Δ(z) = 0 and w is the image of z under the natural quotient map from []: Zⁿ(R) -> Hⁿ(R).

The interaction of the harmonic structure and the differential structure is kind of interesting. For instance, if you have an element a in Z^m(R) and an element b in Zⁿ(R), then ab is an element of Z^m+n(R), but it is not necessarily the case that the product of two harmonic things is harmonic; there is a harmonic differential form that corresponds to it, since we can take [ab] in H^m+n(R) and ask for the corresponding harmonic element of Z^m+n(R), but it probably isn't going to be ab even if a and b are harmonic.

Log in or Sign up

Math(s)! Currently: Summer of Math Exposition

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

syntheme Active Member

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Log in or Sign up

Math(s)! Currently: Summer of Math Exposition

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

syntheme Active Member

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Exohedron Doesn't like words

Useful Searches