Hey, I've been thinking a bit about quaternions (when am I not lol), and I have a bit of a problem Quaternions are supposed to "extend" complex numbers, right? So a number "a+bi" should behave in the same way as "a+bi+0j+0k", right? But you can write complex numbers as re^(iθ). I was wondering if you could do the same for quaternions in some way... but I came across a contradiction: if exponentiation by definition turns sums into products, then doesn't that mean that e^(i+j) can't equal e^(j+i), even though addition of quaternions should be commutative? EDIT: I just realized that (i+j)^2 = -2 so yeah let me figure out the expansion first.
Yeah, there's some fun to be had when you do exponentiation with noncommutative things. We can define exponentiation via Taylor series: exp(A) = I + A + (1/2)A2 + (1/3!)A3 + ... And we also have a logarithm, again via Taylor: log(1 + A) = A - (1/2)A2 + (1/3)A3 - ... And this is one way to move from Lie algebras to Lie groups and back that doesn't use any geometry. So what does exp(A+B) do? exp(A+B) = I + (A+B) + (1/2)(A+B)2 + (1/3!)(A+B)3 + ... Expand a bit: I + A + B + (1/2)(A2 + AB + BA + B2) + (1/3!)(A3 + A2B + ABA + BA3 + AB2 + BAB + B2A + B3) + ... Contrast with exp(A)exp(B): (I + A + (1/2)A2 + (1/3!)A3 + ...)(I + B + (1/2)B2 + (1/3!)B3 + ...) Expanding out the stuff of degree 3 and lower gives I + A + B + (1/2)A2 + AB + (1/2)B2 + (1/3!)A3 + (1/2)A2B + (1/2)AB2 + (1/3!)B3 + ... And you can see that these are slightly different, starting from the degree 2 terms and up, because in the product version all the As are on the left and the Bs are on the right, whereas in the exponentiated sum they're interleaved, and for noncommutative things the interleaving is important. So we do get that exp(A + B) is different from exp(A)exp(B). There's a somewhat ugly formula called the Baker-Campbell-Hausdorff(-Dynkin) formula that gives log(exp(A)exp(B)). It starts as A + B + (1/2)[A, B] + (1/12)([A, [A, B]] + [B, [B, A]]) + ... All of the higher-degree terms are in terms of the brackets for A and B, so when A and B commute then log(exp(A)exp(B)) = A + B, but for noncommutative stuff, including quaternions and nonAbelian Lie algebras and so on, we get that there are a bunch of other terms. So now we can see another way that the bracket structure of a Lie algebra reflects the noncommutativity of the corresponding Lie group.
There's a geometric way of thinking about this. Starting at 1 on the 3-sphere and going a step in direction A is like multiplying by exp(A). Then the above discussion says that if you have two directions, A and B, then going a step in direction A and then a step in direction B doesn't give you the same thing as going a step in direction B and then in direction A, and that both are different from going half a step in direction A and then a step in direction B and then another half step in direction A and so on. And algebraically we say that the structure is noncommutative, but geometrically we'd say that our 3-sphere is curved. Contrast with a flat space, where it doesn't matter which order or how you interleave going in directions A and B, because everything ends up being vector addition which is commutative. And indeed, for certain types of Lie groups there's a natural embedding as matrices, and there's a natural geometry on that embedding, and the curvature of the geometry can be computed from the Lie brackets and almost vice-versa. Curvature is noncommutativity of movement (there is in fact a formal way of saying this that makes this statement essentially true).
Just a bit of a ramble about mathematical biology: Math and physics have grown up a lot together over the past few centuries, and so now theoretical physics is very much a mathematical science. Not just a hard science, for whatever that term may mean, but a mathematical one: everything is supposed to be modeled and there are all sorts of principles and axioms and all that fun stuff that allow mathematicians to abstract away all the physics, do some math, and then add the physics back in. Not so much for biology. Sure, biostatistics is going to interesting places, but mathematical models for biology are generally not considered to be nearly as accurate as models for physics. And there are a bunch of reasons for this. One reason is that it's easier to think of physical systems reductively, and in particular, in isolation. You can imagine a lone electron in the vacuum, and talk about it as if it were the only thing in existence, and you can talk about a bunch of particles in the vacuum and again talk as if they were the only things in existence. You can consider a closed system from a physical standpoint and say everything there is to say about that system. And then you can take that information about the closed systems and turn it into information about the systems put together. Contrast with biology: if you take, say, a wolf in a vacuum, you quickly end up with a dead wolf. And you can say a lot about the dead wolf, but it doesn't tell you much about a live wolf that isn't in a vacuum. Biological systems are not closed, almost by definition. If it eats, if it breathes, if it photosynthesizes, if it excretes, it's not closed; it interacts with its environment in a way that can't be built up from the thing in isolation. Fortunately, there are some mathematical models of certain kinds of interactive behavior, but unlike in physics, there's no general framework for making, thinking about, and relating such models of open systems, and so there's no way to do the kind of complicated syntheses and analyses of such models that theoretical physics thrives on. At least, no framework yet. There are people working on this. Smart people, who are getting some very cool ideas in how to talk about open systems from a mathematical standpoint. Things look pretty basic at the moment, but pretty promising as well. And if I had been born maybe a hundred years later, maybe I would have an interest in mathematical biology rather than mathematical physics. I like the idea of biology, but the mathematical framework isn't there yet and so I have no good way to think about it that isn't mostly memorization and rough analogies (whose roughness I can't measure). But maybe in a hundred years. Or fifty years; who knows how fast things will progress. Of course, by that point mathematical physics will probably be completely bonkers.
Addendum to my last post: here's one mathematician's experience at an ecology conference and the differences he found between the cultures. Cross-posted in the science thread.
I'm going to wander into combinatorics and number-theory-ish stuff for a bit, because I want to eventually talk about one of my favorite ideas in mathematics: the field with one element. But first: finite fields. A brief review of modular arithmetic: consider the days of the week. Let's say that Monday is day 1. Then Tuesday is day 2, and Wednesday is day 3, and so on, until we reach day 7 which is Sunday. Adding a day gets us to...day 1, Monday! So we go from 7 to 1. Thus we say that the days of the week can match the usual integers modulo 7, because the days of the week think that 7 is 0. And now we can do our usual arithmetic with these numbers. 1 + 6 = 7 = 0 mod 7, and 3 * 4 = 12 = 5 mod 7, and so on; every time we compute something, we just divide by 7 and take the remainder and write "mod 7" at the end. Moreover, we get a field, in that everything has an additive inverse (mod 7) and everything other than 0 has a multiplicative inverse mod 7, by which we mean that if x is not 0, then there is some y such that x * y is 1 plus a multiple of 7. So now we have a field with 7 elements. And indeed, we can replace 7 by other numbers and do modular arithmetic, and if we had replaced 7 by another prime number p then we would get another field, this one with p elements. For convenience, we'll denote it by Fp. We can also define fields whose sizes are powers of prime numbers. For instance, suppose we wanted to make a field of size 49. We start with F7 and consider all the elements of the form a + b√(-1) with a and b in F7. We again get a field, and it has 49 elements since we have 7 choices for each of a and b. It turns out that for any prime p and any positive integer n, there is exactly one field of size pn. Often, we write q = pn and thus the corresponding field is Fq. So what about the field with one element, F1? Unfortunately, one of the axioms of a field is that it must have an additive identity 0, and it must have a multiplicative identity 1, and these two elements must be distinct. Which means we need at least two elements. So there can't be a field with only one element. And this axiom is important, because a lot of nice things we can say about fields completely fail if we allow the additive identity to be the same as the multiplicative identity. And while you can do anything you want in mathematics, you can usually only do one thing that you want at a time except in very special circumstances, and this is not one of those circumstances. So why do I talk about it at all? Because like many things in mathematics, the field with one element fails to exist in a way that makes it almost exist. Consider the polynomial 1 + q + q2 + ... + qn. We'll call this polynomial a "q-number", and write it as [n+1]q for short. Notably, when we set q to 1 the polynomial evaluates to n+1. q-numbers act a little differently from regular numbers. For instance, it is not the case that [1]q + [1]q = [2]q. The left side is 1 + 1 = 2, while the right side is 1 + q. Similarly, it is not the case that [a]q[b]q = [ab]q, at least not when q is not equal to 1. We now define a q-factorial: [n]q! = [n]q[n-1]q...[1]q. In other words, just like a regular factorial, but with qs. We similarly define a q-binomial: (n, k)q = [n]q!/([k]q! [n-k]q!). What do these count? Well, recall the notion of a projective space from a little while ago. We said that for a field F, we could make a projective space FPn by taking Fn and appending points at infinity, or by taking sets of n+1 coordinates and then taking ratios. In either case, if F is finite in size then we can count things. And we have fields that are finite in size: our fields Fq. So suppose we have FqPn. How many points does it have? It ends up with [n+1]q points. We can also talk about higher-dimensional structures in FqPn. A "line" in FqPn can be viewed as a plane in Fqn+1 that passes through (0, 0, 0,...,0); since we only care about ratios, the two dimensions of the plane drops by one to give us a projective line. So how many planes are there in Fqn+1? There are (n+1, 2)q. How many planes are there in FqPn? As many as there are 3-dimensional volumes in Fqn+1 that pass through the origin, i.e. (n+1,3)q. And so on. We have (n+1, k+1)q k-dimensional pieces of FqPn. So the q-binomials count the number of pieces of FqPn of a given dimension. So what do the regular binomials count? Consider a set of n+1 elements. How many subsets of size k+1 does it have? (n+1, k+1). So we might want to say that an n-dimensional projective space over F1 is just a set of n+1 elements. Alternatively, there is a family of objects called simplices: the 0-simplex is a point, the 1-simplex is a line segment, the 2-simplex is a triangle, the 3-simplex is a tetrahedron, etc. If you look at the number of k-dimensional pieces that aren't in the middle of a larger-dimensional piece, we get that the n-simplex has (n+1, k+1) k-dimensional pieces: the tetrahedron has 4 vertices, 6 edges, 4 faces and 1 3-dimensional piece. So here's another interpretation of a projective space over the field with one element. Furthermore, the geometric properties match up to a large extent. If you take two projective planes and glue them along a projective line, you end up with an analogue of what you get when you glue two triangles along an edge. So the analogy is fairly precise. Anyway, F1 doesn't exist, but things that can be built off of F1 do exist, and act as if they were actually built using F1.
So I mentioned in TIL that I have a favorite theorem, and that I used to have a different favorite theorem. I'm going to attempt to explain both of them. Generalized Stokes' Theorem Back when I fancied myself a differential geometer, my favorite theorem was the generalized Stokes' theorem, whose short version roughly goes: the fundamental theorem of calculus works in any number of dimensions. More details: Consider a function f that takes in a position in space and returns a number. For instance, current temperature: there is (to some extent) a value that we can call the "temperature" at each point on the surface of the Earth. We would call such a function a "0-form", in that it takes in a 0-dimensional object, a point, and returns a number. We extend this function to negations of points. Given a point p, we don't really define what -p is, but we do say that f(-p) = -f(p). We also don't define q +-p, but we do say that f(q + -p) = f(q) - f(p). Don't treat these statements as meaningful quite yet. Now consider a function that takes in a path and returns a number. So for instance, suppose you're walking through a muddy field; you can ask how much mud you accumulate as you go from point p to point q. Note that this depends on the path that you take; if you take a really long path, you'll probably end up with more mud than if you took a shorter path, even if the endpoints are the same. Given a path s from p to q, we can consider going backwards along the path, from q to p. For our mud function, you're probably going to pick up the same amount of mud in both directions. But there are some functions that will give negative results if you go backwards: if we write -s for the path that goes from q backwards along s to p, we want to say g(-s) = -g(s). We call the direction that the path goes the "orientation" of the path, so -s has the opposite orientation from s. We call our path-to-number functions 1-forms. For those of us who have taken calculus, this is pretty familiar: you swap the limits of an integral and you get negative of the original integral. The fundamental theorem of calculus says that there is an operator d from 0-forms to 1-forms such that, for our path s, if f is a 0-form, then df(s) = f(q) - f(p), which by our notion of generalized function gives us f(q+ -p). * The idea is that df only depends on the endpoints of the path we give it. But not just the pair p and q, but rather the sum of q and -p. Okay, now let's go up in dimension. Consider a surface that has some edges but that doesn't go off to infinity. Now we can consider the boundaries of such a surface. Note that, say, a sphere has no boundaries, and neither does the surface of a cube, because you can always keep going no matter where you are; you might have a bit of a jolt going over the edge of the cube-surface, but you can still keep going. In contrast, if you have a disk, then when you reach the edge of the disk you've really hit an edge. Just as we distinguished a path from p to q from the same set of points but traversed in the opposite direction, we need to distinguish something about our surfaces. We say that we have an orientation on S. Suppose you pick a point on S and a direction from that point. Slicing S perpendicular to that direction yields a path, and the orientation of S then gives an orientation of the path. Note that if you start at the same point and picking the opposite direction, slicing gives you the same path but the same orientation of S now gives the opposite orientation on the path. Note that not all surfaces have orientations, for instance Mobius strips and Klein bottles don't have orientations. We'll stick to surfaces with orientations for now, and denote by -S the same surface as S but with the opposite orientation. Now consider a function h that takes a surface with orientation and spits out a number such that h(-S) = -h(S). We call h a 2-form. If said point is on the edge of S, then the orientation of the surface gives us an orientation on the boundary: it's the one that we get from picking the direction that goes off of S. We denote the union of all of the boundary pieces of S, with the orientations given by the orientation of S, by δ(S). Note that δ(-S) = -δ(S) Compare to the case of a path s, where δ(s) is q + -p. Here the inherited orientation is toward q, which we take to be positive, and away from p, which we take to be negative. Now Stokes' theorem as you see it in multivariable calculus classes is the statement that there is an operator d from 1-forms to 2-forms such that for a 1-form g, dg(S) = g(δ(S)). Let V be a 3-dimensional object with an orientation. Suppose you pick a point on V and a direction from that point. Slicing V perpendicular to that direction yields a surface, and the orientation of V then gives an orientation of the surface. Note that if you start at the same point and picking the opposite direction, slicing gives you the same surface but the same orientation of V now gives the opposite orientation on the surface. Again, not all 3-dimensional objects have orientations, but we'll assume that V does have one for now and denote by -V the same 3-dimensional object as V, but with the opposite orientation. Now consider a function f that takes a 3-dimensional object with orientation and spits out a number such that f(-V) = -f(V). We call f a 3-form. If said point is on the edge of V, then the orientation of V gives us an orientation on the boundary: it's the one that we get from picking the direction that goes off of V. We denote the union of all of the boundary pieces of V, with the orientations given by the orientation of V, by δ(V). Note that δ(-V) = -δ(V) Then we say that there is an operator d from 2-forms to 3-forms such that for a 2-form h, dh(V) = h(δ(V)). I think at this point we can switch to the general case: Generalized Stokes' theorem: Consider an n-dimensional object M with orientation. We can consider the boundary δ(M), and the generalized Stokes' theorem says that there exists an operator d from (n-1)-forms to n-forms such that for an (n-1)-form f, df(M) = f(δ(M)). Moreover, Stokes' theorem tells us what d looks like, and for all n it looks essentially the same. Note that I haven't said what it looks like at all, because that would require a bit more machinery than I really care to give here. For those of us who have taken vector calculus but not differential geometry, you might object that d doesn't look the same for dimensions 1, 2 and 3; for dimension 1 we have the gradient, for dimension 2 we have the curl for Stokes' and Green's theorems, and for dimension 3 we have the divergence for the Divergence theorem. Mostly, that's an artifact of the way that vector calculus is taught: in terms of vector fields in R3. A slight change of language to covectors and differential forms yields a uniform expression for d. There is also a bunch more information contained in d and δ. For instance, ddf = 0 for all forms f, in that we get the function that always returns 0 no matter what you put into it. Conversely, δ(δ(M)) is the empty set; if M is a ball in 3-space, then δ(M) is the 2-sphere enclosing said ball, and δ(δ(M)) is empty. This ties d and δ to (co)homology, which is a powerful tool that can give us a lot of information about the geometry and topology of an object via finite amounts of algebraic data. For instance, you can distinguish between a sphere and a torus by looking at 1-forms on each; for a sphere, if df = 0, then f = dg for some 0-form g, but for a torus that's not the case; conversely, every loop drawn on the sphere that doesn't cross itself is the boundary of some chunk of the sphere, but that's not true for the torus. Anyway, I liked this theorem a lot because it links geometry/topology and calculus in a fairly concrete way (well, if I were to describe d more explicitly it would be concrete) and was a massive generalization of something I was familiar with at the time that I learned it. Also d and δ play nicely with maps between manifolds, so we can prove things in Rn and then paste pieces together. *We need some constraints on f, in particular differentiability. So we'll just assume that everything can be differentiated.
My favorite theorem at the moment goes by the name of Pontryagin duality. Like with generalized Stokes' theorem, there is a version that is more familiar to those who haven't taken graduate level courses in mathematics: the Fourier transform. The Fourier transform is cool because it takes a wave, for instance a sound, and transforms the description of the wave from amplitude-as-a-function-of-time to amplitude-as-a-function-of-frequency. For instance, if someone were to press a bunch of keys on a piano at the same time, the result that comes out is a soundwave that we usually view as a wiggly line where the independent variable is time. The piano translates a bunch of pitches into a soundwave. The Fourier transformation allows us to go backwards, to translate that soundwave into pitches. So what's going on mathematically? The key object in Pontryagin duality is the circle of unit-length complex numbers, which I'll call S1. Consider an abelian group G, by which I mean a group for which ab = ba for all a and b in the group. For instance, the positive real numbers form an abelian group, because multiplication of real numbers is commutative. Now we want to consider the set of homomorphisms from G to S1, i.e. all the maps χ from G to S1 such that χ(a)χ(b) = χ(ab) such that the map doesn't "jump" except when G has a hole. We call such maps "characters" and denote the set of all characters by G^; the ^ is supposed to be on top of the G but I'm too lazy to figure out how to do that. Now suppose you had a complex-valued function* of G, call it f. The claim is that the function can be turned into a complex-valued function of G^. How? As an example, let G be (Z6, +), the integers modulo 6 with addition as the group operation. Then a function of G is just a list of 6 numbers. What is G^? It turns out that G^ also has 6 elements: for k between 0 and 5 inclusive, the map n -> e2πikn/6 is an element of G^, and these are all of them. We want to turn a function f of G into a function f^ of G^, which we define by the following: f^(χ) = (1/√|G|)sumg in G f(g)χ(g) = (1/√6)( f(0)χ(0) + f(1)χ(1) + ... + f(5)χ(5) ) Before anyone gets the wrong idea, it is not always the case that G^ looks like G. If we took G to be (S1, *), the complex unit circle with complex multiplication as the group operation, then G^ ends up being the integers, based on how many times a homomorphism wraps G around S1. In particular, our homomorphisms are all of the form χn(q) = qn. In this case, we get that our f^ functions look like f^(χ) = (1/√|S1|)intg in S1 f(g)χ(g) dg where |S1| = intg in S1 dg For most engineers and physicists, they'll be more used to seeing f^(w) = (1/√(2π))int02π f(t)eiwt dt Anyway, we want to distinguish between G and G^, even when they look the same, because we want to be able to talk about the cases where they're different. Now the interesting part is that G^ is also an abelian group. How do we multiply two elements, χ and φ? We define χφ by saying that (χφ)(g) = χ(g)φ(g), and you can check that this gives another homomorphism in G^. You can also check that the other group axioms hold. The point is that we can then ask what G^^ is. We can guess for (Z6, +) that (Z6, +)^^ = (Z6, +), since (Z6, +)^ = (Z6, +) anyway. But what about (S1, *)^^? In particular, (S1, *)^ = (Z, +), so (S1, *)^^ = (Z, +)^ = ? It turns out that (S1, *)^^ = (Z, +)^ = (S1, *). This is generally true of any abelian locally-compact topological group. Nevermind what the "locally-compact" and "topological" bits mean if you don't already know; they're complicated things that we don't need to care about here beyond the ability to "integrate" in a nice fashion over the group. For discrete groups like Z or Z6, integration is just summation, so I'll write int from now on. Since G^^ = G, we should be able to turn f^ into a function f^^ via roughly the same method as before. And since G^^ = G, we might expect that f^^ = f. But there's a slight twist! In particular, if we were to just do things naively, we'd write f^^(g) = (1/√|G^|)intχ in G^ f(χ)g(χ)dχ But what is g(χ)? If we just take g(χ) to be χ(g), then we end up with f^^(g) = f(g-1), not f(g)! Those of you familiar with Fourier stuff will recall that the inverse Fourier transformation has an additional a minus sign in it to deal with this issue: instead of taking g(χ) = χ(g), we instead go with g(χ) = χ(g)-1. This is the other reason why we want to distinguish between G and G^, because otherwise we'd get that funny inversion when we try to build f^^. Anyway, that's Pontryagin duality: for G an abelian locally-compact topological group, G^ is also an abelian locally-compact topological group, and we have a "Fourier transform" from complex-valued functions on G to complex-valued functions on G^, and we have an "inverse Fourier transform" that looks like the Fourier transform with a sign flipped.** Now for the engineers in the audience, you might be wondering about Fourier series, as those involve sines and cosines, not unit complex numbers. Except that sines and cosines do involve unit complex numbers, since sin(t) = (eit - e-it)/2i and cos(t) = (eit + e-it)/2. If f is a real-valued function of the unit circle, i.e. a periodic function of the real numbers, then f^(χ-n) is the complex conjugate of f^(χn), and hence separating out the real and imaginary parts gives us that f^(χn)χn(t) + f^(χ-n)χ-n(t) = ancos(nt) + bnsin(nt) for some an and bn. Hence the usual Fourier series in terms of sines and cosines. Perhaps interestingly, if we drop the locally-compact topological condition, then the notion of the Fourier transformation via integration falls apart. If we instead drop the abelian condition, then we can still define G^, but then it usually fails to be a group at all. However, there still is enough information to make an "inverse Fourier transformation" in some cases. * We need some somewhat technical conditions on what kinds of functions so that everything is well-defined and doesn't go off to infinity accidentally. ** Note that if we don't flip the sign, then we at least still get that f^^^^ = f. This has led some people, including Professor John Baez from whom I steal material shamelessly, to comment that the Fourier Transformation is like a second-quantized version of multiplication by i. I may someday explain what this means.
There's another way to look at Pontryagin duality that I wish I had come across in my math education because it's kind of cool and not really emphasized enough, perhaps because people think it's too applied or whatever nonsense. So suppose you have a polynomial in one variable, call it P. If you know the coefficients of P, then you can pick a number t and evaluate P(t). Now suppose I have a polynomial, and I don't tell you the coefficients, but if you give me a number t I will tell you what P(t) is, for as many different values of t as you want. Unfortunately, no matter how many values of P(t) I tell you, if I don't give you any information other than those evaluations then you won't be able to figure out the coefficients of P without asking me an infinite number of questions. Alas. However, if I give you what the degree of P is, then you know how many evaluations you need. In particular, if P is something of the form P(x) = pnxn + pn-1xn-1 + ... + p1x + p0 then asking me for the value of P at n+1 different values t0, t1, ..., tn-1, tn, will give you enough information to figure out P. But doing so can be kind of annoying, because it involves some linear algebra and not all sets of {ti} are as nice to deal with as others. So which values should we use? Well, since we're talking about Pontryagin duality, let's use values in S1. And in fact, since we want n+1 different values of t, let's use (n+1)-st roots of unity, i.e. numbers of the form e2πik/(n+1). We call this set Gn+1. Why use this set? Because (e2πik/(n+1))m is e2πikm/(n+1), and so we don't need to do a lot of computation of powers, just some shuffling. Also, (Gn+1, *) is an abelian locally-compact topological group, so Pontryagin duality deals with a bunch of the linear algebra for us. The set of maps x -> xm form Gn+1^ as m wanders from 0 to n, so we can view the maps m -> pm as functions of Gn+1^. Hence Pontryagin duality tells us exactly how to compute the coefficients of P. Similarly, suppose we had a polynomial in two variables, P(x, y) = pn1n2 xn1yn2 + ... Then we just need to take the group G = {(a, b) | a in Gn1+1 and b in Gn2+1}, and its characters G^, which are all of the form (a, b) -> am1bm2 and apply Pontryagin duality to tell us how to find the coefficients from the evaluations. Anyway, this is called the Discrete Fourier Transform by engineers and people working in signal processing and the like, and is liked because it doesn't require any integration, and a lot of the necessary pieces can be precomputed even before you feed in any data as long as you've not underestimated the degree of your polynomial (if you've overestimated, you'll get a bunch of 0s for the higher degree coefficients but the technique itself still works). To be honest, I found character theory to be incredibly boring before I found out about Pontryagin duality, but that's because characters take a lot of the geometric fun out of the representations of nonabelian groups although there's still a bit of more combinatorial charm in the case of finite nonabelian groups.
Banananananach-Tarski Paradox* *nananananananana** There's an old mathematician's joke along the lines of "What's an anagram of Banach-Tarski? Banach-Tarski Banach-Tarski." If this doesn't make sense at the moment, hopefully it will at the end. Note that I said "make sense", not "be funny"; some mathematicians are good comedians, some are not. Anyway! One of the Great Debates in mathematics (well, not really) is about the Axiom of Choice. The idea is simple: given a bunch of nonempty sets, you can choose one thing from each set, and make a set out of those chosen things. This sounds pretty reasonable, right? And it's very useful: instead of having to explicitly pick a representative from each set, you just say "Axiom of Choice!" and boom, you have exactly one element from each set to act as a stand-in for each set. But there are problems. Most of the time, the issue hinges on the word "bunch", because if you allow for large enough bunches, you can get weird things. For instance, it allows you to have your cake and eat it too. Suppose you have a set sitting in three dimensional space. Rotating the set and sliding the set around (translation) without changing the relative positions of the points of the set shouldn't change the size of the set, where here I'm taking "size" to mean volume. Keep this in mind. Now suppose you have a solid ball. You can rotate this ball around any axis that passes through the center of the ball, and as a set of points it remains unchanged, although the individual points move around. Certainly it doesn't change the volume of the ball. Pick a weird angle, i.e. some irrational multiple of π radians. I'll just use θ to denote, say, arccos(1/3). Let A be a rotation by θ around the x axis, with B being a rotation by θ in the other direction, and let C be a rotation by θ around the y axis, with D being a rotation by θ in the other direction. Claim: if you never have A next to B and never have C next to D, then no sequence of As, Bs, Cs or Ds yields the same rotation as any other sequence. I'm not going to prove this, but it basically hinges on θ being an irrational multiple of π. The rules about what can follow what get rid of the obvious redundancies, since AB is just rotating and then rotating back immediately, and similarly for BA, CD, and DC. Let F be the set of all finite sequences of As, Bs, Cs, and Ds, with the above rules about what can be next to what, along with the empty sequence e that just means "don't rotate at all". Note: due to notation, we'll actually be writing our sequences backwards, so when we say BC, we mean "do C, then do B". So the start of a sequence is on the right and the end is on the left. Let FA be the subset of F that ends in A, and similarly for FB, FC, and FD. Then the union of these four sets and the singleton {e} make up all of F. However! Consider AFB, by which we mean take the elements of FB and apply A to them. So BCBDB becomes ABCBDB, which is equal to CBDB due to AB canceling out. Since FB always ends in an B (on the left), there's always something to cancel. Now for anything in FB, the second-to-last letter in the sequence isn't an A, so AFB consists of anything that doesn't end in a A. Hence the union of AFB and FA is all of F. Similarly, the union of CFD and FC is all of F. Now consider the surface of the ball. If we pick a point p on the surface of the ball, we can look at F(p), by which we mean the points of the form f(p) where f is in F. But this is only a countable subset of the surface. So pick another point q and look at F(q). But the union of the sets F(p) and F(q) is still only countable, so keep going. If we accept the Axiom of Choice, we can pick a subset T of the surface of the ball so that every point on the surface lies in F(t) for exactly one t in T. Then extend T so that for each point t in T, we also include all the points "below" t leading to the center of the ball. Ignore the center for the moment. Now we split the ball a second way. First, we take T. We also take FA(T), by which we mean the points you get by starting with some t in T, applying anything in F unless it ends in B, and then applying A. Similarly, we can look at FB(T), FC(T), and FD(T), to get sequences that end in B, C and D respectively. Since every sequence is either empty or ends with one of A, B, C or D, the sets T, FA(T), FB(T), FC(T), and FD(T) make up the entire surface, and are also completely disjoint: no element in one set is also in some other set. Now define T1 as the union of FA(T) with T and S, where S is the union of BT and BBT and BBBT, etc. Let T2 be FB(T) minus anything from S Let T3 be FC(T), and let T4 be FD(T). Now note that A(T2), i.e. the points of the form A(t) for t in T2, contain all of the points from T2, T3 and T4. Similarly, C(T4) contains all the points from T1, T2 and T4. So by rotating the sets T2 and T4, we've gained an extra copy of everything. A little bit of translating our new pieces around then yields two copies of the original ball, the same size as the original ball despite the fact that all we've done is rotations and translations, neither of which changes the sizes of anything. Of course, we're missing the center point, which we never duplicated, but there are ways to get a single extra point without changing any volumes or anything; after all, a point has no volume, so we can basically take it from anywhere we want. There are a few other troublesome points, but they also add up to a set with total volume 0, so who cares. So that's the Banach-Tarski paradox: given a ball, we can make two balls of the same size using non-size-changing operations. So what happened here? Well, the problem is in our set T. The Axiom of Choice tells us that T should exist, but it doesn't guarantee it any nice properties besides being a set. And indeed, it doesn't have many nice properties. One thing it lacks is having a meaningful size. Nor do our pieces T1, T2, T3, or T4 have meaningful sizes. So while it's all well and good to say that rotations and translations don't change size, we're sidestepping that altogether. We say that these sets are nonmeasurable, and that's one of the main arguments against with the Axiom of Choice: if we have the Axiom of Choice, we can make nonmeasurable sets, and that allows us to do nonsense like this. Of course there are mathematicians who love nonsense like this. Most mathematicians simply don't care; they either don't care about measure at all or can safely assume that everything they look at is measurable. But it's always nice to be able to point to places where seemingly nice things go horribly wrong. **nanananananananananana
Isn't "size" pretty broken as a set concept, anyway? I remember a lesson from an undergrad course about how there's no way to define length for an arbitrary set of real numbers, even though they're all on a line and intuitively we should be able to measure the length of line segments. Or did that argument also require the Axiom of Choice? ETA, I thought the joke was funny. *shameface*
Yeah, that was another example of the Axiom of Choice making things difficult. Any explicitly constructible set can be assigned a measure. You need something nonconstructive, like the Axiom of Choice, to make things like the set of representatives of the equivalence classes of R/Q.
Because they're so simple to define, I'm going to define a groupoid here, and then everyone who understands the explanation will know a thing that most mathematicians don't even hear about until graduate school. Which is also to say that don't feel bad if you don't get it; most mathematicians don't get it until graduate school if that. A groupoid is a way to generalize the notion of a group to situations where things are allowed to change, but only in a reversible manner. Think of two sets {a, b, c} and {1, 2, 3}. We have bijections between the two sets, i.e. maps from the first set to the second that send each of a, b and c to different numbers, or maps from the second set to the first set that send each of 1, 2 and 3 to different numbers. So the map f that sends a to 1, b to 2 and c to 3 is a bijection, and the map g that sends a to 2, b to 3 and c to 1 is a bijection. We also have a map f-1 that sends 1 to a, 2 to b and 3 to c, and g-1 that sends 1 to c, 2 to a and 3 to b. We also have a map fg-1 that sends 1 to c and then to 3, 2 to a and then to 1, and 3 to b and then to 2. And so on. We end up with twelve maps total: six from {a, b, c} to {1, 2, 3}, and six in the other direction, and for each map in one direction we have a map in the other direction that reverses the first map. Finally, we stick in twelve more maps: six from {a, b, c} to itself, and six from {1, 2, 3} to itself, such that a, b and c each get sent to different letters, and 1, 2 and 3 each get sent to different numbers. So note that we have something like a group, except instead of maps from one set to itself, we're looking at maps between two different sets. So we have multiple identity maps, one for each set, and composition doesn't always make sense, because what is fg? g(a) is 2, but we can't apply f to 2, we can only apply f to a, b or c. But when composition does make sense, the resulting composite is again a map that we're considering. Also for every map there is an inverse map and the composite of a map and its inverse (which always makes sense) is an identity map for some set. And finally, when composition of three maps makes sense, the composition is associative. That's a groupoid. Usually instead of "set" we say "object" and instead of map we say "morphism" because we want to remove anything that might be too distracting or ping our intuition in the wrong directions. Here's another example of a groupoid: consider a bunch of islands and bridges between the islands. You don't really care about what's on each island, you just care about how they're connected by bridges, because you're Euler or something. An object here is an island. A morphism is a path from one island to another, possibly passing through islands along the way. Also, if you cross a bridge and then immediately backtrack over it, that crossing (and the backtracking) doesn't count as part of the path; if you cross over a bridge and then a second, and then backtrack across both bridges, those two crossings (and two backtrackings) don't count, and so on. However, if you start on an island and make a loop around back to that island without ever backtracking along any bridge, that counts as a path that isn't just staying on the starting island. If you're more of a programmer or an engineer, think of a machine and its possible states, and the processes that lead your machine from one state to another but only in ways that can be reversed by another process. So a groupoid is like a group, except you're allowed to end up somewhere different than where you started, as long as you can get back to your starting place. Note: it is not generally the case that you can get to any object of a groupoid to any other object of a groupoid; you could have pairs of islands with no bridges between them at all. You can have the groupoid whose objects are {a, b, c} and {1, 2, 3, 4}, and there are no bijections between those two sets; the groupoid would only have morphisms from {a, b, c} to itself and {1, 2, 3, 4} to itself and no morphisms between them. Anyway, that's what a groupoid is. Hopefully that isn't too bad. Which is entirely the wrong way to talk about groupoids, because groupois are fundamentally part of category theory, whose main lesson is that what a thing is is unimportant; the important bit is how that thing relates to other things of the same type, and I can't explain how groupoids relate to one another in a single post. Nor can I really explain why we should care about groupoids, except to say that the island-bridges example is essentially the first reason: to be able to talk about paths in a collection of points connected by lines in a sufficiently general but precise way.
@Exohedron howdy, d'you think i could get some help with a differential calculus problem? :O i get about halfway through and then get an integral that, according to the internet, is solved with some gamma integration thing that has not even been touched upon in class and is completely foreign to me, leading me to believe that im doing something incorrect along the way Spoiler: problem if ur interested solve the following IVP: y' - (x^2)y = 7xy y(1) = 1 then from that problem i did this thing i tried to do that last integral using integration by parts but that didnt work, so i googled it and got ??? who knows help would be appreciated, also anyone else who has done differential calc ur input will be v appreciate as well ;o;
The thread for help with math problems is here, but I think the best strategy for that problem is separation of variables. Have you covered that in class yet? It's the one where you rewrite y' as dy/dx and then get all the x and dx terms on one side and the y and dy terms on the other, and then integrate both sides.
oh whoops my bad maybe ill crosspost there, and yes ive covered that! ill give that a shot, i hadnt thought of it that way because the rest of the problem set was all the stuff with the rho substitution :P edit: IT TOTALLY WORKS i feel kinda silly for not looking for that, thanks!!
Diffeqs are an area of math that I consider Hard, in that there's usually no general formula for solving problems and it's really hard to figure out which problems the formulas we do have will work for. For instance, if we had y' - x2y = 7xy2, the problem instantly becomes significantly harder.