Möbius, Euler and Dirichlet series

 trapjaw_ant_amazonAn ant in our community has taken a trip on a hyperbolic tree, which on the large scale looks like a pair of pants. In his struggle to understand the combinatorics of very long geodesics at its surface, he is encountering monstruous looking formulas involving arithmetical beasts. Let’s help him identify those creatures.

Our journey starts with Eulers totient function $\phi(n)$ which counts the number of positive intergers $m$ smaller and coprime with $n$.

This function satisfies pleasing formulas like $\sum_{d|n}{\phi(d)}=n$ where the sum is on all positive divisors of $n$. This can be seen by writting down all the fractions between $\frac{1}{n}$ and $\frac{n}{n}$ with denominator $n$, and then reducing them. Partitionning them by the values $d$ of their denominators, we shall get $\phi(d)$ fractions with lower denominator $d$. Can you make this rigorous ?

Another nice formula is $\phi(n)=n\prod_{p|n}{(1-p^{-1})}$ where the product is on prime factors of $n$. This can be understood working on the finite probability space $[1,n]$ with uniform distribution, and noticing that being coprime with $n$ is equivalent to being divisible by none of its prime divisors. So the probability of picking a coprime element with $n$ can be computed in two ways. It is by definition $\frac{\phi(n)}{n}$. And it is also the product over $n$’s prime divisors of the probabilities of not being divisible by $p$. But the probability of being divisible by $p$ is the number of multiples of $p$ in $[1,n]$ divided by $n$. Putting all this together yields the formula. One must just be carefull that we have supposed independance of the events “being divisible by $p$” for each prime divisor of $p$. Why is this true ?

This so called Euler totient function is quite a fascinating insect. The tree we are on is full of them, we might encounter some more later on…


The next creature on our path is Riemann’s zeta function $\zeta(s)=\sum_{n>0}\frac{1}{n^s}$. It is named after Riemann because he was interested in its behaviour on the complex plane but we shall only be considering $s$ real. Euler had already found a famous product formula: $$\zeta(s)=\prod_{p}{\sum_{k \geq 0}{p^{-ks}}} = \prod_{p}{\frac{1}{(1-p^{-s})}}$$

The second equality is just rewritting the geometric sums, but the first equality is crucial, and it is essentially equivalent to Gauss’s therem of unique decomposition of intergers into a product of prime powers. A good way to get the feeling of how it works is to try and expand the first few terms on the right side:

$$ (1+2^{-s}+2^{-2s}+\dots) \cdot (1+3^{-s}+3^{-2s}+\dots) \cdot (1+5^{-s}+5^{-2s}+\dots) \cdot \dots $$

Ignoring the $s$ for now, you have to choose one term in each parenthesis (the power of the prime). If the sequence of powers stagnates to zero then you get a term which is the inverse of the intger with associated prime power decomposition (to the power $s$). If the sequence doesn’t stagnate with zero powers then you get the inverse of infinity, which is zero. It is actually more subtle than this and one should be carefull about the convergence of series, but if $s$ has real part greater than $1$ then one can prove the product formula along those lines.

Before we introduce the next creature on our path, lets refer to our pocket bestiary which will enable us to better understand the features of all the Tree’s inhabitants.

dirichlet series

Arithmetical functions are maps on the positive integers and we shall suppose thay take integer or real values. The Euler totient function is one of them, but you can define many other intersting ones like the constant function equal to one denoted $\mathbb{1}$, or the function which counts the number of divisors of an integer: $\sigma_{0}(n)=\sum_{d|n}{d^{0}}$. All those functions $f$ satisfy a tremenduous amount of curious identities and a good way of combining all those identities is to look at their Dirichlet series: $F(s)=\sum_{n>0}{\frac{f(n)}{n^{s}}}$. Note that Riemann’s zeta function is the Dirichlet series of $\mathbb{1}$, and that the Dirichlet series of the arithmetic function $\delta(n)$ which returns $1$ if $n=1$ and $0$ otherwise is the constant function of $s$ equal to $\Delta(s)=1$.

The sum of Dirichlet series simply corresponds to the sum of the associated arithmetic functions. The product of Dirichlet series $H=F.G$ induces an convoluted product on the corresponding arithmetical functions:  $h(n)=(f\star g)(n)=\sum_{d|n}{f(d).g(n/d)}$. This at first seemingly awkward operation will enable to unify the variety of number theoretic identities. The philosophy is the following: a polynomial identity (like a product) on the Dirichlet series gives a convolution identity in the arithmetical functions. And as we shall see, it is sometimes straigthforward to get such formulas on the series, whereas the corresponding equality in terms of arithmetic functions is rather unnatural.

Let’s call an arithmetic function $f$ multiplicative if $f(mn)=f(m)f(n)$ when $m \wedge n =1$ (meaning $m$ and $n$ are coprime) and strongly multiplicative if this is true even without the coprimality assumption. In the latter case, one has a factorisation of its Dirichlet series into an Euler product, but in the former only the first equality holds:

$$F(s)=\prod_{p}{\sum_{k \geq 0}{\frac{f(p^{k})}{p^{-ks}}}} = \prod_{p}{\frac{1}{(1-f(p).p^{-s})}}$$

But this Euler product formula is also a way to define multiplicative functions, and obtain immediately its generating series. One just has to choose any values on the powers of primes and then declare it’s multiplicative. We can now introduce ourselves to the next monster on our path.


Möbius function $\mu$ can be defined on the powers of prime numbers: $\mu(p^{k})$ is one if $k$ is zero or one, and zero otherwise, and then declaring it to be multiplicative. One has immediately access to its Dirichlet function $M$ by the afforementionned process and we discover that it is the inverse of Riemann’s zeta function ! So $M.\zeta=1$ which translates to $\sum_{d|n}{\mu(d)}=\delta$.

The same process gives Möbius inversion formula. Suppose two arithmetic functions $f$ and $g$ are related by $g(n)=\sum_{d|n}{f(d)}$ then we can inverse this formula to get $f(n)=\sum_{d|n}{g(d).\mu(n/d)}$.



We are now up on a high branch of our tree, so lets admire the view and look back upon the trail we made. Remember the first formula about Euler’s totient function. Can you spot a convolution product to find that its Dirichlet series is $\frac{\zeta(s-1)}{\zeta(s)}$ ? What does that say on the eventual (strong) multiplicativity of $\phi$ ?

Here are a few questions to conclude. Can you prove that the Dirichlet series of $\sigma_{0}$ is $\zeta^{2}$ ? Can you find the generating series of $\sigma_{k}(n)=\sum_{d|n}{d^{k}}$ ?


Impossible Geodesics on a Pair of Pants

The ways in which a geodesic can intersect itself on a pair of pants is restricted in the sense of what polygons can be formed by the intersections. This post will review the geometry background Professor Anthony Phillips provided and the example we covered.

Background Geometry. The main idea behind the impossible geodesics is based on the fact that, in hyperbolic space, the interior angles of a triangle sum up to less than π (this is shown in example 4). We will prove this as a consequence of the Gauss-Bonnet Theorem. The wikipedia page doesn’t discuss what happens when our boundary is piecewise smooth and consists of curves C1….Cn. In that case the theorem looks somewhat complicated but we will review the terminology necessary and reduce this equation to a much nicer one for the pair of pants:

iθi +  ∑iCikg(s)ds +  ∫∫dA = 2πχ(R)

Where R is our region, θi  is the angle between the curves Ci and Ci+1 at the point where they meet (analogous to the exterior angle of a polygon), ∫Cikg(s)ds is the total geodesic curvature over the curve Ci, and ∫∫dA is the total curvature of R.

The curvature is a measure of how a surface bends, but the formal definition gets particularly cumbersome and we won’t need it. A patch of a surface with positive curvature will resemble a sphere, of zero curvature the flat plane, and of negative curvature a saddle shape. Note that since the metric we consider on the pair of pants is one of constant negative curvature of -1, we have ∫∫dA = -Area(R).

A triangulation of R is family of triangles on R such that their union is R and the intersection of any two triangles is either empty, a common vertex, or a common edge. The Euler characteristic is the number χ(R) = F – E + V.  Triangulating the disk we find that χ(R) = 1, so the right side of the equation becomes 2π.

Intuitively speaking, the geodesic curvature measure how much a curve on a surface fails to be a geodesic. A bit more precisely, a curve on a surface in 3 has an acceleration vector, which can be written as a linear combination of the tangent plane basis and the normal vector to the surface. If the curve is a geodesic, then the acceleration is contained in the normal component. The geodesic curvature is a measure, vaguely speaking, of how much of the acceleration is on the tangent plane (i.e. how much of it is on the actual surface). A geodesic has no tangential acceleration, so kg(s) = 0 for every s if the curve is a geodesic. Since our geodesic configuration is (supposedly) a geodesic, ∫Cikg(s)ds = 0 for every i. So, the Gauss-Bonnet Theorem for a region of constant curvature and piecewise geodesic boundary becomes:

iθi + K·Area(R) = 2π

For a more precise treatment and formal definitions of curvature, geodesic curvature, Euler characteristic on surfaces, see Do Carmo’s Differential Geometry of Curves and Surfaces.

Example 1. Consider an n-sided polygon in the euclidean plane. The plane has constant K = 0, so the Gauss-Bonnet Theorem says ∑iθi = 2π, i.e. the sum of the exterior angles of a polygon is 2π, which is a quicker theorem of euclidean geometry.

Example 2. Consider the upper hemisphere of a unit sphere, which has constant curvature (intuitive by symmetry). The boundary of this region is the equator, which is a great circle and so a geodesic. The boundary has only one piece so there are no turning angles, i.e. ∑iθi. The surface area of the hemisphere is 2π, so Gauss-Bonnet tells us K = 1. (In fact, the curvature of the sphere is a constant R-2). If we didn’t know that the equator was a geodesic, but knew that the curvature was 1, we could use the same idea to show that ∫kg(s)ds  = 0, showing the equator is in fact a geodesic.

Example 3. Consider the spherical cap of height h in a sphere of radius 1, measured from the top and it’s boundary C. We want to compute the geodesic curvature of C. The sum of the turning angles is again 0, since we have one smooth curve. The area of the cap is 2πh. The geodesic curvature at any point of C is constant, say k, by symmetry, so ∫Ckg(s)ds = kCds = k·Circumference(C). If the radius of C is r, then Circumference(C) = 2π(2h-h2)1/2.  Solving for k in the Gauss-Bonnet Theorem, we get k = (1-h)/(2h-h2)1/2.

Example 4. Suppose our region has a constant negative curvature K = -1. Consider a triangle with sides which are geodesics. Gauss-Bonnet tells us that ∑iθi – Area(R) = 2π, so the sum of the turning angles must be greater than 2π. The sum of the interior angles is 3π – (θ123) < π since the sum of the exterior ones is greater than 2π.

Impossible Geodesic on the Pair of Pants. Consider the pair of pants and the curve abAB drawn below. Assume that this curve is a geodesic.abAB

Cutting the pair of pants on the left and right—between the top boundary and the boundary labeled a, and the top boundary and the boundary labeled b—we obtain a diagram that looks as follows:


The sides a and A are identified, as are the sides b and B. The geodesic triangle in the middle borders a geodesic hexagon—the angles between them are labeled. Recall that our metric on the pair of pants is of constant curvature K = -1. Applying the Gauss-Bonnet Theorem to a geodesic n-gon, like in example 4, we get that the sum of the interior angles is less than π(n-2). For the hexagon we get that the sum of the interior angles must be less than 4π. Now we compute

Sum of interior angles in triangle = α+ β + γ < π

Sum of interior angles in hexagon = 2(π – α) + 2(π – β) + 2(π – γ)
= 6π – 2(α+ β + γ)
> 4π since α+ β + γ < π, a contradiction.

Thus, a geodesic curve represented by abAB cannot look as we drew it (it can’t intersect itself such that it creates this figure).

Problem. Can we classify the impossible geodesic configurations?


Divisibility of Curve Words on a Pair of Pants

The table displaying the number of curves of word length w and self-intersection number k provides some insights into the behavior of the words. An easily noticed fact is that many entries in the table are divisible by 3. The following is a summary of some discussions between Christopher, Moshe, and myself. All curve words are taken to be non-power words.

Intuitively, we can begin accounting for the ternary nature of these numbers by noting that rotating our pair of pants by 2π/3, call this operation r, will produce another curve of the same word length with the same number of self-intersections, unless the new curve is a cyclic permutation of the original. That is, a curve w fails to produce a new curve when r(w)=w, i.e. w is a fixed point of r (here = denotes equivalence up to cyclic permutation). Algebraically, r corresponds to mapping a → b, b → c, c → a, A → B, B → C, C → A.

We will now introduce the notion of fixed words of r, their importance is made clear in Proposition 2. Let our word be w = x1 .… xnIf w is a fixed point, then the map r only tells us r(w) = w but doesn’t tell us where each letter  xk was mapped to. However, since applying r yields cyclic permutation of w, we can describe the map by a permutation, σ on the indices of the letters in the word. That is, σ acts on the set of fixed words by σ(w) = xσ(1) .… xσ(n)We can describe this permutation even more precisely. To obtain a cyclic permutation we shift each index by some i, for 0 ≤ i ≤ n, and then mod out by n. Note that such a permutation is an element of a subgroup of Sn generated by permutation k → (k + 1)(mod n). Call this subgroup S̄n. Since σ(w) = r(w) for any w, and r is a rotation by 2π/3, we get that σ3 = r3=1. With this background we can begin drawing some conclusions.

Proposition 1. σ(w) preserves the word length and intersection number of w.

Proof. First note that if the word length/intersection number was not preserved, it could only be reduced, since σ doesn’t add any letters—it could only, hypothetically, change them in such a way which creates relations which cancel out. Now suppose |σ(w)| < |w|, then |σ3(w )|<|w|, but σ3(w ) = w, so their lengths must be equal. Contradiction. An analogous argument holds for the self-intersection number.

Definition.  An orbit of a word w is the set Orb(w) = {σn(w): n ∈ ℤ}. The size of an orbit, |Orb(w)|, is the size of the set given by the orbit.

Proposition 2. For any word w of length n, w has an orbit of size 1 or 3.

Proof. Suppose some w had an orbit of size 2, i.e. σ2(w ) = w and σ(w) ≠ w. Then applying σ again would yield σ3(w ) = σ(w) ⇒ σ2(w ) = w, a contradiction. If does not have an orbit of size 1, then it must have an orbit of size 3, as for some integer 0 ≤ m < 3 we can write σn(w ) =  σ3q+m(w ) = σm3q(w)) = σm(w). Since 2 was excluded, can only be 1, so σ(w)=w. 

The above proposition shows us that to account for entries in the table of words, we can start by considering the nature of words with orbit size 1, i.e. of the fixed words. To account for the non-divisibility by 3 of some of these entries, one can show that the fixed points are not a multiple of 3.

Proposition 3. If w is a fixed word of σ, then the length of w is a multiple of 3.

Proof. By the last remark, σ generates a subgroup of order 3 in S̄n. The group S̄n itself has order n, since any permutation in it is fully determined by where 1 is mapped to, and it could be mapped to other numbers. By LaGrange’s Theorem, 3 divides n. 

This allows us to focus our search on words of lengths that are a multiple of 3, since only those will have fixed points. In fact, we can also say that the number of self-intersections for a of a certain word length is a multiple of 3: for a word of orbit 3, each self intersection is mapped to a new one by σ. For a fixed word, this intersection must be in the center of the pair of pants (or must be homotopic to such an intersection), since the intersection for a fixed word is not moved by σ. Such an intersection is a triple point (i.e. a single point intersection which splits into a triple intersection when perturbed). Thus, it is a multiple of 3. An algebraic proof of this fact is a corollary of the next proposition.

Proposition 4. Let be a fixed word. Then there are two canonical forms of fixed words: w = v.σ(v).σ(v).σ2(v) and w = v2(v).σ(v) for some word v. Here . denotes word concatenation.

Proof. Let w be a fixed word. We know that the length of is a multiple of 3, so we can write w = x1…xnxn+1…x2nx2n+1…x3nWe will reconstruct our word by applying σ to v = x1…xn 3 times. In this discussion we will normalize σ(w) by cyclically permuting it to look exactly like w. This is because we will only consider where v is being sent by σ inside w, which isn’t changed by the indices shifting. Since is a fixed word, σ(v) is a string of letters contained somewhere in w. Furthermore, σ(v) cannot intersect v, because if it did then σ acts by k → k + i for 0 < i < n. Then σ3(w ) acts by k → k + 3i, but 3i < n, so we would need more than 3 iterations to reach the identity, but σ3=1. In particular, n and 2n are the only shifts (mod n) by which σ would have order 3. (Shifts by multiples of 3n would be the identity, and any others > n would be equal to n or 2n (mod n) by an argument similar to the orbit one in Proposition 2). Thus, after 2 iterations, v, σ(v), and σ2(v) fill up the whole word. A shift of n corresponds to the form v.σ(v).σ(v).σ2(v) and a shift of 2corresponds to v2(v).σ(v), which is verified by seeing where the first letter is mapped.

Corollary 5. The self-intersection number of a fixed word is divisible by 3.

We have discussed fixed points as a means of classifying which entries in the table are divisible by 3. Namely it is enough to show that the set of words of orbit 1 is not divisible by 3. We have also shown that fixed words (words of orbit one) fall into two canonical forms—this may help determine what words belong to those two classes which may help counting them. It is also natural to ask whether these types of arguments hold for other number (regarding divisibility of chart entries). Here we formalized a geometric picture, but for other rotations this isn’t so clear.

The Formal Definition and “Movie” Definition of Homotopy

For homotopy of paths on surfaces we have an intuitive “movie” of deforming one curve on a surface into another curve on the surface where both of these curves have the same base point. The difference between a movie and a picture is that a movie requires time and a picture does not. The formal definition of homotopy uses an interval. This interval in the formal definition is the “time” in our movie definition

Crochetting a simple curve

Suppose you are fast asleep in your favorite couch recuperating from a long day of labour when suddenly your sister (or maybe your daughter) switches on the television to watch “My little poney” program. Of course you start moaning but she begs you to make her a poney tail and since you aren’t in a mood for a tantrum you complacently pull out a long piece of thread to fulfill her wish. But as soon as the circular rubber band encircles a strand of hair, she swaps channel to “Pippy longstocking” and asks for a nice braid instead. You’re about to leave her do it herself when you realize this is an opportunity to have a little fun finding an easy way to make a long simple curve.

color braid

Choose one strand out of the thread, and make two from the hair inside it. Now repetedly apply the braid group transformation T= [sigma_2].[sima_1 inverse] suggested by the following image.


You have only three braids so i ranges from one to two but this setting could be generalised to larger braid groups.

The braid group of order n can be defined as the mapping class group of the sphere with n punctures. Recall that the mapping class group of a punctured surface is the isotopy classes of homeomorphisms of that surface which globally preserves the set of punctures.

It can easily be represented as the group of transformations acting on the configurations of n strands in space (taken up to isotopy). It is generated by the [sigma_i] transformations pictured above (reminiscent of the Dehn twists) and here’s a complete set of relations.


Now, let’s not loose the thread of our discussion. Perform a series of T transformations, say 12 of them, and then ease the the initial rubber band you forgot in her hair (to make the poney tail) down accros the braid. Well this gives you Thurston’s simple curve !

Why 12 ? Well notice that in our case, the first time our three strands come back in the same position is after 3 transformations. We therefore call T^3 a pure braid, which means that the induced action on the order on the strands is the identity. In fact, we have an exact sequence 1 —-> Pn –f–> Bn –g–> Sn —-> 1. The second morphism g corresponds to taking the induced action on the order of the braids and the kernel is by definition the set of pure braids. I chose 12 just because I thought it nice to have a pure braid and because I was imagining a long braid (which would give a long simple curve).

Besides, looking at the shape of the curve after three iterations of T might also suggest a recursive construction of Thurston’s simple curve.

Now if you sister (or daughter) has a tough character I would advise you to crochet this with your own wollen pieces of thread. You wouldn’t want to wake up one morning tasting her cold revenge.

spiderwick hair

Problem One: Average Word Length on a Pair of Pants

Let out surface be a pair of pants and call it S. Consider the three curves a,b, and c and the relation abC=1 between them, where C = c^{-1} and 1 is the identity (are there any other relations?).

The relation between the 3 curves is abc=1, so following them in order according to their orientation will produce a curve that can be shrunk to a point.


Our alphabet consists of {a, b, c, A, B, C}, but with the relation abc=1 we can rewrite words which contain abc—the word abAB can be rewritten more optimally with by CAB. 

Let W_L = {cyclic reduced words of L letters on S} and W’_{L} = {cyclic reduced words of L letters on S with the relation abc=1}. Define the average of the word lengths for some set of words, W, to be averagewordlength where SIN(w) is the words self intersection number

We know that asymptotically A(W) = 1/9 * L^2, but adding the relation abc=1 complicates things, and it isn’t known what A(W’_L) is.

Problem: Find A(W’_L).

The same can be asked about the variance of a set of words, where the value for the set with the relation is unknown.

Iterating Thurston’s simple curve

Here’s an image suggesting an iterative method to draw Thurston’s simple curve. The curve is in grey while the colors hint to a recursive process.

What is it ?

Can you implement it on a computer to save Dennis the effort of drawing 5 iterations ? 🙂


Blog at WordPress.com.

Up ↑