Our auto-discovery of the Möbius and Mertens sequences

Recently, we were explaining to our friend the Möbius and the Mertens functions and their relationship to the prime number distribution. We also heard with some wonder from a physicist of a theoretical model where multiparticle states behave as bosons or fermions in way predicated by the Möbius function. This prompted us to put down the tale of our autodiscovery of the Möbius and Mertens sequences in our youth. On one hand that journey gave us some satisfaction that we, in a very limited way, were on our own able to recapitulate the journey of the great minds. But, at the same time, it also brought us down to earth showing how little we were able to do on our own following the rediscovery of their results, thus, telling us how the truly great and pedestrian are distinct.

As we have described before, we began our exploration of sequences by hand and computer inspired by Hofstadter’s book (GEB). One of the sequences that we conceived in course of this exploration was a one-seeded sequence which goes thus in words: We start with 1 as the first term. Then for every integer $k$ from 1 to $n-1$ we test if it divides $n=2,3,4...$. For each $n$, we then take the negative of the sum of all the $f[k]$, where $k$ satisfies this divisibility condition. This becomes the value of the nth term of the sequence $f[n]$

$f[1]=1$
$\displaystyle f[n]= -\sum_{k} f[k]$ where $n \mod k \equiv 0, 1 \le k < n$

We were at first surprised to see that this sequence takes only 3 values -1, 0, 1. The first 25 terms are: 1, -1, -1, 0, -1, 1, -1, 0, 0, 1, -1, 0, -1, 1, 1, 0, -1, 0, -1, 0, 1, 1, -1, 0, 0

Figure 1. A plot of the first 2000 terms (the red short lines are $f[n]=0$).

The sequence (Figure 1) at first looked almost random but after looking more closely we began to discern a pattern. First we asked where do the 0s come? After analyzing their locations it became clear that they came wherever $n$ is a perfect square of a prime number or a multiple of such a perfect square. Thus, $f[n]$ where $n$ is 4, 8, 9, 12, 16, 18, 20, 24, 25… are all 0s. We then asked where the -1s and 1s come. It was immediately clear that when $n$ is a prime $f[n]=-1$. Why this should be so is obvious from the way we defined our sequence. We also noticed that whenever a number has a odd number of prime factors then $f[n]=-1$. Thus $n=30=2 \times 3 \times 5; \; f[30]=-1$ or $n=42=2 \times 3 \times 7; \; f[42]=-1$. For all the remaining values of $n$ we get $f[n]=1$. These values of $n$ have a even number of prime factors. Thus, for $n=6=2 \times 3, 21=3 \times 7$ we get $f[n]=1$. One learns from these patterns how the $f[n]$ values for a given $n$ can be understood on the basis of the sequence definition.

These patterns also directed us to render them graphically as a square matrix with its total number of cells amounting to a certain perfect square $n$ where each cell takes one of three colors based on the value of $f[n]$. By playing with different dimensions we arrived at $n=125 \times 125$ for capturing the above-described structure of this sequence (Figure 2). The 5 vertical lines are generated by the square number 25 and its multiples. The square number 4 and its multiples create the $135^o$ one-square cross-lines. 9 and its multiples create the $45^o$ cross-lines. When multiples of 9 and 4 come adjacent to each other e.g. (8,9), (27,28), (44,45), (63,64), (80,81), they create the broad 4-square cross-lines. They have an alternating prime number separation of 19 and 17.

Figure 2. $f[n]$ for $n=125 \times 125$ with the first row from bottom being $n=1:125$. Brown: -1; light yellow: 0; green: 1

This pattern led us to ask the question as to what fraction of the area of a square, like the above one, is one of the three colors. Experimentally, it is obvious that the 0s occupy most area, while the remainder is evenly split between -1 and 1. Figure 2 helped us to arrive at approximate value for the area occupied by the zeros. For the first row $n=1:125$, counting repeat numbers only once, we get: 31 numbers containing 4 as a factor; 10 with 9; 4 with 25; 2 with 49 and 1 with 121. This adds to 48. Thus for the first row of figure 2 the fraction of 0s is $\tfrac{48}{125}$. Given that these smaller squares set the basic patterns in figure 2, and as we climb up the matrix we add additional perfect squares (e.g. $13^2=169$ between $n=126:250$) and their multiples, we added 1 more to the numerator to arrive at the fraction of 0s $\approx \tfrac{49}{125}= 0.392$. The experimentally determined value oscillates in the vicinity of that value with a tendency for convergence with increasing $n$ (Figure 3).

Figure 3. The gray line in the top panel is $1-\tfrac{6}{\pi^2}$ while in the bottom panel it is $\tfrac{3}{\pi^2}$

This value to which the fraction of 0s converges immediately caught our eye because of a book we had just borrowed from the local library on elementary number theory. It informed us about the masterful discovery of the great Leonhard Euler: What is the probability that any two positive integers chosen at random are mutually prime (i.e. have GCD=1 or are coprime)? The solution to this problem is related in turn to what was called the Basel problem after the hometown of Euler and the Bernoullis. The Basel problem asks the sum of the infinite series:

$1+\dfrac{1}{4}+\dfrac{1}{9}+\dfrac{1}{16}+\dfrac{1}{25}...= \displaystyle \sum_{n=1}^\infty \dfrac{1}{n^2}$

The 28 year old Euler showed that this sum is $\tfrac{\pi^2}{6}$, thus, surprisingly, bringing in the number $\pi$. Now the answer to the original question is the inverse of this number. Thus, the probability that two randomly chosen integers are coprime is $\tfrac{6}{\pi^2} \approx \tfrac{76}{125}$. This, result, even more miraculously, connects $\pi$ to the relatively prime numbers (Figure 4).

Figure 4. The matrix of relative primality of pairs of integers for $n=1:100$. The area in red to the total area of the square converges to $\tfrac{6}{\pi^2}$

That approximate value we list above immediately led us to realize that our fraction of the area under 0s in figure 2 or, more precisely, the asymptotic fraction of 0s in our above sequence $f$ should be exactly $1-\tfrac{6}{\pi^2}$. This also gives the fraction of numbers containing perfect squares in $1:n$. Conversely, the fraction of -1s and 1s in our sequence are $\tfrac{3}{\pi^2}$. This gives the fraction of square-free numbers, which are products or even or odd number of primes in $1:n$ Thus, an exploration of a sequence simply out of the curiosity of checking out patterns emerging from divisibility led us to a deep link to primality and $\pi$.

Given that the fraction of -1s and 1s are approximately the same we wondered what might be the trends in the sum of the above series $f$. Accordingly we defined new series thus:

$f1[n]=\displaystyle \sum_{k=1}^n f[k]$

Figure 5 shows a plot of $f1[n]$ for $n=1:30000$

Figure 5

It is clear that $f1[n]$ oscillates between negative and positive values in a what looks like random walk. However, we do notice that as $n$ increases the extreme values reached by $f1[n]$ are greater (Figure 6).

Figure 6

We wondered what this rate of increase might approximated by. But that was where we hit the end of our explorations. Those days the internet was not yet available to us and Riemann hypothesis was not the kind of thing that was widely known in lay circles. We knew of it and the famed Riemann $\zeta$ function but nothing much of the intricacies of this hypothesis. It was then that in a city teeming with Meccan demons we briefly borrowed a dense tome on the number theory from a distant relative of ours who was belonged to a clan with multi-generational mathematical talent. It was there that we learned that our auto-discovered sequence $f$ was something that was first described by the prince of the mathematics, Carl Gauss, in his late teens or early twenties and published in his famed Disquisitiones Arithmeticae. However, it came be known after a student of his, August Möbius, who rediscovered and studied it more than 30 years later. This Möbius function can be defined thus:

$\mu(n)=\begin{cases} 0; \textrm{n contains square of a prime}\\ 1; n=1\\ (-1)^k; \textrm{n is product of k distinct primes} \end{cases}$

This $\mu(n)$ has a relationship to the Riemann $\zeta$:

$\displaystyle \sum_{n=1}^\infty \dfrac{\mu(n)}{n^s}= \dfrac{1}{\zeta(s)}$ where $s \in \mathbb{C}$

The fraction of 0s in our sequence $f$ can be expressed as converging to $1-\tfrac{1}{\zeta(2)}$ and the fraction of 1s and -1s as converging to $\tfrac{1}{2\zeta(2)}$

Another interesting property of $\mu(n)$ is seen when it is included in a series of Lambert; we get this interesting sum:

$\displaystyle \sum_{n=1}^\infty \mu(n) \cdot \dfrac{x^n}{1-x^n}=x$ where $|x| <1$

The summatory sequence which we described above as $f1$ turned out to be a function discovered by Mertens $\textrm{M}(x)$ in the late 1800s. Mertens, after computing the first $10^5$ values by hand, boldly conjectured that $\textrm{M}(x)$ will for all $n$ lie inside the parabola $y=\sqrt{x}$, shown in blue in Figure 4. But this conjecture was shown to be false in our times by Odlyzko and te Riele. However, the first value for which $\textrm{M}(x)$ will out grow the parabola is so large that it is way outside our current computational reach. But weaker expressions related to this conjecture are shown to be equivalent to the Riemann hypothesis, which we will assume the reader is familiar with:

$M(x)=O(x^{1/2+\epsilon})$ where $\epsilon >0$

To briefly explain the big $O$ notation for an unfamiliar reader: Let there be two functions $f(x)$ and $g(x)$ whose behavior we are comparing as $x \to \infty$. In our current case $f(x)$ is $M(x)$ and $g(x)=x^{1/2+\epsilon}$.
If $|f(x)| \le K\cdot |g(x)|$ for all $x \ge x_0$, where $K$ is a positive real constant and $x_0$ is some real number, then we may state that $f(x)=O(g(x))$.

The link to the prime distribution also comes out in a nice relationship discovered by Mertens between $\textrm{M}(x)$ with the functions defined by the famed Russian scientist Chebyshev in his study of this problem (figure 7). The first of Chebyshev function $\Theta(n)$ is defined thus:

$\Theta(n)= \displaystyle \sum_{p \le n} \log(p)$ where $p \le n$ are all primes less than or equal to $n$

The second Chebyshev function $\psi$ is defined thus:

$\psi(n)=\displaystyle \sum_{p \le n} \left \lfloor \log_p(n) \right \rfloor \log(p)$ where $p \le n$ are all primes less than or equal to $n$

Figure 7. The top panels show the growth of the two Chebyshev functions. The bottom left panel shows the convergence $\tfrac{n}{\Theta(n)} \to 1$. The right bottom panel shows the fluctuations of $\psi(n)-n$ reminiscent of $\textrm{M}(x)$.

This $\psi(n)$ is related to the $\textrm{M}(x)$ thus:

$\psi(n)=\displaystyle \sum_{k=1}^n \textrm{M}\left (\left \lfloor \dfrac{n}{k} \right \rfloor \right ) \cdot \log(k)$

We first computationally demonstrated this relationship for ourselves when we learned that our $f1$ was none other than $\textrm{M}(x)$. Thus, with just a household computer, working knowledge of a modern computer language, and some school and elementary college level numeracy one can at least get a glimpse of some of the deepest issues in mathematics. Visualizing some of these, even at a very low level as we have done, gives you a mystical experience — i.e. a glimpse of some of deep mysteries of existence.