# Category Archives: Number Theory # Primes below N, the sieve of Eratosthenes in Scala

I have been practicing coding tests again. One nice exercise I saw the other day was a sample from a HackerRank test: find the number of prime numbers less than N. Or tell how many primes are below a given limit. You can easily change the problem to pick the list of the said primes.

It is such a classic problem that I decided to dedicate myself to write a nice solution, because I couldn’t find a really good “functional style” implementation in Scala, strictly with immutable collections, and using `foldLeft` if possible. And there is also an important detail about this problem that is worth talking about. The most well known solution is not actually the “best” one, and the difference isn’t even that that big. So it really is a subject we should all be studying a little bit more carefully.

This article discusses four different solutions to the problem. First is the most usual solution, the so-called unfaithful sieve. Then we will show the most “cult” solution to this problem, the sieve of Eratosthenes. These both can be written recursively, but the first can actually be implemented with a fold. We will then show a quite different implementation of the sieve, using a map to store known composite numbers, that I came up with myself, even though I doubt is new. We finish by looking at implementations of the two algorithms using mutable collections.

# Looking for squarefree numbers

A friend proposed an interesting problem the other day: to find out how many numbers are there below N with at most one of each prime as a factor? These numbers are called square-free numbers, because they are not divisible by any square number. If a number is divisible by any larger power of some prime p^n, it also must be divisible by p^2, and so square-freedom implies having each prime factor only once in the factorization.

I thought the problem was very interesting, first of all because this is an interesting set of numbers to think about. But the problem of finding these numbers below a limit is particularly interesting because it falls into a constraint-satisfaction problem with binary variables. And that means we can perform a graph search on a binary tree to look for these numbers. Though that was cool because a search on a binary tree is something simple, and therefore it should be easy to write a program to solve the problem. At the same time it is not obvious at first that this is the solution, so it ends up being an interesting problem.

I am not sure there is no other smarter way to find these numbers, but here are two implementations of a depth-first search to look for these numbers. Notice that the list of primes is given. The first implementation is in Python:

```from operator import mul
import string
primes = [2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61]
def dfs(n, i):
m = n * primes[i]
if m < 60:
print m
dfs(m, i+1)
dfs(n, i+1)
dfs(1,0)
```

This code uses a recursive function, yes indeed. As an exercise to the reader you can implement it with using a queue.

Now, depth-first searches always make me think of Prolog. I always think “Gosh, I would like to see that in Prolog, because depth-first searches is Prolog’s thing”. So I implemented it in Prolog too, as a little exercise to dust off my Prolog references. Here it goes:

```dfs(X, X, []).
dfs(X, M, [_|L]) :- dfs(X, M, L).
dfs(X, M, [N|L]) :- P is N * M, P < 61, dfs(X, P, L).
squarefree(X) :- dfs(X, 1, [2, 3, 5, 7, 11, 13, 17,
19, 23, 29, 31, 37, 41, 43, 47, 53, 59]).
:- setof(X, squarefree(X), Z), writeln(Z), halt.
```

Prolog can be awkward sometimes, I confess. But after studying that Python code above, you will see that what Prolog will do is pretty much the same thing. In the Python script we have a list of numbers from where we take values to calculate a product. The numbers are either accepted or discarded from the product. This creates the different combinations of numbers, that must then be tested to pass the n <= 60 restriction. A new number is accepted into the product when the code enters the first recursive call to dfs. In the second call the new number from the list is discarded, and the product so far is used directly as an argument to the recursive call to the dfs function. When the accumulated product surpasses the limit (60), the function returns, stopping the search on that branch.

So at each recursive call we have a pointer to the rest of the primes list, and an accumulated product of factors. The head of the list is either accumulated in the product, or discarded from each state from the search. If the product reaches the limit no more calls are made, the search ends in that branch.

What the first three lines in the Prolog code do is to describe these recursion rules. A predicate dfs(_, M, L) with an accumulated product M and remaining list L implies the same predicate with a new factor in the head of an expanded list, either with the same accumulated product, or removing that factor from the product. When Prolog tries to prove the initial predicate, with the complete list of prime numbers, it goes on popping the list head, then producing predicates with the reduced list, and with the accumulated product either receiving or not the factor that was taken from the list. This goes on until the product surpasses the limit, because it is necessary to prove the third predicate.

The first predicate allows us to "retrieve" the solutions, the accumulated products. The X variable provides the "way back" to the beginning of the inference process. The fourth line is just to make the code look nicer, it is a simple predicate implied by the dfs predicate, that causes all the search. The last line is the query that asks Prolog to give us all the values it can find that make the squarefree predicate true.

The crazy part is this "retrieving the solution" trick. I am not sure there is a better way to write the code that doe snot involve that. Send me comments if you find one!…

The output from the code is

```\$ swipl -s squarefree.P -q
[1,2,3,5,6,7,10,11,13,14,15,17,19,21,22,23,26,29,30,31,33,34,35,37,38,39,41,42,43,46,47,51,53,55,57,58,59]
```