Archive for the ‘Project Euler’ Category

Project Euler Problem #8 (R)

Saturday, July 19th, 2008

Problem #8 on the Project Euler problem list requires finding the greatest product of five consecutive digits in a large string of integers.

For this problem, I copied and pasted the large number into a file, and read it in using scan(). As there were line breaks in the copied and pasted numbers, I used paste() with the collapse argument to restore it to a single line. From there, it was a matter of taking 5-character substrings, converting them to numbers, and using prod() to calculate the product of each group of 5 numbers.

gprod <- function() {
 # Read number as string
 snum <- scan(file="number.dat", what=character(0),
  strip.white=TRUE, quiet=TRUE)
 # Concatenate lines into a single line
 snum <- paste(snum, collapse="", sep="")
 mp <- 0
 for (i in 1:(nchar(snum)-4)) {
  s <- substr(snum, i, i+4)
  m <- prod(as.numeric(strsplit(s,"")[[1]]))
  if (m > mp) {
   mp <- m
}
 }
 mp
}

Project Euler Problem #7 (R)

Friday, July 18th, 2008

Problem 7 on the Project Euler site is the following challenge:

By listing the first six prime numbers: 2, 3, 5, 7, 11, and 13, we can see that the 6th prime is 13.

What is the 10001st prime number?

The routine below returns a list of primes, each successive prime being calculated by testing none of the currently known primes divide the current integer counter value evenly. If not, the current integer value is added to the list of known primes.

nprime <- function(n) {
        if (n == 1) {
                return (2)
        }
        psofar <- c()
        psofar[1] <- 2
        j <- 2
        p <- 1
        i <- 2
        while (p < n)  {
                if ( all((i %% psofar) > 0) ) {
                        psofar[j] <- i
                        j <- j + 1
                        p <- p + 1
                }
                i <- i + 1
        }
        psofar
}

To return the 10001st prime, we invoke it like so:

>tail(psofar(10001),1)

Project Euler Problem #6 (R)

Friday, July 18th, 2008

Problem 6 in the Project Euler list asks us to:

Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

The literal translation of this is:

sumdiff <- function(n) {
    sum(c(1:n))^2-sum(c(1:n)^2)
}

However, this is another problem (like Problem #1) that has a closed-form solution, and so can run in constant time.

The closed form expression for the sum of the squares is

tex:\sum_{i=1}^{n}i^2 = \frac{1}{6}n(2n+1)(n+1)

The closed form expression for the square of the sum is

tex:\left(\sum_{i=1}^{n}i\right)^2 = \left(\frac{n(n+1)}{2}\right)^2

So if we subtract and simplify, we end up with the solution

tex:\frac{1}{12}n(3n^3+2n^2-3n-2).

The solution is shown below:

sumdiff2 <- function(n) {
    #(1/4)*(n^2)*(1+n)^2 - (1/6)*n*(1+n)*(1+2*n)
    (1/12)*(n)*(-2-3*n+2*n^2+3*n^3)
}

We can calculate the answer as shown:

> sumdiff2(100)
[1] 25164150

Needless to say, closed form solutions are preferable where possible.

Project Euler Problem #5 (R)

Friday, July 18th, 2008

Problem 5 on the Project Euler list is the following:

2520 is the smallest number that can be divided by each of the numbers from 1 to 10 without any remainder.

What is the smallest number that is evenly divisible by all of the numbers from 1 to 20?

This is asking us to compute the lowest common multiple (lcm) from 1 to 20. The lcm of two integers a and b is defined as tex:\frac{ab}{\mbox{gcd}(a,b)}

In order to compute the gcd, we can use a simple algorithm, like a Euclidian reduction:

# Iterative Euclidean algorithm
gcd <- function(a,b) {
    t <- 0
    while (b != 0) {
        t <- b
        b <- a %% b
        a <- t
    }
    a
}

Defining the lcm then becomes:

lcm <- function(a,b) {
    abs(a * (b/(gcd(a,b))))
}

Now we have defined pretty much everything we need to solve the problem. We can write an iterative lcm routine that calculates the least common denominator from 1 to x:

lcm_ <- function(xs) {
    l <- 1
    for (i in 1:(length(xs))) {
        l <- lcm(l,xs[i])
    }
    l
}

And we can calculate the answer:

> lcm_(c(1:20))
[1] 232792560

However, I also experimented with writing a reduce() function (similar to Haskell’s foldr operation). We can write reduce() as follows:

reduce <- function(op,x,xs) {
    y <- x
    for (i in 1:(length(xs)-1)) {
        y <- do.call(op, list(y, xs[i]))
    } 
    y
}

And then invoke:

>reduce(lcm, 1, c(1:20))
[1] 232792560

Note that R actually has inbuilt Map/Reduce operations. Using the native Reduce, this is:

> Reduce(f=lcm, x=c(1:20), right=1)
[1] 232792560

Project Euler Problem #4 (R)

Friday, July 18th, 2008

Problem 4 in on Project Euler’s list is the following:

Find the largest palindrome made from the product of two 3-digit numbers.

We can attack this as follows: firstly, we know that for an n-digit number, the minimum n-digit number is tex:10^{n-1} and the maximum is tex:9\sum_{i=0}^{n-1}10^{i}. So we need the cartesian product of all numbers in this range, which we can then test for the palindrome property, and finally filter for the maximum element.

Testing for the palindrome property requires that we check each digit in the number for equality on either side. We do this first by calculating the number of digits in the product (using tex:\lfloor\log_{10}(x)\rfloor) and then just comparing the digits by using the modulus and shifting. Note a couple of points in the R code below:

  • %/% is integer division;
  • Note the vectorization at work in statements like (p %/% 10^x): this will divide p by successive powers of 10, where the powers are defined in the x vector;
  • Note the use of the logical all() function to test whether all palindromic contenders are equal.
  • pal < - function(N) {
     palindromes <- c()
     min_no <- 1*10^(N-1)
     max_no <- sum(9*10^(seq(0,N-1)))
     n <- 1
     for (i in seq(max_no,min_no,-1)) {
      for (j in seq(i,min_no,-1)) {
       p <- i * j
       digits <- floor(log10(p))
       x <- seq(0, digits %/% 2)
        if (all( ((p %/% 10^x) %% 10)==((p %/% 10^(digits-x)) %% 10) )) {
         palindromes[n] <- p
         n <- n + 1
        }
      }
     }
     max(palindromes)
    }

Project Euler Problem #3 (R)

Thursday, July 17th, 2008

The third problem in Project Euler is an interesting one:

The prime factors of 13195 are 5, 7, 13 and 29.

What is the largest prime factor of the number 600851475143 ?

In order to compute prime factors, we can use a sieving method. The simplest and oldest sieving algorithm is the sieve of Erastothenes, which we can implement in R as follows:

sieve <- function(n) {
    if (n <= 1) {
        return(numeric(0))
    }
    if (n == 2) {
        return(c(2))
    }

    a <- seq(2,n)
    b <- c()
    b[1] <- 2

    i <- 1
    while (length(a) > 0) {
        b[i] <- a[1]
        i <- i + 1
        a <- a[!(a %% a[1] == 0)]
    }
    b
}

So now we need to compute the largest prime factor of 600851475143. An upper bound for the factorisation is tex:p \leq \lfloor\sqrt{x}\rfloor, so we can halt our search at this point (alternatively, we may be able to begin our search here and work downwards). Let’s do it the obvious (and inefficient) way, starting at 1 and working up to tex:\lfloor\sqrt{x}\rfloor, building up a list of prime factors along the way:

pfactor <- function(x) {
    factors <- c()
    j <- 1
    primes <- sieve( floor(sqrt(x)) )
    for (prime in primes) {
        if (x %% prime == 0) {
            factors[j] <- prime
            j <- j + 1
    }
    }
    factors
}

Now to compute the maximum factor, we just take the maximum of this list:

max( pfactor(600851475143) )

Project Euler Problem #2 (R)

Tuesday, June 17th, 2008

Here is a solution for Project Euler’s Problem #2 in R, which is stated as:

Each new term in the Fibonacci sequence is generated by adding the previous two terms. By starting with 1 and 2, the first 10 terms will be:

1, 2, 3, 5, 8, 13, 21, 34, 55, 89, …

Find the sum of all the even-valued terms in the sequence which do not exceed four million.

Firstly ,we will define a function to calculate the Fibonacci numbers below N, where N in this case is 4*10^6:

fib < - function(n) {
x <- c(length=10)
x[1] <- 1; x[2] <- 2
for (i in 3:n) {
y <- x[i-1]+x[i-2];
if (y > n){ break }
else { x[i] < - y }
}
x
}

Next, we just calculate the sum of the even-valued problems, in a similar manner to the way we solved problem #1:

f < - fib(4E6)
sum(f[f %% 2 ==0])

Project Euler Problem #1 (R)

Thursday, June 12th, 2008

Here is a solution for Project Euler’s problem #1 in R. The problem is expressed as:

If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9. The sum of these multiples is 23.

Find the sum of all the multiples of 3 or 5 below 1000.

As usual with Project Euler questions, there is an obvious way, and a less obvious, but much more efficient way. In this case, the obvious way is:

x < - seq(1,999)
sum(x[x %% 3 ==0 | x %% 5 == 0])

Which very concisely returns the correct answer, 233168. However, if we use the following intuition:

tex:S_N = S_3 + S_5 - (S_{3,5})

i.e. the sum of all numbers divisible by 3 or 5 is the sum of all numbers divisible by 3, plus the sum of all numbers divisible by 5, minus the sum of all numbers divisible by 3 and 5 (as we have double counted them), then we get the correct answer.

Since tex:S_n  = \frac{n(a_1 + a_n)}{2}, where tex:n = \lfloor \frac{N}{n} \rfloor, and tex:a_n = a_1n, the last piece of the puzzle is what to use for tex:S_{3,5}. This is straightforward, we just use the lowest common multiple of 3 and 5, which in this case is 15. Hence, the R representation of this is:

sum(333*((3+333*3)/2),199*((5+199*5)/2)-66*((15+66*15)/2))