Scripting Maven Deployments with Ruby

Recently I had to import a number of legacy projects into a Maven-type structure. I knocked up the following script to make the task easier for repeated applications. Basically what it does is the following:

  • Tried to parse the filenames of jars in the current directory;
  • Does a Nexus search to see if it can locate the artifact;
  • If it finds the artifact in Nexus, it can use the GAV parameters for that artifact, otherwise, it generates a GAV stanza for inclusion in the POM.
  • For each artifact that could not be located in Nexus, it generates a mvn deploy:deploy-file command to upload the dependency

Error handling is non-existent in this file, so caveat emptor!

#!/usr/bin/ruby

require 'net/http'
require 'rexml/document'
include REXML

# Nexus server
host="nexus-server"
port=8080

# Could also use GAV parameters in query string
url="http://#{host}:#{port}/nexus/service/local/data_index?q="

print "<dependencies>","\n"

# Open file for dependencies entries
depsFile=File.open("dependencies.xml",'w')
depsFile.write("</dependencies><dependencies>\n")

# Track unprocessed jars
errors=[]

# Jars to be uploaded to Nexus
uploads=[]

for jarFile in Dir.glob("*.jar")
    if jarFile =~ /([A-Za-z\-\.0-9]+?)(?:\-)?(\d+(?:\.\d+)*)?.jar/
        artifact=$1
        version=$2

        if artifact =~ /(\S+)\.(.*)/
            group=$1
            artifact=$2
        else
            group=artifact.gsub("\-", "\.")
        end    

        if version.nil?
            version="1.0"
        end

        # Create a default stanza template
        stanza = "<dependency>\n"
        stanza < < '  <groupId>#{group}\n'
        stanza < < '  <artifactId>#{artifact}<artifactid>\n'
        stanza < < '  <version>#{version}\n'
        stanza < < "  <packaging>jar\n"
        stanza < < "</dependency>\n\n"

        puts "Searching Nexus for #{artifact}"
        query="#{url}#{artifact}"

        resp = Net::HTTP.get_response(URI.parse(query))
        respDoc = REXML::Document.new(resp.body)
        XPath.each(respDoc, "//search-results/totalCount") do |count|
            matches = Integer(count.text)
            if matches > 0

                idx=1; artifactMap = {}
                XPath.each( count, "//data/artifact") do |ar|
                    artifactId=ar.elements["artifactId"].text
                    artifactGroupId=ar.elements["groupId"].text
                    artifactVersion=ar.elements["version"].text
                    artifactRepo=ar.elements["repoId"].text
                    puts "#{idx}. #{artifactGroupId}:#{artifactId}:#{artifactVersion} (repo:#{artifactRepo})"
                    artifactMap[idx]={"group",artifactGroupId, "artifact",artifactId, "version",artifactVersion}
                    idx+=1
                end

                puts "Found #{matches} possible match(es) for #{group}:#{artifact}:#{version}"
                puts "Enter artifact number, or enter to use generated artifact:"
                num=gets.chomp

                if !num.empty?
                    num=Integer(num)
                    entry=artifactMap[num]
                    group=entry["group"];artifact=entry["artifact"];version=entry["version"]
                else
                    uploads.push( {"artifact",artifact,"group",group,"version",version,"file",jarFile} )
                end

            else
                uploads.push( {"artifact",artifact,"group",group,"version",version,"file",jarFile} )
            end
            stanza = eval('"' + stanza + '"')
            depsFile.write(stanza)
        end
    else
        errors < < jarFile
    end
end

depsFile.write("</dependencies>\n")
depsFile.close

puts "Finished processing.\nCheck the file dependencies.xml for generated XML"

# Generate the upload script
File.open("upload_deps.sh", "w") { |f|
    f.puts "#!/bin/bash\n\n"
    uploads.each do |dep|
        # Send the file to the correct repo
        repo="http://#{host}:#{port}/nexus/content/repositories/"
        if dep["version"] =~ /(.*)SNAPSHOT/
            repo < < "snapshots"
        else
            repo << "releases"
        end

        line="mvn deploy:deploy-file -Dfile=#{dep["file"]} -DartifactId=#{dep["artifact"]} -DgroupId=#{dep["group"]} -Dversion=#{dep["version"]} -Dpackaging=jar -DrepositoryId=nexus -DgeneratePom=true -Durl=#{repo}"

        f.puts "echo \"Uploading #{dep["file"]}...\""
        f.puts line
    end
}

puts <<HERE
The file upload_deps.sh contains Maven commands to upload dependencies not currently in Nexus.

This assumes that you have a stanza in $MVN_HOME/conf/settings.xml as follows:

<server>
    <id>nexus</id>
    <username>$USERNAME</username>
    <password>$PASSWORD</password>

Where the appropriate Nexus deployment credentials are in <username> and <password>.
HERE

if !errors.empty?
    puts "The following jar files could not be processed and need to be manually defined:"
    errors.each do |error|
        puts error
    end
end

GSoC Commons-Net SSH Concluded

Today is officially “pencils-down” day for the Google Summer of Code Project 2009. I have been a mentor this year for the Apache Commons-Net SSH project, which aims to add SSH and SCP support to Commons-Net.

The project has been a great success, mainly down to the super work performed by the student, Shikhar, who has put in tremendous work to get a fully functional SSH/SCP client built (with thanks to the efforts of the Mina SSHD project, whose codebase we originally based the effort on). All of the goals for the project have been ticked, and some extra ones accomplished too.

I had great help and input from Chico, another Apache committer, throughout the project, and so it’s been a great experience all round. This will form the basis of a release, but for now, the code is hosted on googlecode at: http://code.google.com/p/commons-net-ssh/

Headless R / X11 and Cygwin/X

Running R on a Linux server in headless mode (i.e. producing graphics without XWindows running) can be tricky. Some people recommend using a virtual X framebuffer. However, I’ve found that the best approach (at least im my opinion) is to use the R interface to Cairo. This allows R to produce png graphics in headless mode, and also produces very nice looking graphs. I configured R as follows (after downloading and building pixman-0.15.18, and cairo-1.8.8:

./configure --with-gnu-ld --with-x --with-cairo

This will produce an R binary with cairo support that can be run non-interactively and produce graphical output – very useful for running automated statistical reports.

You can check that Cairo support is enabled by checking the return value of the capabilities() function:

> capabilities()
jpeg png tiff tcltk X11 aqua http/ftp sockets
TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE
libxml fifo cledit iconv NLS profmem cairo
TRUE TRUE TRUE TRUE TRUE FALSE TRUE

Finally, some notes on connecting X11 clients using Cygwin (which I always forget how to do). On the server, check /etc/ssh/sshd_config for the line

X11Forwarding yes

And then run a local X server:

XWin -clipboard -emulate3buttons -multiwindow

Once this is running, from an xterm, run ssh, passing in the -X argument to enable X forwarding.

ssh -X -l username myserver

X11-based applications can then be run from this session.

Compiling The kdb/R interface on Win32

I have been playing with the kdb/R interface from kx.com, and had some problems installing with Cygwin gcc. It may be possible to get this to work with Cygwin gcc + a Win32 threads library, but in the meantime I installed MinGW, and it works perfectly. Here are the steps (basically as per the kx docs):

1. Download c.o from here: http://kx.com/q/w32/
2. gcc -c base.c -I. -I "${R_HOME}/include/"
3. gcc -Wl,--export-all-symbols -shared -o qserver.dll c.o base.o ${r
-HOME}/bin/R.dll -lws2_32

The resulting qserver.dll can be loaded via dyn.load(), and then (just using the qserver.R supplied by kx) from within R:

source("qserver.R")
conn < - open_connection("server", 12345)
result <- execute(conn, "select avg bid by sym from fx_quote")
x <- as.data.frame(mapply(FUN=c, result))
> head(x, 10)
V1 V2
1 AUD= 0.792402880224811
2 AUD=D2 0.791632149468651
3 AUD=EBS 0.790402776387278
4 AUDCHF=R 0.85955071021153
5 AUDJPY=R 75.0707755671935
6 BRL= 1.97194091379422
7 CAD= 1.15980648929715
8 CAD=D2 1.15962545479939
9 CAD=EBS 1.14104373919176
10 CADJPY=R 81.6389284332255

London UseR Group Talk – Slides

The inaugural London UseR event was a great success, with a lot of interesting people and a very constructive networking atmosphere!

I gave a (slightly disjointed) talk on concurrency and the bigmemory package in R (more on that later this year at UseR! 2009 in France).

The slides are here.

Project Euler Problem #28

Problem 28 on the Project Euler website asks what is the sum of both diagonals in a 1001×1001 clockwise spiral. This was an interesting one: the relationship between the numbers on the diagonals is easy to deduce, but expressing it succinctly in R took a little bit of tweaking. I’m sure it could be compressed even further.

# Problem 28
spiral.size <- function(n) {
        stopifnot(n %% 2 ==1)
        
        if (n==1) {
                return(1)
        }
        sum(cumsum(rep(2*seq(1,floor(n/2)), rep(4,floor(n/2))))+1)+1
}

spiral.size(1001)

Project Euler Problem #22

Problem 22 on Project Euler proves a text file containing a large number of comma-delimited names and asks us to calculate the numeric sum of the alphabetical score for each name multiplied by the name’s position in the original list. This is made slightly easier by the presence of the predefined LETTERS variable in R.

problem22 <- function() {
        namelist <- scan(file="c:/temp/names.txt", sep=",", what="", na.strings="")
        sum(unlist(
                lapply(namelist, 
                        function(Z) which(namelist==Z) * sum(match(unlist(strsplit(Z,"")), LETTERS)))))
}

Project Euler Problem #15

Problem 15 on Project Euler asks us to find the number of distinct routes between the top left and bottom right corners in a 20×20 grid, with no backtracking allowed.

I originally saw this type of problem tackled in the book Notes On Introductory Combinatorics, by George Polya amongst others. This book is hard to find now, but it is a really clear intro to combinatoric math.

The solution can be paraphrased as follows: if the grid is of size 20×20, and it takes 2 movements to navigate a single square in the grid, then we must make a total of 40 movements to get from the top right to the bottom left. Exactly half of these movements will be left-to-right, and the other half will be up-down. The total number of distinct routes is the number of ways that we can choose 20 of each type of move from the 40 total moves required. So we need the combinatoric construct n-choose-k, or how many ways k items can be selected from n total items. This is represented as tex:{n\choose k}.

In R, calculating tex:{40\choose 20} is just:

choose(40, 20)