August 17th, 2010
Here is an example of a custom Checkstyle rule that catches the following situation: sometimes an Exception can be caught and an error message logged, but the underlying exception (and thus the stack trace) may not be logged at all. This is not a problem generally, unless the exception tends wrap an underlying cause (i.e. one or more nested exceptions). The rule is designed to catch instances like:
try {}
catch (Exception e) {
logger.error("blah");
}
With the assumption that the following is better:
try {}
catch (Exception e) {
logger.error("blah", e);
}
Here is the code for the custom check:
package au.com.national.efx.build;
import java.util.ArrayList;
import java.util.List;
import com.puppycrawl.tools.checkstyle.api.Check;
import com.puppycrawl.tools.checkstyle.api.DetailAST;
import com.puppycrawl.tools.checkstyle.api.TokenTypes;
/**
* Check that attempts to catch instances of the following:
* <code>
* catch (Exception e) { logger.error("foo"); }
* </code>
*
* with the assumption that the following is preferable:
*
* <code>
* catch (Exception e) { logger.error("foo", e); }
* </code>
* @author rwinston
*
*/
public class SwallowedExceptionInLoggerCheck extends Check {
@Override
public int[] getDefaultTokens() {
return new int[] { TokenTypes.LITERAL_CATCH };
}
/**
* Get ident of exception
* Try to find it in logger error/warn parameter list
*/
@Override
public void visitToken(DetailAST aAST) {
super.visitToken(aAST);
final DetailAST parameterDef = aAST.findFirstToken(TokenTypes.PARAMETER_DEF);
final DetailAST ident = parameterDef.findFirstToken(TokenTypes.IDENT);
final String exceptionIdent = ident.getText();
final DetailAST slist = aAST.findFirstToken(TokenTypes.SLIST); // Find '{'
// Find all method calls within catch block
final List<DetailAST> variables = findChildASTsOfType(slist, TokenTypes.METHOD_CALL);
try {
for (DetailAST methodCall : variables) {
DetailAST dot = methodCall.findFirstToken(TokenTypes.DOT);
// I'm assuming the last child will be the method name called
DetailAST lastIdent = dot.getLastChild();
if (lastIdent.getText().equals("error")) {
// Ok, now check that the ELIST contains an IDENT matching
// the exception name
DetailAST elist = methodCall.findFirstToken(TokenTypes.ELIST);
boolean exceptionInParameterList = false;
for (DetailAST identAST : findChildASTsOfType(elist, TokenTypes.IDENT)) {
if (identAST.getText().equals(exceptionIdent))
exceptionInParameterList = true;
}
if (!exceptionInParameterList) {
log(methodCall, "error() method does not contain caught Exception as a parameter");
}
}
}
} catch (Exception e) { e.printStackTrace(); }
}
/**
* Recursively traverse an expression tree and return all
* ASTs matching a specific token type
* @param parent
* @param type
* @return
*/
private List<DetailAST> findChildASTsOfType(DetailAST parent, int type) {
List<DetailAST> children = new ArrayList<DetailAST>();
DetailAST child = parent.getFirstChild();
while (child != null) {
if (child.getType() == type)
children.add(child);
else {
children.addAll(findChildASTsOfType(child, type));
}
child = child.getNextSibling();
}
return children;
}
}
Posted in Coding | No Comments »
August 16th, 2010
This is a solution for problem 21 on the Project Euler website. It consists of finding the sum of all the amicable numbers under 10000. This was pretty easy to solve, but the solution could probably be improved quite a bit.
Solution #1 in R is as follows (it calculates the proper divisors of each number using prop.divs, and then adds up the sequence of amicable numbers in the main function).
prop.divs <- function(x) {
if (x == 1) return (1)
divs <- integer(30)
j <- 1
divs[j] <- 1
j <- j + 1
for (i in 2:(floor(x/2))) {
if ((x %% i) == 0) {
divs[j] <- i
j <- j + 1
}
}
sum(divs[1:(j-1)])
}
problem.21 <- function(N) {
s <- 0
for (i in 2:N) {
da <- prop.divs(i)
if (da == i) next
db <- prop.divs(da)
if ( db==i ) {
s <- s + da + db
}
}
s/2
}
The s/2 is needed as each factor is added twice during the calculation.
This gives the correct answer, but the implementation is a bit naive. I remembered coming across an article about prime factors and proper divisors on PlanetMath a while ago, and this seemed like potentially a more efficient way to calculate the factors involved. Specifically, the sum of proper divisors of a number n can be given by:
.
The second attempt at this problem looked like the following:
prime.sieve <- function(n) {
a <- seq.int(1,n)
p <- 1
M <- as.integer(sqrt(n))
while ((p <- p + 1) <= M) {
if (a[p] != 0)
a[seq.int(p*p, n, p)] <- 0
}
a[a>1 & a>0]
}
sum.proper.divisors <- function(x) {
primes <- prime.sieve( x )
primes <- primes[ x %% primes == 0]
geo.sum <- numeric(length(primes))
i <- 1
for (prime in primes) {
n <- x
curr <- 0
while (n %% prime == 0) {
curr <- curr + 1
n <- n %/% prime
}
geo.sum[i] <- ( (prime^(curr+1) - 1)/(prime - 1) )
i <- i + 1
}
prod(geo.sum)-x
}
problem.21_2 <- function(N) {
s <- 0
for (i in 2:N) {
da <- sum.proper.divisors(i)
if (da == i) next
db <- sum.proper.divisors(da)
if (db==i) s <- s + da +db
}
s/2
}
This also gives the correct answer, but with much reduced runtime overhead:
> system.time(problem.21(10000))
user system elapsed
103.943 0.511 106.978
> system.time(problem.21_2(10000))
user system elapsed
24.834 0.160 26.565
Posted in Coding, Project Euler, R | No Comments »
August 16th, 2010
Recently I had to import a number of legacy projects into a Maven-type structure. I knocked up the following script to make the task easier for repeated applications. Basically what it does is the following:
- Tried to parse the filenames of jars in the current directory;
- Does a Nexus search to see if it can locate the artifact;
- If it finds the artifact in Nexus, it can use the GAV parameters for that artifact, otherwise, it generates a GAV stanza for inclusion in the POM.
- For each artifact that could not be located in Nexus, it generates a
mvn deploy:deploy-file command to upload the dependency
Error handling is non-existent in this file, so caveat emptor!
#!/usr/bin/ruby
require 'net/http'
require 'rexml/document'
include REXML
# Nexus server
host="nexus-server"
port=8080
# Could also use GAV parameters in query string
url="http://#{host}:#{port}/nexus/service/local/data_index?q="
print "<dependencies>","\n"
# Open file for dependencies entries
depsFile=File.open("dependencies.xml",'w')
depsFile.write("</dependencies><dependencies>\n")
# Track unprocessed jars
errors=[]
# Jars to be uploaded to Nexus
uploads=[]
for jarFile in Dir.glob("*.jar")
if jarFile =~ /([A-Za-z\-\.0-9]+?)(?:\-)?(\d+(?:\.\d+)*)?.jar/
artifact=$1
version=$2
if artifact =~ /(\S+)\.(.*)/
group=$1
artifact=$2
else
group=artifact.gsub("\-", "\.")
end
if version.nil?
version="1.0"
end
# Create a default stanza template
stanza = "<dependency>\n"
stanza < < ' <groupId>#{group}\n'
stanza < < ' <artifactId>#{artifact}<artifactid>\n'
stanza < < ' <version>#{version}\n'
stanza < < " <packaging>jar\n"
stanza < < "</dependency>\n\n"
puts "Searching Nexus for #{artifact}"
query="#{url}#{artifact}"
resp = Net::HTTP.get_response(URI.parse(query))
respDoc = REXML::Document.new(resp.body)
XPath.each(respDoc, "//search-results/totalCount") do |count|
matches = Integer(count.text)
if matches > 0
idx=1; artifactMap = {}
XPath.each( count, "//data/artifact") do |ar|
artifactId=ar.elements["artifactId"].text
artifactGroupId=ar.elements["groupId"].text
artifactVersion=ar.elements["version"].text
artifactRepo=ar.elements["repoId"].text
puts "#{idx}. #{artifactGroupId}:#{artifactId}:#{artifactVersion} (repo:#{artifactRepo})"
artifactMap[idx]={"group",artifactGroupId, "artifact",artifactId, "version",artifactVersion}
idx+=1
end
puts "Found #{matches} possible match(es) for #{group}:#{artifact}:#{version}"
puts "Enter artifact number, or enter to use generated artifact:"
num=gets.chomp
if !num.empty?
num=Integer(num)
entry=artifactMap[num]
group=entry["group"];artifact=entry["artifact"];version=entry["version"]
else
uploads.push( {"artifact",artifact,"group",group,"version",version,"file",jarFile} )
end
else
uploads.push( {"artifact",artifact,"group",group,"version",version,"file",jarFile} )
end
stanza = eval('"' + stanza + '"')
depsFile.write(stanza)
end
else
errors < < jarFile
end
end
depsFile.write("</dependencies>\n")
depsFile.close
puts "Finished processing.\nCheck the file dependencies.xml for generated XML"
# Generate the upload script
File.open("upload_deps.sh", "w") { |f|
f.puts "#!/bin/bash\n\n"
uploads.each do |dep|
# Send the file to the correct repo
repo="http://#{host}:#{port}/nexus/content/repositories/"
if dep["version"] =~ /(.*)SNAPSHOT/
repo < < "snapshots"
else
repo << "releases"
end
line="mvn deploy:deploy-file -Dfile=#{dep["file"]} -DartifactId=#{dep["artifact"]} -DgroupId=#{dep["group"]} -Dversion=#{dep["version"]} -Dpackaging=jar -DrepositoryId=nexus -DgeneratePom=true -Durl=#{repo}"
f.puts "echo \"Uploading #{dep["file"]}...\""
f.puts line
end
}
puts <<HERE
The file upload_deps.sh contains Maven commands to upload dependencies not currently in Nexus.
This assumes that you have a stanza in $MVN_HOME/conf/settings.xml as follows:
<server>
<id>nexus</id>
<username>$USERNAME</username>
<password>$PASSWORD</password>
Where the appropriate Nexus deployment credentials are in <username> and <password>.
HERE
if !errors.empty?
puts "The following jar files could not be processed and need to be manually defined:"
errors.each do |error|
puts error
end
end
Posted in Coding | No Comments »
August 17th, 2009
Today is officially “pencils-down” day for the Google Summer of Code Project 2009. I have been a mentor this year for the Apache Commons-Net SSH project, which aims to add SSH and SCP support to Commons-Net.
The project has been a great success, mainly down to the super work performed by the student, Shikhar, who has put in tremendous work to get a fully functional SSH/SCP client built (with thanks to the efforts of the Mina SSHD project, whose codebase we originally based the effort on). All of the goals for the project have been ticked, and some extra ones accomplished too.
I had great help and input from Chico, another Apache committer, throughout the project, and so it’s been a great experience all round. This will form the basis of a release, but for now, the code is hosted on googlecode at: http://code.google.com/p/commons-net-ssh/
Posted in Coding | No Comments »
August 17th, 2009
Running R on a Linux server in headless mode (i.e. producing graphics without XWindows running) can be tricky. Some people recommend using a virtual X framebuffer. However, I’ve found that the best approach (at least im my opinion) is to use the R interface to Cairo. This allows R to produce png graphics in headless mode, and also produces very nice looking graphs. I configured R as follows (after downloading and building pixman-0.15.18, and cairo-1.8.8:
./configure --with-gnu-ld --with-x --with-cairo
This will produce an R binary with cairo support that can be run non-interactively and produce graphical output – very useful for running automated statistical reports.
You can check that Cairo support is enabled by checking the return value of the capabilities() function:
> capabilities()
jpeg png tiff tcltk X11 aqua http/ftp sockets
TRUE TRUE FALSE FALSE FALSE FALSE TRUE TRUE
libxml fifo cledit iconv NLS profmem cairo
TRUE TRUE TRUE TRUE TRUE FALSE TRUE
Finally, some notes on connecting X11 clients using Cygwin (which I always forget how to do). On the server, check /etc/ssh/sshd_config for the line
X11Forwarding yes
And then run a local X server:
XWin -clipboard -emulate3buttons -multiwindow
Once this is running, from an xterm, run ssh, passing in the -X argument to enable X forwarding.
ssh -X -l username myserver
X11-based applications can then be run from this session.
Posted in Coding, R | No Comments »
July 15th, 2009
I have been playing with Lilypond a little more, and here are some themes that I have transcribed. They are all from John Carpenter movies: the first is the theme from “Halloween”, which is an interesting 5/4 time piano riff; the second is the synth theme from “Escape From New York”, and the final is the ending credits music of “The Thing” (known as “Humanity Part II” on the soundtrack album). The last one is a bit messy as I couldnt get Lilypond to hold the ties across multiple bars properly.

carpenter.pdf
This is still not 100% finished – there are some omissions. Plus, my transcription may be incorrect in places.
Here is the PDF:
carpenter.pdf
And the Lilypond source:
carpenter
Posted in Music | No Comments »
July 15th, 2009
Last week I presented a short talk at the 2009 UseR conference. The conference was the usual mix of varied topics (even more varied than usual this year) and a lot of interesting discussions.
Here are the slides.
Incidentally, Mango Solutions have set up a website for the London R user group meetings here: http://www.londonr.org/
Posted in Coding | No Comments »
July 14th, 2009
I have been playing with the kdb/R interface from kx.com, and had some problems installing with Cygwin gcc. It may be possible to get this to work with Cygwin gcc + a Win32 threads library, but in the meantime I installed MinGW, and it works perfectly. Here are the steps (basically as per the kx docs):
1. Download c.o from here: http://kx.com/q/w32/
2. gcc -c base.c -I. -I "${R_HOME}/include/"
3. gcc -Wl,--export-all-symbols -shared -o qserver.dll c.o base.o ${r
-HOME}/bin/R.dll -lws2_32
The resulting qserver.dll can be loaded via dyn.load(), and then (just using the qserver.R supplied by kx) from within R:
source("qserver.R")
conn < - open_connection("server", 12345)
result <- execute(conn, "select avg bid by sym from fx_quote")
x <- as.data.frame(mapply(FUN=c, result))
> head(x, 10)
V1 V2
1 AUD= 0.792402880224811
2 AUD=D2 0.791632149468651
3 AUD=EBS 0.790402776387278
4 AUDCHF=R 0.85955071021153
5 AUDJPY=R 75.0707755671935
6 BRL= 1.97194091379422
7 CAD= 1.15980648929715
8 CAD=D2 1.15962545479939
9 CAD=EBS 1.14104373919176
10 CADJPY=R 81.6389284332255
Posted in Coding | No Comments »
July 2nd, 2009
Here is a handy way to get awk to preprocess a line and add a timestamp (Put here as I will probably forget how to do this straight away again!)
echo "foo,bar" | awk '{x="'"`date +%Y%M%d%S%N`"'"; printf "%s,%s\n",x,$0 }'
Posted in Coding | No Comments »
April 2nd, 2009
The inaugural London UseR event was a great success, with a lot of interesting people and a very constructive networking atmosphere!
I gave a (slightly disjointed) talk on concurrency and the bigmemory package in R (more on that later this year at UseR! 2009 in France).
The slides are here.
Posted in Coding, R | No Comments »