Author: Rory Winston

Coding

Deducing the JDK Version of a .jar File

Post author By Rory Winston
Post date February 16, 2011

Here is a little script that uses Pyton to examine the contents of a jar file (or specifically the first .class file it comes across) and then reads the major version byte and maps it to a JDk version. May be useful if you have a bunch of jars compiled by different JDKs and want to figure out which is which.

[sourcecode language=”python”]#!/usr/bin/python

import zipfile
import sys
import re

class_pattern=re.compile("/?\w*.class$")

for arg in sys.argv[1:]:
print ‘%s:’ % arg,
file=zipfile.ZipFile(arg, "r")

for entry in file.filelist:
if class_pattern.search(entry.filename):
bytes = file.read(entry.filename)
maj_version=ord(bytes[7])
if maj_version==45:
print "JDK 1.1"
elif maj_version==46:
print "JDK 1.2"
elif maj_version==47:
print "JDK 1.3"
elif maj_version==48:
print "JDK 1.4"
elif maj_version==49:
print "JDK 5.0"
elif maj_version==50:
print "JDK 6.0"
break
[/sourcecode]

Coding R

Gdb Macros for R

Post author By Rory Winston
Post date September 6, 2010

When debugging R interactively, one hurdle to navigate is unwrapping SEXP objects to get at the inner data. Gdb has some useful macro functionality that allows you to wrap useful command sequences in reusable chunks. I recently put together the following macro that attempts to extract and print some useful info from a SEXP object.

It can be used as follows. For instance, given a SEXP called “e”:

(gdb) dumpsxp e Type: LANGSXP (6) Function:Type: SYMSXP (1) "< -" Args:Type: LISTSXP (2) (SYMSXP,LISTSXP)

We can see that e is a LANGSXP, and the operator is “< -". Functions have different components - here we can see the function representation (the SYMSXP) and the function arguments (the LISTSXP).

Some knowledge of LANGSXP structure is useful here. For instance, if we know that for a LANGSXP that CAR(x) gives us the function and CDR(x) gives us the arguments, we can view the components individually.

To see the first component:

(gdb) dumpsxp CAR(e) Type: SYMSXP (1) "< -"

The arguments are given by the CDR of e. We can then crack open the list and view the function arguments, recursively looking through the pairlist until we get to the end:

(gdb) dumpsxp CDR(e) Type: LISTSXP (2) (SYMSXP,LISTSXP) (gdb) dumpsxp CADR(e) Type: SYMSXP (1) "x" (gdb) dumpsxp CADDR(e) Type: LANGSXP (6) Function:Type: SYMSXP (1) "sin" Args:Type: LISTSXP (2) (REALSXP,NILSXP) (gdb) dumpsxp CADDDR(e) Type: NILSXP (0)

The NILSXP tells us that we’ve got to the end of the list.

The GDB macro is below. Put it in your .gdbinit to automatically load it when gdb starts up.

[sourcecode language=”c”]
define dumpsxp
if $arg0==0
printf "uninitialized variable\n"
return
end

set $sexptype=TYPEOF($arg0)

# Typename
printf "Type: %s (%d)\n", typename($arg0), $sexptype

# SYMSXP
if $sexptype==1
# CHAR(PRINTNAME(x))
print_char PRINTNAME($arg0)
end

# LISTSXP
if $sexptype==2
printf "(%s,%s)\n", typename(CAR($arg0)), typename(CDR($arg0))
end

# CLOSXP
if $sexptype==3
dumpsxp BODY($arg0)
end

# PROMSXP
# Promises contain pointers to value, expr and env
# tmp = eval(tmp, rho);
if $sexptype==5
printf "Promise under evaluation: %d\n", PRSEEN($arg0)
printf "Expression: "
dumpsxp ($arg0)->u.promsxp.expr
# Expression: (CAR(chain))->u.promsxp.expr
end

# LANGSXP
if $sexptype==6
printf "Function:"
dumpsxp CAR($arg0)
printf "Args:"
dumpsxp CDR($arg0)
end

# SPECIALSXP
if $sexptype==7
printf "Special function: %s\n", R_FunTab[($arg0)->u.primsxp.offset].name
end

# BUILTINSXP
if $sexptype==8
printf "Function: %s\n", R_FunTab[($arg0)->u.primsxp.offset].name
end

# CHARSXP
if $sexptype==9
printf "length=%d\n", ((VECTOR_SEXPREC)(*$arg0))->vecsxp.length
#print_veclen $arg0
print_char $arg0
end

# LGLSXP
if $sexptype==10
set $lgl=*LOGICAL($arg0)
if $lgl > 0
printf "TRUE\n"
end
if $lgl == 0
printf "FALSE\n"
end
end

# INTSXP
if $sexptype==13
printf "%d\n", *(INTEGER($arg0))
end

# REALSXP
if $sexptype==14
print_veclen $arg0
print_double $arg0
end

# STRSXP
if $sexptype==16
print_veclen $arg0
set $i=LENGTH($arg0)
set $count=0
while ($count < $i)
printf "Element #%d:\n", $count
dumpsxp STRING_ELT($arg0,$count)
set $count = $count + 1
end
end

# VECSXP
if $sexptype==19
print_veclen $arg0
end

# RAWSXP
if $sexptype==24
print_veclen $arg0
end

end

define print_veclen
printf "Vector length=%d\n", LENGTH($arg0)
end

define print_char
# this may be a bit dodgy, as I am not using the aligned union
printf "\"%s\"\n", (const char*)((VECTOR_SEXPREC *) ($arg0)+1)
end

define print_double
printf "%g\n", (double*)((VECTOR_SEXPREC *) ($arg0)+1)
end
[/sourcecode]

Coding

Custom Checkstyle Rule Example

Post author By Rory Winston
Post date August 17, 2010

Here is an example of a custom Checkstyle rule that catches the following situation: sometimes an Exception can be caught and an error message logged, but the underlying exception (and thus the stack trace) may not be logged at all. This is not a problem generally, unless the exception tends wrap an underlying cause (i.e. one or more nested exceptions). The rule is designed to catch instances like:

[sourcecode language=”java”]
try {}
catch (Exception e) {
logger.error("blah");
}
[/sourcecode]

With the assumption that the following is better:

[sourcecode language=”java”]
try {}
catch (Exception e) {
logger.error("blah", e);
}
[/sourcecode]

Here is the code for the custom check:

[sourcecode language=”java”]
package au.com.national.efx.build;
import java.util.ArrayList;
import java.util.List;

import com.puppycrawl.tools.checkstyle.api.Check;
import com.puppycrawl.tools.checkstyle.api.DetailAST;
import com.puppycrawl.tools.checkstyle.api.TokenTypes;

/**
* Check that attempts to catch instances of the following:
* <code>
* catch (Exception e) { logger.error("foo"); }
* </code>
*
* with the assumption that the following is preferable:
*
* <code>
* catch (Exception e) { logger.error("foo", e); }
* </code>
* @author rwinston
*
*/
public class SwallowedExceptionInLoggerCheck extends Check {

@Override
public int[] getDefaultTokens() {
return new int[] { TokenTypes.LITERAL_CATCH };
}

/**
* Get ident of exception
* Try to find it in logger error/warn parameter list
*/
@Override
public void visitToken(DetailAST aAST) {
super.visitToken(aAST);

final DetailAST parameterDef = aAST.findFirstToken(TokenTypes.PARAMETER_DEF);
final DetailAST ident = parameterDef.findFirstToken(TokenTypes.IDENT);
final String exceptionIdent = ident.getText();

final DetailAST slist = aAST.findFirstToken(TokenTypes.SLIST); // Find ‘{‘
// Find all method calls within catch block
final List<DetailAST> variables = findChildASTsOfType(slist, TokenTypes.METHOD_CALL);

try {
for (DetailAST methodCall : variables) {
DetailAST dot = methodCall.findFirstToken(TokenTypes.DOT);

// I’m assuming the last child will be the method name called
DetailAST lastIdent = dot.getLastChild();

if (lastIdent.getText().equals("error")) {
// Ok, now check that the ELIST contains an IDENT matching
// the exception name
DetailAST elist = methodCall.findFirstToken(TokenTypes.ELIST);
boolean exceptionInParameterList = false;
for (DetailAST identAST : findChildASTsOfType(elist, TokenTypes.IDENT)) {
if (identAST.getText().equals(exceptionIdent))
exceptionInParameterList = true;
}

if (!exceptionInParameterList) {
log(methodCall, "error() method does not contain caught Exception as a parameter");
}
}
}
} catch (Exception e) { e.printStackTrace(); }

}

/**
* Recursively traverse an expression tree and return all
* ASTs matching a specific token type
* @param parent
* @param type
* @return
*/
private List<DetailAST> findChildASTsOfType(DetailAST parent, int type) {
List<DetailAST> children = new ArrayList<DetailAST>();

DetailAST child = parent.getFirstChild();
while (child != null) {
if (child.getType() == type)
children.add(child);
else {
children.addAll(findChildASTsOfType(child, type));
}
child = child.getNextSibling();
}
return children;
}

}

[/sourcecode]