Integrating Rmathlib and kdb+

The R engine is usable in a variety of ways – one of the lesser-known features is that it provides a standalone math library that can be linked to from an external application. This library provides some nice functionality such as:

* Probability distribution functions (density/distribution/quantile functions);
* Random number generation for a large number of probability distributions

In order to make use of this functionality from q, I built a simple Rmathlib wrapper library. The C wrapper can be found here and is simply a set of functions that wrap the appropriate calls in Rmathlib. For example, a function to generate N randomly-generated Gaussian values using the underlying rnorm() function is:

K rnn(K n, K mu, K sigma) {
    int i,count = n->i;
    K ret = ktn(KF, count);
    for (i = 0; i < count; ++i)
        kF(ret)[i] = rnorm(mu->f, sigma->f);
    return ret;
}

These have to be imported and linked from a kdb+ session, which is done using special directives (the 2: verb). I decided to automate the process of generating these directives – the code shell script below parses a set of function declarations in a delimited section of a C header file and produces the appropriate load statements:

INFILE=rmath.h
DLL=\`:rmath

echo "dll:$DLL"

DECLARATIONS=$(awk '/\/\/ BEGIN DECL/ {f=1;next} /\/\/ END DECL/ {f=0} f {sub(/K /,"",$0);print $0}' $INFILE)
for decl in $DECLARATIONS; do
    FNAME=${decl%%(*}
    ARGS=${decl##$FNAME}
    IFS=, read -r -a CMDARGS <<< "$ARGS"
    echo "${FNAME}:dll 2:(\`$FNAME;${#CMDARGS[*]})"
done

echo "\\l rmath_aux.q"

This generates a set of link commands such as the following:

dll:`:rmath
rn:dll 2:(`rn;2)
rnn:dll 2:(`rnn;3)
dn:dll 2:(`dn;3)
pn:dll 2:(`pn;3)
qn:dll 2:(`qn;3)
sseed:dll 2:(`sseed;2)
gseed:dll 2:(`gseed;1)
nchoosek:dll 2:(`nchoosek;2)

It also generates a call to load a second q script, rmath_aux.q, which contains a bunch of q wrappers and helper functions (I will write a separate post about that later).

A makefile is included which generates the shared lib (once the appropriate paths to the R source files is set) and q scripts. A sample q session looks like the following:


q) \l rmath.q
q) x:rnorm 1000 / generate 1000 normal variates
q) dnorm[0;0;1] / normal density at 0 for a mean 0 sd 1 distribution

The project is available on github: https://github.com/rwinston/kdb-rmathlib.

Note that loading rmath.q loads the rmath dll, which in turn loads the rmathlib dll, so the rmathlib dll should be available on the dynamic library load path.

[Check out Part 2 of this series]