Categories
Coding R

Working With Unicode Symbols in R and Vim

I came across a Vim feature today that I had forgotten about for a while – digraphs. Digraphs are basically Vim abbreviations for encoded character shortcuts. Typing <ctrl-K> followed by the digraph will insert the mapped character into the file. For instance, to insert the Euro symbol, type

<CTRL-K>Eu.

A list of existing digraphs can be shown using :digraph, and a new digraph can be added using the same command.

For instance, to add new digraphs for the Unicode male (♂) and female (♀) gender symbols, which are Unicode code points 9794/0x2642 and 9792/0x2640 respectively, you can type:

:dig Ma 9794
:dig Fe 9792

Similarly, R can insert Unicode code points into text and labels, and even data point symbols. One little-known convention in R is that if a data point symbol is negative, it is taken to be a Unicode code point. The following example generates some random data, and then displays the data with either a male or female symbols depending on whether the data point is divisible by 2 or not. Also, the symbols are inserted into the plot title using the \u character prefix.

> gender < - rbinom(n=100, size=100, prob=0.5) > plot(gender, cex=2.5,
pch=ifelse(gender %% 2 == 0, -0x2642L, -0x2640L),
col=ifelse(gender %% 2 == 0, 2, 3), main="\u2640 and \u2642 Trials")

Plot Example
Plot Example