Categories

## Market Making and The Win/Loss Ratio

The article https://online.wsj.com/public/resources/documents/VirtuOverview.pdf is a neat little illustration of a simple asymptotic toy distribution given an initial probability of a win or loss per-trade. It is used as an example to illustrate the basic methodology behind the working market-maker business – develop a small edge and scale this up as cheaply as possible to maximise the probability of overall profit.

If we take $p=0.51$ as the probability of a win per-trade and then after $n$ transactions we will have a number of ‘wins’ k that will vary from 0 to n. We model each trade as the outcome of a binomial 0-1 trial.

In order to come out at breakeven or better, the number of wins k needs to be at least $\frac{n}{2}$. Using the binomial distribution this can be modelled as:

$P\left(n>\frac{k}{2}\right) = \sum_{\frac{k}{2}}^\infty \frac{n!}{k!(n-k)!}p^k(1-p)^{n-k}$

As the binomial distribution converges to a normal $\mathcal{N}(np, np(1-p))$ as n gets large, we can use the distribution below to model the win/loss probability over n:

$\int_{\frac{k}{2}}^\infty \mathcal{N}\left(np, np(1-p) \right) dx$

Which is

$\int_{\frac{k}{2}}^\infty \frac{1}{\sigma\sqrt{2}\pi}e^{-\frac{1}{2}\frac{x-\mu}{\sigma}^2} dx$

Where $\mu=np$ and $\sigma^2=np(1-p)$

This can be modelled in R

> p <- 0.51
> n <- 100
> 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
 0.5792754
> n <- 1000
> 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
 0.7364967

Showing that with a win probability of 51% 100 trades gives us a 57% probability of breakeven or better and 1000 trades gives us a 73% chance of breakeven or better.

We can plot the probability of breakeven holding p constant and changing n from 1 to 1000:

 n<-seq(1,1000)
> y <- 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
> library(ggplot2)
> library(scales)
> qplot(n,y)+scale_y_continuous(label=percent)

Which produces the following graph

Which shows the convergence to a sure 100% probability of profit as n gets large.

To make it more interesting we can generate different paths for n from 1 to 10000 but also vary the win probability from say 45% to 51% and look at the paths as we vary n and p:

n <- seq(1,10000)
p<- 0.5
y <- 1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
plot(n, y, type='l', ylim=c(0,1))

probs <- seq(0.45, 0.55, length.out = 100)
for (pr in seq_along(probs)){
p<-probs[pr]
y<-1-pnorm(q=n/2, mean=n*p,sd=sqrt(n*p*(1-p)))
lines(x=n,y=y,col=ifelse(y<0.5,rgb(1,0,0,.5),rgb(0,1,0,.5)))
}

Which shows the probabilities of breakeven or better given a number of different starting win/loss probabilities and a varying number of trades. The path with $p=0.5$ is shown in black.

Categories

## Approximating e

I was reading Simon Singh’s The Simpsons And Their Mathematical Secrets today and he mentioned a simple method for approximating e – given a uniform RNG , e can be approximated by the average number of draws required for the sum of the draws to exceed 1. This is a neat little demonstration and easy to generate in R – taking the uniform RNG and plotting the average number of draws required to exceed a sum of one, and then replicating this using an increasing number of draws to illustrate convergence:

# Function to calculate number of draws required
function() {
r <- runif(1,0,1)
n <- 1
while (r<1) {
r <- r+runif(1,0,1)
n <- n+1
}
return(n)
}
# Generate a series of draws from 2 .. 2^16 (65536)
N<-2^seq(1,16)
# Generate simulation
y <- sapply(N, function(x)mean(replicate(x,gen())))
# Plot convergence
qplot(1:16, y) + geom_line(linetype=2) + geom_hline(aes(yintercept=exp(1)),color='red')
Categories

## Functional Selects/Updates in kdb+

Functional selects/updates are a relatively trick topic in kdb+ – mainly as the syntax takes a lot of getting used to. They are normally required when there are some dynamic elements in e.g. column selection or grouping criteria.

They are pretty well covered in Q For Mortals, but I wanted to add a couple of examples…combining functional select and update for example:

To start, load the sample tables in sp.q..we will use the table called ‘p’:

\l sp.qq)pp | name color weight city w--| ----------------------------p1| nut red 12 london 91p2| bolt green 17 paris 91p3| screw blue 17 rome 91p4| screw red 14 london 91p5| cam blue 12 paris 91p6| cog red 19 london 91

### Functional Selects

A simple select from p with some criteria:

q)select from p where city=londonp | name color weight city w--| ----------------------------p1| nut red 12 london 91p4| screw red 14 london 91p6| cog red 19 london 91

Now lets look at the parse tree for this query:

q)parse "select from p where city=london"?p,,(=;city;,london)0b()

Now in order to convert this to a functional select, we need to turn this parse tree into an executable statement using the ? operator.

The basic form of the ? operator is

?[tablename;(select criteria);(grouping criteria);(columns)]

The parse tree above gives us each of the four elements in the right order – we just need to convert them to a valid functional syntax. For the example above this translates to:

q)?[p;enlist (=;city;enlist london);0b;()]p | name color weight city w--| ----------------------------p1| nut red 12 london 91p4| screw red 14 london 91p6| cog red 19 london 91

So if we want to add e.g. column selection:

q)select name,color,weight from p where city=londonname color weight------------------nut red 12screw red 14cog red 19

The parse tree looks like:

q)parse"select name,color,weight from p where city=london"?p,,(=;city;,london)0bnamecolorweight!namecolorweight

In this case we have added a dictionary mapping selected columns to their output names:

q)?[p;enlist (=;city;enlist london);0b;(namecolorweight)!(namecolorweight)]name color weight------------------nut red 12screw red 14cog red 19

Similarly, we can change the select criteria:

q)select name,color,weight from p where city in londonparisname color weight------------------nut red 12bolt green 17screw red 14cam blue 12cog red 19

Which produces the following parse tree:

q)parse"select name,color,weight from p where city in londonparis"?p,,(in;city;,londonparis)0bnamecolorweight!namecolourweight

This is only a small modification to the original functional select:

q)?[p;enlist (in;city;enlist londonparis);0b;(namecolorweight)!(namecolorweight)]name color weight------------------nut red 12bolt green 17screw red 14cam blue 12cog red 19

Functional updates follow an almost identical form, but use the ! operator, e..g

q)update w:sum weight by city from pp | name color weight city w--| ----------------------------p1| nut red 12 london 45p2| bolt green 17 paris 29p3| screw blue 17 rome 17p4| screw red 14 london 45p5| cam blue 12 paris 29p6| cog red 19 london 45q)parse "update w:sum weight by city from p"!p()(,city)!,city(,w)!,(sum;weight)

This parse tree maps to:

q)![p;();(enlist city)!enlist city;(enlist w)!enlist (sum;weight)]p | name color weight city w--| ----------------------------p1| nut red 12 london 45p2| bolt green 17 paris 29p3| screw blue 17 rome 17p4| screw red 14 london 45p5| cam blue 12 paris 29p6| cog red 19 london 45

Now if we want to update the output from a select, e.g. a simple grouped update:

q)update w:sum weight by color from select name,color,weight from p where city in londonparisname color weight w---------------------nut red 12 45bolt green 17 17screw red 14 45cam blue 12 12cog red 19 45

The parse tree for this looks like:

q)parse"update w:sum weight by color from select name,color,weight from p where city in londonparis"!(?;p;,,(in;city;,londonparis);0b;namecolorweight!namecolorweight)()(,color)!,color(,w)!,(sum;weight)

This parse tree looks complex, but the main complexity comes from the nested functional select within.

We could explicitly write the entire function (nested select and update):

q)![?[p;enlist (in;city;enlist londonparis);0b;namecolorweight!namecolorweight];();(enlist color)!enlist color;(enlist w)!enlist (sum;weight)]name color weight w---------------------nut red 12 45bolt green 17 17screw red 14 45cam blue 12 12cog red 19 45

However it may be easier to read if we store the select in its own variable:

q)sel::?[p;enlist (in;city;enlist londonparis);0b;namecolorweight!namecolorweight]q)selname color weight------------------nut red 12bolt green 17screw red 14cam blue 12cog red 19

And then refer to the select thus:

q)![sel;();(enlist color)!enlist color;(enlist w)!enlist (sum;weight)]name color weight w---------------------nut red 12 45bolt green 17 17screw red 14 45cam blue 12 12cog red 19 45