Pairs Trade – Theory

Pairs trade is one of the simplest market-neutral statistical arbitrage strategies.  The goal is to find a pair of securities which historically move up and down in highly correlated fashion but the price differential between them is temporarily at an extreme.  We then long the relatively cheap security and simultaneously short the other.  Hopefully, the price differential would promptly revert back to normal such that we can realise some profit.  We would review some useful statistical concept (such as co-integration, stationary process) and discuss the types of securities that are likely to form good trading pairs.  

Mathematical Foundation

For a mean-reverting strategy to work, we need to find a stationary time series which has a time-invariant mean that the series would return to. Most of the financial time series we encounter are not stationary.  Think about S&P500 level now vs 1 year ago vs 10 years ago vs 50 years ago.  It is hard to conceive S&P500 is mean-reverting to some long term average value.  A very limited number of financial time series (currency pair between similar economies e.g. NZDAUD) show some weak sign of being stationary.  The beauty of pairs trade is we can construct stationary time series from the linear combination of two suitable non-stationary time series (in other words, cointegrate).  This gives a much bigger pool for us to search for possible trading pairs.   

Stationary Process

Consider a simple AR(1) process yt =r yt-1 + εt with ε being the noise term.  If r is larger than 1, the process is explosive.  If r equals 1 (ie processing a unit root), the process becomes a random walk.  Neither is stationary.  While the explosive process is rare in financial time series (with exception to hyperinflation), random walk is a common occurrence.

Augmented Dickey-Fuller (ADF), Δyt = ρ yt-1 + δt-1Δyt-1 + … + δt-nΔyt-n+ εt, is a popular test for unit root.  It takes lags into consideration.  We look into the simpler Diskey-Fuller test which ignores the lag terms. First, we introduce the first difference operator Δyt = yt-yt-1,
the AR(1) process becomes Δyt = (r-1)yt-1 + εt = ρ yt-1 + εt.

The null hypothesis for Dickey-Fuller suggests the presence of a unit test.   The test statistic is generated for ρ in the same way as to test if the regression coefficient ρ obtained through OLS = 0 : \hat{\rho} / \text{SE}({\rho}) .  The explanatory variable and the noise terms are however not independent.  Standard t-stat is not applicable.  The test statistic follows a special distribution, Dickey-Fuller-distribution.


A stationary time series Xt has an Order of Integration of zero, denoted I(0).  The time series Zk formed by the cumulative sum of Xt, Z_t = \sum{X_k}_{i=1}^t , would have an order of one, I(1), and so on.

A collection of time series variables are said to be cointegrated if their order of integration are n and there exists a linear combination of them which give an Order of Integration of less than n.  For many securities, their daily return time series can be shown to be stationary I(0) and their price time series is I(1).  We can identify variables that are cointegerated if we can find a linear combination of price series that are stationary.

Engle-Granger 2-step approach is an intuitive testing method.  In the first step, the two input price series and their return series obtained by calculating the first differences are run through ADF unit-root test.  If the price series are non-stationary I(1) and the return series are stationary I(0), we can advance to step 2.   In the second step, a simple OLS regression is being run against the two price series.   A hedge ratio between price of y and price of x is obtained as the regression coefficient.  We then run ADF unit-root test upon the residuals of the regression.  If the null hypothesis of the presence of unit root cannot be excluded, then the two variables are likely to be cointegrated.

While we restrict our discussion to two variables below, the ideal of cointegrated (so as “pairs” trading) can be extended to three or more securities.  The statistical method used would be different with Johansen Methodology being the preferred solution.

Hunting Ground for Pairs Trade

Nothing to stop us from running a brute search for possible pairs trade candidates.  With an investable universe of few thousand securities (say shares of publicly listed companies, ETFs and futures contracts), the search would not take long. In the end, the computation requirement is just O(N^2)However, the search would likely be filled result generated due to spurious correlation.

It makes more sense to examine security pairs which have economic reasons to trade up and down together (“stochastically conditionally independent”).  Here are some examples:

  • Shares with multiple classes: e.g. Royal Dutch Shell A vs Royal Dutch Shell B 
  • ETFs or Futures vs their underlying assets/indices
  • Companies with a similar business model yet not direct competitors: e.g. Exxon vs Chevron (both affect significantly by oil price) but not Apple vs Samsung (one releases a supreme new model will dent the other’s market).
  • Securities with a strong linkage in the supply chain: e.g. car manufacturers (Ford, GM…) vs car parts manufacturers (Dana, Lear…) or gold ETF vs gold miner ETF

In general, forming pairs between baskets of shares (e.g. equity ETF) should work better than choosing shares of individual companies.  The idiosyncratic corporate developments can lead to decoupling in performance between historically close related companies.  A basket of companies would cancel out these effects.

In the next post, we are going to see some real-word examples.