To understand how this strategy works, we first need to get through some background theoretical knowledge. First, let’s start with what pairs trading even is.
Background #
Pairs Trading is a market-neutral trading strategy that allows the trader to profit from the divergence in returns of correlated assets. The most familiar example of this is the Coca-Cola & Pepsi pairs trade. Both Coca-Cola & Pepsi operate in the same industry and are exposed to similar risks/upsides, so generally, the performance of these 2 stocks tend to be similar.
When the performance of these 2 stocks diverge by a significant amount (e.g. Coca-Cola stock goes up 5%, Pepsi only goes up by 1%), the trader profits by buying the underperforming shares (in the above example, this would be Pepsi since the % return is lower than that of Coca-Cola) and they would go short the overperforming shares (in this case, Coca-Cola since the % return is higher).
A profit is realized when the spread between the two stocks decreases. It isn’t necessary for both legs to converge in opposing directions, often times profit is made when one leg of the trade “catches up” and closes the spread. In that scenario, you would lose money on one leg, but you would make more than you lost through the leg that “catches up”.
That is the meat and bones of how pairs trading works, but this strategy has been used/studied for decades by now, so while opportunities in stocks like Coke and Pepsi exist, they aren’t significant enough to be worth the effort. However, there are a few other areas of the market where this strategy can be extraordinarily effective.
Sector-Based Pairs Trading #
Pairs trading works because the stocks involved often have valid economic reasons for being correlated (e.g., similar industry, different securities for the same underlying company, similar macroeconomic exposure, etc.). As you expand the scope of stocks looked at, you will notice that while the correlations remain strong, they become imperfect — but this is where the largest opportunities lie.
Let’s look at this from a sector-based perspective. We first ask the question, how correlated are the top 10 stocks of the Technology sector (by market cap)? To answer that, we pulled about a year of data and ran a correlation matrix for those stocks and this was the output:
As demonstrated, the top stocks of the given sector are strongly, positively correlated. Does this relationship work for other sectors? Let’s try the same thing, but for the Financial sector:
While the correlations aren’t as high as the Technology sector, it is clear that there are still strong, positive correlations between the 10 largest components.
Implementation
Since we know that the top n<10 stocks of a given sector will have strong correlations, we can use it as the basis for a pairs trading strategy. This strategy will take the top 10 stocks of a sector, then it will split the basket in 2. Considering the high correlation, the performance of these 2 baskets should be very similar, so when the performance of the baskets diverge (the “spread”), we can buy the underperforming basket, then short the overperforming basket.
Thankfully, the QuantGlobal API handles the bulk work of the calculations and we can get our universe with just a few lines of code.
For this example, we’ll use the Financials sector:
# First, import the necessary packages
import pandas as pd
import QuantGlobal as qg
# This variable will store the underlying basket data which includes
# the ticker symbols, prices, and performance.
underlying_data = qg.download(key = "authenticated_user@email.com",
strategy = 'pt_extended',
underlying = 'financials',
from_date = '2022-11-29',
end_date = '2022-11-30')
# This variable will store the spread index values to act as a trade signal.
index_data = qg.download(key = 'authenticated_user@email.com',
strategy = 'pt',
underlying = 'financials',
from_date = '2022-11-29',
end_date = '2022-11-30')
The underlying_data variable will return a dataframe structured as follows:
datetime | open | high | low | close | volume | Ticker | Returns | Cumulative Returns |
---|---|---|---|---|---|---|---|---|
2022-11-29 09:30:00 | 311.89 | 312.5 | 311.57 | 311.71 | 70840 | BRK.B | 1.0 | 100.0 |
2022-11-29 09:30:00 | 209.35 | 210.1 | 208.82 | 209.83 | 133051 | V | 1.0 | 100.0 |
2022-11-29 09:30:00 | 134.66 | 134.7 | 134.48 | 134.48 | 163079 | JPM | 1.0 | 100.0 |
2022-11-29 09:30:00 | 343.65 | 343.65 | 342.18 | 342.88 | 39725 | MA | 1.0 | 100.0 |
2022-11-29 09:30:00 | 36.98 | 37.04 | 36.89 | 36.89 | 364617 | BAC | 1.0 | 100.0 |
2022-11-29 09:30:00 | 47.18 | 47.24 | 47.08 | 47.09 | 145988 | WFC | 1.0 | 100.0 |
2022-11-29 09:30:00 | 89.64 | 89.64 | 89.6 | 89.6 | 200 | MS | 1.0 | 100.0 |
2022-11-29 09:30:00 | 79.81 | 80.29 | 79.81 | 80.225 | 45060 | SCHW | 1.0 | 100.0 |
2022-11-29 09:30:00 | 381.13 | 381.8 | 379.26 | 379.89 | 43124 | GS | 1.0 | 100.0 |
2022-11-29 09:30:00 | 30.55 | 30.57 | 30.52 | 30.54 | 15030 | HSBC | 1.0 | 100.0 |
The index_data variable will return a dataframe structured as follows:
datetime | Spread |
---|---|
2022-11-29 09:30:00 | 0.0 |
2022-11-29 09:31:00 | 0.1275532893 |
2022-11-29 09:32:00 | 0.053932754 |
2022-11-29 09:33:00 | 0.0714814048 |
2022-11-29 09:34:00 | 0.0398588432 |
Trading #
Now that the data is stored, we can focus on building the actual strategy that will be traded.
We will put on a pairs trade when the index spread increases above our designated threshold. Just to clarify, the spread represents the absolute difference between the two equal-weighted baskets. To better understand this, consider the below scenario:
- Basket A: BRK/B, V, JPM, MA, BAC
- Basket B: WFC, MS, SCHW, GS, HSBC
- Both baskets are equal-weighted
- If Basket A returns 2%, but Basket B returns 1%; the index spread will be 1 (abs(Basket A Performance — Basket B performance)).
- If we buy Basket B and short Basket A; we make money when the index spread drops to a number lower than 1 (or whatever the number was when we entered the trade)
It might be a bit tricky to navigate buying/selling 10 different stocks at once, so for this example, we’ll trade using just 2 stocks out of the 10. The two stocks will be the highest performing share and the lowest performing share. This pair will have the largest effect on the spread as they are the “outliers”, and as such, they offer the largest profit opportunity.
Every broker’s API looks different, but here is the general code flow of how we would put on this trade:
# When the index spread crosses to/above this level, we want to put on a trade
opening_threshold = 0.75
# When the spread crosses to/below this level, we close the trade
closing_threshold = 0.25
# We check the most recent value of the spread to see if it is at the threshold
if index_data['Spread'].iloc[-1] >= opening_threshold:
# If the index is indeed above/at our threshold, we get the symbols we need.
# This line gets the ticker with the smallest/largest cumulative return
most_recent_underlying_data = underlying_data[underlying_data.index == underlying_data.index[-1]]
lowest_performer = most_recent_underlying_data['Ticker'][most_recent_underlying_data['Cumulative Returns'] == most_recent_underlying_data['Cumulative Returns'].min()].iloc[0]
highest_performer = most_recent_underlying_data['Ticker'][most_recent_underlying_data['Cumulative Returns'] == most_recent_underlying_data['Cumulative Returns'].max()].iloc[0]
# Now, we submit the pairs trade order
# "broker_api" is a placeholder for the api used to submit your orders
# Any broker capable of buying/selling stocks is compatible with this strategy.
long_order = broker_api.buy(lowest_performer)
short_order = broker_api.short(highest_performer)
else:
# If the most recent value isn't above our threshold, do nothing.
pass
Using the logic above, the first condition would be triggered at around 11:21 on that day:
datetime | Spread |
---|---|
2022-11-29 11:17:00 | 0.6706872513 |
2022-11-29 11:18:00 | 0.7012132456 |
2022-11-29 11:19:00 | 0.7261917145 |
2022-11-29 11:20:00 | 0.7460098272 |
2022-11-29 11:21:00 | 0.7474594075 |
datetime | open | high | low | close | volume | Ticker | Returns | Cumulative Returns |
---|---|---|---|---|---|---|---|---|
2022-11-29 11:21:00 | 30.65 | 30.65 | 30.62 | 30.62 | 4551 | HSBC | -0.0008157938 | 100.2642869952 |
2022-11-29 11:21:00 | 382.02 | 382.02 | 381.84 | 381.92 | 1800 | GS | 7.85567e-05 | 100.5363113498 |
2022-11-29 11:21:00 | 80.25 | 80.3 | 80.21 | 80.2382 | 3460 | SCHW | -0.000333894 | 100.022271221 |
2022-11-29 11:21:00 | 47.42 | 47.435 | 47.4 | 47.405 | 11880 | WFC | -0.0003163222 | 100.6685160494 |
2022-11-29 11:21:00 | 90.52 | 90.55 | 90.52 | 90.53 | 6222 | MS | 0.0002209701 | 101.0362170394 |
2022-11-29 11:21:00 | 341.06 | 341.1899 | 340.96 | 340.96 | 500 | MA | -0.0001173021 | 99.4410901608 |
2022-11-29 11:21:00 | 312.03 | 312.1 | 312.03 | 312.1 | 1777 | BRK.B | 0.0002403654 | 100.1267140605 |
2022-11-29 11:21:00 | 36.83 | 36.8399 | 36.8124 | 36.8399 | 70360 | BAC | 0.0005404671 | 99.8665539121 |
2022-11-29 11:21:00 | 135.2 | 135.24 | 135.16 | 135.22 | 12639 | JPM | 0.0002219099 | 100.5503721585 |
2022-11-29 11:21:00 | 207.54 | 207.57 | 207.435 | 207.48 | 7448 | V | -0.0001445713 | 98.8763150782 |
Considering that the index is above the threshold, the script would buy the weakest performing share (V) and sell-short the strongest performing share (MS).
Now that the position is on, the program needs to wait for the index to duck back down below/to the closing threshold. Again, each broker’s API works differently, so here’s just a boilerplate version of how that would flow:
# When the spread crosses to/below this level, we close the trade
closing_threshold = 0.25
# We check the most recent value of the spread to see if it is at the threshold
if index_data['Spread'].iloc[-1] <= closing_threshold:
# If the index is indeed below/at our threshold, we close the open orders.
close_long_position = broker_api.sell(lowest_performer)
close_short_position = broker_api.buy(highest_performer)
else:
# If the most recent value isn't below/at our threshold, do nothing.
pass
The closing condition would be triggered at 3:07 on that day:
datetime | Spread |
---|---|
2022-11-29 15:03:00 | 0.2729414397 |
2022-11-29 15:04:00 | 0.2729105446 |
2022-11-29 15:05:00 | 0.2603440284 |
2022-11-29 15:06:00 | 0.2559272209 |
2022-11-29 15:07:00 | 0.2399088933 |
datetime | open | high | low | close | volume | Ticker | Returns | Cumulative Returns |
---|---|---|---|---|---|---|---|---|
2022-11-29 15:07:00 | 208.29 | 208.35 | 208.29 | 208.3 | 3970 | V | 9.60246e-05 | 99.2729078343 |
2022-11-29 15:07:00 | 30.5 | 30.5 | 30.495 | 30.495 | 4345 | HSBC | 0.0 | 99.8565930014 |
2022-11-29 15:07:00 | 382.65 | 382.74 | 382.62 | 382.69 | 1600 | GS | 0.0001306711 | 100.7394665405 |
2022-11-29 15:07:00 | 80.085 | 80.135 | 80.085 | 80.13 | 8567 | SCHW | 0.000561903 | 99.8890787827 |
2022-11-29 15:07:00 | 91.14 | 91.16 | 91.13 | 91.14 | 4840 | MS | -5.48576e-05 | 99.7096601473 |
2022-11-29 15:07:00 | 314.085 | 314.195 | 314.085 | 314.16 | 5535 | BRK.B | 0.0002228661 | 100.785884532 |
2022-11-29 15:07:00 | 36.975 | 36.985 | 36.96 | 36.985 | 30897 | BAC | 0.0001352082 | 100.2612721446 |
2022-11-29 15:07:00 | 343.165 | 343.26 | 343.14 | 343.22 | 1100 | MA | 8.74151e-05 | 100.1036418925 |
2022-11-29 15:07:00 | 136.29 | 136.34 | 136.29 | 136.34 | 11130 | JPM | 0.0003668648 | 101.3765070009 |
2022-11-29 15:07:00 | 47.45 | 47.46 | 47.445 | 47.455 | 6229 | WFC | 0.0001074818 | 100.7755559356 |
As demonstrated, this trade yielded a profit of ~1.72%. You can intuitively calculate PnL by tracking the Cumulative Returns column. In this example, we bought V at 98.88, then sold it for 99.27. Since the index starts each day at 100 and tracks returns, this means a 0.39% profit on the V leg. Then, we were short MS at 101.04, and bought it back at 99.71 for a 1.33% profit. So when tallied, the net profit on the position was 1.72% (0.39% + 1.33%).
Further Steps #
While this was just one implementation of the strategy, there is essentially an infinite degree of flexibility in how it can be further optimized. For example, while trading all 10 stocks in the baskets may be difficult to mentally manage, it can significantly reduce the overall volatility of the strategy. Trading the full baskets also allows for maximal scalability. Another possible optimization of this is continuing the pattern of taking just the 2 outlier stocks, but instead of just those in the financials sector, how about 2 from all the sectors? Our API provides unlimted access to all the data you need to explore and implement these ideas and we make it easy for you to get started.
Ready to try our data? Start your free trial