Finding bear markets -- The separating hyperplane approach
In the Quant’s Corner blog that was posted in August, we outlined the basic methodology for a ‘Bear Detector’ based on a combination of flow and volatility data. The model we presented used the data to create a two-dimensional representation, with the points corresponding to the past eight weeks (dark green triangles), to previous bull markets (light green circles) and previous bear markets (gray circles) all colored differently.
It was left to the reader to determine by visual inspection whether the latest data point was bear or bull, depending on the density of green or gray dots surrounding the dark green triangles (see chart below).
During the intervening period we have harnessed the power of machine learning to refine the model so that it generates actual scores that allow users to quantify whether a given week’s data point is consistent with a bear or bull market.
Starting at the beginning
As with the original model, we start by using four data points to characterize each week. These data points are:
i. DM flow – Daily fund flow into developed-market cross-border equities, as a percentage of assets, compounded over the prior four weeks.
ii. EM ex DM – The difference between daily fund flow into emerging-market and developed cross-border equities, as a percentage of assets, compounded over the prior four weeks.
iii. Equity ex Bond – The difference between daily fund flow into equities and bonds, as a percentage of assets, compounded over the prior four weeks.
iv. Volatility – Daily volatility of the Russell 1000 over the prior four weeks.
EPFR began collecting daily fund flows on April 24, 2007. Given that four weeks of data are needed, the very first week in our sample is the one ending May 23 of that same year. The sample ends on Oct. 23, 2019, covering 647 weeks.
In the original model, these 647 points were plotted along four axes. The two axes which captured the least volatility were discarded and the remaining two presented in two dimensions. But, utilizing machine learning, it is now possible to work with these data points in four dimensions.
To get a practical sense of how this works, imagine the known bear markets as “crosses” and the remainder, the bull markets, as “noughts.” All are plotted in this four-dimensional space. Further, imagine inserting a partition between the bull and bear markets so that as many bull/bear markets fall on one/other side of the partition as possible. On a flat piece of paper, such a separating hyperplane looks like a line. In three dimensions, the hyperplane is a two-dimensional sheet. In the four-dimensional space, however, the separating hyperplane, itself, will be three dimensional.
The hyperplane misclassifying the fewest points – the optimal hyperplane – can always be moved until it hits four points. In a two-dimensional space, the optimal hyperplane, which is a line, can be moved until it hits two points. The new plane will be equally optimal in terms of misclassifying the fewest points. Thus, it suffices to check only the planes that go through any four of the points and not the infinitude of potential hyperplanes.
Further, if the optimal hyperplane is moved “up”, it will run aground on a bear market. If it hit a bull market instead, you could move it past that point and so improve the classification. This is impossible since this hyperplane is assumed to separate the bull and bear markets optimally.
What this means is that you only need to look at the combinations where at least one bear market is involved. This reduces the number of combinations to 2,757,904,814, 38% of the original 7,233,877,315 possible combinations of any four points from the 647. This number of combinations can be checked with modern computers. Thus, we can dispense with more sophisticated machine-learning techniques such as support vector machines.
The methodology used will be as follows:
Utilizing basic machine learning, all these combinations were tried. The best hyperplane misclassifies only 32 of the 647 weeks, 5% of the total. We report below the distance of each of the weeks from the hyperplane. Values are negative if the week was characterized as a bear market, or positive otherwise.
As you can see from the picture, the predictor is quite accurate. Of the 73 known bear market weeks, 12 were missed. There were also 20 weeks where a false alarm was sounded. For the remaining 615 weeks (or 95% of the sample), the classifier was accurate.
Furthermore – and perhaps most importantly – four-dimensional model agrees with its two-dimensional predecessor when it comes to the danger of a bear market: there is no immediate need to worry.
For more insight subscribe here.