When I first started exploring the world of technical analysis, the first indicator I came across was the Moving Average Convergence/Divergence (MACD) oscillator, a popular indicator for measuring stock momentum and following the trend of a stock’s price. Although many websites (e.g. Investopedia, BusinessInsider, MoneyShow) touted the MACD as an “essential” indicator for technical trading, I wasn’t quite sold on the fact that following “a few simple trading rules” could allow me to consistently make money from trading. And, like most people, I did not know how to back-test trading strategies at that time.
However, now that I know a little bit of programming, I have decided to write a series of posts on the effectiveness of technical trading entitled How (Not) to Invest. This first post in the series is on the MACD.
The Evolution of Technical Indicators
As a start to the series on How (Not) to Invest, it is appropriate to articulate my theory on the evolution of technical indicators, which borrows from cryptography. In cryptography, there are codemakers who devise encryption algorithms, and codebreakers who aim to break them. Once encryption algorithms are broken, new ones are created. That tension leads to stronger and more complex encryption. Technical indicators appear to work in the same way.
Today, there are many trading indicators at our disposal. And I believe these came about from the cycle of people devising new indicators and others exploiting them, leading to the creation of new ones. The main point here is that once indicators are released for mass consumption, they cease to be effective as a critical mass of traders act on these signals. As such, it is my opinion that when employed in too simple a fashion, too slowly, or both, existing technical indicators cannot deliver positive excess returns.
However, machine learning (ML) may provide hope for technical trading. First, new and more complex indicators are developed when the old ones don’t work - this is equivalent to feature generation in ML. Second, employing new and existing indicators in complex ways could make systems of indicators work - this can be achieved using ML algorithms.
How (Not) to Invest will test these hypotheses.
Having addressed my boring general theory on why indicators don’t work, we can now dive into MACD, the subject of this post.
The MACD indicator comprises three lines: a fast exponential moving average (EMA), a slow EMA, and an EMA of the difference between the fast and the slow EMAs. The typical setting is
12/26/9, and means that we use a (fast) 12-day EMA, a (slow) 26-day EMA, and a 9-day EMA of the difference between the prior two EMAs. This difference between the fast and slow EMAs can be though of as the pressure on price in the near term (fast EMA) vs. the pressure on price in the slightly longer term (slow EMA). The quicker prices are rising, the quicker the increase in the fast EMA.
The MACD has two main signals for buying and selling: the centerline crossover trigger and the signal line crossover trigger. First, the centerline crossover trigger is when the MACD line (different between fast EMA and slow EMA) rises above/below zero. Second, the signal line crossover trigger is when the MACD line crosses above/below the signal line. We can interpret both signals as increasing upward/downward momentum in prices.
In our trading simulations, we will hard-code the MACD triggers, apply different EMA settings, and test them out on all stocks in the S&P 500. We will also calculate returns and accuracy metrics (precision and recall) to evaluate each MACD configuration.
To evaluate the effectiveness of MACD on the S&P 500 stocks, I compute (1) returns, (2) precision, and (3/4) the excess of returns and precision over the baselines.
In the simulations, I used 5-day, 10-day, and 20-day forward returns. Having multiple trading horizons increases the probability of finding winning strategies. Therefore, if we fail to find any profitable MACD strategies, we would have higher confidence that MACD strategies do not work.
I used the buy-and-hold returns of each stock as the benchmark. Excess returns were therefore defined as the returns from the MACD trading strategy minus the buy-and-hold returns.
Precision is the proportion of predicted positives that were actually positive. In our case, it is the percentage of predicted profitable trades that were in fact profitable. This allows us to assess how often the MACD indicator gives us signals that are correct.
The precision advantage defines how precise the MACD indicator is relative to a naive guess. Suppose a stock’s 10-day returns is positive 60% of the time and negative 40% of the time. A naive guess would be to guess the more popular outcome: positive. If the MACD indicator guesses that the returns are positive only 30% of the time, we would have been better off going with the naive guess.
In our trading simulation, we used the following rules:
- Only long trades allowed
- Trades are executed when a buy signal is triggered
a. Centerline crossover (CC): Signal line crosses above zero
b. Signal line crossover (SLC): The MACD line crosses above the signal line + the MACD line is above zero
We used the following parameters:
- Returns: 5-day, 10-day, and 20-day (separately)
- Fast EMA: 2, 3, 4, and 5 times the forward returns period
- Slow EMA: 2, 3, 4, and 5 times the fast EMA setting
- Signal Line: 4, 8, 12, 16, and 20 days
# Import required modules import fix_yahoo_finance as yf import matplotlib as mpl import matplotlib.pyplot as plt import numpy as np import pandas as pd from pandas_datareader import data as pdr import scipy.stats as ss from sklearn.ensemble import RandomForestClassifier from sklearn.linear_model import LinearRegression from sklearn.metrics import classification_report, confusion_matrix, precision_score, recall_score from sklearn.model_selection import TimeSeriesSplit from sklearn.naive_bayes import BernoulliNB import warnings from yahoo_finance import Share # Settings warnings.filterwarnings('ignore') # Override pdr yf.pdr_override() # Import stocklist sp500 = pd.read_csv('sp500.csv') # Configure dates start_date = '1990-01-01' end_date = '2018-06-01' # Load pre-processed data res_df = pd.read_csv('macd_ret5_results.csv') res_df10 = pd.read_csv('macd_ret10_results.csv') res_df20 = pd.read_csv('macd_ret20_results.csv')
Function for Trading Simulations
# Simulate MACD strategy # m1 is a multiple of forward returns period # m2 is a multiple of m1 def sim_macd_multi(stock, m1 = [2, 3, 4, 5], m2 = [2, 3, 4, 5], sl = [4, 8, 12, 16, 20], returns = 5): # Configure dates start_date = '1990-01-01' end_date = '2018-06-01' # Error handling try: # Pull data orig_df = pdr.get_data_yahoo(stock, start_date, end_date, progress=False) # Initialise results list output =  for m in m1: for n in m2: for s in sl: # Copy data df = orig_df.copy() # Calculate look-ahead profits df['ret'] = ((df.Close.shift(-returns) - df.Close) / df.Close > 0).astype(int) df['roi'] = df.Close.shift(-returns) / df.Close df['ret_normal'] = df.Close.shift(-returns) / df.Close - 1 # BAH returns ret_bah = df.Close.iloc[-1] / df.Close.iloc * 100 - 100 # Compute base rate baserate = np.mean(df.ret) # Configure settings temp_m1 = m * returns temp_m2 = m * returns * n # Name temp_varname = 'macd_' + str(temp_m1) + '_' + str(temp_m2) # Calculate MACD line df[temp_varname] = df.Close.ewm( span = temp_m1, min_periods = temp_m1 ).mean() - df.Close.ewm( span = temp_m2, min_periods = temp_m2 ).mean() # Calculate signal line df[temp_varname + '_' + 'sig'] = df[temp_varname].ewm(span = s, min_periods = s).mean() # Calculate shifts df['temp'] = df[temp_varname].shift(1).copy() # Remove empty rows df.dropna(axis = 0, inplace = True) # Compute centerline crossover signals (CC) df['buy_cc'] = ((df['temp'] < 0) & (df[temp_varname] > 0)).astype(int) # Compute signal line crossover signals (SLC) df['buy_slc'] = ((df['temp'] < df[temp_varname + '_' + 'sig']) & \ (df[temp_varname] > df[temp_varname + '_' + 'sig']) & \ (df[temp_varname] > 0)).astype(int) # Compute time frame n_days = df.shape # Compute scores for CC cc_precision = precision_score(y_pred = df.buy_cc, y_true = df.ret) cc_recall = recall_score(y_pred = df.buy_cc, y_true = df.ret) cc_signals = np.sum(df.buy_cc) cc_returns = np.prod(df.roi[df.buy_cc == 1]) * 100 - 100 cc_exc_ret = cc_returns - ret_bah cc_std = np.std(df.ret_normal) # Compute scores for SLC slc_precision = precision_score(y_pred = df.buy_slc, y_true = df.ret) slc_recall = recall_score(y_pred = df.buy_slc, y_true = df.ret) slc_signals = np.sum(df.buy_slc) slc_returns = np.prod(df.roi[df.buy_slc == 1]) * 100 - 100 slc_exc_ret = slc_returns - ret_bah slc_std = np.std(df.ret_normal) # Configure output append_data = (stk, temp_m1, temp_m2, s, returns, cc_precision, cc_recall, cc_signals, cc_returns, cc_exc_ret, cc_std, slc_precision, slc_recall, slc_signals, slc_returns, slc_exc_ret, slc_std, baserate, n_days) # Append data output.append(append_data) except ValueError: output = [(stk, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan, np.nan)] return output
5-day Forward Returns
First, we run trading simulations with returns calculated using the 5-day forward returns. A total of 39,604 MACD simulations were run, comprising 497 stocks with approximately 80 MACD configurations each.
# # Initialise results # all_res =  # # Collect data on all S&P 500 companies # for i in np.arange(0, len(sp500.Symbol)): # # Get symbol # stk = sp500.Symbol.iloc[i] # # Update # print('Processing [' + str(i) + '] ' + stk + '...', end = '', flush = True) # # Simulate trades and append results # temp_res = sim_macd_multi(stk) # all_res += temp_res # # Convert to df # temp_res_df = pd.DataFrame(temp_res) # # Extract returns # cc_exc_ret = str(round(temp_res_df.loc[:, 9].mean(), 2)) + '%' # slc_exc_ret = str(round(temp_res_df.loc[:, 15].mean(), 2)) + '%' # cc_exc_ret = str(round(temp_res_df.loc[:, 9].mean(), 2)) + '%' # slc_exc_ret = str(round(temp_res_df.loc[:, 15].mean(), 2)) + '%' # # Update # print('CC Returns: ' + cc_exc_ret + ' | ' + 'SLC Returns: ' + slc_exc_ret) # # Delete erroneous entries # # all_res = [x for x in all_res if not np.isnan(x)] # # Consolidate results # res_df = pd.DataFrame(all_res, columns = ['stock', 'm1', 'm2', 's', 'returns', 'cc_precision', 'cc_recall', 'cc_signals', 'cc_returns', # 'cc_exc_returns', 'cc_std', 'slc_precision', 'slc_recall', 'slc_signals', # 'slc_returns', 'slc_exc_returns', 'slc_std', 'base_rate', 'n_days']) # res_df.dropna(axis = 0, inplace = True) # # Create ID variable # res_df['id'] = res_df.m1.astype(str) + '_' + res_df.m2.astype(str) + '_' + res_df.s.astype(str) # # Save # res_df.to_csv('macd_ret5_results.csv', index = False) # res_df = pd.read_csv('macd_ret5_results.csv')
Overall, the MACD strategy does not appear to be profitable. The first graph below tells us that most strategies (combination of stock and MACD configuration) had zero overall returns. The second graph shows us that most strategies could not beat the buy-and-hold benchmark. Only 4.57% of the 39,604 strategies delivered positive excess returns using the signal line crossover trigger, and 3.35% using the centerline crossover trigger.
It is also worth point out some interesting results from the simulations. First, note how the distribution of returns for the signal line crossover strategy towers over that of the centerline crossover strategy in both graphs. This is because it generates substantially more buy signals, resulting in more trades. Second, note how there are a bunch of strategies that lie to the right of zero in the second graph. This implies that it is possible to make positive excess returns if you choose the right stock and the right configuration. However, that number is extremely small: just 4.57% of all possible configurations of stock choice and MACD settings.
Overall, the MACD does not appear to give us any advantage in identifying profitable trades. The first graph shows us that the precision of an MACD-based trading strategy is centered just above 50%: of all profitable trades predicted, slightly more than half of them were actually profitable. The second graph shows that on average, the MACD strategy could not beat naive guesses (the more popular outcome of profitable vs. unprofitable trade). Thus, the data suggests that the MACD strategy did poorly in identifying profitable trades. A coin flip would have performed just as well.
10-day Forward Returns
The results for the simulations using 10-day forward returns were not much better. Most strategies could not generate positive excess returns: 6.10% of the strategies using the signal line crossover trigger and 3.23% of the strategies using the centerline crossover trigger delivered positive excess returns.
Likewise, the results for MACD precision were no better.
20-day Forward Returns
For the simulations using 20-day forward returns, we observe an improvement in the number of strategies that generated positive excess returns. This time, 10.16% of the signal line crossover strategies beat the buy-and-hold benchmark. However, only 2.92% of centerline crossover strategies did so, possibly because of the significantly smaller number of buy signals generated from the longer returns period (recall that the fast and slow EMA for MACD were multiples of the forward returns periods).
Meanwhile, precision of the MACD strategies remained poor.
The graph below summarises the results from our simulations earlier. Essentially, trading the MACD with 5-day, 10-day, and 20-day forward looking returns failed to beat the benchmark: a simple buy-and-hold strategy. We ensured high confidence in the simulations by testing 80 configurations for each of 497 stocks in the S&P 500. At this point, we cannot yet conclude that trading strategies built on existing indicators will fail to deliver positive excess returns. What we can say, though, is that trading strategies should not be built on the MACD alone.
Click here for the full Jupyter notebook.
Credits for images: FinanceAndMarkets.com