Rebalancing with the Conservative Formula
The Conservative Formula approach is presented in this paper: The Conservative Formula in Python: Quantitative Investing made Easy
It is one many possible rebalancing approaches, but one that is easy to grasp. A summary of the approach:
-
x
stocks are selected from a universe ofY
(100 of 1000) -
The selection criteria are
- Low volatility
- High Net Payout Yield
- High Momentum
- Rebalancing every month
With this in mind let's go and present a possible implementation in backtrader
The data
Even if one has a winning strategy, nothing will be actually won if no data is available for the strategy. Which means that it has to be considered how the data looks like and how to load it.
A set of CSV ("comma-separated-values") files is assumed to be available, containing with the following features
-
ohlcv
monthly data -
With an extra field after the
v
containing the Net Payout Yield (npy
), to have anohlcvn
data set.
The format of the CSV data will therefore look like this
date, open, high, low, close, volume, npy
2001-12-31, 1.0, 1.0, 1.0, 1.0, 0.5, 3.0
2002-01-31, 2.0, 2.5, 1.1, 1.2, 3.0, 5.0
...
I.e.: one row per month. The data loader engine can now be prepared for which simple extension of the generic built-in CSV loader delivered with backtrader will be created.
class NetPayOutData(bt.feeds.GenericCSVData):
lines = ('npy',) # add a line containing the net payout yield
params = dict(
npy=6, # npy field is in the 6th column (0 based index)
dtformat='%Y-%m-%d', # fix date format a yyyy-mm-dd
timeframe=bt.TimeFrame.Months, # fixed the timeframe
openinterest=-1, # -1 indicates there is no openinterest field
)
And that is. Notice how easy has been to add a point of fundamental data to the
ohlcv
data stream.
-
By using the expresion
lines=('npy',)
. The other usual fields (open
,high
, ...) are already part ofGenericCSVData
-
By indicating the loading position with the
params = dict(npy=6)
. The other fields have a predefined position.
The timeframe has also been updated in the parameters to reflect the monthly nature of the data.
Note
See Docs - Data Feeds Reference - GenericCSVData for the actual fields and loading positions (which can all be customized)
The data loader will have to be properly instantiated with a file name, but that's something for later, when a standard boilerplate is presented below to have a complete script.
The Strategy
Let's put the logic into a standard backtrader strategy. To make it as
generic and customizable as possible, the same same params
approach will be
used, as it was used before with the data.
Before delving into the strategy, let's consider one of the points from the quick summary
x
stocks are selected from a universe ofY
The strategy itself is not in charge of adding stocks to the universe, but it
is in charge of the selection. One could be in a situation in which only 50
stocks have been added and still try to select 100 if x
and Y
are fixed in
the code. To cope with such situations, the following will be done:
-
Have a
selperc
parameter with a value of0.10
(i.e.:10%
), to indicate the amount of stocks to be selected from the universe.This means that if 1000 are present, only 100 will be selected and if the universe consist of 50 stocks, only 5 will be selected.
As for the formula ranking the stock, it looks like this:
-
(momentum * net payout) / volatility
Which means that those with higher momentum, higher payout and lower volatility will have a higher score.
For momentum
the RateOfChange
indicator (aka ROC
) will be used, which
measures the ratio of change in prices over a period.
The net payout
is already part of the data feed.
To calculate the volatility
, the StandardDeviation
of the
n-periods
return of the stock (n-periods
, because things will be kept as
parameters) will be used.
With this information, the strategy can already be initialize with the right parameters and the setup of the indicators and calculations which will be later used in each monthly iteration.
First the declaration and the parameters
class St(bt.Strategy):
params = dict(
selcperc=0.10, # percentage of stocks to select from the universe
rperiod=1, # period for the returns calculation, default 1 period
vperiod=36, # lookback period for volatility - default 36 periods
mperiod=12, # lookback period for momentum - default 12 periods
reserve=0.05 # 5% reserve capital
)
Notice that something not mentioned above has been added, and that is a
parameter reserve=0.05
(i.e. 5%), which is used to calculated the
percentage allocation per stock, keeping a reserve capital in the
bank. Although for a simulation one could conceivable want to use 100% of the
capital, one can hit the usual problems doing that, such as price gaps,
floating point precision and end up missing some of the market entries.
Before anything else, a small logging method is created, which will allow to log how the portfolio is rebalanced.
def log(self, arg):
print('{} {}'.format(self.datetime.date(), arg))
At the beginning of the __init__
method, the number of stocks to rank is
calculated and the reserve capital parameter is applied to determine the per
stock percentage of the bank.
def __init__(self):
# calculate 1st the amount of stocks that will be selected
self.selnum = int(len(self.datas) * self.p.selcperc)
# allocation perc per stock
# reserve kept to make sure orders are not rejected due to
# margin. Prices are calculated when known (close), but orders can only
# be executed next day (opening price). Price can gap upwards
self.perctarget = (1.0 - self.p.reserve) % self.selnum
And finally the initialization is over with the calculation of the per stock indicators for volatility and momentum, which are then applied in the per stock ranking formula calculation.
# returns, volatilities and momentums
rs = [bt.ind.PctChange(d, period=self.p.rperiod) for d in self.datas]
vs = [bt.ind.StdDev(ret, period=self.p.vperiod) for ret in rs]
ms = [bt.ind.ROC(d, period=self.p.mperiod) for d in self.datas]
# simple rank formula: (momentum * net payout) / volatility
# the highest ranked: low vol, large momentum, large payout
self.ranks = {d: d.npy * m / v for d, v, m in zip(self.datas, vs, ms)}
It's now time to iterate each month. The ranking is available in the
self.ranks
dictionary. The key/value pairs have to be sorted for each
iteration, to get which items have to go and which ones have to be part of the
portfolio (remain or be added)
def next(self):
# sort data and current rank
ranks = sorted(
self.ranks.items(), # get the (d, rank), pair
key=lambda x: x[1][0], # use rank (elem 1) and current time "0"
reverse=True, # highest ranked 1st ... please
)
The iterable is sorted in reverse order, because the ranking formula delivers higher scores for the highest ranked stocks.
Rebalancing is now due.
Rebalancing 1: Get Top Ranked and the stocks with open positions
# put top ranked in dict with data as key to test for presence
rtop = dict(ranks[:self.selnum])
# For logging purposes of stocks leaving the portfolio
rbot = dict(ranks[self.selnum:])
A bit of Python trickery is happening here, because a dict
is being used. The
reason is that if the top ranked stocks were put in a list
the operator ==
would be used internally by Python to check for presence with the operator
in
. And although improbable it would be possible for two stocks to have the
same value on the same day. When using a dict
a hash value is used when
checking for presence of an item as part of the keys.
Note: For logging purposes rbot
(ranked bottom) is also created with
the stocks not present in rtop
.
To later discriminate between stocks that have to leave the portfolio, those which simply have to be rebalanced and the newly top ranked, a current list of stocks in the portfolio is prepared.
# prepare quick lookup list of stocks currently holding a position
posdata = [d for d, pos in self.getpositions().items() if pos]
Rebalancing 2: Sell those no longer top ranked
Just like in real world, in the backtrader ecosystem selling before buying is a must to ensure enough cash is there.
# remove those no longer top ranked
# do this first to issue sell orders and free cash
for d in (d for d in posdata if d not in rtop):
self.log('Exit {} - Rank {:.2f}'.format(d._name, rbot[d][0]))
self.order_target_percent(d, target=0.0)
Stocks currently with an open position and no longer top ranked are sold
(i.e. target=0.0
).
Note
A simple self.close(data)
would have sufficed here, rather than explicitly
stating the target percentage.
Rebalancing 3: Issue a target order for all top ranked stocks
The total portfolio value changes over time and those stocks already in the
portfolio may have to slightly increase/reduce the current position to match
the expected percentage. order_target_percent
is an ideal method to enter the
market, because it does automatically calculate whether a buy
or a sell
order is needed.
# rebalance those already top ranked and still there
for d in (d for d in posdata if d in rtop):
self.log('Rebal {} - Rank {:.2f}'.format(d._name, rtop[d][0]))
self.order_target_percent(d, target=self.perctarget)
del rtop[d] # remove it, to simplify next iteration
Rebalancing the stocks already with a position is done before adding the new
ones to the portfolio, as the new one will only issue buy
orders and consume
cash. Having removed the existing stocks from with rtop[data].pop()
after
having re-balanced, the remaining stocks in rtop
are those which will be
newly added to the portfolio.
# issue a target order for the newly top ranked stocks
# do this last, as this will generate buy orders consuming cash
for d in rtop:
self.log('Enter {} - Rank {:.2f}'.format(d._name, rtop[d][0]))
self.order_target_percent(d, target=self.perctarget)
Running it all and Evaluating it!
Having a data loader class and the strategy is not enough. Just like with any other framework, some boilerplate is needed. The following code makes it possible.
def run(args=None):
args = parse_args(args)
cerebro = bt.Cerebro()
# Data feed kwargs
dkwargs = dict(**eval('dict(' + args.dargs + ')'))
# Parse from/to-date
dtfmt, tmfmt = '%Y-%m-%d', 'T%H:%M:%S'
if args.fromdate:
fmt = dtfmt + tmfmt * ('T' in args.fromdate)
dkwargs['fromdate'] = datetime.datetime.strptime(args.fromdate, fmt)
if args.todate:
fmt = dtfmt + tmfmt * ('T' in args.todate)
dkwargs['todate'] = datetime.datetime.strptime(args.todate, fmt)
# add all the data files available in the directory datadir
for fname in glob.glob(os.path.join(args.datadir, '*')):
data = NetPayOutData(dataname=fname, **dkwargs)
cerebro.adddata(data)
# add strategy
cerebro.addstrategy(St, **eval('dict(' + args.strat + ')'))
# set the cash
cerebro.broker.setcash(args.cash)
cerebro.run() # execute it all
# Basic performance evaluation ... final value ... minus starting cash
pnl = cerebro.broker.get_value() - args.cash
print('Profit ... or Loss: {:.2f}'.format(pnl))
Where the following is done:
-
Parsing arguments and have this available (this is obviously optional, as everything can be hardcoded, but good practices are good practices)
-
Creating a
cerebro
engine instance. Yes, this is Spanish for "brain" and is the part of the framework in charge of coordinating the orchestral maneuvers in the dark. Although it can accept several options, the defaults should suffice for most use cases. -
Loading the data files, which is done with a simple directory scan of
args.datadir
is done and all files are loaded withNetPayOutData
and added to thecerebro
instance -
Adding the strategy
-
Setting the cash, which defaults to
1,000,000
. Given that the use case is for100
stocks in a universe of500
, it seems fair to have some cash to spare. It is also an argument which can be changed. -
And calling
cerebro.run()
-
Finally the performance is evaluated
To make it possible to run things with different parameters straight from the
command line, an argparse
enabled boilerplate is presented below, with the
entire code
Performance Evaluation
A naive performance evaluation added in the form of the final resulting value, i.e.: the final net asset value minus the starting cash.
The backtrader ecosystem offers a set of built-in performance analyzers which
could also be used, like: SharpeRatio
, Variability-Weighted Return
, SQN
and others. See Docs - Analyzers
Reference
The complete script
And finally the bulk of the work presented as whole. Enjoy!
import argparse
import datetime
import glob
import os.path
import backtrader as bt
class NetPayOutData(bt.feeds.GenericCSVData):
lines = ('npy',) # add a line containing the net payout yield
params = dict(
npy=6, # npy field is in the 6th column (0 based index)
dtformat='%Y-%m-%d', # fix date format a yyyy-mm-dd
timeframe=bt.TimeFrame.Months, # fixed the timeframe
openinterest=-1, # -1 indicates there is no openinterest field
)
class St(bt.Strategy):
params = dict(
selcperc=0.10, # percentage of stocks to select from the universe
rperiod=1, # period for the returns calculation, default 1 period
vperiod=36, # lookback period for volatility - default 36 periods
mperiod=12, # lookback period for momentum - default 12 periods
reserve=0.05 # 5% reserve capital
)
def log(self, arg):
print('{} {}'.format(self.datetime.date(), arg))
def __init__(self):
# calculate 1st the amount of stocks that will be selected
self.selnum = int(len(self.datas) * self.p.selcperc)
# allocation perc per stock
# reserve kept to make sure orders are not rejected due to
# margin. Prices are calculated when known (close), but orders can only
# be executed next day (opening price). Price can gap upwards
self.perctarget = (1.0 - self.p.reserve) / self.selnum
# returns, volatilities and momentums
rs = [bt.ind.PctChange(d, period=self.p.rperiod) for d in self.datas]
vs = [bt.ind.StdDev(ret, period=self.p.vperiod) for ret in rs]
ms = [bt.ind.ROC(d, period=self.p.mperiod) for d in self.datas]
# simple rank formula: (momentum * net payout) / volatility
# the highest ranked: low vol, large momentum, large payout
self.ranks = {d: d.npy * m / v for d, v, m in zip(self.datas, vs, ms)}
def next(self):
# sort data and current rank
ranks = sorted(
self.ranks.items(), # get the (d, rank), pair
key=lambda x: x[1][0], # use rank (elem 1) and current time "0"
reverse=True, # highest ranked 1st ... please
)
# put top ranked in dict with data as key to test for presence
rtop = dict(ranks[:self.selnum])
# For logging purposes of stocks leaving the portfolio
rbot = dict(ranks[self.selnum:])
# prepare quick lookup list of stocks currently holding a position
posdata = [d for d, pos in self.getpositions().items() if pos]
# remove those no longer top ranked
# do this first to issue sell orders and free cash
for d in (d for d in posdata if d not in rtop):
self.log('Leave {} - Rank {:.2f}'.format(d._name, rbot[d][0]))
self.order_target_percent(d, target=0.0)
# rebalance those already top ranked and still there
for d in (d for d in posdata if d in rtop):
self.log('Rebal {} - Rank {:.2f}'.format(d._name, rtop[d][0]))
self.order_target_percent(d, target=self.perctarget)
del rtop[d] # remove it, to simplify next iteration
# issue a target order for the newly top ranked stocks
# do this last, as this will generate buy orders consuming cash
for d in rtop:
self.log('Enter {} - Rank {:.2f}'.format(d._name, rtop[d][0]))
self.order_target_percent(d, target=self.perctarget)
def run(args=None):
args = parse_args(args)
cerebro = bt.Cerebro()
# Data feed kwargs
dkwargs = dict(**eval('dict(' + args.dargs + ')'))
# Parse from/to-date
dtfmt, tmfmt = '%Y-%m-%d', 'T%H:%M:%S'
if args.fromdate:
fmt = dtfmt + tmfmt * ('T' in args.fromdate)
dkwargs['fromdate'] = datetime.datetime.strptime(args.fromdate, fmt)
if args.todate:
fmt = dtfmt + tmfmt * ('T' in args.todate)
dkwargs['todate'] = datetime.datetime.strptime(args.todate, fmt)
# add all the data files available in the directory datadir
for fname in glob.glob(os.path.join(args.datadir, '*')):
data = NetPayOutData(dataname=fname, **dkwargs)
cerebro.adddata(data)
# add strategy
cerebro.addstrategy(St, **eval('dict(' + args.strat + ')'))
# set the cash
cerebro.broker.setcash(args.cash)
cerebro.run() # execute it all
# Basic performance evaluation ... final value ... minus starting cash
pnl = cerebro.broker.get_value() - args.cash
print('Profit ... or Loss: {:.2f}'.format(pnl))
def parse_args(pargs=None):
parser = argparse.ArgumentParser(
formatter_class=argparse.ArgumentDefaultsHelpFormatter,
description=('Rebalancing with the Conservative Formula'),
)
parser.add_argument('--datadir', required=True,
help='Directory with data files')
parser.add_argument('--dargs', default='',
metavar='kwargs', help='kwargs in k1=v1,k2=v2 format')
# Defaults for dates
parser.add_argument('--fromdate', required=False, default='',
help='Date[time] in YYYY-MM-DD[THH:MM:SS] format')
parser.add_argument('--todate', required=False, default='',
help='Date[time] in YYYY-MM-DD[THH:MM:SS] format')
parser.add_argument('--cerebro', required=False, default='',
metavar='kwargs', help='kwargs in k1=v1,k2=v2 format')
parser.add_argument('--cash', default=1000000.0, type=float,
metavar='kwargs', help='kwargs in k1=v1,k2=v2 format')
parser.add_argument('--strat', required=False, default='',
metavar='kwargs', help='kwargs in k1=v1,k2=v2 format')
return parser.parse_args(pargs)
if __name__ == '__main__':
run()