Binary Datafeed Development
Note
The binary file used in the examples goog.fd
belongs to VisualChart and
cannot be distributed with backtrader
.
VisualChart can be downloaded free of charge for those interested in directly using the binary files.
CSV Data feed development has shown how to add new CSV based data feeds. The existing base class CSVDataBase provides the framework taking most of the work off the subclasses which in most cases can simply do:
def _loadline(self, linetokens):
# parse the linetokens here and put them in self.lines.close,
# self.lines.high, etc
return True # if data was parsed, else ... return False
The base class takes care of the parameters, initialization, opening of files,
reading lines, splitting the lines in tokens and additional things like skipping
lines which don’t fit into the date range (fromdate
, todate
) which the
end user may have defined.
Developing a non-CSV datafeed follows the same pattern without going down to the already splitted line tokens.
Things to do:
-
Derive from backtrader.feed.DataBase
-
Add any parameters you may need
-
Should initialization be needed, override
__init__(self)
and/orstart(self)
-
Should any clean-up code be needed, override
stop(self)
-
The work happens inside the method which MUST always be overriden:
_load(self)
Let’s the parameters already provided by backtrader.feed.DataBase
:
from backtrader.utils.py3 import with_metaclass
...
...
class DataBase(with_metaclass(MetaDataBase, dataseries.OHLCDateTime)):
params = (('dataname', None),
('fromdate', datetime.datetime.min),
('todate', datetime.datetime.max),
('name', ''),
('compression', 1),
('timeframe', TimeFrame.Days),
('sessionend', None))
Having the following meanings:
-
dataname
is what allows the data feed to identify how to fetch the data. In the case of theCSVDataBase
this parameter is meant to be a path to a file or already a file-like object. -
fromdate
andtodate
define the date range which will be passed to strategies. Any value provided by the feed outside of this range will be ignored -
name
is cosmetic for plotting purposes -
timeframe
indicates the temporal working referencePotential values:
Ticks
,Seconds
,Minutes
,Days
,Weeks
,Months
andYears
-
compression
(default: 1)Number of actual bars per bar. Informative. Only effective in Data Resampling/Replaying.
-
compression
-
sessionend
if passed (a datetime.time object) will be added to the datafeeddatetime
line which allows identifying the end of the session
Sample binary datafeed
backtrader
already defines a CSV datafeed (VChartCSVData
) for the
exports of VisualChart, but it is also possible to
directly read the binary data files.
Let’s do it (full data feed code can be found at the bottom)
Initialization
The binary VisualChart data files can contain either daily (.fd extension) or
intraday data (.min extension). Here the parameter timeframe
will be used to distinguish which type of file is being read.
During __init__
constants which differ for each type are set up.
def __init__(self):
super(VChartData, self).__init__()
# Use the informative "timeframe" parameter to understand if the
# code passed as "dataname" refers to an intraday or daily feed
if self.p.timeframe >= TimeFrame.Days:
self.barsize = 28
self.dtsize = 1
self.barfmt = 'IffffII'
else:
self.dtsize = 2
self.barsize = 32
self.barfmt = 'IIffffII'
Start
The Datafeed will be started when backtesting commences (it can actually be started several times during optimizations)
In the start
method the binary file is open unless a file-like object has
been passed.
def start(self):
# the feed must start ... get the file open (or see if it was open)
self.f = None
if hasattr(self.p.dataname, 'read'):
# A file has been passed in (ex: from a GUI)
self.f = self.p.dataname
else:
# Let an exception propagate
self.f = open(self.p.dataname, 'rb')
Stop
Called when backtesting is finished.
If a file was open, it will be closed
def stop(self):
# Close the file if any
if self.f is not None:
self.f.close()
self.f = None
Actual Loading
The actual work is done in _load
. Called to load the next set of data, in
this case the next : datetime, open, high, low, close, volume, openinterest. In
backtrader
the “actual” moment corresponds to index 0.
A number of bytes will be read from the open file (determined by the constants
set up during __init__
), parsed with the struct
module, further
processed if needed (like with divmod operations for date and time) and stored
in the lines
of the data feed: datetime, open, high, low, close, volume,
openinterest.
If no data can be read from the file it is assumed that the End Of File (EOF) has been reached
False
is returned to indicate the fact no more data is available
Else if data has been loaded and parsed:
True
is returned to indicate the loading of the data set was a success
def _load(self):
if self.f is None:
# if no file ... no parsing
return False
# Read the needed amount of binary data
bardata = self.f.read(self.barsize)
if not bardata:
# if no data was read ... game over say "False"
return False
# use struct to unpack the data
bdata = struct.unpack(self.barfmt, bardata)
# Years are stored as if they had 500 days
y, md = divmod(bdata[0], 500)
# Months are stored as if they had 32 days
m, d = divmod(md, 32)
# put y, m, d in a datetime
dt = datetime.datetime(y, m, d)
if self.dtsize > 1: # Minute Bars
# Daily Time is stored in seconds
hhmm, ss = divmod(bdata[1], 60)
hh, mm = divmod(hhmm, 60)
# add the time to the existing atetime
dt = dt.replace(hour=hh, minute=mm, second=ss)
self.lines.datetime[0] = date2num(dt)
# Get the rest of the unpacked data
o, h, l, c, v, oi = bdata[self.dtsize:]
self.lines.open[0] = o
self.lines.high[0] = h
self.lines.low[0] = l
self.lines.close[0] = c
self.lines.volume[0] = v
self.lines.openinterest[0] = oi
# Say success
return True
Other Binary Formats
The same model can be applied to any other binary source:
-
Database
-
Hierarchical data storage
-
Online source
The steps again:
-
__init__
-> Any init code for the instance, only once -
start
-> start of backtesting (one or more times if optimization will be run)This would for example open the connection to the database or a socket to an online service
-
stop
-> clean-up like closing the database connection or open sockets -
_load
-> query the database or online source for the next set of data and load it into thelines
of the object. The standard fields being: datetime, open, high, low, close, volume, openinterest
VChartData Test
The VCharData
loading data from a local “.fd” file for Google for the
year 2006.
It’s only about loading the data, so not even a subclass of Strategy
is
needed.
from __future__ import (absolute_import, division, print_function,
unicode_literals)
import datetime
import backtrader as bt
from vchart import VChartData
if __name__ == '__main__':
# Create a cerebro entity
cerebro = bt.Cerebro(stdstats=False)
# Add a strategy
cerebro.addstrategy(bt.Strategy)
###########################################################################
# Note:
# The goog.fd file belongs to VisualChart and cannot be distributed with
# backtrader
#
# VisualChart can be downloaded from www.visualchart.com
###########################################################################
# Create a Data Feed
datapath = '../../datas/goog.fd'
data = VChartData(
dataname=datapath,
fromdate=datetime.datetime(2006, 1, 1),
todate=datetime.datetime(2006, 12, 31),
timeframe=bt.TimeFrame.Days
)
# Add the Data Feed to Cerebro
cerebro.adddata(data)
# Run over everything
cerebro.run()
# Plot the result
cerebro.plot(style='bar')
VChartData Full Code
from __future__ import (absolute_import, division, print_function,
unicode_literals)
import datetime
import struct
from backtrader.feed import DataBase
from backtrader import date2num
from backtrader import TimeFrame
class VChartData(DataBase):
def __init__(self):
super(VChartData, self).__init__()
# Use the informative "timeframe" parameter to understand if the
# code passed as "dataname" refers to an intraday or daily feed
if self.p.timeframe >= TimeFrame.Days:
self.barsize = 28
self.dtsize = 1
self.barfmt = 'IffffII'
else:
self.dtsize = 2
self.barsize = 32
self.barfmt = 'IIffffII'
def start(self):
# the feed must start ... get the file open (or see if it was open)
self.f = None
if hasattr(self.p.dataname, 'read'):
# A file has been passed in (ex: from a GUI)
self.f = self.p.dataname
else:
# Let an exception propagate
self.f = open(self.p.dataname, 'rb')
def stop(self):
# Close the file if any
if self.f is not None:
self.f.close()
self.f = None
def _load(self):
if self.f is None:
# if no file ... no parsing
return False
# Read the needed amount of binary data
bardata = self.f.read(self.barsize)
if not bardata:
# if no data was read ... game over say "False"
return False
# use struct to unpack the data
bdata = struct.unpack(self.barfmt, bardata)
# Years are stored as if they had 500 days
y, md = divmod(bdata[0], 500)
# Months are stored as if they had 32 days
m, d = divmod(md, 32)
# put y, m, d in a datetime
dt = datetime.datetime(y, m, d)
if self.dtsize > 1: # Minute Bars
# Daily Time is stored in seconds
hhmm, ss = divmod(bdata[1], 60)
hh, mm = divmod(hhmm, 60)
# add the time to the existing atetime
dt = dt.replace(hour=hh, minute=mm, second=ss)
self.lines.datetime[0] = date2num(dt)
# Get the rest of the unpacked data
o, h, l, c, v, oi = bdata[self.dtsize:]
self.lines.open[0] = o
self.lines.high[0] = h
self.lines.low[0] = l
self.lines.close[0] = c
self.lines.volume[0] = v
self.lines.openinterest[0] = oi
# Say success
return True