| author: | Cimarron Taylor |
|---|---|
| version: | 0.1 |
| date: | August 28, 2011 |
To demonstrate the structure of NASDAQ ITCH41 data, I wrote a simple yaml-driven parser using python.
It relies on a YAML specification derived from the ITCH41 documentation to construct tables of formatting strings suitable for use with the python struct module which does most of the work.
Using the code is straightforward, as the sample test program (src/main.py) demonstrates:
#!/usr/bin/env python
import sys
import itch41
converter = itch41.ITCH41()
sep = '|'
filename = sys.argv[1]
with file(filename) as f:
# read all the messages
for m in converter.parse(f):
# set aside unknown/invalid messages
if m[0] is None:
print >>sys.stderr, m
continue
# convert message to db compatible format
rec = converter.record(m)
print sep.join(rec)
The records generated from the parser may easily be processed by additional code or converted to other formats such as csv. All the output records have the same schema, described in the file src/create.sql.
Note that an extra _offset column appears at the beginning of the record to indicate the byte offset of the corresponding ITCH record in the original data feed. While often helpful in diagnosing downstream data processing issues, it may easily be ommitted if not desired.
This software is licensed under the 2-clause BSD license. See the file LICENSE for details.
The source is kept in a Mercurial repository over at bitbucket which you can get via:
% hg clone http://bitbucket.org/cdtitch41/itch41
You can also browse the source at bitbucket here: http://bitbucket.org/cdtitch41/itch41/src
Checking out itch41 from bitbucket requires Mercurial
Running itch41 requires Python and the python yaml module.
Re-building the itch41 documentation from its reStructuredText sources requires Python and Sphinx
itch41 - project directory
|-- Makefile - build commands
|-- data - directory with Makefile to download some sample data
|-- doc - sphinx documentation
|-- src - Python sources
| |-- create.sql - create table statement for output record schema
| |-- itch41.py - itch41 format converter
| `-- main.py - sample test program
|-- test - test output files
`-- todo - known improvements
Additional directories are created by the Makefile directives:
itch41
`-- work - temporary files generated by test