itch41 - simple NASDAQ TotalView-ITCH 4.1 parser / converter

author:Cimarron Taylor
version:0.1
date:August 28, 2011

To demonstrate the structure of NASDAQ ITCH41 data, I wrote a simple yaml-driven parser using python.

Overview

This code parses a file of level 3 NASDAQ stock exchange data, such as
ftp://emi.nasdaq.com/ITCH/S030711-v41.txt.gz

It relies on a YAML specification derived from the ITCH41 documentation to construct tables of formatting strings suitable for use with the python struct module which does most of the work.

Example

Using the code is straightforward, as the sample test program (src/main.py) demonstrates:

#!/usr/bin/env python

import sys
import itch41

converter = itch41.ITCH41()
sep       = '|'
filename  = sys.argv[1]

with file(filename) as f:

    # read all the messages
    for m in converter.parse(f):

        # set aside unknown/invalid messages
        if m[0] is None:
            print >>sys.stderr, m
            continue

        # convert message to db compatible format
        rec = converter.record(m)
        print sep.join(rec)

Schema

The records generated from the parser may easily be processed by additional code or converted to other formats such as csv. All the output records have the same schema, described in the file src/create.sql.

Note that an extra _offset column appears at the beginning of the record to indicate the byte offset of the corresponding ITCH record in the original data feed. While often helpful in diagnosing downstream data processing issues, it may easily be ommitted if not desired.

License

This software is licensed under the 2-clause BSD license. See the file LICENSE for details.

Source

The source is kept in a Mercurial repository over at bitbucket which you can get via:

% hg clone http://bitbucket.org/cdtitch41/itch41

You can also browse the source at bitbucket here: http://bitbucket.org/cdtitch41/itch41/src

Requirements

Checking out itch41 from bitbucket requires Mercurial

Running itch41 requires Python and the python yaml module.

Re-building the itch41 documentation from its reStructuredText sources requires Python and Sphinx

Project Structure

itch41                        - project directory
  |-- Makefile                - build commands
  |-- data                    - directory with Makefile to download some sample data
  |-- doc                     - sphinx documentation
  |-- src                     - Python sources
  |   |-- create.sql          - create table statement for output record schema
  |   |-- itch41.py           - itch41 format converter
  |   `-- main.py             - sample test program
  |-- test                    - test output files
  `-- todo                    - known improvements

Additional directories are created by the Makefile directives:

itch41
  `-- work                    - temporary files generated by test

Table Of Contents

Previous topic

August 2011

Next topic

January 2009

This Page