Lots o' Ticks: real-time high performance queries on billions of quotes and trades Arthur Whitney, Chief Technical Officer, KX Systems. a@kx.com Dennis Shasha, Courant Institute, New York University. shasha@cs.nyu.edu
KSQL Model Replace Tables by Arrables (Array Tables). Any set-oriented query is answered as in SQL. New queries are possible based on viewing columns in tables as arrays.
Demo Application: Stocks 10,000 securities from all the major markets. Entire quote and trade history. Production application holds 10 years of data, about 1/2 terabyte, on a Linux cluster of up to 100 processors. For demo: a few months on a single PC.
Example Query 1 Find the 5 tick moving averages of each stock per month. trade is ordered by date. select 5 avgs price by sym, date.month from trade “avgs” produces an array for each stock, month pair. In SQL, aggregates can produce only a scalar for each group by result.
Example Query 2 Find the 10 day delayed auto-correlation (a function defined in C) for each stock: select auto[10,price] by sym from trade User-defined functions can be defined in C and imported into the name space. They can produce arrays or scalars.
“Emotive” Queries http://kx.com/a/tdb/q.t 7 best stocks to buy and later sell. “7 first desc …(last price)/first price …” 7 worst hours for some stock. “7 first asc …” The best buy and later sale for some stock. “… max price - prefixmins price …”
Implementation Fully vertically partitioned to take advantage of cache lines and stride one behavior. Special algorithms for sorting. Very small footprint (< 0.3 megabytes)
Who Uses KDB Now? Mostly finance and telecommunications companies. High value, large data. Response must be fast. Data interchange products, because XML is ordered as is KDB.
More Information Timeseries application at: http://kx.com/a/tdb Realtime: 10 million trades and quotes per day(reuters triarch) at max rate of 10000 records per second. History: 10 billion trades and quotes. Free download of KDB at www.kx.com: