Spark vs Storm in Pairs Trading Predictions

Slides:



Advertisements
Similar presentations
Value-at-Risk: A Risk Estimating Tool for Management
Advertisements

Cointegration and Error Correction Models
Functional Form and Dynamic Models
Nonstationary Time Series Data and Cointegration
Forecasting OPS 370.
A State Contingent Claim Approach To Asset Valuation Kate Barraclough.
Nonstationary Time Series Data and Cointegration Prepared by Vera Tabakova, East Carolina University.
Introduction to Algorithmic Trading Strategies Lecture 3 Pairs Trading by Cointegration Haksun Li
Discretized Streams: Fault-Tolerant Streaming Computation at Scale Wenting Wang 1.
Chapter 12 - Forecasting Forecasting is important in the business decision-making process in which a current choice or decision has future implications:
 Derek Crookes: ◦ Basic introduction to pairs Trading  Jonathan Roche: ◦ Highlights of a Pairs Trading Case Study ◦ Searching.
Business Forecasting Chapter 5 Forecasting with Smoothing Techniques.
BOX JENKINS METHODOLOGY
CH5 Overview 1. Agenda 1.History 2.Motivation 3.Cointegration 4.Applying the model 5.A trading strategy 6.Road map for strategy design 2.
McGraw-Hill/Irwin Copyright © 2005 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 16 Managing Bond Portfolios.
Capital Asset Pricing Model CAPM Security Market Line CAPM and Market Efficiency Alpha (  ) vs. Beta (  )
Capital Market Theory Chapter 20 Jones, Investments: Analysis and Management.
CH7 Testing For Tradability 1. Agenda 1.Introduction 2.The linear relationship 3.Estimating the linear relationship: The multifactor approach 4.Estimating.
INVESTMENTS | BODIE, KANE, MARCUS Copyright © 2011 by The McGraw-Hill Companies, Inc. All rights reserved. McGraw-Hill/Irwin CHAPTER 20 Futures, Swaps,
Portfolio Construction Strategies Using Cointegration M.Sc. Student: IONESCU GABRIEL Supervisor: Professor MOISA ALTAR BUCHAREST, JUNE 2002 ACADEMY OF.
20 Hedge Funds Bodie, Kane, and Marcus
McGraw-Hill/Irwin Copyright © 2005 by The McGraw-Hill Companies, Inc. All rights reserved. Chapter 11 Arbitrage Pricing Theory and Multifactor Models of.
Spark Streaming Large-scale near-real-time stream processing
Yale School of Management 1 Pairs Trading: performance of a relative value arbitrage strategy Evan G. Gatev William N. Goetzmann K. Geert Rouwenhorst Yale.
Monte Carlo Linear Algebra Techniques and Their Parallelization Ashok Srinivasan Computer Science Florida State University
Byron Gangnes Econ 427 lecture 23 slides Intro to Cointegration and Error Correction Models.
CHAPTER 12 FORECASTING. THE CONCEPTS A prediction of future events used for planning purpose Supply chain success, resources planning, scheduling, capacity.
Cointegration in Single Equations: Lecture 5
Single Index Model. Lokanandha Reddy Irala 2 Single Index Model MPT Revisited  Take all the assets in the world  Create as many portfolios possible.
 Hedge Funds. The Name  Act as hedging mechanism  Investing can hedge against something else  Typically do well in bull or bear market.
1 Lecture Plan : Statistical trading models for energy futures.: Stochastic Processes and Market Efficiency Trading Models Long/Short one.
Forecasting. Model with indicator variables The choice of a forecasting technique depends on the components identified in the time series. The techniques.
Investments, 8 th edition Bodie, Kane and Marcus Slides by Susan Hine McGraw-Hill/Irwin Copyright © 2009 by The McGraw-Hill Companies, Inc. All rights.
EQUITY-PORTFOLIO MANAGEMENT
Valuation: Market-Based Approach
9. Arbitrage trading strategies
Topic 3 (Ch. 8) Index Models A single-factor security market
A Primer on WARF’s Hedge Fund Portfolio:
Capital Asset Pricing and Arbitrage Pricing Theory
Nonstationary Time Series Data and Cointegration
Spurious Regression and Simple Cointegration
Futures and Swaps: Markets and Applications
Investments: Analysis and Management
Data Mining, Distributed Computing and Event Detection at BPA
Chapter 6: Autoregressive Integrated Moving Average (ARIMA) Models
Momentum Effect (JT 1993).
Investments: Analysis and Management
COS 518: Advanced Computer Systems Lecture 11 Michael Freedman
Stock prices and the effective exchange rate of the dollar
Topic #4 Financial Instruments in the Market: II Stocks
Mean Reverting Asset Trading
Mean Reverting Asset Trading
Bassem Bimo Irwan Analysis Financial Report And Business Assessment EQUITY SECURITY ANALYSIS CHAPTER 9 Presented.
Unit Roots 31/12/2018.
Slides prepared by Samkit
Discretized Streams: A Fault-Tolerant Model for Scalable Stream Processing Zaharia, et al (2012)
Spurious Regression and Simple Cointegration
Spurious Regression and Simple Cointegration
Stock Prediction with ARIMA
Unit Root & Augmented Dickey-Fuller (ADF) Test
Part III Exchange Rate Risk Management
Econometrics Analysis
Econometrics I Professor William Greene Stern School of Business
Lecturer Dr. Veronika Alhanaqtah
Econometrics Chengyuan Yin School of Mathematics.
Streaming data processing using Spark
COS 518: Advanced Computer Systems Lecture 12 Michael Freedman
Chapter 6 Beyond Duration
FORECASTING 11-Dec-19 Dr.B.Sasidhar.
Global Market Inefficiencies
Presentation transcript:

Spark vs Storm in Pairs Trading Predictions Aditya Waghaye Irene (Ying Yu) CS848 - Dec. 5th, 2016

Pairs Trading - Background Key Notion: Asset pricing can be viewed in relative terms Asset prices converge to long-term equilibrium Pricing equilibrium for near-equivalent shares Hedging Short sell an overvalued asset Long buy an undervalued asset Focus is on Cointegration approach to Pairs Trading The key notion is that asset pricing can be viewed in absolute and relative terms. A pairs trading strategy speculates on the convergence of securities with similar characteristics to a long-term stable equilibrium. you are betting on the preservation of the "long run equilibrium" pairs trading relies on the principle of equilibrium pricing for near-equivalent shares Hedging: notions of relative pricing are implemented by shorting an over-valued security and going long an under-valued one, with the expectation that the mispricing will correct itself, at which point the trade is closed out. The resulting portfolio of ‘pairs’ is typically constructed in such a way that it has negligible exposure to the market.

Cointegration - Glossary Stationary Time Series: Properties do not depend on observation time Cointegration is applied to non-stationary I(1) time series Trends and seasonality Differencing Compute the differences between consecutive observations of a time series Dow Jones index data was non-stationary, but the daily changes were.

Cointegrated Time series Random Walk model Two non-stationary time series X and Y are cointegrated if: They are both I(1) Some linear combination of X and Y is stationary Other examples: Income and consumption: as income increases/decreases, so too does consumption. Size of police force and amount of criminal activity Stock ticker price time series are historically mean reverting so I(1)

Detect Cointegration Steps: Check that two time series X and Y are I(1) Estimate the cointegrating relationship Yt = aXt + et Residuals are deviations of actual time series observations from the fitted function observations Goal of cointegration is to find an equilibrium relationship between two time series that individually aren’t in equilibrium Check that cointegrating regression residuals et are stationary

Pairs Trading Example Consider Cointegrated relationship: X - 2Y = Z X tends to be priced twice as high as Y Z is stationary series of zero mean If X > 2Y => sell X and buy Y If X < 2Y => buy X and sell Y

Project Goals Central Objective: Evaluate Spark Streaming vs Storm Implement Cointegration algorithm in both frameworks Evaluate computation time on similar dataset Evaluate computation latency on each framework Usage differences and benchmarking considerations

Benchmarks design To compare two systems, main operations: Read time series data from kafka Run algorithm on each system Run tests Store relevant pairs into database

Benchmark design Topic parse json OLS Store events partition1 Apache Kafka partition1 Kafka clusters events partition2 events partition3 parse json OLS Dickey Fuller Test Store key-value

Framework Storm Streaming spout bolt bolt spout spout topology kafka 1 2 bolt spout 1 spout topology

Framework Spark streaming Dstream RDD Dstream transformation kafka RDD

How to measure the performance Time needed to build the model The latency to test the model With backup pressure storm APPL: [ {C:GOOGLE,R:Y=AX+B}, {C:MICRO,R:Y=CX+D} ...] ... kafka redis spark

Differences and considerations Spark streaming vs. Storm Micro batching vs. Pure streaming Increase batch duration Maintain throughput Increase parallelism

THANKS!