1 / 14

Genetic Programming in Statistical Arbitrage

Genetic Programming in Statistical Arbitrage. Philip Saks PhD Seminar 17.10.2007. Contents. Introduction Genetic Programming Clustering of Financial Data Data Framework Results Conclusion. Introduction.

lindsey
Download Presentation

Genetic Programming in Statistical Arbitrage

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Genetic Programming in Statistical Arbitrage Philip Saks PhD Seminar 17.10.2007

  2. Contents • Introduction • Genetic Programming • Clustering of Financial Data • Data • Framework • Results • Conclusion

  3. Introduction • To develop an automated framework for trading strategy design, by employing evolutionary computation in conjunction with other machine learning paradigms • The present framework utilize genetic programming • Much of the existing financial forecasting using GP has focused on high-frequency FX [Jonsson, 1997][Dempster and Jones, 2001][Bhattacharyya et al, 2002] and the general consencus is that there is predictability, and excess return is achievable in the pressence of transaction costs • For stocks, the results are mixed [Allen and Karjalainen, 1999] do not significantly out-perform the buy-and-hold on S&P500 daily data, but [Becker and Sheshadri, 2003] do on monthly.

  4. GP I • EC is a concept inspired by the Darwinian survival of the fittest principle – The rationale being, that natural evolution has proved succesfull in solving a wide range of problems throughout time, hence an algorithm that mimics this behavior, might solve a wide range of artificial problems • The concept was pioneered by Holland (1975) in the form of Genetic Algorithms (GA) • A GA is essentially a population based search method, where each candidate solution is incoded in a fixed length binary string. • The population evolves, via mainly three operators, selection, reproduction and mutation. • The selection process is based on the survival of the fittest principle.

  5. GP II • GP’s are basically GA’s in which the genome contitutes hierachical computer programs • Using this representation, we can solve problems in a wide range of fields such as, symbolic or ordinary regression, classification, optimal control theory etc. since each of these areas “can be viewed as requiring discovery of a computer program that produces some desired output for particular inputs” (Koza, 1992) • Tree representation of programs, function & terminal Set • Evolutionary operators: selection, cross-over & mutation

  6. Clustering of Financial Data

  7. Data • Hourly VWAP prices and volume for banking stocks within the Euro Stoxx Universe, covering the period from 01-Apr-2003 to 29-Jun-2007 (8648 oberservations).

  8. Framework • Evolve trading rules with binary decisions • We consider the classical single tree setup, but also a dual tree framework, where buy and sell rules are co-evolved. • The training set comprises 6000 samples, while the remaining 2647 are used for out-of-sample testing • 10 runs are performed for each experiment.

  9. Results • Trading on VWAP, assuming 1bp market impact

  10. Sensitivity Analysis

  11. Stress Testing I

  12. Turnover Analysis

  13. Transaction Cost Implications

  14. Conclusion • It is possible to discover profitable arbitrage trading rules on the Euro Stoxx banking sector. • A cooperative co-evolution of buy and sell rules are beneficial to the classical single tree structure. • Optimizing in the pressence of transaction costs makes a difference – There should be correspondence between assumption and application for optimal performance.

More Related