Data intensive computing case study area 2 financial engineering
Download
1 / 20

Data-intensive Computing Case Study Area 2: Financial Engineering - PowerPoint PPT Presentation


  • 167 Views
  • Uploaded on

Data-intensive Computing Case Study Area 2: Financial Engineering. B. Ramamurthy. Modern Portfolio Theory. Modern Portfolio Theory (MPT) is the theory of investment that tries to maximize the return and minimize the risk by analytically choosing among the different (financial) assets.

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha
Download Presentation

PowerPoint Slideshow about ' Data-intensive Computing Case Study Area 2: Financial Engineering' - avedis


An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Data intensive computing case study area 2 financial engineering

Data-intensive Computing Case Study Area 2: Financial Engineering

B. Ramamurthy

B. Ramamurthy & Abhishek Agarwal


Modern portfolio theory
Modern Portfolio Theory Engineering

  • Modern Portfolio Theory (MPT) is the theory of investment that tries to maximize the return and minimize the risk by analytically choosing among the different (financial) assets.

  • MPT was introduced by Harry Morkowitz in 1952 and he received Nobel Prize for Economics in 1990. He is currently a professor of Finance at Rady School of Management at UC, San Diego. (83 years old)

  • One of his influences was John Von Neumann, the inventor of the stored program computer.

B. Ramamurthy & Abhishek Agarwal


The big picture
The Big Picture Engineering

  • Stock market portfolio context:

  • Given an amount A and a set a stocks {w, x, y, z} and the historical performance of the set of stocks, what should be (%) allocation of the amount to each of the stocks in the set so that returns are maximized and risks are minimized.

  • Example: $10000, stocks {C, F, Q, T}, what is the recommended split among these to get best returns and least risk.

  • Some quantitative assumptions are made about return and risks.

  • The above is my simple interpretation of the complex problem.

B. Ramamurthy & Abhishek Agarwal


Reference
Reference Engineering

  • Reference: Application of Hadoop MapReduce to Modern Portfolio Theory by Abhishek Agarwal and Bina Ramamurthy, paper submitted to ICSA 2011.

  • Also work by Ross Goddard (UB grad at Drexel), Mohit Vora and Neeraj Mahajan (Yahoo.com, alumni)

B. Ramamurthy & Abhishek Agarwal


Markowitz model
Markowitz Model Engineering

  • How is it data intensive?

    • Number of assets in the financial world is quite large

    • Historical data for each of these assets will easily overwhelm traditional databases

  • Would like real data on the volume of this problem. Any guess?

  • We are currently working with 500000 assets.

B. Ramamurthy & Abhishek Agarwal


MPT Engineering

  • Fundamental assumption of MPT is that the assets in an investment portfolio cannot be selected individually.

  • We need to consider how the change in the value of every other asset can affect the given asset.

  • Thus MPT is a mathematical formulation of the diversification in investing, with the objective of selecting a collection of investment assets that has collectively lower risk than any individual asset.

  • In theory this is possible because different types of assets change values in opposite directions.

  • For example, stock vs bonds, tech vs commodities

  • Thus a collection will mitigate each other’s risks.

B. Ramamurthy & Abhishek Agarwal


Technical details
Technical Details Engineering

  • An asset’s return is modeled as normally distributed random variable

  • Risk is defined as a standard deviation of return

  • Return of a portfolio is a weighted combination of returns

  • By combining different assets whose returns are not correlated, MPT reduced the total variance of the portfolio.

B. Ramamurthy & Abhishek Agarwal


Expected return and variance
Expected Return and Variance Engineering

If E(Ri )is the return on the asset i and wi is weight of the asset i, then the total expected return on the portfolio will be

E(Rp)= ∑wi* E(Ri)

Portfolio Return Variance can be written as

(σ p)2 = ∑∑ wi wj σi σj pij

where pij is the co relation between the assets i and j. pij for i= j is 0.

B. Ramamurthy & Abhishek Agarwal


Mpt explained
MPT explained Engineering

B. Ramamurthy & Abhishek Agarwal


Portfolio combination
Portfolio Combination Engineering

  • Compute and plot the expected returns and variance on a graph.

  • The hyperbola derived from the plot represents the efficient frontier.

  • Portfolio on the efficient frontier represents the combination offering the best possible return for a given risk level.

  • Matrices are used for calculation of efficient frontier.

B. Ramamurthy & Abhishek Agarwal


Efficient frontier
Efficient Frontier Engineering

  • In the matrix form, for a given level of risk the efficient frontier is found by minimizing this expression.

    wT ∑w – q RT w

    • w represents weight of an asset in a portfolio( ∑wi=1)

    • ∑ is the co variance matrix

    • R is the vector of expected returns

    • q is the risk tolerance (0,∞)

B. Ramamurthy & Abhishek Agarwal


What s new
What’s new? Engineering

  • Parallel processing using MapReduce.

  • Co-variance computation: from O(n2) to O(n)

B. Ramamurthy & Abhishek Agarwal


Co variance matrix
Co-variance Matrix Engineering

  • We calculate how an asset varies in response to variations in every other asset.

  • We use the means of monthly returns of both the assets for this purpose.

  • This operation has to be done turn by turn for each asset.

  • In a traditional environment this is done via nested loops.

  • But this calculation is intrinsically parallel in nature. Each mapper i can calculate how the asset i varies in response to variations in all of other assets.

  • The input to each mapper is the current asset and the list of all assets.

  • Its output is a vector containing the variations of that asset with respect to the other assets.

  • The reducer just inserts the result of the map operation into the covariance matrix table.

  • As all the mappers execute parallel this gives us a run time of O(n).

B. Ramamurthy & Abhishek Agarwal


Inverse of co variant matrix
Inverse of co-variant matrix Engineering

  • Using first principles:

    A-1= adjoint(A)*(1/determinant(A))

  • This method requires the calculation of the transpose, determinant, cofactor, adjoint, upper triangle of a matrix.

  • Some of these operations like transpose can be easily implemented on Hadoop.

  • Others like determinant, upper triangle have data dependencies and therefore are not very suitable for a Hadoop like environment.

B. Ramamurthy & Abhishek Agarwal


Inverse of co variant matrix1
Inverse of Co-variant Matrix Engineering

  • Gaussian-Jordan Elimination

  • Gaussian elimination that puts zeroes both above and below each pivot element as it goes from the top row of the given matrix to the bottom.

    [AI] = A-1[AI]= IA-1

  • Gaussian- Jordan elimination has a runtime complexity of O(n3).

  • For MR it requires two sets in tandem resulting poor performance.

B. Ramamurthy & AbhishekAgarwal


Inverse of a square co variance matrix
Inverse of a square co-variance matrix Engineering

  • The third approach is to use single value decomposition (SVD). We use this approach for our implementation. By using the Single Value Theorem, the matrix A can be written as

    A= V ∑ Ut

  • where U and V are unitary matrices and ∑ is a rectangular diagonal matrix with the same size as A.

  • Using Jacobi Eigen Value algorithm, if A is square and invertible then the inverse of A is given by

    A-1 = U ∑-1 Vt

B. Ramamurthy & Abhishek Agarwal


Inverse of co variance matrix using mr
Inverse of co-variance matrix using MR Engineering

  • We need two map reduce task to implement this on hadoop.

  • In the first task, each map task receives a row as a key and a vector of all the other rows as its value.

  • This map emits block id and the sub vector pairs.

  • The reduce task merges block structures based on the information of the block id.

  • In the second task each mapper receives block id as a key and 2 sub matrices A and B as its value.

  • The mapper multiplies both the matrices.

  • As A will be a symmetric matrix At*A= A*At. The reducer computes the sum of all the blocks.

B. Ramamurthy & Abhishek Agarwal


Expected returns matrix using mr
Expected Returns Matrix Using MR Engineering

  • The expected returns matrix can be easily built on the Hadoop platform.

  • Each mapper computes the expected return of a particular asset.

  • All these mappers can run in parallel giving us a run time of O(1) as opposed to run time of O(n) that we would get in the traditional environment.

B. Ramamurthy & Abhishek Agarwal


Multiply variance inverse with returns matrix
Multiply Variance inverse with Returns matrix Engineering

  • Use block multiplication algorithm in MR framework.

  • The next step of making each negative entry of a row into a positive: MR makes a O(n2) algorithm into a O(n) algorithm.

  • Sort the entries in the row once again using MR

B. Ramamurthy & Abhishek Agarwal


Simulation using hadoop framework
Simulation using Hadoop Framework Engineering

  • Besides the standard Hadoop package we also used two other packages- HBase[3][10] and Hama[4].

  • Hama is a parallel matrix computation package based on Hadoop Map- Reduce.

  • Hama proposes the use of 3-dimensional Row and Column (Qualifier), Time space and multi-dimensional Column families of Hbase, and utilizes the 2D blocked algorithms.

  • We used HBase for storing the matrices. We use the Hama package for matrix multiplication, matrix transpose and for Jacobi Eigenvalue algorithm.

  • Computational the simulation proved that time did not increase linearly with the size of data.

B. Ramamurthy & Abhishek Agarwal


ad