Loading in 2 Seconds...

Data-intensive Computing Case Study Area 2: Financial Engineering

Loading in 2 Seconds...

- 168 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Data-intensive Computing Case Study Area 2: Financial Engineering' - avedis

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

### Data-intensive Computing Case Study Area 2: Financial Engineering

B. Ramamurthy

B. Ramamurthy & Abhishek Agarwal

Modern Portfolio Theory

- Modern Portfolio Theory (MPT) is the theory of investment that tries to maximize the return and minimize the risk by analytically choosing among the different (financial) assets.
- MPT was introduced by Harry Morkowitz in 1952 and he received Nobel Prize for Economics in 1990. He is currently a professor of Finance at Rady School of Management at UC, San Diego. (83 years old)
- One of his influences was John Von Neumann, the inventor of the stored program computer.

B. Ramamurthy & Abhishek Agarwal

The Big Picture

- Stock market portfolio context:
- Given an amount A and a set a stocks {w, x, y, z} and the historical performance of the set of stocks, what should be (%) allocation of the amount to each of the stocks in the set so that returns are maximized and risks are minimized.
- Example: $10000, stocks {C, F, Q, T}, what is the recommended split among these to get best returns and least risk.
- Some quantitative assumptions are made about return and risks.
- The above is my simple interpretation of the complex problem.

B. Ramamurthy & Abhishek Agarwal

Reference

- Reference: Application of Hadoop MapReduce to Modern Portfolio Theory by Abhishek Agarwal and Bina Ramamurthy, paper submitted to ICSA 2011.
- Also work by Ross Goddard (UB grad at Drexel), Mohit Vora and Neeraj Mahajan (Yahoo.com, alumni)

B. Ramamurthy & Abhishek Agarwal

Markowitz Model

- How is it data intensive?
- Number of assets in the financial world is quite large
- Historical data for each of these assets will easily overwhelm traditional databases
- Would like real data on the volume of this problem. Any guess?
- We are currently working with 500000 assets.

B. Ramamurthy & Abhishek Agarwal

MPT

- Fundamental assumption of MPT is that the assets in an investment portfolio cannot be selected individually.
- We need to consider how the change in the value of every other asset can affect the given asset.
- Thus MPT is a mathematical formulation of the diversification in investing, with the objective of selecting a collection of investment assets that has collectively lower risk than any individual asset.
- In theory this is possible because different types of assets change values in opposite directions.
- For example, stock vs bonds, tech vs commodities
- Thus a collection will mitigate each other’s risks.

B. Ramamurthy & Abhishek Agarwal

Technical Details

- An asset’s return is modeled as normally distributed random variable
- Risk is defined as a standard deviation of return
- Return of a portfolio is a weighted combination of returns
- By combining different assets whose returns are not correlated, MPT reduced the total variance of the portfolio.

B. Ramamurthy & Abhishek Agarwal

Expected Return and Variance

If E(Ri )is the return on the asset i and wi is weight of the asset i, then the total expected return on the portfolio will be

E(Rp)= ∑wi* E(Ri)

Portfolio Return Variance can be written as

(σ p)2 = ∑∑ wi wj σi σj pij

where pij is the co relation between the assets i and j. pij for i= j is 0.

B. Ramamurthy & Abhishek Agarwal

MPT explained

B. Ramamurthy & Abhishek Agarwal

Portfolio Combination

- Compute and plot the expected returns and variance on a graph.
- The hyperbola derived from the plot represents the efficient frontier.
- Portfolio on the efficient frontier represents the combination offering the best possible return for a given risk level.
- Matrices are used for calculation of efficient frontier.

B. Ramamurthy & Abhishek Agarwal

Efficient Frontier

- In the matrix form, for a given level of risk the efficient frontier is found by minimizing this expression.

wT ∑w – q RT w

- w represents weight of an asset in a portfolio( ∑wi=1)
- ∑ is the co variance matrix
- R is the vector of expected returns
- q is the risk tolerance (0,∞)

B. Ramamurthy & Abhishek Agarwal

What’s new?

- Parallel processing using MapReduce.
- Co-variance computation: from O(n2) to O(n)

B. Ramamurthy & Abhishek Agarwal

Co-variance Matrix

- We calculate how an asset varies in response to variations in every other asset.
- We use the means of monthly returns of both the assets for this purpose.
- This operation has to be done turn by turn for each asset.
- In a traditional environment this is done via nested loops.
- But this calculation is intrinsically parallel in nature. Each mapper i can calculate how the asset i varies in response to variations in all of other assets.
- The input to each mapper is the current asset and the list of all assets.
- Its output is a vector containing the variations of that asset with respect to the other assets.
- The reducer just inserts the result of the map operation into the covariance matrix table.
- As all the mappers execute parallel this gives us a run time of O(n).

B. Ramamurthy & Abhishek Agarwal

Inverse of co-variant matrix

- Using first principles:

A-1= adjoint(A)*(1/determinant(A))

- This method requires the calculation of the transpose, determinant, cofactor, adjoint, upper triangle of a matrix.
- Some of these operations like transpose can be easily implemented on Hadoop.
- Others like determinant, upper triangle have data dependencies and therefore are not very suitable for a Hadoop like environment.

B. Ramamurthy & Abhishek Agarwal

Inverse of Co-variant Matrix

- Gaussian-Jordan Elimination
- Gaussian elimination that puts zeroes both above and below each pivot element as it goes from the top row of the given matrix to the bottom.

[AI] = A-1[AI]= IA-1

- Gaussian- Jordan elimination has a runtime complexity of O(n3).
- For MR it requires two sets in tandem resulting poor performance.

B. Ramamurthy & AbhishekAgarwal

Inverse of a square co-variance matrix

- The third approach is to use single value decomposition (SVD). We use this approach for our implementation. By using the Single Value Theorem, the matrix A can be written as

A= V ∑ Ut

- where U and V are unitary matrices and ∑ is a rectangular diagonal matrix with the same size as A.
- Using Jacobi Eigen Value algorithm, if A is square and invertible then the inverse of A is given by

A-1 = U ∑-1 Vt

B. Ramamurthy & Abhishek Agarwal

Inverse of co-variance matrix using MR

- We need two map reduce task to implement this on hadoop.
- In the first task, each map task receives a row as a key and a vector of all the other rows as its value.
- This map emits block id and the sub vector pairs.
- The reduce task merges block structures based on the information of the block id.
- In the second task each mapper receives block id as a key and 2 sub matrices A and B as its value.
- The mapper multiplies both the matrices.
- As A will be a symmetric matrix At*A= A*At. The reducer computes the sum of all the blocks.

B. Ramamurthy & Abhishek Agarwal

Expected Returns Matrix Using MR

- The expected returns matrix can be easily built on the Hadoop platform.
- Each mapper computes the expected return of a particular asset.
- All these mappers can run in parallel giving us a run time of O(1) as opposed to run time of O(n) that we would get in the traditional environment.

B. Ramamurthy & Abhishek Agarwal

Multiply Variance inverse with Returns matrix

- Use block multiplication algorithm in MR framework.
- The next step of making each negative entry of a row into a positive: MR makes a O(n2) algorithm into a O(n) algorithm.
- Sort the entries in the row once again using MR

B. Ramamurthy & Abhishek Agarwal

Simulation using Hadoop Framework

- Besides the standard Hadoop package we also used two other packages- HBase[3][10] and Hama[4].
- Hama is a parallel matrix computation package based on Hadoop Map- Reduce.
- Hama proposes the use of 3-dimensional Row and Column (Qualifier), Time space and multi-dimensional Column families of Hbase, and utilizes the 2D blocked algorithms.
- We used HBase for storing the matrices. We use the Hama package for matrix multiplication, matrix transpose and for Jacobi Eigenvalue algorithm.
- Computational the simulation proved that time did not increase linearly with the size of data.

B. Ramamurthy & Abhishek Agarwal

Download Presentation

Connecting to Server..