Pattern Discovery of Fuzzy Time Series for Financial Prediction -IEEE Transaction of Knowledge and Data Engineering

147 Views

Download Presentation
## Pattern Discovery of Fuzzy Time Series for Financial Prediction -IEEE Transaction of Knowledge and Data Engineering

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**Pattern Discovery of Fuzzy Time Series for Financial**Prediction-IEEE Transaction of Knowledge and Data Engineering Presented by Hong Yancheng For COMP630P, Spring 2009**Outline**• Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion**Problems with existing stock prediction tools**• A lot of tools exists for predicting stock price • Artificial Neural Network, SVM, NeuroFuzzy, Naïve Bayes and so on • Three major problems with these tools • Training process is nontrivial and training result cannot be further used for other target • Prediction results are incomprehensible • Hard for user to tuning the parameters • Gap exists between prediction result and investment decision • Improving prediction VS buy/sell decision**Target problem**• Data preprocessing are needed before applying various of techniques • Data mining, machine learning & pattern recognition • Good knowledge representation method can assist investors • Knowledge-based method to transfer financial data to comprehensible rules and visual patterns**Outline**• Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion**Japanese Candlestick Theory**• Four general ways of represent stock price fluctuation • Original daily fluctuation • Single close price • Bar chart • Candlestick chart • More visual information**Fuzzy Time Series**• Fuzzy time series • Assume U is the universe of discourse, where U = {x1, x2,…, xn}. A fuzzy set Ai of U is defined by Ai = µAi (x1)/x1 + µAi (x2)/x2 + … + µAi (xn)/xn where µAi (xk) is membership function of the fuzzy set Ai ,µAi: U -> [0,1]**Outline**• Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion**Fuzzy candlestick pattern**• A fuzzy candlestick pattern is composed of related fuzzy candlestick lines in a period • A fuzzy candlestick line has seven parts • Sequence, open style, close style, upper shadow, body, body color and lower shadow • Sequence defines the location of the candlestick • Open/Close style model the relationship between consecutive candlestick lines**Candlestick line modeling**• Modeling the length of shadow and body • Four linguistic variables EQUAL, SHORT, MIDDLE and LONG indicate the fuzzy sets of length • Lupper = ([high – MAX(open, close)]/open) * 100 • Llower = ([MIN(open, close) - low]/open) * 100 • Lbody = ([MAX(open, close) – MIN(open, close)]/open) * 100**Candlestick line modeling**• The membership function of four fuzzy sets are shown as follows • The range is set to (0, 14) because the Taiwan stock price limitation**Candlestick line modeling**• The body color is defined by three terms BLACK, WHITE and CROSS • If open–close > 0 then body color is BLACK • If open–close < 0 then body color is WHITE • If open–close = 0 then body color is CROSS**Candlestick line modeling**• The open/close style is another important feature • Five linguistic variables LOW, EQUAL_LOW, EQUAL, EQUAL_HIGH, HIGH indicate fuzzy sets of open/close style**Trend modeling**• Two linguistic variables are used to model the trends before and after the candlestick pattern • previous trend is represented by weekly candlestick line • Six fuzzy sets are used to define the trend • CROSS, EQUAL, WEAK, NORMAL, STRONG, and EXTREME • BEARISH and BULLISH define the body color**Trend modeling**• Following trend is derived from the variation of close price (Closet+n – Closet)/ Closet * 100 • Closet+n and Closet mean the close price at day t+n and day t respectively • n is a user-defined parameter**Outline**• Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion**Three major pattern recognition problems**• Sensing problem • Measured values are open, close, high, low • Feature extraction problem • Fuzzy candlestick patterns • Pattern classification problem • Can be determined by user**Forecast procedure**• Step 1 • Calculate the variation percentage between two close prices. • Use the minimum increase Imin and maximum increase Imax to define the universe of discourse • UoD = [Imin –D1, Imax +D2] • E.g. Imin = -5.83, Imax = 7.66 then UoD = [-6, 8] • Step 2 • Partition UoD into several intervals • E.g. partition [-6, 8] into seven intervals [-6, -4], [-4, -2], …, [6, 8]**Forecast procedure**• Step 3 • Define fuzzy sets on the UoD associate with the intervals in step 2 • Step 4 • Fuzzifying the values calculated in step 1 • If v ∈ ux, and there is Ay in which maximum membership function occurs at ux, v is translate to Ay**Forecast procedure**• Step 5 • Calculate all the candlestick patterns • Step 6 • Refine extracted patterns, identify important attributes • Step 7 • Select pattern for forecasting based on probability P(Ax |Py ) • Statistic T = Count(Py ∩ Ax)/Count(Py) as the threshold to select the patterns**Forecast procedure**• Step 8 • Forecast the trend follows • Rule 1: test pattern not found, set variation v to 0 • Rule 2: test pattern found, set variation v to arithmetic average of midpoints of matched patterns • Forecast = close + close * v • Step 9 • Evaluate the forecasting • MSE = ∑ (Forecasti - Actuali)2 / N**Outline**• Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion**Experiments and Applications**• The experiments are conducted based on TAIEX index from 2004-01-02 to 2005-01-31 and 2330(TSMC) from 1997-10-23 to 2002-12-25**Experiments and Applications**• Experiment for TAIEX index**Experiments and Applications**• Experiment results for TAIEX**Problems with existing stock prediction tools**• Three major problems with these tools • Training process is nontrivial and training result cannot be further used for other target • Prediction results are incomprehensible • Hard for user to tuning the parameters • Gap exists between prediction result and investment decision • Improving prediction VS buy/sell decision**Experiments and Applications**• Experiment with 2330 (TSMC) • The focus is to find the buying time of the stock • The rule is: IF T>0.5 and the following trend is STRONG_INCREASE or EXTREME_INCREASE THEN select the pattern • 5-day return is 2.9% on average**Experiments and Applications**• Fuzzy modifier can be implemented to help user tuning the parameters • ABOVE, BELOW, PLUS, VERY, EXTREMELY, MORE_OR_LESS, SOMEWHAT, and NOT • E.g. STRONG_BEARISH and EXTREME_BEARISH can be merged by ABOVE STRONG_BEARISH**Outline**• Introduction and target problem • Background knowledge and related work • Modeling the candlestick pattern • Candlestick pattern for financial prediction • Experiments and applications • Conclusion and Discussion**Conclusion and Discussion**• Pros • Knowledge-based method to represent the financial time series and to facilitate the knowledge discovery • Comprehensible, computable and visual • Can be used directly or as data preprocess • Cons • Time complexity • How many candlestick lines for a pattern