130 likes | 151 Views
This system utilizes internet data mining and machine learning to analyze stock market events, correlations, & influences in real-time. It offers functions like event recognition, data mining, impact analysis, and event clustering. With advanced technologies like natural language processing and internet data mining, it provides insights for stock predictions and hot event impacts on stock prices.
E N D
Lietu Search Engine Realtime Financial Monitoring and Analysis System May 2010
Agenda • Status Quo Analysis • Software Design • Software System Structure • System Advantage • Successful Application Cases
Realtime Financial Monitoring and Analysis System Stock market analysis is based on large-volume,real-time information, internet information possess the due characteristics. Large-volume internet information provide large-volume valuable information, which necessitates machine-based online content-mining. Online financial information data-mining relates to internet technology, vertical search engine, information theory, machine learning, natural language process, finance, quantitative. Now it is a new cross-domain research field. Online financial information data-mining has become a very popular and important research area in information data-mining.
Major Applications • Stock recommendation Take flammable ice concept for example, many people don’t know which stock might benefit from flammable ice concept. Whilst after the event, stock price already high up there. • Analysis of stock price uprising Event driven mechanism model. • Analysis of event correlation Analysis of hot event impact on every stock, and the impact degrees. ”dubai”negative impact on”振华重工”, but in a long term,”dubai crisis”is temporary. When “振华重工”price down low enough to stable,and “dubai crisis”dwindle away,we can buy “振华重工”. • Analysis of bulk commodity price which impact on stock price
Main Functions Realtime financial monitoring and analysis system take advantage of internet search engine and machine learning technologies, real-time analyze the event that impact stock price and data-mining correlation information which influence the listed due companies, and provide some criteria for real-time predictions of stock market. Main functions include: 1、Recoginition and clustering of information which impact stock price, which includerecoginition and clustering of policies and accidental events. 2、Data-mining of correlations between listed companies, which refers to mutual stock-share holdings and industry chain analysis. 3、Analysis and evaluation of influential event’s impact on stock-cluster and individual stock. 4、Analysis of stock-cluster and individual stock, finding out the underlying reasons for hot stock-cluster shaping, and facilitates continuous tracking.
Technology rationale • Natural language processing technology One Chinese text appears as a string composed of Chinese characters.Characters can form word, word can form word cluster,word cluster can form sentense,and go on to paragraph, chapter, etc. Character, word, word cluster, sentence, paragraph…whatever level explanation, within different syntax, there might be different meanings. Generally, nearly they all can be resolved right according to the due environment, and do not have ambiguity, that’s why we can communicate fluently with natural language. However, to eliminate ambiguity, we need very large volume knowledge and reasoning.How to collect and select out such knowledge, and store it with due form to computer system, and how to effectively eliminate ambiguity with them, is a very immense and laborious work. • Internet data mining Internet data mining technology take advantage of the internet web crawler, whole text retrieval, and integrate with intelligence, pattern recognition, neural-net, and significantly differ from internet search enginein that it can automatically comprehend user’s natural information need and search out the best match through pattern matching.
Technology rationale (Continue) • Text tendency analysis Text tendency analysis has now become a very conspicuous topic in natural language processing and machine learning domain. Through text tendency analysis, we can grasp text writer’s emotion tendency.In financial domain, news public opinion is a critical criterion that shows investor’s, trader’s and regulator’s view points and emotions, they have identical importance with all other financial statistics such as trading statistics and macro statistics. According to current references, text tendency analysis can be carried out through two factors.The first is to calculate a value for one text, the sign represents the text’s tendency, and the absolute value represents the text’s influence.The second is to classify a text to category—roughly positive or negative. From these two factors, text tendency analysis is a statistics problem, and also a classification problem.Now, many researchers do it as a classification problem.Besides title classification, text tendency classification needs more concern, whichcan be carried out from two factors:one is machine learning, another is semantic analysis.Now, text tendency analysis is applied to English, Chinese and Arabic.
innovation highlights Main modules of the project: • Recognition and clustering of stock-price sensitive information: real-time monitor policy issues and accidental events on internet, extract and cluster them. • Data-mining of inter-relationship of public listed companies: construct public listed companies stock holding relationship and industry chain relationship, and evaluate their mutual influence within the relationship context. • Evaluation of stock-price sensitive information impact on stock cluster and individual stock:according to public listed companies inter-relationship and industry chain relationship, find out the stock cluster and stocks which may be influenced by stock-price sensitive policy or event, calculate the influence degree of the event impact on individual stock. • Stock cluster and individual stock analysis: aiming at stock-price sensitive policy or event and the industry chain relationship, provide analysis of stock cluster and individual stock,including short-term evaluation and long-term evaluation.
Console Program Interface Output Interface Classification, Clustering, Correlation Analysis...... Text Mining Knowledge Database Search Engine Management DBMS Search Engine Database Module Component Structure
Product Frame Finance Information Mining Backbone System Basic Function Applications Text Abstraction Text Classification Database Segment Association Analysis Page Anti-noise Trend Analysis Tagger Emotion Recognition Index Text Clustering Correlation Analysis Display Functions Report Generation Control Console Chart Generation Others
Case study • Lietu Internet Monitor • Lietu Enterprise Search
http://www.lietu.com Thanks!