1 / 60

Applications of Slow Intelligence Systems

Explore the various applications of Slow Intelligence Systems (SIS) including social influence analysis, product and service optimization, topic and trend detection, and high dimensional feature selection. Discover how these systems can be used to analyze social networks, predict influential nodes, personalize products and services, detect and track hot topics, and select the most relevant features.

karenm
Download Presentation

Applications of Slow Intelligence Systems

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Applications ofSlow Intelligence Systems

  2. Outline • Application: Social Influence Analysis • Application: Product & Service Optimization • Application: Topic/Trend Detection • Application: High Dimensional Feature Selection • Discussion

  3. Outline • Application: Social InfluenceAnalysis • Application: Product & Service Optimization • Application: Topic/Trend Detection • Application: High Dimensional Feature Selection • Discussion

  4. Application to Social Influence Analysis In large social networks, nodes (users, entities) are influenced by others for many different reasons. How to model the diffusion processes over social network and how to predict which node will influence which other nodes in network have been an active research topic recently. Many researchers proposed various algorithms. How to utilize these algorithms and evolutionarily select the best one with the most appropriate parameters to do social influence analysis is our objective in applying the SIS technology.

  5. The Social Influence Analysis SIS System Input data stream is first processed by the Pre-Processor. The Enumerator then invokes the super-component that creates the various social influence analysis algorithms such as Linear Threshold LIM, Susceptible-Infective-Susceptible SIS, Susceptible-Infective-Recovered SIR and Independent Cascading. The Tester collects and presents the test results.

  6. LIM Results of concept 1 and concept 3 with two combinations of parameters in Plurk dataset

  7. LIM Results of concept 1 and concept 3 with two combinations of parameters in Facebook dataset

  8. The SIA/SIS System The Timing Controller will restart the social influence analysis cycle with a different SIA super component such as the Heat Diffusion algorithms, or with different pre-processor. The Eliminator eliminates the inferior SIA algorithms, and the Concentrator selects the optimal SIA algorithm.

  9. Outline • Application: Social Influence Analysis • Application: Product & Service Optimization • Application: Topic/Trend Detection • Application: High Dimensional Feature Selection • Discussion

  10. SIS Application to Product Configuration Productionof personalized or custom-tailored goods or services to meet consumers' diverse and changing needs

  11. Figure 6 - Ontological Filter and the Slow Intelligent System Ontological Filter and Slow Intelligence System

  12. A Scenario • A customer would like to buy a Personal Computer in order to play videogames and surf on the internet. • He knows that he needs an operating system, a web browser and an antivirus package. • In particular, the user prefers a Microsoft Windows operating system. He lives in the United States and prefers to have a desktop. He also prefers low cost components.

  13. Ontological Transform for Product Configurator

  14. Outline • Application: Social Influence Analysis • Application: Product & Service Optimization • Application: Topic/Trend Detection • Application: High Dimensional Feature Selection • Discussion

  15. Topic Detection and Tracking (TDT) System Overview • Detect current hot topics and predict future hot topics based on data collected from the internet • TDT System composes of • Crawler & Extractor: • Collect latest data from Internet for user’s needs • Restrict range of data collection from web data (focus crawler) • Topic Extractor • Discover current hot topics from a set of text documents • Topic Detector • Predict hot topics 15

  16. Topic/Trend Detection System • Crawler & Extractor Social Media HTML documents User’s Keywords of Interests Web Crawler Text documents Web data DB Topic Extractor Information Extractor * Extract articles and metadata (title, author, content, etc) from semi-structured web content Crawler & Extractor

  17. Focused Crawler : Classification Yahoo! Open Directory Project Taxonomy Creation Example Collection • URLs • Browsing • System proposes the most common classes • User marks as GOOD • User change trees Taxonomy Selection and Refinement • System propose URLs found in small neighborhood of examples. • User examines and includes some of these examples. Interactive Exploration Training • Integrate refinements into statistical class model • (classifier-specific action). 17

  18. Focused Crawler: Distillation • Identify relevant hubs by running a topic distillation algorithm. • Raise visit priorities of hubs and immediate neighbors. Distillation • Report most popular sites and resources. • Mark results as useful/useless. • Send feedback to classifier and distiller. Feedback 18

  19. Extractor Given a Web page: Build the HTML tag tree Mine data regions Mining data records directly is hard Identify data records from each data region Learn the structure of a general data record A data record can contain optional fields Extract the data 19

  20. TDT Petri Net Simulation Topic Detection and Tracking 20

  21. 21

  22. Crawler 22

  23. Initial State 23

  24. Accept user input 24

  25. Validate user input 25

  26. Refine user input 26

  27. Train the system 27

  28. Detect most popular topic 28

  29. Extractor 29

  30. Extractor activated 30

  31. Generate HTML tag trees 31

  32. Detect important data 32

  33. Train the system with record 33

  34. Extract data 34

  35. Save data into knowledge base 35

  36. Topic Detection and Tracking 36

  37. Slow Intelligence Steps in blue color:Accept user requestSend request data to TDTEnumerator generates combinationsEliminator selects the best method to fit our needEvaluate combinationsUse concentrator to highlight the selected resultsSend the result to TDTGenerate the instructions to the serverDispatcher gets the instructionDecide where we are going to send the instructionsSend the instructions to the serverEnd of simulation run 37

  38. Outline • Application: Social Influence Analysis • Application: Product & Service Optimization • Application: Topic/Trend Detection • Application: High Dimensional Feature Selection • Discussion

  39. Introduction High-dimensional feature selection is a hot topic in statistics and machine learning. Model relationship between one response and associated features , based on a sample of size n. 39

  40. Math formulation Let be a vector of responses and be their associated covariate vectors where . When for the classification problem, we assume a Logistic model: We estimate the regression coefficient and the bias by minimizing the loss function: 40

  41. Application Supervised learning: gene selection problem in bioinformatics one wants to eliminate those irrelevant genes (features) to obtain a robust classifier. one wants to know which genes are the most critical factors to the disease. each sample’s data with p gene expression levels n samples, patients or healthy ones Important genes selected 41 each Gene expression level

  42. Challenges Dimensionality grows rapidly with interactions of the features Portfolio selection and networking modeling: 2000 stocks involve over 2 millions unknown parameters in the covariance matrix. Protein-protein interaction: the sample size may be in the order of thousands, but the number of features can be in the order of millions. To construct effective method to learn relationships between features and responses in high dimension for scientific purposes. 42

  43. Feature Selection Approach • Main SIS procedure • main_Enumerator • main_Eliminator • main_Adaptator • main_Propagator • main_Concentrator • time controller • Sub procedure • sub_enumerator • sub_concentrator • knowledge base 43

  44. Main Enumerator • Enumerate p features • Among these features, some are relevant to the responses while others not. 44

  45. Main Eliminator • Apply Pearson Correlation between each feature and response , then rank the value from high to low and eliminate the lowest features. • is a pre-defined constant. • is selected top feature set. 45

  46. Sub Enumerator • Enumerate all feature selection algorithms in Knowledge base by applying them to feature set . And select top features as set from for each algorithm. • Knowledge Base: stores the existing candidate algorithms. • We add L1-regularized regression, elastic-net regularized regression • and forward stepwise regression. In principle, any feature selection • algorithms can be put into the knowledge base. 46

  47. Sub Concentrator • For each selected feature set , we compute the loss function: • and choose the best algorithm with the minimum loss. • Then the sub system selects features from . • We denote the feature set 47

  48. Main Adaptor • For all other features in the total p features, • we add each one to and compute the loss function: 48

  49. Main Concentrator • Ranking all with from low to high, and select the top features with the smallest . • top features 49

  50. Main Propagator • Add these top features to to form the new feature set . • top features 50

More Related