1 / 17

Identifying Critical Factors in Case-Based Prediction

R. Weber College of Information Science & Technology Drexel University. Identifying Critical Factors in Case-Based Prediction. Outline. Case-Based Prediction, Critical Factors Motivation Background: Use of Domain Knowledge Methods to Identify Critical Factors

nikki
Download Presentation

Identifying Critical Factors in Case-Based Prediction

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. R. Weber College of Information Science & Technology Drexel University Identifying Critical Factors in Case-Based Prediction

  2. Outline • Case-Based Prediction, Critical Factors • Motivation • Background: Use of Domain Knowledge • Methods to Identify Critical Factors Gradient descent, Logistic regression, Feature-oriented Case-based, Knowledge-based, Union • Comparative Study Dataset, Methodology, Results • Conclusions • Future Work

  3. Case-Based Prediction • The predicted outcome can be: • Irreversible • Path of natural disasters, e.g. hurricane, tornados • Reversible • Ongoing project outcome, project effort, cost; health conditions • Critical Factors: • features (feature-value) that support the outcome • significant changes in their values can potentially reverse the prediction either alone or in conjunction with changes in values of other critical factors • Critical Success and Critical Failure Factors

  4. Motivation • Assumption: • Users are interested in prediction of reversible outcomes so they can reverse unwanted predictions • Health conditions, project/system failure • Aamodt and Nygaard (1995): • Consider the entire application context (including user’s perspective) to maximize usefulness of CBR systems • Motivation: • Case-based prediction systems that do not indicate effective and efficient ways to reverse unwanted outcomes do not take into account the user’s perspective. • Find a minimal set of critical factors that maximize the chances of reversing unwanted outcomes

  5. Background on Case-Based Prediction • ICCBR 2001: Kadoda et al. has stated that design decisions depend on the dataset • FLAIRS 2002: Watson et al. has evaluated different design decisions because of such bias • CBRW91: Cain, Pazzani, Silverstein proposed EBL+CBR to improve accuracy of case-based prediction when features outnumber cases • ICCBR03: Weber et al. confirmed the improvement in accuracy (scarce data, bias) against other CBR techniques and logistic regression

  6. Methods to Identify Critical Factors Scope • Personalized • Methods that identify failure and success factors that are specific to the case under assessment and to its actual values • Collective • They only identify the features • Provide trends based upon a community of cases. When this community consists of real world experiences, they represent evidence of the importance of these factors

  7. Collective Methods • Gradient descent • Critical factors are those features whose resulting importance values are above the overall average. • Logistic regression • Critical factors are those features with the strongest correlations to the outcome and then these features are used for prediction purposes • Feature-oriented • Using LOOCV, submit a project description for prediction and observe the resulting accuracy; then, submit each feature separately and the success factors the features that produce accuracy closest to the overall accuracy of true positives and as failure factors the ones with overall accuracy closest to true negatives

  8. Personalized Methods • Case-based • Failure factors are feature-value pairs that co-occur in both the target case and in the similar case(s) that was(ere) used to predict failure in the target??????? • Knowledge-based • Submit new case to the EBL method to identify relevance factors with the resulting prediction • In predictions of failure, the feature-values assigned relevance factors are critical failure factors • For the remaining features, we replaced the predicted outcome to assign relevance factors for the alternate outcome • Union • We combined the knowledge-based and the case-based methods by taking the union of the factors each individually identify.

  9. Comparative Study: Dataset • Dataset • 20 out of 88 real cases of software development projects • 23 symbolic features • The 12 out of 21 projects have all originally failed and when submitted to the EBL+CBR prediction, they were predicted to fail.

  10. Comparative Study: Methodology • Methodology consists of 3 stages: • 1) Identification of critical factors • 2) Overturn • 3) Prediction

  11. Results for Collective Methods • GD maximizes reversal but does minimize the set of factors • Feature-oriented is the most efficient • Methods currently used performed most poorly

  12. Results for Personalized Methods Results for Knowledge-Based Overturning

  13. Knowledge-Based Overturning • Personalized • Different methods are able to reverse a project’s prediction using different sets of factors, and one method reversed a prediction contrary to domain knowledge. • Collective • GD failed to reverse one project. However, when we perform knowledge-based overturning we found that it still cannot reverse that one project. More interestingly, some projects are no longer reversed.

  14. Conclusions ?? Recommendations • Domain specific conclusion • 2 factors were identified by all of them • a well defined scope • end users having time for requirements gathering-- • Domain knowledge combined with contextual experiential knowledge may uncover knowledge • Define the level of reversibility of factors, e.g., using measures of efficiency of factors throughout the dataset and by project. Factors that are easy to reverse should receive priority.

  15. Future Work • Case-based framework to learn: • Weights for EBL rules • Dependencies between rules • Dependencies between factors • How to use contextual knowledge embedded in cases to reverse unwanted outcomes? • Use collective methods to identify critical factors and then use cases to assess their potential to reverse unwanted outcomes

  16. Acknowledgements • Co-authors • William Evanco, Michael Waller, June Verner • Colleagues • This and previous work • Anonymous reviewers • National Institute for Systems Test and Productivity

  17. Questions? Ideas? Comments?

More Related