70 likes | 147 Views
Dive into innovative projects focusing on data analytics, big data, decision support systems, and parallel computing. Explore applications, algorithms, and platforms used in various domains such as healthcare and finance.
E N D
CDT PROJECTS 2013-14John Keane, Software Systems Groupjak@cs.man.ac.uk1. Data Analytics / Big Data2. Parallel & Distributed Systems3. Decision Support SystemsHAPPY TO DISCUSS
Big Data Analytics (IBM funded) With Nenadic CHALLENGE • Investigate: • Applications: characteristics and predictability • Data Analytic / Machine Learning Algorithms – relatively simple so far • Software: Map-Reduce, Hadoop • Hardware: various platforms
Bio-medical data analytics With Nenadic, Zeng, Stivaros (Consultant, RMCH) • Adverse drug event detection (EU funded) • Bayesian/Fuzzy association rules algorithms CHALLENGE • Compare/contract accuracy of prediction • Clinical Outcome Mining (Christie Hospital) • Data/text-based clinical records – better diagnose and predict CHALLENGE • Illness staging; multi-modal data; changes over time; • Decision Support for Radiology (NIHR-funded) • Decision aid to assist better description of scans CHALLENGE • Usability; Integration with existing tools; Link to literature
Itemset Mining Algorithms {baby nappies}->{beer} • Colossal itemsets: - Very high dimensional datasets - Run-time increases exponentially as average row length increases; • Minimal unique itemsets (MUI) SUDA: Special Unique Detection - “risky” records, those likely to be linked– 16 years old + widow - Records of most concern have many, small MUIs - SUDA s/w used by ONS, UK; licensed by Singaporean govt; - Algorithm used by UN/World Bank International Household Survey CHALLENGES: • Data structure to represent itemsets during search process • Search space pruning • Algorithm: bottom-up; top-down; hybrid; • Parallelism
Eco-service composition (EU funded) with Mehandjiev, MBS • Aims to determine conditions for achieving eco-friendly, resilient and optimal service compositions on a distributed cloud infrastructure • Two service optimisation approaches deployed: 1. Global: analyses end-to-end interaction between services 2. Local: computes local optimization by creating dynamic service chains between service provider/consumer CHALLENGE • Energy-efficient load balance and scheduling
HPC + Finance (EU funded, UK Government) • High Frequency Trading • Flash crashes: dramatic sudden drop in share price describe/predict • Working paper: High Frequency Trading and Mini Flash Crashes http://arxiv.org/abs/1211.6667 • HPCFinance • New models of risk analysis (diverse data integration) • Role of HPC in Finance and comparison of technologies • Trade-off: accuracy, speed, cost comparison: Cloud; GPGPUs, FPGA (Maxeler box) CHALLENGES: Data engineering; Analytics; Algorithms; High performance;
Preference Elicitation from Pairwise Comparison with Mikhailov, MBS; Siraj, COMSATS IIT, Pakistan • Decision making is complex in presence of uncertainty and insufficient knowledge. • Aim to estimate preference using pairwise comparison: PC used when unable to assign scores to available options; judgements provided may be inconsistent • Work has proposed consistency measures and prioritization measures where revision not allowed. • PriEsT tool now has sensitivity analysis -> best solution. • CHALLENGES • Evolutionary approach to multi-criteria DSS • Work on preference elicitation model and tool • Group decision making • Bridge PriEsT and R (popular data mining tool) via XMCDA