Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs
This presentation is the property of its rightful owner.
Sponsored Links
1 / 28

Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs PowerPoint PPT Presentation


  • 90 Views
  • Uploaded on
  • Presentation posted in: General

Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs Breandan Considine OneSpot, Inc. Why Real-Time?. The world is full of hard problems Types of real time applications Hard (nuclear reactor control) Firm (auction bidding) Soft (train scheduling)

Download Presentation

Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Armed bandits machine learning and fast java practical advice for real time apis

Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs

Breandan Considine

OneSpot, Inc.


Why real time

Why Real-Time?

  • The world is full of hard problems

  • Types of real time applications

  • Hard (nuclear reactor control)

  • Firm (auction bidding)

  • Soft (train scheduling)

  • Real-time is a good thing

  • Real world applications

  • Performance over scalability


Benefits of real time processing

Benefits of Real-Time Processing

  • Forces us to narrow our priorities

  • Focus on constant, stable solutions rather than time-varying, exact solutions

  • Abundance of data but scarce processing power

  • Lifespan of actionable data extremely short

  • Tradeoff between optimality and throughput

  • Speed and parallelism will come over time

  • Upfront investment with long-term benefits


Real time interactive tasks rits

Real-time Interactive Tasks (RITs)

  • Online auctions: DSPs, SSPs

  • Multivariate testing

  • Inventory control, SCM

  • Scheduling, navigation, routing

  • Recommendation systems

  • High frequency trading

  • Fraud prevention


Common thread

Common Thread

  • Agent offered a context and set of choices

  • Each choice has a unknown payoff distribution

  • Choose an option, measure the outcome

  • Goal: Maximize cumulative payoff

  • Many instances

  • Time sensitive

  • Nontrivial features


Challenges

Challenges

  • Impractical to test every action in context

  • Computationally intractable to consider

  • Cost of full survey outweighs benefit

  • Exploration-Exploitation Tradeoff

  • Opportunity cost for suboptimal choices

  • Local extrema conceal optimal solutions

  • Latency comes at the cost of throughput

  • Every clock cycle must count

  • Firm real-time characteristics


Traditional supervised learning cycle

Traditional Supervised Learning Cycle


Reinforcement learning rl

Reinforcement Learning (RL)


Dis advantages

Dis/advantages

  • Starts from scratch, training is expensive

  • Credit assignment problem & reward structure

  • Issues with non-stationary systems

  • Continuously integrates feedback

  • Adapts to real-time decisions

  • No assumptions about data

  • Follows signal on-line

  • Similar to how we learn


Non blocking algorithms

Non-blocking Algorithms

  • Critical for high performance I/O

  • Relatively difficult to implement correctly

  • Offers large speedup over lock-based variants

  • Types of non-blocking guarantees

  • Wait-freedom

  • Lock-freedom

  • Obstruction-freedom


Lock freedom

Lock-Freedom

  • Guarantees progress for at least one thread

  • Does not guarantee starvation-freedom

  • May be slower overall, see Amdahl's law


Java memory model

Java Memory Model

  • happens-before relation

  • Threaded operations follow a partial order

  • Ensures JVM does not reorder ops arbitrarily

  • Sequential consistency is guaranteed for race-free programs

  • Does not prevent threads from having different visibility on operations, unless explicitly declared


The volatile keyword

Thevolatilekeyword

  • Mechanics governed by two simple rules

  • Each action within a thread happens in program order

  • volatile writes happen before all subsequent reads on that same field

  • Reads from and writes to main memory

  • Syntactic shorthand for lock on read, unlock on write – incurs similar performance toll


Java concurrency

Java Concurrency

  • ConcurrentHashMap, ConcurrentLinkedQueue

  • Need to carefully benchmark

  • Can be significantly slower depending on implementation

  • Avoid using default hash map constructor

  • Faster implementations exist, lock-free

  • Java 8 improvements in the pipeline

  • Prone to atomicity violations


Armed bandits machine learning and fast java practical advice for real time apis

ConcurrentHashMap<String, Data> map;

Data updateAndGet(String key) {

Data d = map.get(key);

if(d == null) { // Atomic violation

d = new Data();

map.put(key, d);

}

return d;

}


Java atomics

Java Atomics

  • Guarantees lock-free thread safety

  • Uses CAS primitives to ensure atomic execution

  • Better performance than volatile under low to moderate contention, must be tested in production setting


Armed bandits machine learning and fast java practical advice for real time apis

private T current;

public synchronized <T> T compareAndSet(T expected, T new) {

T previous = current;

if(current == expected)

current = new;

return previous;

}


Aba problem

ABA Problem

  • Direct equality testing is not sufficient

  • Full A-B-A transaction can execute immediately before execution of CAS primitive, causing unintended equality when structure has changed

  • Solution: generate a unique tag whenever value changes, then CAS against value-tag pair


False sharing

False Sharing

  • Can be prevented by padding out fields

  • Java 8 addresses this problem with @Contended


Multi armed bandit problems

Multi-Armed Bandit Problems

  • N choices, each with hidden payoff distributions

  • What strategy maximizes cumulative payoff?

  • Observation: Choose randomly from a distribution representing observed probability, return ARGMAX


Bayesian bandits

Bayesian Bandits

*http://camdp.com/blogs/multi-armed-bandits


Adaptive control problems

Adaptive Control Problems

  • Parameter estimation for real time processes

  • Uses continuous feedback to adjust output


Pacing techniques

Pacing Techniques


Pid controller

PID Controller


Counting filtering problems

Counting/Filtering Problems

  • Large domain of inputs (IPs, emails, strings)

  • Need to maintain online, streaming aggregates

  • See Hadoop libraries for good implementations

  • Observation: Fast hashing is key.


Bloom filters

Bloom Filters

  • Fast probabilistic membership testing

  • Guarantees no false negatives, low space overhead


Special thanks to ian clarke matt cohen

Special thanks toIan ClarkeMatt Cohen


References

References

http://mechanical-sympathy.blogspot.ie/

http://camdp.com/blogs/multi-armed-bandits

http://blog.locut.us/2011/09/22/proportionate-ab-testing

http://blog.locut.us/2008/01/12/a-decent-stand-alone-java-bloom-filter-implementation/

http://www.cl.cam.ac.uk/research/srg/netos/lock-free/

https://github.com/edwardw/high-scale-java-lib

M. Michael, et al. Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms. [PDF]

P. Tsigas, et al. Wait-free queue algorithms for the real-time java specification. [PDF]


  • Login