Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs
Download
1 / 28

Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs - PowerPoint PPT Presentation


  • 106 Views
  • Uploaded on
  • Presentation posted in: General

Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs Breandan Considine OneSpot, Inc. Why Real-Time?. The world is full of hard problems Types of real time applications Hard (nuclear reactor control) Firm (auction bidding) Soft (train scheduling)

loader
I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
capcha

Download Presentation

Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


Armed bandits machine learning and fast java practical advice for real time apis

Armed Bandits, Machine Learning and Fast Java: Practical Advice for Real-Time APIs

Breandan Considine

OneSpot, Inc.


Why real time

Why Real-Time?

  • The world is full of hard problems

  • Types of real time applications

  • Hard (nuclear reactor control)

  • Firm (auction bidding)

  • Soft (train scheduling)

  • Real-time is a good thing

  • Real world applications

  • Performance over scalability


Benefits of real time processing

Benefits of Real-Time Processing

  • Forces us to narrow our priorities

  • Focus on constant, stable solutions rather than time-varying, exact solutions

  • Abundance of data but scarce processing power

  • Lifespan of actionable data extremely short

  • Tradeoff between optimality and throughput

  • Speed and parallelism will come over time

  • Upfront investment with long-term benefits


Real time interactive tasks rits

Real-time Interactive Tasks (RITs)

  • Online auctions: DSPs, SSPs

  • Multivariate testing

  • Inventory control, SCM

  • Scheduling, navigation, routing

  • Recommendation systems

  • High frequency trading

  • Fraud prevention


Common thread

Common Thread

  • Agent offered a context and set of choices

  • Each choice has a unknown payoff distribution

  • Choose an option, measure the outcome

  • Goal: Maximize cumulative payoff

  • Many instances

  • Time sensitive

  • Nontrivial features


Challenges

Challenges

  • Impractical to test every action in context

  • Computationally intractable to consider

  • Cost of full survey outweighs benefit

  • Exploration-Exploitation Tradeoff

  • Opportunity cost for suboptimal choices

  • Local extrema conceal optimal solutions

  • Latency comes at the cost of throughput

  • Every clock cycle must count

  • Firm real-time characteristics


Traditional supervised learning cycle

Traditional Supervised Learning Cycle


Reinforcement learning rl

Reinforcement Learning (RL)


Dis advantages

Dis/advantages

  • Starts from scratch, training is expensive

  • Credit assignment problem & reward structure

  • Issues with non-stationary systems

  • Continuously integrates feedback

  • Adapts to real-time decisions

  • No assumptions about data

  • Follows signal on-line

  • Similar to how we learn


Non blocking algorithms

Non-blocking Algorithms

  • Critical for high performance I/O

  • Relatively difficult to implement correctly

  • Offers large speedup over lock-based variants

  • Types of non-blocking guarantees

  • Wait-freedom

  • Lock-freedom

  • Obstruction-freedom


Lock freedom

Lock-Freedom

  • Guarantees progress for at least one thread

  • Does not guarantee starvation-freedom

  • May be slower overall, see Amdahl's law


Java memory model

Java Memory Model

  • happens-before relation

  • Threaded operations follow a partial order

  • Ensures JVM does not reorder ops arbitrarily

  • Sequential consistency is guaranteed for race-free programs

  • Does not prevent threads from having different visibility on operations, unless explicitly declared


The volatile keyword

Thevolatilekeyword

  • Mechanics governed by two simple rules

  • Each action within a thread happens in program order

  • volatile writes happen before all subsequent reads on that same field

  • Reads from and writes to main memory

  • Syntactic shorthand for lock on read, unlock on write – incurs similar performance toll


Java concurrency

Java Concurrency

  • ConcurrentHashMap, ConcurrentLinkedQueue

  • Need to carefully benchmark

  • Can be significantly slower depending on implementation

  • Avoid using default hash map constructor

  • Faster implementations exist, lock-free

  • Java 8 improvements in the pipeline

  • Prone to atomicity violations


Armed bandits machine learning and fast java practical advice for real time apis

ConcurrentHashMap<String, Data> map;

Data updateAndGet(String key) {

Data d = map.get(key);

if(d == null) { // Atomic violation

d = new Data();

map.put(key, d);

}

return d;

}


Java atomics

Java Atomics

  • Guarantees lock-free thread safety

  • Uses CAS primitives to ensure atomic execution

  • Better performance than volatile under low to moderate contention, must be tested in production setting


Armed bandits machine learning and fast java practical advice for real time apis

private T current;

public synchronized <T> T compareAndSet(T expected, T new) {

T previous = current;

if(current == expected)

current = new;

return previous;

}


Aba problem

ABA Problem

  • Direct equality testing is not sufficient

  • Full A-B-A transaction can execute immediately before execution of CAS primitive, causing unintended equality when structure has changed

  • Solution: generate a unique tag whenever value changes, then CAS against value-tag pair


False sharing

False Sharing

  • Can be prevented by padding out fields

  • Java 8 addresses this problem with @Contended


Multi armed bandit problems

Multi-Armed Bandit Problems

  • N choices, each with hidden payoff distributions

  • What strategy maximizes cumulative payoff?

  • Observation: Choose randomly from a distribution representing observed probability, return ARGMAX


Bayesian bandits

Bayesian Bandits

*http://camdp.com/blogs/multi-armed-bandits


Adaptive control problems

Adaptive Control Problems

  • Parameter estimation for real time processes

  • Uses continuous feedback to adjust output


Pacing techniques

Pacing Techniques


Pid controller

PID Controller


Counting filtering problems

Counting/Filtering Problems

  • Large domain of inputs (IPs, emails, strings)

  • Need to maintain online, streaming aggregates

  • See Hadoop libraries for good implementations

  • Observation: Fast hashing is key.


Bloom filters

Bloom Filters

  • Fast probabilistic membership testing

  • Guarantees no false negatives, low space overhead


Special thanks to ian clarke matt cohen

Special thanks toIan ClarkeMatt Cohen


References

References

http://mechanical-sympathy.blogspot.ie/

http://camdp.com/blogs/multi-armed-bandits

http://blog.locut.us/2011/09/22/proportionate-ab-testing

http://blog.locut.us/2008/01/12/a-decent-stand-alone-java-bloom-filter-implementation/

http://www.cl.cam.ac.uk/research/srg/netos/lock-free/

https://github.com/edwardw/high-scale-java-lib

M. Michael, et al. Simple, Fast, and Practical Non-Blocking and Blocking Concurrent Queue Algorithms. [PDF]

P. Tsigas, et al. Wait-free queue algorithms for the real-time java specification. [PDF]


ad
  • Login