Schema-based Program Synthesis and the AutoBayes System Part III

1 / 18

# Schema-based Program Synthesis and the AutoBayes System Part III - PowerPoint PPT Presentation

Schema-based Program Synthesis and the AutoBayes System Part III. Johann Schumann SGT, NASA Ames. Extending AutoBayes. some extensions are straight-forward: add text-book formulas additional symbolic simplification rules might be required adding schemas requires substantial work

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## Schema-based Program Synthesis and the AutoBayes System Part III

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Schema-based Program Synthesis and the AutoBayes SystemPart III

Johann Schumann

SGT, NASA Ames

Extending AutoBayes
• some extensions are straight-forward: add text-book formulas
• additional symbolic simplification rules might be required
• adding schemas requires substantial work
• “hard-coded” schema as first step
• applicability constraints and control
• functional mechanisms to handle scalar/vector/matrix cases are available
• support for documentation generation
• no schema language, Prolog syntax used
Non-Gaussian PDF
• Data characteristics are modeled using probability density functions (PDFs)
• Example: Gaussians, exponential, ...
• AB contains a number of built-in PDFs, which can be extended (hands-on demo)
• Having multiple PDFs adds a lot of power over libraries
Exercise 1:
• For clustering, often Gaussian distribution of data is used.
• How about angles: 0 == 360
• you get 5 clusters
• A different distribution (vonMises-Fisher) automatically solves this problem
• In AutoBayes: just replace the “gauss” by “vonmises1” -- no programming required
• multiple PDFs in one spec
Exercise 2:
• Take the “estimate foot” example (norm.ab)
• try to generate multiple solutions
• pragma schema_control_arbitrary_init_values=true
• enables numerical algorithm
• pragma schema_control_use_generic_optimize=true
• allows AB to use the generic “optimize(...)” statement
Exercise 3:
• Take the “estimate foot” example (norm.ab) and modify it to work with different probability densities
• examples:
• vonmises1
• exponential
• poisson
• cauchy
Excercise 4
• generate multiple programs for a simple clustering example: mog.ab
• autobayes -maxprog 20 mog.ab
Exercise 5
• Add the pareto distribution to AutoBayes
• must modify the file
• synth/distribution.pl and
• interface/symbols.pl
AutoBayes as a Prolog Program
• AutoBayes is a pretty large program
• ~180 prolog files, 100,000LoC (with AutoFilter)
• Heavy use of
• meta-programming (call, etc.)
• rewriting (using an engine implemented in Prolog)
• functional programming elements for all sorts of list/vector/array handling
• backtracking and backtrackable global data structures
• procedural (non-logical) elements, e.g., file I/O, flags, etc.
• no use of modules but naming conventions
• everything SWI Prolog + few C extensions to handle backtrackable global counters and flags
AutoBayes Weak Points
• The input parser is very inflexible (uses Prolog operators)
• Very bad error messages–often just “no”
• no “schema language”: AutoBayes extension only by union of Prolog/domain specialist
• Only primitive control of schema selection: need for a schema-selection mechanism
• not all schemas are fully documented
• large code-base, which needs to be maintained
Summary
• AutoBayes suitable for a wide range of data analysis tasks
• AutoBayes generated customized algorithms
• AutoBayes schema-based program synthesis + symbolic
• logic + functional + procedural elements used
• AutoBayes extension: easy to very hard
• AutoBayes debugging: a pain, but explanations and LaTeX output very helpful
• AutoBayes is NASA OpenSource: bugfixes/extensions always welcome
• AutoBayes has a 160+ pages Users manual
• AutoBayes useful for classroom projects to PhD projects
AutoBayes in Air Traffic Control
• The US Airspace is very crowded and extreme growth rates are expected over the next years
• Air Traffic Control (ATC) is still mostly done manually
• Next Generation Air Traffic Systems (NGATS) are
• highly computerized
• researched/developed at NASA

The statistical analysis of air traffic radar data (position, speed, altitude, etc. for each aircraft every 12 seconds) is important for the development, testing, and assessment of air traffic control algorithms. Of particular interest: separation assurance and trajectory prediction

altitude

CAS-Mach Transition
• most climb profiles
• followed by a segment of constant mach (speed relative to the speed of sound, depends on altitude)
• transition altitude is not published

Transition altitude

How to detect the transition?

mach

• aircraft data contain mach, CAS, altitude
• data are very noisy
• task: determine the most likely point, where
• mach goes from increase to constant
• CAS goes from constant to decrease
• get the altitude at this point

CAS

alt

Finding the Transitions I

1 const nat N. const double sigma_sq

2 double m_level, m_rate. nat t_0.

3 data double mach(0 .. N-1).

4 mach(t) ~ N( if(t < t_0)?

4.1 m_level + m_rate*(t-t_0):

4.2 m_level,

4.3 sigma_sq).

5 max pr(mach|{t_0, m_level, m_rate})

for {t_0, m_level, m_rate}.

• Declare all variables,
• unknown parameters, and
• data.
• The mach data is Gaussian distributed:
• Before transition: grows linearly
• After transition: mean mach number is constant
• Variance is given
• Ask AutoBayes to estimate the best values for the unknowns

a = 0.69428

b = 0.0072091

c = 0.44532

t_0 = 35

mach

Noise!

Red: actual trajectory

Blue: estimated profile

1 const nat N. const double sigma_sq

2 double m_level, m_rate. nat t_0.

3 double cas_level, cas_rate.

4 data double mach(0..N-1).

5 data double cas(0..N-1).

6 mach(t) ~ N( if(t < t_0)?

6- m_level + m_rate*(t-t_0): m_level, sigma_sq).

7 cas(t) ~ N( if(t < t_0)?

7- cas_level : cas_level + cas_rate*(t-t_0), sigma_sq).

8 max pr({cas, mach}|{t_0, m_level, m_rate, cas_level, cas_rate})

for {t_0, m_level, m_rate, cas_level, cas_rate}.

A small modification to the AutoBayes model allows the generation of an entirely new algorithm for finding the best transition in Mach and CAS

NOW: 682LoC

B737

Transition Points B737

~32,000ft

altitude

B737

~26,000ft

~280kn

CAS

mach

mach

Likelihood of transition

1 week ZOA_SFO data

423 of 1645 climb scenarios

Different AC Types

B733

A320

B737

B733

A320

B737