- 51 Views
- Uploaded on
- Presentation posted in: General

Evaluation of modeling and solution techniques

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

- Theoretical
- worst case, average case, partial orders
- shortcomings:
- worst case seldom occurs
- unrealistic assumptions

- Empirical
- computational experiments

- Results presented must be sufficient to justify claims
- e.g., don’t confuse an algorithm with an implementation

- Sufficient detail to allow reproducibility of results
- give actual code
- experimental notebook

- Benchmark sets
- from practice
- specially constructed

- Randomly generated
- simple random
- model a real problem

- Benchmark sets
- sometimes representative of real world
- expensive to collect, thus sets often small
- biased

- Randomly generated
- can explore entire space of problems
- allows statistically valid conclusions
- lack of realism

- Efficiency
- CPU time
- nodes visited
- constraint checks

- Robustness, scope
- class of problems which can be effectively solved

- Scalability
- size of problems

- Accuracy, solution quality

A claim that…

- a new algorithm is feasible and promising
- preliminary testing on several hand-picked problems

- an algorithm/implementation is better
- detailed comparison with prominent methods already available on broad range of problems

- Straw algorithms
- only compare against the “best”

- Easy problems
- Unfair comparisons
- different languages, programmers, optimization efforts, machines, ...

- Test set tuning
- e.g., parameter tuning
- solution: divide into “training” and test sets

- Drawbacks of competitive testing
- enormous amount of work
- dictates implementation language
- tells us which algorithm is better but not why
- negative results are considered uninteresting

- Scientific testing
- experiments designed to contribute to understanding

- Crowder, H.P., Dembo, R.S., and Mulvey, J.M. “On reporting computational experiments with mathematical software,” ACM Transactions on Mathematical Software, 5:193-203, 1979.
- Jackson, R.H.F., Boggs, P.T., Nash, S.G., and Powell, S. “Guidelines for reporting results of computational experiments,” Mathematical Programming, 49:413-426, 1990.
- Hooker, J.N., “Needed: An empirical science of algorithms,” 1993.
- Hooker, J.N., “Testing heuristics: We have it all wrong,” 1995.