Outline Of Confusion

Intelligent design primer • Collecting CSI from people for search algorithms • Existing users of technique • Search algorithm primer • Why use search algorithms in the first place? • How search algorithms work • The search algorithm dilemma • Intelligent Agents to the rescue • How we help solve the dilemma through superior pattern recognition • Empirical evidence of human capability • The Experiment • Hypothesis • Experiment design • Results • Questions Outline Of Confusion Guide for the Perplexed

Intelligent Design Primer What is Id? Is it Just God of The Gaps?

Fundamentals The fundament question of Intelligent Design Theory: How do we know intelligent design when we see it? The fundament claim of Intelligent Design Theory: Only intelligent agents create information.

Irreducible complexity

Explanatory filter

Complex specified information Is this fractal complex specified information?

What does it mean to Create Information? Creating irreducible complexity creates complex specified information

How IS ID different than all of modern Science? ID

Why is ID fitter than Darwinism? ID focuses on the information creation instead of on the information. Would you rather own… THIS THIS? OR

Collecting Active Information for Search & Optimization ?

Collecting CSI from People • According to Intelligent Design theory, particularly in “Search for a Search”, intelligent agents such as people are capable of improving search algorithm performance beyond mathematical bounds. • Goal: Create a generalized interface for people to contribute to an algorithmic search and optimization process, thus demonstrating human supra-computational capability.

Collecting CSI from People

Collecting CSI from People ! ?

Commercial CSI Collection • Mechanical Turk http://www.mturk.com/ • Marketplace of simple web based jobs for low skill work • reCaptcha http://www.google.com/recaptcha • Uses captchas to correct OCR text translation • Foldit http://fold.it/portal/ • Players fold genes along with algorithm, achieving results superior to gene folding algorithm alone • Google Image Labeling http://images.google.com/imagelabeler/ • Players compete to label images

Commercial CSI Collection • Mechanical Turk http://www.mturk.com/ • Marketplace of simple web based jobs for low skill work • reCaptcha http://www.google.com/recaptcha • Uses captchas to correct OCR text translation • Fold It http://fold.it/portal/ • Players fold genes along with algorithm, achieving results superior to gene folding algorithm alone • Google Image Labeling http://images.google.com/imagelabeler/ • Players compete to label images Foldit

Search Algorithm Primer What Robots Can Do

When are Search Algorithms Used? • Many problems can be solved by straightforward algorithms in an amount of time polynomial proportional to the problem size. These problems are generally tractable for solving exactly with a computer, though a significant amount of computing power and space may be necessary. • However, there is a much larger group of problems which, as far as we know, cannot be solved in polynomial time (NPC+). For these problems the best we can do is a best effort attempt to get as close to the optimal as possible within our computation time and space limits. • There are numerous different heuristic and approximation algorithmsthat are used for NPC+ problems, and this is where search algorithms are used. Since we don’t know how to find the optimum solution, we have to search around in a problem space.

How Complexity Classes Scale Blue=Linear, Green=Polynomial, Red=Exponential

Some Examples of NPC+ Problems • Finding binding sites on proteins • Delivery route planning • Calculating cheap airline trips • Stock market portfolio selection • Packing your belongings for a move • Making the Internet fast

How Search Algorithms Work • Search is a process of hill climbing, focusing on using information in previously found solutions to find even better solutions. One well known example is the Newton-Raphson method of finding square roots. • The problem is an uneven search landscape will cause a search to become stuck on low lying peaks and crags. • To get out of these traps, the search algorithm has to have an element of exploration. Exploration consists of sampling areas of the landscape, and hill climbing in promising sections.

How Search Algorithms Work • Search is a process of hill climbing, focusing on using information in previously found solutions to find even better solutions. One well known example is Newton’s method of finding square roots. • The problem is in an uneven search landscape will cause a search to get stuck on low peaks and crags. • To get out of these traps, the search algorithm has to have an element of exploration.Exploration consists of sampling areas of the landscape, and exploring promising sections.

How do we know where to go? Finding Good places to Explore x x x x x x x x x

The Dilemma • Unfortunately, selecting good areas to hillclimb is itself a very difficult problem to solve, and depending on how good of a guess is desired the selection algorithm will be NPC+. • Consequently, using search effectively to solve an NPC+ problem ends up introducing a new problem of equal or greater complexity (as predicted by Dembski’s“Search for a Search” paper). • Consider the following solution set from which a search algorithm needs to select a new space to explore.

Example • Which solution signifies a new area to investigate?

Intelligent Agents to the Rescue Our Superior Pattern Recognition

Why Can We Improve algorithms? • In Dr. Dembski’s “Search for a Search” he shows that search algorithms are incapable of finding a search target any better than a random search, without the insertion of external information. • Furthermore, he shows that such information cannot come from another search algorithm. It can only come from a non-algorithmic source. • Intelligent design theory posits that intelligent agents are capable of creating this information, and consequently capable of improving the capabilities of search and optimization algorithms.

Can we Improve Algorithms? ? ? ? ? ? • In Dr. Dembski’s “Search for a Search” he shows that search algorithms are incapable of finding a search target any better than a random search, without the insertion of external information. • Furthermore, he shows that such information cannot come from another search algorithm. It can only come from a non-algorithmic source. • Intelligent design theory posits that intelligent agents are capable of creating this information, and consequently capable of improving the capabilities of search and optimization algorithms. ? ???? ????

Human vs Algorithm Shows human and algorithmic performance on an NP-Complete (Travelling Salesman Problem). Points and O(n)/O(n ln n) plots show human capability, O(n2) and greater show algorithmic capability.

Human vs Algorithm

How Humans Help Solve the Dilemma • If humans are capable of adding information to the search process, then we can assist the search algorithm in exploring the problem space more effectively than algorithmically possible. • The reason why algorithms have trouble searching is because they don’t have a good, generic pattern detection ability. They can’t effectively detect patterns in the solutions that lead them to better solutions. However, we humans are known for our pattern detection, and can use our superior ability to help out the algorithm. • Let’s take another look at the search process.

Example • Which solution signifies a new area to investigate? • Must be both very unlike other good solutions, while being highly ranked.

Example • Which solution signifies a new area to investigate? • Must be both very unlike other good solutions, while being highly ranked. This solution is most unlike the rest, while also being highly ranked.

The Experiment In Which Things Kind Of Work

Hypotheses • Grand hypothesis: humans can improve any improvable search algorithm beyond mathematical limits • Actual hypothesis: humans can improve a particular search algorithm in a particular domain • Criteria for verification: human generated solution displaces best solutions found by computer in fewer samples of solutions

Experiment • Problem: find primes that generated RSA key pair • The fitness function has access to an original plain text and its cypher text. • Metric: two objectives to be maximized • 1) similarity between original plain text and cypher text generated by a given set of primes • 2) similarity between original cypher text and its decryption generated by a given set of primes • Algorithm: multi-objective genetic algorithm • Human involvement: users of Amazon’s Mechanical Turk service will select a set of solutions for one iteration of GA optimization • Method of comparison: best solution found in proportion to number of solutions checked by humans/algorithm.

Screenshot

Screenshot Explanation Checkbox selected by user to signify solution set for algorithm exploration. Solution is really just a bit string (universal problem representation). However, to make patterns more discernable and more appealing to the eye, substrings are mapped to images. Stars represent relative valuation of solution. 5 stars means one of best solutions found so far.

Amazon Turk Results There exists an optimum solution with objective values of 64 and 236 Optimum Optimal solution found by both algorithm and Amazon Turk user with values of 45 and 121 Objective #2 Objective #1

Conclusion • Actual hypothesis not verified. Humans (may have) contributed to, but did not improve, the search process. • Solution found did not displace solutions found by algorithm, since exact same solution was found by algorithm. Therefore, no human generated improvement observed. • However, human finding same solution shows definite contribution. • Experiment shows slight promise. However, Amazon Turk users are known to script their responses. So, results may be output of a script, not a human. • Many things can be improved in algorithm, GUI, data collection and mathematical analysis.

Improvements to Experiment • Add Captcha to submission form so Turkers cannot script form submission. • More descriptive user interface. Describe experiment? Turn into a game? Other suggestions? • Better comparison between human and algorithm?

Why is there no Problem information? • This representation is all the search algorithm sees. It knows nothing about the nature of the problem. • Consequently, to perform a fair comparison, the human user cannot be given any additional problem domain information.

Why compare on the number of solutions evaluated? • Both human users and algorithms are allowed to do whatever they want with the solutions that have been found so far. Consequently, the number of solutions evaluated is the upper bound on information used by both parties to discover new search areas.

Outline Of Confusion

Outline Of Confusion

Presentation Transcript

Clarity or Confusion

The coordinates of confusion

Skin Color Confusion

Confusion

Healthy Confusion

Serving Size Confusion

Confusion

Illusion Confusion

Label Confusion

Eliminate the Confusion

A case of acute confusion

THE AGE OF CONFUSION

Process Capability Confusion

People Confusion of the Day

DATA CONFUSION

Confusion

WoRd CoNfUsIon !

Confusion and Diffusion

Diffusion and Confusion

Confusion and Diffusion