Loading in 5 sec....

Michael R. Wick and Paul J. Wagner Department of Computer SciencePowerPoint Presentation

Michael R. Wick and Paul J. Wagner Department of Computer Science

- By
**telyn** - Follow User

- 110 Views
- Uploaded on

Download Presentation
## PowerPoint Slideshow about ' Michael R. Wick and Paul J. Wagner Department of Computer Science' - telyn

**An Image/Link below is provided (as is) to download presentation**

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript

Connecting Discrete Structures to the “Real World”Using Market Basket Analysis (and Gray Codes) to Integrate and Motivate Topics in Discrete Structures

Michael R. Wick and Paul J. Wagner

Department of Computer Science

University of Wisconsin - Eau Claire

Eau Claire, WI 54701

Road Map

- Introduction
- Our Discrete Structures Course
- Application: Market Basket Analysis
- The Apriori Algorithm
- Set Theory
- Dynamic Programming
- Algorithm Analysis

- Application: Binary Reflected Gray Codes
- Applications
- Recursion
- Algorithm Analysis
- Divide-and-Conquer
- Dynamic Programming

- Summary
- Contact Information

Introduction

- Perceived disconnect with Discrete Structures
- Rest of curriculum
- Application to “real world”

- Particularly problematic in applied programs
- We claim this course for our own
- Replaced similar course in Mathematics
- Retained rigor
- Infused applications and algorithmics

Our Discrete Structures Course

- Topics
- Logic
- Expert Systems, Algorithm Correctness Proof

- Proof Techniques
- Recursion
- Graycodes
- Divide and Conquer
- Dynamic Programming

- Sets & Relations
- Market-basket Analysis
- compareTo and equals implementations

- Functions
- Algorithm Analysis

- Combinatorics/Probability
- Expert Systems

- Matrices
- Graphics/Transmission Errors

- Graphs and Trees
- Shortest Path, Iterative Deepening, Huffman Coding

- Logic

Application: Market-Basket Analysis

- Sets are a powerful way to describe the application
- Market Basket Analysis: the use of association techniques to find groups of items that tend to occur together in transactions
- frequent item sets
- sets of items that occur above some minimum threshold (called the minimum support)
- example: {a,b,c,d} occurs 12 times (min. support == 10)

- association rules
- a,b,c d iff support({a,b,c,d}) / support({a,b,c}) r (called minimum confidence)
- a,b c,d iff support({a,b}) / support({c,d}) r
- how many such rules are there?

- frequent item sets
- Suggestive Sell
- When the client selects the antecedent items suggest that they select the consequent items

- Market Basket Analysis: the use of association techniques to find groups of items that tend to occur together in transactions

Application: Market-Basket Analysis

- Apriori Algorithm (1997)
- Principles
- Every subset of a frequent item set must be frequent
- Every frequent item set of cardinality n+1 must have at least two frequent item sets of cardinality n as subsets
- The intersection of these two subsets must have a cardinality of n-1
- We can build every possible frequent item set of size n+1 from the union of frequent item sets of size n.

- Principles

Application: Market-Basket Analysis

- Apriori Algorithm (1997)
- Example: minSupport = 2
I= {Table Saw, Router, Kreg Jig, Sander, Drill Press}

T= {{Table Saw, Router, Drill Press},

{ Router, Sander },

{ Router, Kreg Jig },

{Table Saw, Router, , Sander },

{Table Saw, , Kreg Jig },

{ Router, Kreg Jig },

{Table Saw, , Kreg Jig },

{Table Saw, Router, Kreg Jig, , Drill Press},

{Table Saw, Router, Kreg Jig }}

L1 = { {T}, {R}, {K}, {S}, {D} }

L2 = { {R,T}, {K,T}, {D,T}, {K,R}, {R,S}, {D,R} }

L3 = { {K,R,T}, {D,R,T} }

L4 =

Rules = ????

- Example: minSupport = 2

k

Application: Market-Basket Analysis- Apriori Algorithm (1997)
Let I = {a,b,c,…} be a set of all items in the domain

Let T = { S | S I } be a bag of all transaction records of item sets

Let support(S) = {A | A T S A} |

Let L1 = { {a} | a I support({a}) minSupport }

k (k > 1 Lk-1 ) Let

Lk = { Si Sj| (Si Lk-1) (Sj Lk-1)

( |Si– Sj| = 1 ) ( |Sj– Si| = 1)

( S[ ((S Si Sj) (|S| = k-1)) S Lk-1] )

( support(Si Sj) minSupport )

The set of all frequent item sets is given by

L = Lk

and the set of all association rules is given by

R = { A C | A (Lk) (C = Lk – A) (A ) (C )

support(Lk) / support(A) minConfidence }

k

k

Application: Market-Basket Analysis- Dynamic ProgrammingApproach
- Want proof of principle of optimality and overlapping subproblems
- Principle of Optimality
- The optimal solution to Lk includes the optimal solution of Lk-1
- Proof by contradiction

- Overlapping Subproblems
- Lemma of every subset of a frequent item set is a frequent item set
- Proof by contradiction

Application: Market-Basket Analysis

- Rule Generation Algorithm
Let L = k Lk

Let T = {S | S I } be the set of all transactions.

Let <A,C> be an association rule with antecedent A and consequent C.

Let confid(<A,C>) = |{B | B T

(A B) B}| /

|{B | B T A B}|

Let R1 = {<F-a,a> | F L a F

confid(F,a) ≥ min_confid)} and

k [ (k > 1) (Rk-1 ≠ )

Rk = { <A,C> |

(<A,Ci> Rk-1)

(<A,Cj> Rk-1)

(|Ci – Cj| =1 |Cj – Ci| = 1)

(S [((S Ci Cj)

(|S| = k-1)) <A,S> Rk-1])

(confide(<A, Ci Cj>) ≥ min_confi) }

then

R = Rk is the set of all confident association rules.

Given as a homework

problem on sets

Application: Binary Reflected Gray Codes

- Formal Definition:
- A binary reflected Gray code is a one-to-one function mapping the integers 0 i 2n – 1 to n-bit binary numbers so that every two consecutive binary numbers differ in exactly one bit.

- Origin
- Used by Emile Baudot in telegraph in 1878.
- Used by Frank Gray in 1953 patient for pulse-code modulation tube
- Prevented large noise spikes when vacuum tube counters incremented

- Example:

Application: Binary Reflected Gray Codes

- Appears in a curiously large number of applications
- Towers of Hanoi
- Robotic Arm Angle measurement
- Hamiltonian Circuits
- …

Application: Binary Reflected Gray Codes

- Why is it called “Binary Reflected”?
- Binary is obvious
- Strings are drawn from alphabet of 0s and 1s

- Reflected is less obvious
- Each half of the code sequence is built from a reflected copy of the other half

- Binary is obvious

Application: Binary Reflected Gray Codes

- A Simple Recursive Definition
- Let G(k,n) represent the kth code in the n-bit binary reflected Gray code sequence
- Computed in Θ(n) time (for n bits)
- For single Gray code value, this is optimal
- Typically, however, desire entire code sequent

Application: Binary Reflected Gray Codes

- A Naïve Implementation
- To generate the entire sequence, call G(i,n) with i going from 0 to k-1.
- A priori Analysis
- Each invocation of G requires Θ(n) time
- G is invoked k times
- k is equal to 2n
- Therefore, Θ(n*2n) time and Θ(2n) space
- Optimal is Θ(2n) time and space

Application: Binary Reflected Gray Codes

- What is the source of the inefficiency?
- Repeated work.

Application: Binary Reflected Gray Codes

- A Dynamic Programming Approach

Application: Binary Reflected Gray Codes

- Naïve Dynamic Programming Implementation
- Requirement
- We must generate and store the entire (n-1)-bit Gray code sequence prior to starting the n-bit Gray code sequence

- Approach
- Use two-dimensional matrix to store previously calculated Gray code sequences

- Requirement

Application: Binary Reflected Gray Codes

- Analysis
- Time
- Space

Application: Binary Reflected Gray Codes

- Notice the classic time/space trade-off
- Naïve Iterative
- Time: Θ(n*2n)
- Space: Θ(2n)

- Naïve Dynamic Programming
- Time: Θ(2n+1)
- Space: Θ(2n+1)

- Naïve Iterative
- What are the sources of the remaining inefficiencies?
- Time: Spends too much time copying values
- 2nd half of n-bit sequence is copy (plus “0”) of 1st half

- Space: Only require previous Gray code sequence, not all previous sequences

- Time: Spends too much time copying values

Time/Space trade-off

is just a rule of thumb

Application: Binary Reflected Gray Codes

- Improved Approach
- Use integers rather than strings to represent codes
- Binary representation of integer is equivalent to the string version
- Requires only 1 bit per bit of code.

- Reuse the first half of the (n-1)-bit sequence directly as the first half of n-bit sequence
- Most-significant bit is still set as it must contain leading zeros.
- To set leading one of second half, just add 2n-1

- Use integers rather than strings to represent codes

Application: Binary Reflected Gray Codes

- Analysis
- Produces and stores
- Time and Space

Summary

- Revised Discrete Structures Course
- Explicit connection to curriculum
- Infusion of “real-world” applications
- Applications allow infusion of
- Dynamic Programming
- Divide-and-Conquer
- Set Theory
- Algorithm Analysis
- Recursion
- Proof Techniques
- Logic

Contact Information

Michael R. Wick ([email protected])

Paul J. Wagner ([email protected])

Department of Computer Science

University of Wisconsin – Eau Claire

Eau Claire, WI 54701

www.cs.uwec.edu

Download Presentation

Connecting to Server..