CS 122 – Jan. 9

CS 122 – Jan. 9 The nature of computer science General solutions are better What is a program? OO programming Broad outline of topics: Program design, GUIs, recursion, sorting, collections Handout: Review of Java (errors, I/O, ArrayList)

Computer science The study of how we solve problems How we manipulate and represent information Ultimately, think of rep’n of brain & knowledge  Inspiration for hierarchical data structures In problem solving, we crave general solutions Peanut butter & jelly sandwich  any sandwich Average of 3 numbers  No restriction

Program What kinds of programs have you seen, written? Software can be classified in many ways. First, consider this breakdown: simple & not-so-simple “Simple” problems can be solved easily with 1 source file Input, calculations, output Ex. How many vowels does a 9-letter word typically have? Object oriented technique for larger concerns Critical for us to design first – why?

Review You can find my notes for previous course here http://cs.furman.edu/~chealy/cs11/notes.ppt Java API documentation http://docs.oracle.com/javase/7/docs/api/ Handout highlights some important things to remember: common errors, I/O, ArrayList class

CS 122 – Jan. 11 OO programming Good for development of larger programs Encapsulation Examples Handout Review elementary Java

OO programming In the problem description, look for nouns first A noun can be a primitive type or a class Often we define our own classes E.g. Card, Deck, Board, Player An object has Attributes (i.e. little nouns) Operations In a larger program, we have several source files (i.e. classes)

Philosophy OO programming means we look for nouns first A “noun” could be Primitive type Class object, a complex noun with attributes + operations We define classes that accurately reflect the information in our program, but not every detail (abstraction). Defining your own class might not be necessary. Sometimes the type/class we want already exists in the language or API. Your class might have just 1 attribute or just 1 instance

Encapsulation Separate implementation from use Outside class implementation (e.g. in Driver), we should not be concerned with object rep’n or how operations implemented Logic & strategy of a game does not depend on how information is stored internally This is why attributes are declared private Only be concerned with “interface” information Names of methods (including constructors), necessary parameters, return type

Example Let’s design a small OO program pertaining to information about mountains. Mountain class Attributes for: continent, name, height Constructors Can have get/set methods for each attribute Driver class with main( ) function Contains the logic that actually does something with the data. Let’s read a text file containing mountain information, and list those over 10,000 feet.

continued Outline of driver class: Read file For each line: Parse information contained in the line Create mountain object Add to array or Arraylist Close file For each mountain If it’s over 10,000 feet, print it out Have we solved the problem? See review handouts

Consider these (From handout) Count the number of words in a text file Modify mountain program to display average height of all the mountains Compute the cost of tiling a floor What class should be defined? Its attributes & operations?

CS 122 – Jan. 13 Briefly discuss room tiling problem Aggregation Interfaces Inheritance Handouts: Encryption (interfaces) Shapes (Please show me your lab if you have not already done so.)

Generalizing If you can tile/carpet a single room, you can do a whole house, etc. A House contains Rooms Create a House class that has an attribute representing a collection (i.e. ArrayList) of Room objects To find total cost for house, write a loop that traverses this list of rooms, finding the cost for each one. Can extend outward to more layers: Subdivision, City This class relationship is aggregation, or “has a”

Illustration After extending our Room program to include a House class, we now have 3 classes: Driver class Create a House object Find the total cost to tile/carpet this house House class Contains collection of Rooms For each room, call appropriate room.findCost( ) and add them all up. Room class Already has a formula to find its cost

Organization How is an OO program organized? Look at Driver/main( ) to see what a program is doing. Look at the other classes to see what the program is made of (big nouns), and what it is capable of (operations)

Class building helpers Designing and implementing a class can be time consuming. There are 2 OO features to help. They start with the letter “I” Interface Inheritance An interface is a to-do list of methods we want for a class An interface is not a class! We say that a class implements an interface. It’s possible for 2+ classes to implement the same interface.

Examples Geometrical shapes like Rectangle, Circle ShapeInterface.java Rectangle.java See handout Encryption Different ways to do it Need to define functions to encrypt & decrypt Many possible implementations! Note: compiler will complain if you forget to implement all methods listed in interface

Inheritance Another class relationship “is a” rather than “has a” Useful for what? To improve existing class, add functionality Random has nextInt( ) and nextDouble( ), but not nextLetter( ) or nextWord( ) To make a more specialized class Publication  Book or Magazine Animal  Fish, Reptile, Bird When you inherit something, you can override it

Shape example Shape class Rectangle, Cube and Sphere inherit from shape 3-D inherits from 2-D Yes, you can have inheritance among interfaces! Rectangle implements 2-D Cube and Sphere implement 3-D Can represent all these relationships with UML diagram

CS 122 – Jan. 18 Inheritance Polymorphism Handouts: Zoo Speaker Pet

Zoo program What inheritance relationships do we have among the classes: Animal, Bird, Fish, Penguin Reptile, Snake? ArrayList<Animal> “super” is called implicitly, for default constructor For other kinds of constructors, call super( ) first to initialize inherited attributes. exercise( ) uses instanceof – saves us from having to implement exercise( ) in every subclass We override toString( )

Override When a method is called, how do we know where to find its implementation? Penguin p = new Penguin( ); p.fun( ); Which fun( ) is being called? What if fun( ) does not exist in Penguin.java? You can look up, but not down when looking for attribute/method. E.g. equals( ) and toString( ) are defined in Object, and we normally override these.

Polymorphism What does this word mean? At any time, an object can change its dynamic type by a call to a constructor. When declared, object is given a static type to ensure program will compile. Examples Speaker (interface) Pet (inheritance)

CS 122 – Jan. 20 Handling exceptions Begin study of GUIs Start with simple applets Responding to events Handouts: Exceptions Steps in writing a GUI Applet examples

Exceptions What can go wrong when a program runs? (run-time errors) File not found, number format, etc. Important errors have their own classes that we can include To allow program to continue gracefully, use try/catch block Example: Read a menu, and average the prices, ignoring all the words See handout

Steps in creating GUI Define & initialize the components you want Where? In main( ), init( ), or constructor Component = object to show on screen Decide how things should be drawn Details in paint function Create listener objects Figure out which components (e.g. button) will generate events for this listener. Define how listener should respond to event

Typical Applet Structure My applet class Attributes Constructor Inner Listener class – somewhere in here call repaint( ) Create objects that we will draw Call the listener Paint method

Applet examples Applet = small program, not stand-alone application. To be run in a browser. In each program, you’ll see 3 parts init( ) or constructor paint( ) if we want to draw geometric shapes Listener class, with object declared in init( ) FirstApplet – doesn’t have listener. We just take input string and go. MouseApplet3 – User can click, and blue square follows! Look at mousePressed( ) MouseApplet4 – alternate blue/red

Applets, continued In mouseApplet4: How is the grid color changing? How often does a certain code fragment get executed? init( ), constructor  once at the beginning Listener  when the event occurs paint  when we need to refresh the screen

CS 122 – Jan. 23 Review mouse applets Applet that draws polygon “Containment hierarchy” GUI Application Handouts: Draw polygon (PolygonApplet) Containment hierarchy EasyFrame

Drawing polygon GUI components: Frame contains panel, which contains: 2 Labels, 2 Text fields, 1 Button Need to attach listener to button! Data structure for our polygon ArrayList of Point objects When ready to draw, create Polygon to paint in applet: X-array Y-array Number of points We can create Polygon in the listener or in paint( )

GUIs GUIs have components, and some components generate events Components are placed into “panels” Ex. Polygon applet Components of the pop-up frame: Frame has a Panel Panel has 2 labels, 2 text fields, 1 button Attach listener to button, so we know how to add a new point Draw polygon using x array, y array, # points

GUIs Containment hierarchy for realistic GUI: Usually 1 frame (window), comes with its own default panel: “content pane” Create more panels to add to content pane Put components into panels; panel has some layout that determines where stuff goes What events could take place? ActionEvent for button ChangeEvent for slider CaretEvent for typing in text field (ignore this one) Don’t memorize – these events are listed in handout

First examples “EasyFrame” has no events, but is application. Note use of JFrame. This is the starting point for any GUI application.

Example Containment hierarchy can look like this Frame Panel North panel 2 labels West panel Panel for first name Label Text field Panel for last name Label Text field

CS 122 – Jan. 25 Any questions on containment hierarchy? Note that we define frame in driver, and have separate classes for panels In panel constructor, good place to add listener Handouts: Text area Tabs & layouts Rectangle frame (with control panel)

First examples “EasyFrame” has no events, but is application.  Note use of JFrame. This is the starting point for any GUI application. “Text area” example: Scroll pane Has an input frame, and an output frame Listener for button

Tabs & layouts There are different ways to lay out components in a GUI. In the “Tab” program, a frame contains a “tabbed pane”, which contains 5 panels 4 panels show possible layouts Flow (default) Border (when you add stuff, say where it goes) Grid Box (uses “glue”; when initialized, specify vert or horiz orientation) With JLabel, you can center / justify text.

RectangleFrame We can draw geometric objects in panel instead of applet This program is similar to one in the lab Frame contains 2 panels Specified in frame’s constructor Input panel listens to Action event from a button Output panel repaints the rectangle paintComponent( ) is analogous to paint( ) that you saw in applets.

CS 122 – Jan. 27 More interactive features in a GUI! Radio buttons Button group to enforce mutual exclusion Check boxes Making a GUI easier to use Handouts: Pizza (choices) Day/night (special features)

Pizza GUI Allow user to make choices Radio buttons – mutually exclusive choice Check boxes – can have any # checked We’ll use the isSelected( ) method Any change will affect price, which is shown in a JLabel object. Can use the same listener.  Button group for radio buttons to make sure only 1 can be selected Where / how do we update price? I decided not to have separate classes for each panel

Day / Night GUI User-friendly features in a GUI Enable / disable a button whenever appropriate Mnemonics Tool tips We have a picture panel and button panel We have listener for each of 2 buttons One panel needs to talk to the other  send parameter! Can you guess appearance / behavior from the code? When are the various functions invoked?

CS 122 – Jan. 30 Jlist Maintains a list of objects (e.g. file names) Sliders Communication between panels Review event train of thought Border, split pane, scroll pane Handouts: JList (pick image) and borders JSlider Events

Review events

CS 122 – Feb. 1 Recursion Definition How function calls work Purpose & examples Handouts: Odd number / factorial / Fibonacci Exponential and Pascal formulas

Recursion When a function calls itself A more technical definition – within a single thread of control, when a function is called and an earlier instance of the same function has not yet returned It’s a problem-solving technique You have a problem but only know how to solve a small case Break a problem down until it can be expressed in simple cases

Function calling Review what happens when main calls f, and f calls g. While in g, how do we know where we’ve been? How do we know where to return to? Function calls to g may appear many times throughout the program. In recursion, the common scenario is: main calls f, and f calls itself! When f calls itself, it’s not yet returning How does process not go on forever? Need a “base case”; it must always be achievable. Direct vs. indirect recursion

Recursion We’ll start with math formulas, as in defining a sequence of numbers You’ll notice definitions with 2 parts: Base case Recursive case How would you define these sequences? 8, 13, 18, 23, 28, … 3, 6, 12, 24, … Well known recursive problems: Factorial, Fibonacci numbers, Pascal’s triangle

Thinking recursively Recursion is a different way of approaching the problem of having to repeat something It’s accomplishing the same thing as a loop Theoretically, loops and recursion are equivalent, and we can convert one to the other. Express some problem in simpler terms – an important skill Ex. fact(6) = 6 * fact(5) How does this continue?

Formula examples Let’s examine recursive implementations of Computing the nth odd number Factorial Fibonacci Refer to handout

CS 122 – Feb. 3 For HW: how to check cube color validity? Simple recursion examples Exponentiation Pascal’s triangle Number of recursive calls  inefficiency Towers of Hanoi Handouts: Towers of Hanoi

Inefficiency Recursion can be inefficient when you run it For Fibonacci and Pascal’s triangle, the number of function calls increases rapidly. Why? How many total function calls are needed for Pascal(4,2)? Pascal(a, b) calls Pascal(a – 1, b – 1) and Pascal(a – 1, b) We can draw a tree of all the function calls made. Implementations vary, but we can encounter base case whenever b = 0 or a = b.

# calls for Pascal(a, b) a = b or b = 0 1 call 2,1 1 + 1 + myself = 3 3,1 calls(2,0) + calls(2,1) + myself = 1+3+1 = 5 4,1 calls(3,0) + calls(3,1) + myself = 1+5+1 = 7 2,2 1 call 3,2 calls(2,1) + calls(2,2) + myself = 3+1+1 = 5 4,2 calls(3,1) + calls(3,2) + myself = 5+5+1 = 11 5,2 calls(4,1) + calls(4,2) + myself = 7+11+1 = 19

Towers of Hanoi How can we move n disks from peg A to peg B if: Can only move one disk at a time Cannot place a disk on top of a smaller one See handout

CS 122 – Feb. 6 Recursion Any repetitive action can be expressed recursively In other words, we can convert from loop to recursion Some programming languages only allow recursion Examples Handout: Practice recursion

Remarks Earlier we saw how to define a sequence recursively. What if we interleave 2 sequences? 8, 12, 14, 20, 20, 28, 26, 36, … Beware of infinite recursion – stack overflow In my example “Limit” program, we could call ourselves thousands of times before running out of memory for the program.

Practice For lab, see handout Practice.java How would you approach the problem of Counting from 1 to 10? 10 to 1? Counting by 5’s? Print the alphabet? Printing 10 stars? N stars? You always need a base case, and a recursive case (at least 1 of each) Converting an integer to/from binary string

CS 122 – Feb. 8 Recursion: More on number base conversion Another array application: search Linear search strategy Binary search strategy (Some searches require backtracking) Handouts: Recursive searches 8 queens

CS 122 – Feb. 10 Backtracking Recursion that takes care of mistakes, wrong turns Good for various types of searches Need to find solution to some problem without trying every combination We’ll look at a couple of examples from chess 8 queens problem Knight’s tour Handout: Knight’s Tour

8 Queens Is there some way to place 8 queens on a chess board so that no one can get captured? (There are no other pieces on the board, just the queens) There are over 4 billion ways to arrange the queens Don’t write 8 nested loops – waste

Outline of solution Place one queen in the first column. Place the next queen in the next column, in a place safe from attack If not able to place this queen, back up! It means previous combination of choices does not produce a solution. We got stuck. In the code, look for Pre and post condition (comment for us) Base case = we’re done When we backtrack, see value of done To find all solutions, set done to false to pretend we failed. Print solution in base case.

Knight’s tour Is it possible for a knight to make legal moves from square to square on a (5x5) chess board visiting all the squares exactly once? (8x8 board would be too slow) Code is a little ugly, due to knight moves and checking if off the board Details We start in a corner There are 8 possible directions to choose from Cells are numbered 1-25 Actually, an elegant iterative solution exists: Start in a corner & continue to rotate in same direction using outermost possible squares.

CS 122 – Feb. 13 Backtracking We saw some chess examples  How would we find all solutions to a problem, not just one? Something amazing: finding our way through a maze  Handouts: Maze

Maze Famous example of backtracking solve( ) is heart of solution – returns boolean Start at solve(0, 0) Done = false If blocked, return false // base cases If at exit, return true Else // recursive case: Mark this square as being tried Done = solve (go up) if not done, try another direction If done, mark the successful path  cascade returns

continued blocked( ) checks Wall Out of bounds Already tried that cell Where do we see backtracking In the code? In the output? Similar problems: 3-D, finding size of contiguous area or tumor

CS 122 – Feb. 15 Example: Upside-down numbers Defining things with recursion Specifically, sets of strings Recursive function to test whether a given string has this property Generating these strings. (How do we decide on length?) Handout Defining strings

Recursive definitions Recursion is good for defining things. If we can define a set of strings, then we can Generate possible strings in that set Test for membership – like a compiler checking syntax We start with a description of a set of strings (“language”) Ex. Strings consisting of 1 or more a’s. Can be helpful to write down example words Identify base case, recursive rule Write a formal definition (can express it in Java) See handout for examples

Accepting a string Starts with an ‘a’, followed by zero or more ‘b’s Function isGood(String s) if s is “a”, return true else if s length <= 1, but not “a”, return false else // length 2 or more verify that s ends with ‘b’ call isGood(s minus its last character) String is of the form anbn Function isGood(String s) if s is “”, return true else if s length is odd, return false else // even length verify s begins with ‘a’ and ends with ‘b’ call isGood(s minus first and last char’s)

CS 122 – Feb. 17 Recursion with sets of strings: Accepting a string Generating possible strings: how do we decide on the length?  concept of random recursion Handouts Accept Generate Lobster

Random recursion Let the machine decide how far to go Often, this means we are counting up, rather than down Ex. To return a random factorial, we effectively multiply 1 * 2 * 3, … and stop at some random place Technique is useful when you want to generate random text How many words do you want? Inside recursive function, we “decide” whether to take base case or not.

CS 122 – Feb. 20 Using recursion to … Generate all possible permutations Evaluate a numerical expression Handouts Review for midterm, including old test w/answers Permute Evaluator

Permutations How do we permute 4 objects { W, X, Y, Z } ? Write down W, then permute the 3 objects { X, Y, Z } Write down X, then permute the 3 objects { W, Y, Z } Write down Y, then permute the 3 objects { W, X, Z } Write down Z, then permute the 3 objects { W, X, Y } See what is happening in general? Each object gets swapped to the front temporarily Rest of list is permuted recursively What is the base case?

Expression evaluation What is a mathematical expression? Expr = “terms” separated by + or – Term = “factors” separated by * or / Factor = single number or another expression! Indirect recursion For simplicity, implementation assumes 1 digit operands Program works also for non-nested expressions: start with these

CS 122 – Feb. 22 Recap grammar in Old Mother Hubbard More about expressions Define recursively in form of a grammar Introduce alternate notations: prefix and postfix Recursively defined image Handouts: GUI recursion (Fractal)

Making couplets rhyme She went to the ____ to buy him a { ____ [ But when she came back, he was ____ the ] ____ } The 2nd and 4th blanks must rhyme. Select these words at the same time! (at….at) She went to the ____ to buy him a (ee….ee) Between the rhyming words, we have a phrase that can be chosen independently.

Expression grammar “An expression is 1+ terms separated by + or –” “A term is 1+ factors separated by * or /” “A factor is a single number or an expression” expr term | term + term | term – term term  factor | factor * factor | factor / factor factor  number | expr number  0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 In later courses, we’ll look at this in more detail to work out subtleties.

Notations There are 3 ways to write down a mathematical expression. We’ll study them in detail soon. Infix (default): operator sits between operands Prefix: operator appears before operands Postfix: operator appears after operands Advantage of prefix & postfix No ambiguity about order of operations So, we never need grouping symbols Can we define prefix and postfix expressions formally?

GUI recursion Drawing tiled images To make this work, the applet has to know where images are. drawImage() lets us display image without JLabel. It has several versions based on parameters. Fractal image. Organization: Applet contains panel which contains: Button panel 2 buttons 2 labels Koch snowflake panel Keeps track of level. For each level, 5 points on the side of a triangle.

CS 122 – Feb. 27 Design Software life cycle Classes discovery Class relationships State Handouts: OO design concepts Store design example

Design Software life cycle… includes maintaining code Don’t try to do everything at once Waterfall model: risk of over-designing, creating classes you’ll never use Spiral approach: go thru iterations, phases Creating a design Discover classes & their relationships Consider sequence of events  “states”, although this is not needed for simple programs

Design In real life, classes can be complex, with many attributes and operations. For example: Airline, Army, Bank, Casino, Hotel, Restaurant, Store Programs including such a class may need other classes as well, such as Employee, Customer, Product The state you are in determines the effect / functionality of input devices (buttons, dials). Consider accelerator pedal.

Examples Class relationships (database) for a Store What states would we encounter in… a microwave oven? a vending machine? an ATM? a calculator? a casino? buying an airline ticket? A GUI has buttons that need to respond based on state; multi-purpose buttons.

CS 122 – Feb. 29 Lab recap (designs) ATM design example Handouts Answers to test Complete list of lab questions ATM design ATM button & state table

Other examples In any design, we need to consider What buttons do we need? What events do we pay attention to? How do events and states relate? For next day, please be ready to discuss: Design a child’s game that will display a clock and ask the user to enter the time. Determine if the time entered is correct. Keep track of score. Different players may play. Design a similar game that tests arithmetic Design a vending machine

CS 122 – March 2 Design examples ATM  Game Vending machine What does it mean for a program to be in some state? What states are in a solitaire game? Handout: Game design

Review design For ATM, buttons A, B, C have different purpose depending on state – that’s why they have generic names Otherwise we’d need a lot more buttons! Take note of Attributes and operations of each class Table of states and buttons For vending machine, What buttons do we need? What events do we pay attention to? How do events and states relate?

Vending machine design Machine contains products Product has price, quantity – can be replenished Need to accept coins and user selection as “input event” that may signal a state change. Machine class: Product class: Attributes are: Attributes are: product [ ] name totalMoney price password / key quantityAvailable moneyInserted Operations are: Operations are: buy( ) emptyMoney( ) replenish( ) create all products

CS 122 – March 12 Sorting Many possible ways to accomplish the task Brings up concept of algorithm efficiency Handout: Selection and insertion sort

Sorting Some methods do better depending on type of data or how distributed Easy algorithms are not the fastest  Most efficient algorithms are clever One possible method, “swap” sort: for i = 0 to n – 1, for j = i+1 to n – 1, if a[ i ] and a[ j ] are out of order, swap them

Famous methods

Examples Selection sort: Find the largest value and swap it into first, find 2nd largest value and put it 2nd, etc. Bubble sort: Scan the list and see which consecutive values are out of order & swap them Insertion sort: Place the next element in the correct place by shifting other ones over to make room Swap sort: For each pair of elements, see if they are in correct sequence. If not, swap them.

Check out Enjoy this Web site demo of sorting methods http://cg.scs.carleton.ca/~morin/misc/sortalg Doing operations in parallel can speed up sorting In bubble sort, we can swap #1 and #2 at the same time as #3 and #4, etc.

CS 122 – March 14 Recursive sorts Merge sort: Split list in half until just 1-2 elements. Merge adjacent lists by collating them. Quick sort: We want “smaller” elements on left, and “larger” elements on right. Recursively continue on these 2 subarrays. Can be tricky to code Handouts: Code for recursive sorts Quick sort

Quick sort Reminiscent of merge sort, but the tough part is how we partition the (sub)-array Given a slice of the array a[ p .. r ], Need to decide on suitable pivot/threshold value. Start with left and right hands at both ends Work inward. Stop when you see Big number (> pivot) on the left Small number (< pivot) on the right, and Swap numbers if necessary The loop stops when the hands converge. This marks end of the first partition for the recursive calls.

CS 122 – March 16 Sorting using the Java API Radix sort Bucket sort Handout: Sorting using the API

Sorting with API If a is an array of primitive values, use Arrays.sort(a) If a is an array of objects, use Arrays.sort(a, comparatorObject) If a is an ArrayList, use Collections.sort(a, comparatorObject) Purpose of comparator is to tell Java how to compare 2 values belonging to some class. We need to create our own comparator class first! Arrays and Collections also have other helpful methods for max, min, search, shuffle….

Special sorts If we make some assumptions about the input data, it is possible to sort in linear time. In other words, by only using singly nested loops. Radix sort Dates from the time of punch cards Assume an upper bound on # digits of each number For each decimal place from right to left: Create 10 buckets labelled 0-9 Place each number into bucket based on its digit @ this decimal place Concatenate the buckets

Bucket sort Works well if values evenly distributed between lowest and highest Partition the range of values into n buckets Typically n = size of array For example, if you want 10 buckets, and the range of all values is 0-200, then the first bucket holds values 0-20, the next bucket holds values 20-40, etc. For each value in array, place into appropriate bucket Within each bucket, sort elements using any method. Ideally there would be very few elements per bucket. Concatenate the buckets.

CS 122 – March 19 Finish sorting Counting sort Stooge sort Collections (data structures) Aggregate data types other than array Defining our own data types that can hold many objects Example: “Bag” data structure Handout: Halloween (bag with ArrayList)

Sorting Counting sort Good when you have small number of distinct values or a small range  repetition Use 2nd array to record distribution of values Use distribution to tell you how many of each value to put into final array Example A = [ 3, 1, 2, 1, 2, 0, 3, 0, 1, 3, 2, 3 ] C = [ 2, 3, 3, 4 ] B = [ 0, 0, 1, 1, 1, 2, 2, 2, 3, 3, 3, 3 ]

Sorting Stooge sort Needlessly complex, approx. on order of n2.7 Purpose purely to study complexity of algorithms, originally a textbook exercise Given a (sub) array a[p..r], First, swap endpoints a[p] and a[r] if they are out of order if p..r consists of 3+ elements, recursively call yourself 3 times: first 2/3 of list, last 2/3 of list, first 2/3 of list again

Collections Very often we want to maintain a lot of data of the same type Many ways to conceive of a collection of data May be linear or nonlinear A general procedure: Write an interface – what ops should it do? Choose an underlying representation. Then you are ready to implement the interface Test the implementation. If not satisfied, you may need to change the rep’n

Considerations Common operations to a collection include: Insert, delete, search/get, sort, shuffle, how many, print whole thing out Design questions for the rep’n Where do elements get added? Does it matter? Can I have duplicate values? Are the values always sorted? Among (add, search, remove), which operation will be performed most frequently? Answers to these questions will help decide on rep’n

Iterator Once you have a collection implemented, may be helpful to have systematic way to visit all elements. Iterator = another class whose sole purpose is to traverse your collection A collection may have > 1 type of iterator. Iterator is an interface in the Java API. Methods: next( ) hasNext( ) remove( ) -- we rarely need this one

Iterators Here’s one way to include an iterator with your collection: Somewhere you need to define the iterator class Implement the 3 required methods Since we are not interested in removing elements, just throw UnsupportedMethodException In your collection’s implementation, include a method that will return an iterator object. Whoever has an iterator object (e.g. Driver class) can call next( ) and hasNext( ) as needed.

Example: Bag Let’s create our own collection: the Bag Purpose – to simulate contents of a bag/sack/box – we are generally uninterested in object’s location inside the Bag. Drawing numbers for bingo or similar game. Bag interface should include what operations? In particular, what operation(s) might a Bag have that other collections would not? And vice versa?

CS 122 – March 21 Data often collected into some kind of collection or data structure  May be linear or nonlinear, as appropriate Another linear data structure: linked list Handouts: Linked list guide Linked list implementation

Linked List Linear data structure, with individual elements called nodes. Unlike an ArrayList, the individual elements are not numbered. There is no direct way to obtain, say, the 10th element. Very little information is needed to maintain list: List itself consists of a head and tail. Head refers to first element in the list. Tail = last. Each node in the list knows its previous and next element So, instead of an array(list) of data, we have a linked list of node objects, each of which holds some data

LL Implementation First, create Node class 3 attributes: data, prev, next The data can be of any type. For this reason, we typically define Node generically. The prev and next are also Node objects. LinkedList class Attributes: head, tail (optional: # of elements) Operations: insert & remove are most important! What else? (Almost every class should have toString( ), at least for debugging)

LL handout Examine the various operations such as remove( ) and sort( ) We can test the LL implementation with Driver and Tester classes. Prints node address so we can see the actual “links” We have a separate Node class

Notes on LL Be sure to work out insert/delete operations. There are special cases to consider (see handout). Singly linked versus doubly linked Compared to ArrayList, which LL operations are slower, faster? The Java API already has a LinkedList class. Creating our own is a good exercise to learn how to implement collections, but in practice not necessary.

CS 122 – March 23 2 more linear data structures: Stack Queue (we’ll study this a little later) Handouts: Intro to Stacks and Queues Stack implementations Application: Balanced parentheses Application: Postfix evaluation

Stack A linear data structure where we insert & remove elements from the same end, e.g. the “top”. “Last in – first out” philosophy. The insert operation is called push Delete operation is called pop For example, what is left in a stack after performing these operations: push(3); push(7); push(4); pop( ); push(1); pop( ); pop( ); Stacks are good when you want things to be processed backwards & forwards Ex. Maze; certain card games; Recursion & backtracking

Implementation Stack Interface Useful operations: push, pop, peek, size, toString( ) (Several implementations possible based on rep’n) Representation: ArrayList of any type of object. In other words, we can define a generic stack to make the implementation as general (useful) as possible Boundary cases Need to handle empty stack situation. When is this a problem? Could also impose a logical limit on size. How? Why? API has a Stack implementation

Applications Some applications we will examine: Checking balanced parentheses in a mathematical expression Evaluating a postfix expression Converting an infix to a postfix expression You will see stacks in later courses When a computer program runs, the system must maintain a run-time stack to keep track of nested function calls. When a function returns, how does the system remember where to return back to?

Balanced ( ) Push every ( Pop a ( on every ) Ignore all other characters Result: When do we say input is OK? Stack should be empty when done with input. Bad if we have extra ( at the end, or no ( to pop when we see a ). Need to guard against a run-time error (empty stack).

Evaluate postfix Every operator needs 2 operands (the most recent 2) So, push numbers on the stack When you see an operator: Pop a number, and call it y Pop another number, and call it x Evaluate x OP y (don’t get this backwards!) Push the result When done, pop the answer.

Example Our postfix expression is: 25 5 – 6 7 + * 9 + When you see a number… push. When you see an operator… pop 2, evaluate, push. When no more input, pop answer.

CS 122 – March 26 Math expressions Evaluating postfix Evaluating prefix Converting from infix to postfix Handout: Infix to postfix worksheet Infix to postfix (Convert.java incomplete for lab)

Evaluating expressions How can a computer evaluate a numerical expression? First, need to introduce some notation: Infix: operator goes between numbers Prefix: operator goes before numbers Postfix: operator goes after numbers It turns out that evaluating postfix expressions is relatively quick & easy. So, our approach is a 2-step process: Convert an infix expression into postfix Evaluate the postfix expression

Eval prefix ? Incidentally, how would you evaluate a prefix expression? Not as easy. Try some examples: + 2 3 Push +, push 2 … + – 2 3 4 Push +, push -, push 2, … + 2 3 – 4 Push +, push 2, …

Infix  Postfix Observations: The order of the numbers inside the expression does not change Operators appear later in the postfix version, and possibly out of their original order Let’s look at simple examples first How does 1 + 2 become 1 2 + ? How does 1 + 2 – 3 become 1 2 + 3 – ? Compare what happens to 1 + 2 * 3 and 1 * 2 + 3. What is different, and why?

continued 1 + 2 * 3 becomes 1 2 3 * +: push +, push *, and when done pop *, pop + 1 * 2 + 3 becomes 1 2 * 3 +: push *, pop * & push +, when done pop + 1 * 2 / 3 + 4 – 5 becomes 1 2 3 * / 4 + 5 –: push *, pop * push /, pop / push +, pop + push -, at end pop - 1 + 2 * 3 – 4 becomes 1 2 3 * + 4 –: push +, push *, pop * pop + push -, at end pop -

More examples ( 1 + 2 ) * 3 becomes 1 2 + 3 * What happens when we get to the ) ? 2 – (3 – (4 – 5 * 6) + 7) * 8 becomes 2 3 4 5 6 * – – 7 + 8 * – push -, push (, push -, push (, push -, push *, at ) pop *-(, at + pop – push +, at ) pop +(, … Contrast these similar cases: 1 – 2 – 3 becomes 1 2 – 3 – 1 – (2 – 3) becomes 1 2 3 – –

Moral of story Don’t worry about the operands. Just print them out as you see them. Always push ( When you see ), pop until you see matching ( When you see an operator, pop those of same or higher precedence. Stop popping when you get to ( or an operator of lower precedence. Then push me.

CS 122 – March 28 Collections array, ArrayList, LinkedList, Stack, Queue, … Issues How do you want to use your data? How should they be (logically) arranged? In a linear fashion? Let’s look at queues and hashing Handout: Queue Hashing (to see how fast searches are)

Queue Similar to stack, except: Philosophy is “first in, first out” not “last in, first out” As a result, we add elements to one end of the queue, and remove from the opposite end Names of operations: enquque and dequeue Purpose Collect data before it can be processed Data needs to be buffered for some time (Has this happened to you?) Producer / consumer, Reader / writer, input / output, client / server Can be tricky to implement with array (Java API has no Queue class)

Set interface Often we want to store values in a collection, and our situation calls for: No duplicates Order of elements doesn’t matter (freedom to place elements arbitrarily) Desired operations: Add, remove, contains, iterator Java API has 2 Set implementations that are more efficient than ArrayList or Linked List. HashSet and TreeSet  Important to understand concepts of hashing & trees.

Hashing Hash table = A fancy array-type of data structure Purpose We want quick search; respectable insert, delete times May want to access any element any time, so we don’t want the restrictions of stack or queue There is no notion of a previous or next value We have no need to sort the data Our “bag” data structure could be a hash table if we desire Major feature: each element has a hash value or hash “code” that determines where into the hash table the object belongs.

How to hash Where does an object go into a hash table? The object needs to have some intrinsic hash value. Can be determined by an instance method hashCode( ). Example of hash code could be some unique attribute value, such as ID number. Next, we use a hash function to convert a hash value into the index into the hash table. Why? Because hash values can be very large. Use a mod function based on the size of the hash table.

In other words Hash values are usually not very interesting. hashCode( ) functions just have to make sure that the objects have a good chance of having unique hash values. The interesting part of hashing is figuring out where data winds up in the hash table – this is determined by the hash function. Hash table size is usually a prime number. We would like objects to map to different places in the hash tables. Otherwise we have a collision.

CS 122 – March 29 Hash code  Hash function  Collisions, and what to do about them Java’s “Hashtable” An array-type data structure allowing you to index on any value, not just a number. Hashing implementation done behind the scenes. Useful methods: put, get, containsKey, keys Handout: Hashtable of foods

Collisions Let’s suppose our hash table has size 11. A simple hash function would say index = hashCode % 11 What happens if we insert the values 23 and 34? 23 % 11 = 1, and 34%11 = 1. Both values map to the [ 1 ] entry of the table. Ex. What objects can live at position [ 0 ] ? There are 2 ways to resolve this situation. Chaining: allow multiple values to reside inside each cell of the hash table. Probing: On a collision, find a different nearby place to sit.

Chaining Basically it means that each cell of the hash table maintains a linked list of objects. Each object has a hash code that maps to this address. The Java API HashSet class uses this technique.

Probing Also known as open addressing If cell c is already occupied, find a nearby alternative. Linear probing Try c+1, c+2, c+3, … until you find a space At end of the table, wrap around to [ 0 ] and continue Try example with size 11 table: 24,45,3,87,65,32,1,12,17, using h(k) = k mod 11. Quadratic probing Try c+1, c+4, c+9, … instead Example: Apply hash function h(k)=(2k+1) mod 7 for k = 10, 15, 5, 8, 3, 20, 6. What happens?

Load factor Tells you how much of the hash table is occupied In practice, when load factor exceeds some threshold, we enlarge hash table and rehash. In Java’s HashSet, the initial size is 16, and the threshold load factor is 0.75 before doubling size.

CS 122 – April 2 The last major subject of course is nonlinear data structures Graphs (e.g. network) Trees (e.g. hierarchy) Consider graphs first, since trees are a special case To model something that “looks like” a graph Relationships between objects we can quantify/measure Graph terminology Creating our own Graph class: Representation, desired ops, applications

Graph A nonlinear data structure Useful to model any kind of network, or set of relationships Questions we may want to ask: How many vertices / edges are there? Does an edge exist from x to y? How far apart are x and y? How many edges incident on x? (i.e. find the degree) How many nodes are within some distance from x? Is y reachable from x? Is there a systematic way to visit every node and return back to the beginning?

Graph ideas A graph consists of A set of nodes (i.e. vertices) A set of edges The purpose of an edge is to “connect” two vertices: to make them adjacent to each other. Sometimes… The graph may be weighted: We may want to label a cost or distance on each edge. The graph may be directed: This indicates one-way traffic.

Representations Adjacency list For each vertex in the graph, we maintain a list (e.g. linked list or array list) of other vertices that are directly connected to this one Adjacency matrix  A 2-d array The vertices are in some order, such as alphabetical order (A, B, C, …) The entry in row / column indicates whether there is an edge or not. (1 or 0) Elegantly handles weighted and directed graphs.

Example What can we determine about the graph, when given its adjacency matrix?

Adjacency matrix For an ordinary graph The matrix is symmetric. For example, if A is adjacent to C, then C is adjacent to A. No vertex is adjacent to itself. So, the main diagonal is all 0’s. For a directed graph Matrix not likely to be symmetric For a weighted graph We still put 0’s along the main diagonal, to indicate zero cost or distance. To represent non-adjacent, use ∞ value.

Possible design Graph Attributes # vertices, # edges List of vertices Adjacency matrix Graph Operations Create empty graph Add vertex; add edge getVertex(key) edgeExists(v1, v2) degree(v) Neighbor iterator(v) BFS & DFS iterators (v) Is connected, etc. Design of Vertex depends on graph’s application… Vertex attributes Name Marked Level Others depending on app. Vertex operations Mark / unmark / is marked Get and set level equals

Some details What do entries in adjacency matrix mean? > 0 adjacent 0 myself –1 not adjacent (be careful) addVertex( ) – need to re-allocate array space This is the price we pay for using array not ArrayList addEdge( ) – the no parameter version uses default 1 findVertex(String) – used by iterator, edgeExists( ), addEdge( ) neighborIterator( ): just look at 1 row of adj matrix!

CS 122 – April 4 Graph implementation Relies on Vertex class Array representation of Vertices and edges Desired operations? Neighbor iterator Handout: Graph

Implementation Attributes: Vertex [ ], int [ ] [ ] When creating a graph we may do operations in this order: Create empty graph (in default constructor) Add in all the vertices, then the edges Desired operations? Add vertex; add edge Return # vertices; # edges See if some edge exists (or return the distance label) Find degree of a vertex Neighbor iterator

Applications Besides basic graph operatons, we may have specific questions, depending on why we are using a graph Finding a path (sequence of vertices) between 2 vertices In particular, find the shortest path Finding a cycle Seeing if the graph is connected or not Let us visit all the vertices 

Iterator Here is one way to implement an iterator Create an inner class, containing Attributes needed during the traversal Constructor initializing attributes hasNext( ) next( ) remove( ) – honestly, we won’t need this so you can leave its implementation empty Create a method that returns iterator object In driver/test class, use hasNext/next as needed

Neighbor iterator A systematic way to traverse all neighbors of a vertex Read across one row of the adjacency matrix Row number will remain constant Column number will change Constructor: initialize column number to first column representing a neighbor hasNext: have we reached the end of adj mat? Next: return corrent column #, and advance it to next neighbor’s column (Alternative approach: create ArrayList in constructor)

CS 122 – April 11 Systematic ways to visit all vertices in a (connected) graph Breadth-first search (BFS) Depth-first search (DFS) Handouts: Worksheet for BFS, DFS BFS / DFS code

Breadth-first search We assign a level to each vertex. Start with some vertex, and assign it level 0. Fan out in all directions at once (well, in some order). All the neighbors of the first vertex become the “level 1” vertices. Neighbors of the level 1 vertices become “level 2” In general, at level n, all the neighbors that do not already have a level number become “level n+1” Continue until all vertices assigned.

Depth-first search Start at some vertex. Continue down a path as far as you can until you reach a dead end. Then you backtrack to some point where you had a choice / fork in the road, and you now try a different route. For this to work, there needs to be a way to select a “next” vertex to go to – we use the neighbor iterator (e.g. alphabetical order).

Using BFS / DFS Iterators for BFS or DFS can be used to answer questions about a graph. Let’s try these… Finding path between 2 vertices DFS: handle it like a maze BFS: may be easier, and can find all paths simultaneously How to find a cycle Which iterator should we use? The Vertex “mark” attribute may be useful here. Is everybody connected?

CS 122 – April 13 Important graph application: Dijkstra’s shortest path algorithm Handout: Dijkstra worksheet (2 pages: instructions plus examples)

Dijkstra’s algorithm 4 7 9 6 8 7 4 3 2 3 1 6 How do you find the shortest path in a network? General case solved by Edsger Dijkstra, 1959

A 4 7 2 B C 3 4 Z Let’s say we want to go from “A” to “Z”. The idea is to label each vertex with a number – its best known distance from A. As we work, we may find a cheaper distance, until we “mark” or finalize the vertex. Label A with 0, and mark A. Label A’s neighbors with their distances from A. Find the lowest unmarked vertex and mark it. Let’s call this vertex “B”. Recalculate distances for B’s neighbors via B. Some of these neighbors may now have a shorter known distance. Repeat steps 3 and 4 until you mark Z.

A 4 7 2 B C 3 4 Z First, we label A with 0. Mark A as final. The neighbors of A are B and C. Label B = 4 and C = 7. Now, the unmarked vertices are B=4 and C=7. The lowest of these is B. Mark B, and recalculate B’s neighbors via B. The neighbors of B are C and Z. If we go to C via B, the total distance is 4+2 = 6. This is better than the old distance of 7. So re-label C = 6. If we go to Z via B, the total distance is 4 + 3 = 7.

A 4 7 2 B C 3 4 Z Now, the unmarked vertices are C=6 and Z=7. The lowest of these is C. Mark C, and recalculate C’s neighbors via B. The only unmarked neighbor of C is Z. If we go to Z via C, the total distance is 6+4 = 10. This is worse than the current distance to Z, so Z’s label is unchanged. The only unmarked vertex now is Z, so we mark it and we are done. Its label is the shortest distance from A.

A 4 7 2 B C 3 4 Z Postscript. I want to clarify something… The idea is to label each vertex with a number – its best known distance from A. As we work, we may find a cheaper distance, until we “mark” or finalize the vertex. When you mark a vertex and look to recalculate distances to its neighbors: We don’t need to recalculate distance for a vertex if marked. So, only consider unmarked neighbors. We only update a vertex’s distance if it is an improvement: if it’s shorter than what we previously had.

Dijkstra path The answer is not just a number In implementation, need some way to store the actual path taken. Possibilities: When Dijkstra’s algorithm is done, work backwards from the destination vertex to deduce the path. Look at the vertex labels of Z and its neighbors to see which neighbor logically comes before Z. For example, if label (Y) + edge YZ = label(Z). During Dijkstra, when labeling a vertex, also store the list of vertices encountered along the way from A.

CS 122 – April 16 Tree: a special kind of graph Good for storing information hierarchically Often we want (rooted) binary trees Traversing a tree More interesting options besides BFS and DFS! Handouts: Tree with preorder (simple implementation) Tree traversals (worksheet)

Tree Defined to be a connected acyclic graph All vertices are connected to all the others (note that this does not mean adjacent to all the others).  not disconnected, no isolated vertices  No cycle exists anywhere in the graph In CS most trees we use are binary trees There is a root vertex at the top Each node has up to 2 children. Specialized terminology: parent, left & right children, sibling, ancestor, descendant How would we design a Node class for trees?

Trees vs. graph A tree is a special kind of graph Trees may be: Rooted Among the rooted trees, one important kind is the binary tree! In it, each node has left & right subtree. Non-rooted. Also called “free trees” (We could make use of inheritance if we wanted all these possibilities implemented.) Binary tree class design: Attributes for a root and number of nodes Constructor(s), size( ), toString( ), preorderIterator( )

Building a binary tree Suppose we want to create a mathematical expression Simplest tree is a root with 2 children. Ex. 1 + 2: the root is “+”, 1 is left child, 2 is right child We don’t add nodes / edges one at a time! Instead, combine 2 subtrees with a common parent. Ex. For (1 + 2) * (3 + 4), we create little trees for 1 + 2, and 3 + 4, and then combine them with * on top. In this application, not interested in removing

Binary tree traversals Important operation is to visit all vertices  iterator General procedures like BFS and DFS do not take the hierarchical nature of the data into account. They “skip around” too much. 3 ways to traverse a binary tree, each done recursively. Preorder = process this node, and then call myself with each child Inorder = call myself with my left child, process this node, then call myself with my right child Postorder = call myself with each child, then process myself If “process” means print, traversals give rise to 3 exprression notations: prefix, infix and postfix.

Practice Is it possible to determine the tree, given a traversal? If we’re given an expression in prefix, infix, or postfix notation, we can build the tree. Some notes: Prefix and postfix notations never need parentheses In prefix, an operator is applied to the 2 “numbers” (atoms or subexpressions) that follow Similarly, in postfix, an operator is applied to the 2 numbers that precede it. A little more difficult for general binary tree.

Tree & traversal Given a (binary) tree, we can find its traversals. √ How about the other way? Mathematical expression had enough context information that 1 traversal would be enough. But in general, we need 2 traversals, one of them being inorder. Example: Draw the binary tree having these traversals. Postorder: S C X H R J Q T Inorder: S R C H X T J Q Hint: End of the postorder is the root of the tree. Find where the root lies in the inorder. This will show you the 2 subtrees. Continue with each subtree, finding its root and subtrees, etc. Another example... Inorder: R H B J E G C A K D F Postorder: R B E J H C K F D A G

CS 122 – April 18 Review tree traversals Binary Search Trees Handout 5 ways to arrange 3 nodes

Tree & traversal Given a (binary) tree, we can find its traversals. √ How about the other way? Mathematical expression had enough context information that 1 traversal would be enough. But in general, we need 2 traversals, one of them being inorder. Example: Draw the binary tree having these traversals. Postorder: S C X H R J Q T Inorder: S R C H X T J Q Hint: End of the postorder is the root of the tree. Find where the root lies in the inorder. This will show you the 2 subtrees. Continue with each subtree, finding its root and subtrees, etc. Another example... Inorder: R H B J E G C A K D F Postorder: R B E J H C K F D A G

Binary search tree Special kind of tree, an excellent data structure Efficient searches, insertions, deletions Operations should take logarithmic time  In the Java API, the TreeSet class is a kind of binary search tree called a red-black tree. BST with extra functionality to make sure the tree stays “balanced” as we insert/delete elements In ordinary BST, it’s possible for it to degenerate into a linked list if we’re unlucky Binary search idea may already be familiar to you: guessing games, spell checker.

BST Motivation Store data so it will be easy to find later. Does this list contain 52? 5,8,12, 13, 15, 18, 20, 25, 30, 32, 36, 38, 44, 45, 58, 61, 62, 77, 80 Key property of BST: each node’s value is: For all nodes: left child  you  right child Or better yet: all in L subtree you  all elements in R subtree Then, it’s easy to tell where to search for something, or where a new element must go.

BST operations insert (x) delete (x) find (x) predecessor (x) successor (x) And we can generalize, in case x is not inside: closestBefore (x) closestAfter(x)

Search, min, max Search(k) Just follow binary search strategy. Starting at root, follow left or right child… until you find k or discover it can’t exist. May be recursive or just a while loop. Next, how would you find …? Min key value in the tree Max key value in the tree

CS 122 – April 20 BST operations find, insert  predecessor, successor, delete Efficiency of BST Tradeoff ? balancing

Successor Pred & succ are analogous/symmetric cases, so let’s just look at successor function. Try example. succ(x): if right(x) not empty return min(right(x)) else // succ is lowest ancestor of x whose // left child is also ancestor of x y = parent(x) while y && x == right(y) x = y y = parent(x) return y The while loop goes up the tree until x is the left child of its parent

Insert, delete Insert(k) is like search  The new key becomes a left/right child as appropriate. Special case if tree initially empty. Delete(victim) – Careful… there are 3 cases, depending on how many children the victim has. If victim has no children: it’s a leaf. Just tell parent to set its left or right child pointer to null. If victim has 1 child: Reset the victim’s parent and child to point to each other now. What if victim has 2 children?

Delete, continued Deleting a node that has 2 children: Replace the victim with succ(victim) Now, need to repair pointers where succ(victim) came from. Turns out that succ(victim) has at most 1 child. It can’t have a left child… if it did, then it’s not really the succ. Go back to how we calculated succ(x) to see why it’s true. So, treat the removal of succ(victim) like the 0 or 1 child delete case. Example: insert 10, 5, 16, 3, 7, 12; then delete 3, 16, 10.

Comparison Comparing a TreeSet with an ArrayList, how long does it take to insert 1,000,000 elements, and then perform a futile search? TreeSet: 10.504s to insert; 0.001s to search ArrayList: 1.891s to insert; 0.161s to search What is the tradeoff?

Balance BST operations should take O(log2 n) time. But in the worst case, a BST acts like a linked list. Example: insert elements in sorted order Goal: as we insert elements, the BST should maintain balance How to define/quantify balance? We can measure the height of each node For each node, the height of its 2 children do not differ by more than 1. What if unbalanced? Rotate.

CS 122 – April 23 Analyzing binary trees Some definitions Quantitative relationship between height of tree and # of nodes. (balance)

Binary trees Each node has  2 children. Very useful for CS applications Special cases Full binary tree = each node has 0 or 2 children. Suitable for arithmetic expressions. Also called “proper” binary tree. Complete binary tree = taking “full” one step further: All leaves have the same depth from the root. As a consequence, all other nodes have 2 children.

Binary tree properties Suppose we have a full binary tree n = total number of nodes, h = height of tree Let’s establish lower and upper bounds on… The number of leaves The number of internal nodes The total number of nodes The height (The answer to this one will show us why it’s important for a tree to maintain balance!) Let’s repeat the above assuming it’s a full binary tree.

CS 122 – Jan. 9

CS 122 – Jan. 9

Presentation Transcript