1 / 61

Lecture 5. Java Collection : Built-in Data Structures for Java

Lecture 5. Java Collection : Built-in Data Structures for Java. Cheng-Chia Chen. The Java Collection API. Interfaces: Collection  Set  SortedSet, List Map  SortedMap Iterator  ListIterator Comparator. Summary of all interfaces in the java Collection API.

tamira
Download Presentation

Lecture 5. Java Collection : Built-in Data Structures for Java

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. Lecture 5. Java Collection :Built-in Data Structures for Java Cheng-Chia Chen

  2. The Java Collection API Interfaces: • Collection  • Set  SortedSet, • List • Map  SortedMap • Iterator  ListIterator • Comparator

  3. Summary of all interfaces in the java Collection API Collection Interfaces : The primary means by which collections are manipulated. • Collection  Set, List • A group of objects. • May or may not be ordered; May or may not contain duplicates. • Set  SortedSet • The familiar set abstraction. • No duplicates; May or may not be ordered. • SortedSet • elements automatically sorted, either in their natural ordering (see the Comparable interface), or by a Comparator object provided when a SortedSet instance is created. • List • Ordered collection, also known as a sequence. • Duplicates permitted; Allows positional access.

  4. Map SortedMap • A mapping from keys to values. • Each key can map to at most one value (function). • SortedMap • A map whose mappings are automatically sorted by key, either in the keys' natural ordering or by a comparator provided when a SortedMap instance is created.

  5. Infrastructure Iterators : Similar to the Enumeration interface, but more powerful, and with improved method names. • Iterator  ListIterator • functionality of the Enumeration interface, • supports elements removement from the backing collection. • ListIterator • Iterator for use with lists. • supports bi-directional iteration, element replacement, element insertion and index retrieval. Ordering • Comparable ( compareTo(object) ) • Imparts a natural ordering to classes that implement it. • The natural ordering may be used to sort a list or maintain order in a sorted set or map. Many classes have been retrofitted to implement this interface. • Comparator ( compare(Object, Object) ) • Represents an order relation, which may be used to sort a list or maintain order in a sorted set or map. Can override a type's natural ordering, or order objects of a type that does not implement the Comparable interface.

  6. Infrastructure Runtime Exceptions • UnsupportedOperationException • Thrown by collections if an unsupported optional operation is called. • ConcurrentModificationException • Thrown by Iterator and listIterator if the backing collection is modified unexpectedly while the iteration is in progress. • Also thrown by sublist views of lists if the backing list is modified unexpectedly.

  7. classes of the java collection API • AbstractCollection (Collection)  • AbstractSet (Set)  HashSet, TreeSet(SortedSet) • AbstractList (List) ArrayList, AbstractSequentialList  LinkedList • AbstractMap (Map)  • HashMap • TreeMap(SortedMap) • WeakHashMap • Arrays • Collections

  8. General-Purpose Implementation classes The primary implementations of the collection interfaces. • HashSet : Hash table implementation of the Set interface. • TreeSet : Red-black tree implementation of the SortedSet interface. • ArrayList : Resizable-array implementation of the List interface. • (Essentially an unsynchronized Vector.) The best all-around implementation of the List interface. • LinkedList : Doubly-linked list implementation of the List interface. • May provide better performance than the ArrayList implementation if elements are frequently inserted or deleted within the list. • Useful for queues and double-ended queues (deques). • HashMap : Hash table implementation of the Map interface. • (Essentially an unsynchronized Hashtable that supports null keys and values.) The best all-around implementation of the Map interface. • TreeMap : Red-black tree implementation of the SortedMap interface.

  9. The java.util.Collection interface • represents a group of objects, known as its elements. • no direct implementation • The primary use : pass around collections of objects where maximum generality is desired. Ex: List l = new ArrayList( c ) ;// c is a Collection object

  10. The definition public interface Collection { // Basic properties int size(); boolean isEmpty(); boolean contains(Object element); // use equals() for comparison boolean equal(Object); int hashCode(); // new equals() requires new hashCode() // basic operations boolean add(Object);// Optional; return true if this changed boolean remove(Object);// Optional; use equals() (not ==)

  11. The Collection interface definition // Bulk Operations boolean containsAll(Collection c); // true if c this // the following 4 methods are optional, returns true if contents changed boolean addAll(Collection c); // Optional; this = this U c boolean removeAll(Collection c); // Optional; this = this - c boolean retainAll(Collection c); // Optional; this = this c void clear(); // Optional; this = {}; // transformations Iterator iterator();// collection  iterator Object[] toArray(); // collection  array Object[] toArray(Object[] a ); // if |this| \le |a| => copy this to a and return a ; else => create a new array b // whose component type is that of a and copy this to b and return b; . }

  12. Why using Iterators instead of Enumerations • Iterator allows the caller to remove elements from the underlying collection during the iteration with well-defined semantics. • Method names have been improved. public interface Iterator { boolean hasNext(); // cf: hasMoreElements() Object next(); // cf: nextElement() void remove(); // Optional } // a simple Collection filter using iterator that Enumeration could not help static void filter(Collection c) { for (Iterator i = c.iterator(); i.hasNext(); ) { if ( no-good( i.next() ) ) i.remove(); // i.next() is removed from c i.remove(); // exception raised, cannot removed more than once ! } }

  13. Examples of Bulk and array operations for Collection • Remove all instances of an element e from a collection c: c.removeAll(Collections.singleton(e)); // remove all e’s in c c.removeAll(Collections.singleton(null)); // remove all nulls in c • collections to arrays : Collection c = new LinkedList(); // LinkedList is an imp. of Col. c .add(“a”); c.add(“b”); // c == [“a”, “b”]; component type = Object Object[] ar1 = c.toArray(); // ok, ar1.length == 2 String[] ar2 = (String[]) c.toArray(); // runtime exception, cannot // cast an array of Object component type to String[]; // note: can pass compilation. String[] ar3 = (String[]) c.toArray(new String[0]); // ok! since c.toArray(String[]) has String component type.

  14. The Set Interface • is a Collection that cannot contain duplicate elements. • models the mathematical set abstraction. • contains no methods other than those inherited from Collection. • same signatures but different semantics ( meaning ) • Collection c = new LinkedList(); Set s = new HashSet(); • String o = “a”; • c.add(o); c.add(o) ; // both return true; c.size() == 2 • s.add(o); s.add(o) ; // 2nd add() returns false; s.size() == 1 • It adds the restriction that duplicate elements are prohibited. • Collection noDups = new HashSet(c);// a simple way to eliminate duplicate from c • Two Set objects are equal if they contain the same elements. • Two direct implementations: • HashSet TreeSet

  15. Basic Operations • A simple program to detect duplicates using set: import java.util.*; public class FindDups { public static void main(String args[]) { Set s = new HashSet(); // or new TreeSet(), another implementation of Set // following code uses Set methods only for (int i=0; i<args.length; i++) if (!s.add(args[i])) System.out.println("Duplicate detected: “ + args[i]); System.out.println(s.size()+" distinct words detected: "+s); } } % java FindDups i came i saw i left Duplicate detected: i Duplicate detected: i 4 distinct words detected: [came, left, saw, i]

  16. Bulk Operations for Set objects • s1.containsAll(s2) returns true if s2 is a subset of s1. • s1.addAll(s2), s1.retainAll(s2), s1.removeAll(s2): • s1 = s1 U s2, s1 = s1  s2, s1 = s1 – s2, respectively • return true iff s1 is modified. • For nondestructive operations: Set union = new HashSet(s1); // make a copy of s1 union.addAll(s2); Set intersection = new HashSet(s1); // may also use TreeSet(s1) intersection.retainAll(s2); Set difference = new HashSet(s1); difference.removeAll(s2);

  17. Example • Show all arguments that occur exactly once and those that occur more than once. import java.util.*; public class FindDups2 { public static void main(String args[]) { Set uniques = new HashSet(), dups = new HashSet(); for (int i=0; i<args.length; i++) if (! uniques.add(args[i])) dups.add(args[i]); uniques.removeAll(dups); // Destructive set-difference System.out.println("Unique words: " + uniques); System.out.println("Duplicate words: " + dups); } }

  18. The List Interface • A List is an ordered Collection(sometimes called a sequence). • may contain duplicate elements. • The List interface includes operations for: • Positional Access: set/get elements based on their numerical position in the list. • Search: search for a specified object in the list and return its numerical position. • List Iteration: extend Iterator semantics to take advantage of the list's sequential nature. • Range-view: perform arbitrary range operations on the list. • Three direct implementations: • ArrayList : resizable array • LinkedList : doubly linked-list • Vector : synchronized ArrayList.

  19. The List interface definition public interface List extends Collection { // Positional Access Object get(int index); // 0-based Object set(int index, Object element); // Optional; return old value void add([int index,] Object element); // Optional Object remove(int index); // Optional abstract boolean addAll(int index, Collection c); // Optional // Search int indexOf(Object o); int lastIndexOf(Object o); // Range-view List subList(int from, int to); // Iteration ListIterator listIterator([int f]); // default f = 0; return a listIterator with cursor set to position f}

  20. List in comparison with Vector (1.1) • shorter getter/setter names • get(int) // elementAt(int) / • add(int,Object) // insertElemetnAt(O,i) • Object set(int, Object) //void setElementAt(Object,int) • Object remove(int) // void removeElementAt(int) • Note: From java1.2 Vector also implements the List interface. • List concatenation: • list1 = list1.addAll(list2); // destructive • List list3 = new arrayList( list1); // or LinkedList(list1) • list3.addAll(list2); // list3 equals to list1 . list2 • Two List objects are equal if they contain the same elements in the same order. • List l1 = new LilnkedList(l2); // l2 is an ArrayList • l1.equals(l2) ? true: false // returns true, but l1==l2 returns false.

  21. E(0) E(1) E(2) E(3) ^ ^ ^ ^ ^ Index: 0 1 2 3 4 (cursor) previous() next() ListIterator public interface ListIterator extends Iterator { // from Iterator boolean hasNext(); Object next(); // backward iteration: boolean hasPrevious(); Object previous(); int nextIndex(); // == cursor position == pos of next() object int previousIndex(); // == nextIndex() – 1 = pos of previous() object // ; == -1 if cursor = 0; void remove(); // Optional void set(Object o); // Optional void add(Object o); // Optional }

  22. Set and Add operations in ListIterator • set(Object), remove() : Replaces/remove the last element returned by next or previous with the specified element. Ex: => either E(1) (if next() is called more recently than previous()) or E(2) (otherwise) would be replaced • add(Object): Inserts the specified element into the list, immediately before the current cursor position. • Ex: add(o) => E(0) E(1) o (cursor) E(2) E(3) • Backward Iteration: for(ListIterator i = list.listIterator(list.size()); i.hasPrevious(); ) { processing( i.previous()) ; } E(0) E(1) E(2) E(3) ^ ^ ^ ^ ^ Index: 0 1 2 3 4 (cursor)

  23. Some Examples • possible implementation of List.indexOf(Object): public int indexOf(Object o) { for (ListIterator i = listIterator(); i.hasNext(); ) if (o==null ? i.next()==null : o.equals(i.next())) return i.previousIndex(); // or i.nextIndex() -1 return -1; // Object not found } • replace all occurrences of one specified value with another: public static void replace(List l, Object val, Object newVal) { for (ListIterator i = l.listIterator(); i.hasNext(); ) if (val==null ? i.next()==null : val.equals(i.next())) i.set(newVal); }

  24. Range-view Operation • subList(int f, int t), returns a List view of a portion of this list whose indices range from f, inclusive, to t, exclusive, [f,t). • Ex: sublist(1,3) = E(1),E(2) • This half-open range mirrors the typical for-loop: for (int i=f; i<t; i++) { ... } // iterate over the sublist • Change on sublist is reflected on the backing list Ex: the following idiom removes a range of elements from a list: list . subList(from, to) . clear(); E(0) E(1) E(2) E(3) ^ ^ ^ ^ ^ Index: 0 1 2 3 4 (cursor)

  25. The Map Interface • A Map is an object that maps keys to values. • A map cannot contain duplicate keys: • Each key can map to at most one value. • Three implementations: • HashMap, which stores its entries in a hash table, is the best-performing implementation. • TreeMap, which stores its entries in a red-black tree, guarantees the order of iteration. • Hashtable has also been retrofitted to implement Map. • All implementation must provide two constructors: (like Collections) Assume M is your implementation • M()// empty map • M(Map m)// a copy of map from m

  26. The Map interface public interface Map { // Map does not extend Collection // Basic Operations // put or replace, return replaced object Object put(Object key, Object value); // optional Object get(Object key); Object remove(Object key); boolean containsKey(Object key); boolean containsValue(Object value); int size(); boolean isEmpty();

  27. The Map interface // Bulk Operations void putAll(Map t); //optional void clear(); // optional // Collection Views; // backed by the Map, change on either will be reflected on the other. public Set keySet(); // cannot duplicate by definition!! public Collection values(); // can duplicate public Set entrySet(); // no equivalent in Dictionary // nested Interface for entrySet elements public interface Entry { Object getKey(); Object getValue(); Object setValue(Object value); } }

  28. Possible Exceptions thrown by Map methods • UnsupportedOperationException • if the method is not supported by this map. • ClassCastException • if the class of a key or value in the specified map prevents it from being stored in this map. • ex: m.put(“Name”, new Integer(2)) // m actually put (String) value • IllegalArgumentException • some aspect of a key or value in the specified map prevents it from being stored in this map. • ex: put(“Two”, 2) // put expect an Object value • NullPointerException • this map does not permit null keys or values, and the specified key or value is null.

  29. Basic Operations • a simple program to generate a frequency table import java.util.*; public class Freq { private static final Integer ONE = new Integer(1); public static void main(String args[]) { Map m = new HashMap(); // Initialize frequency table from command line for (int i=0; i<args.length; i++) { Integer freq = (Integer) m.get(args[i]); // key is a string m.put(args[i], (freq==null ? ONE : new Integer( freq.intValue() + 1))); } // value is Integer System.out.println( m.size()+" distinct words detected:"); System.out.println(m); } } • > java Freq if it is to be it is up to me to delegate 8 distinct words detected: {to=3, me=1, delegate=1, it=2, is=2, if=1, be=1, up=1}

  30. Bulk Operations • clear() : removes all of the mappings from the Map. • putAll(Map) operation is the Map analogue of the Collection interface's addAll(…) operation. • can be used to create attribute map creation with default values. Here's a static factory method demonstrating this technique: • static Map newAttributeMap(Map defaults, Map overrides) { Map result = new HashMap(defaults); result.putAll(overrides); return result; }

  31. Collection Views methods • allow a Map to be viewed as a Collection in three ways: • keySet: the Set of keys contained in the Map. • values: The Collection of values contained in the Map. This Collection is not a Set, as multiple keys can map to the same value. • entrySet: The Set of key-value pairs contained in the Map. • The Map interface provides a small nested interface called Map.Entry that is the type of the elements in this Set. • the standard idiom for iterating over the keys in a Map: for (Iterator i = m.keySet().iterator(); i.hasNext(); ) { System.out.println(i.next()); if(no-good(…)) i.remove() ; } // support removal from the back Map • Iterating over key-value pairs for (Iterator i=m.entrySet().iterator(); i.hasNext(); ) { Map.Entry e = (Map.Entry) i.next(); System.out.println(e.getKey() + ": " + e.getValue()); }

  32. Permutation groups of words import java.util.*; import java.io.*; public class Perm { public static void main(String[] args) { int minGroupSize = Integer.parseInt(args[1]); // Read words from file and put into simulated multimap Map m = new HashMap(); try { BufferedReader in = new BufferedReader(new FileReader(args[0])); String word; while((word = in.readLine()) != null) { String alpha = alphabetize(word); // normalize word : success ccesssu List l = (List) m.get(alpha); if (l==null) m.put(alpha, l=new ArrayList()); l.add(word); } } catch(IOException e) { System.err.println(e); System.exit(1); }

  33. // Print all permutation groups above size threshold for (Iterator i = m.values().iterator(); i.hasNext(); ) { List l = (List) i.next(); if (l.size() >= minGroupSize) System.out.println(l.size() + ": " + l); } }

  34. // buketsort implementation private static String alphabetize(String s) { int count[] = new int[256]; int len = s.length(); for (int i=0; i<len; i++) count[s.charAt(i)]++; StringBuffer result = new StringBuffer(len); for (char c='a'; c<='z'; c++) for (int i=0; i<count[c]; i++) result.append(c); return result.toString(); } }

  35. Some results % java Perm dictionary.txt 8 9: [estrin, inerts, insert, inters, niters, nitres, sinter, triens, trines] 8: [carets, cartes, caster, caters, crates, reacts, recast, traces] 9: [capers, crapes, escarp, pacers, parsec, recaps, scrape, secpar, spacer] 8: [ates, east, eats, etas, sate, seat, seta, teas] 12: [apers, apres, asper, pares, parse, pears, prase, presa, rapes, reaps, spare, spear] 9: [anestri, antsier, nastier, ratines, retains, retinas, retsina, stainer, stearin] 10: [least, setal, slate, stale, steal, stela, taels, tales, teals, tesla] 8: [arles, earls, lares, laser, lears, rales, reals, seral] 8: [lapse, leaps, pales, peals, pleas, salep, sepal, spale] 8: [aspers, parses, passer, prases, repass, spares, sparse, spears] 8: [earings, erasing, gainers, reagins, regains, reginas, searing, seringa] 11: [alerts, alters, artels, estral, laster, ratels, salter, slater, staler, stelar, talers] 9: [palest, palets, pastel, petals, plates, pleats, septal, staple, tepals] …

  36. The SortedSet Interface • A SortedSet is a Set that maintains its elements in ascending order, sorted according to the elements' natural order(via comparable interface), or according to a Comparator provided at SortedSet creation time. • In addition to the normal Set operations, the SortedSet interface provides operations for: • Range-view: Performs arbitrary range operations on the sorted set. • Endpoints: Returns the first or last element in the sorted set. • Comparator access: Returns the Comparator used to sort the set (if any). • Standard constructors: Let S be the implementation • S( [ Comparator ] )// empty set • S(SortedSet)// copy set • Implementation: TreeSet

  37. The SortedSet public interface SortedSet extends Set { // Range-view SortedSet subSet(Object f, Object t); //return [ f, t ), f.eq(t) ->null SortedSet headSet(Object t); // [first(), t ) SortedSet tailSet(Object fromElement);// [f, last() ] // Endpoints Object first(); Object last(); // Comparator access Comparator comparator(); // if any }

  38. The SortedMap Interface • A SortedMap is a Map that maintains its entries in ascending order, sorted according to the keys‘ natural order, or according to a Comparator provided at SortedMap creation time. • In addition to the normal Map operations, the SortedMap interface provides operations for: • Range-view: Performs arbitrary range operations on the sorted map. • Endpoints: Returns the first or last key in the sorted map. • Comparator access: Returns the Comparator used to sort the map (if any). • Constructors provided by implementations: • M([Comparator])// empty SortedMap • M(SortedMap)// copy Map • Implementation: TreeMap

  39. The SortedMap interface // analogous to SortedSet public interface SortedMap extends Map { Comparator comparator(); // range-view operations SortedMap subMap(Object fromKey, Object toKey); SortedMap headMap(Object toKey); SortedMap tailMap(Object fromKey); // member access; // Don’t forget bulk of other Map operations available Object firstKey(); Object lastKey(); } // throws NoSuchElementException if m.isEmpty()

  40. SortedMap Operations • Operations inheriting from Map behave identically to normal maps with two exceptions: • [keySet() | entrySet() | values()] . Iterator() traverse the collections in key-order. • toArray(…) contains the keys, values, or entries in key-order. • Although not guaranteed by the interface, • the toString() method of SortedMap in all the JDK's SortedMap returns a string containing all the elements in key-order.

  41. Example SortedMap m = new TreeMap(); m.put("Sneezy", "common cold"); m.put("Sleepy", "narcolepsy"); m.put("Grumpy", "seasonal affective disorder"); System.out.println( m.keySet() ); System.out.println( m.values() ); System.out.println( m.entrySet() ); • Running this snippet produces this output: [ Grumpy, Sleepy, Sneezy] [ seasonal affective disorder, narcolepsy, common cold] [ Grumpy=seasonal affective disorder, Sleepy=narcolepsy, Sneezy=common cold]

  42. Actual Collection and Map Implementations • Implementations are the actual data objects used to store collections (and Maps). Three kinds of implementations: • General-purpose Implementations • the public classes that provide the primary implementations of the core interfaces. • Wrapper Implementations • used in combination with other implementations (often the general-purpose implementations) to provide added functionality. • Convenience Implementations • Convenience implementations are mini-implementations, typically made available via static factory methods that provide convenient, efficient alternativesto the general-purpose implementations for special collections (like singleton sets).

  43. General Purpose Implementations

  44. Properties of the implementations • consistent names as well as consistent behavior. • fully implementations [of all the optional operations]. • All permit null elements, keys and values. • unsynchronized. • remedy the deficiency of Hashtable and Vector • can become synchronized through the synchronization wrappers • All have fail-fast iterators, which detect illegal concurrent modification during iteration and fail quickly and cleanly. • All are Serializable, • all support a public clone() method. • should be thinking about the interfaces rather than the implementations. The choice of implementation affects only performance.

  45. HashSet vs treeSet (and HashMap vs TreeMap) • HashSet/ HashMap is much faster (constant time vs. log time for most operations), but offers no ordering guarantees. • always use HashSet/HashMap unless you need to use the operations in the SortedSet, or in-order iteration. • choose an appropriate initial capacity of your HashSet if iteration performance is important. • The default initial capacity is 101, and that's often more than you need. • can be specified using the int constructor. • Set s= new HashSet(17);// set bucket size to 17

  46. ArrayList vs LinkedList • Most of the time, you'll probably use ArrayList. • offers constant time positional access • Think of ArrayList as Vector without the synchronization overhead. • Use LikedList If you frequently add elements to the beginning of the List, or iterate over the List deleting elements from its interior. • These operations are constant time in a LinkedList but linear time in an ArrayList.

  47. Wrapper Implementations • are implementations that delegate all of their real work to a specified collection, but add some extra functionality on top of what this collection offers. • These implementations are anonymous: • the JDK provides a static factory method. • All are found in the Collections class which consists solely of static methods. • Synchronization Wrappers • public static Collection synchronizedCollection(Collection c); • public static Set synchronizedSet(Set s); • public static List synchronizedList(List list); • public static Map synchronizedMap(Map m); • public static SortedSet synchronizedSortedSet(SortedSet s); • public static SortedMap synchronizedSortedMap(SortedMap m);

  48. read-only access to Collection/Maps • Unmodifiable Wrappers • public static Collection unmodifiableCollection(Collection c); • public static Set unmodifiableSet(Set s); • public static List unmodifiableList(List list); • public static Map unmodifiableMap(Map m); • public static SortedSet unmodifiableSortedSet(SortedSet s); • public static SortedMap unmodifiableSortedMap(SortedMap m);

  49. Convenience Implementations • mini-implementations that can be more convenient and more efficient then the general purpose implementations • available via static factory methods or exported constants in class Arrays or Collections. • List-view of an Array • List l = Arrays.asList(new Object[100]); // list of 100 null’s. • Immutable Multiple-Copy List • List l = new ArrayList(Collections.nCopies(1000, new Integer(1))); • Immutable Singleton Set • c.removeAll(Collections.singleton(e)); • profession.values().removeAll(Collections.singleton(LAWYER)); • (Immutable) Empty Set and Empty List Constants • Collections.EMPTY_SET Collections.EMPTY_LIST • Collections.EMPTY_MAP // add(“1”)  UnsupportedOp… Exception

  50. The Arrays class • static List asList(Object[] a) // a could not be of the type int[],… • Returns a fixed-size list backed by the specified array. cannot add/remove(); • static int binarySearch(Type[] a, Type key) • Searches the specified array of bytes for the specified value using the binary search algorithm. Type can be any primitive type or Object. • static boolean equals(Type[] a1, Type[] a2) • static void fill(Type[] a [,int f, int t], Type val) • Assigns the specified val to each element of the specified array of Type. • static void sort(Type[] a [, int f, int t]) • Sorts the specified range of the specified array into ascending numerical order. • static sort(Object[] a, Comparator c [,int f, int t ]) • Sorts the specified array of objects according to the order induced by the specified comparator.

More Related