91.102 - Computing II
This presentation is the property of its rightful owner.
Sponsored Links
1 / 61

91.102 - Computing II PowerPoint PPT Presentation


  • 82 Views
  • Uploaded on
  • Presentation posted in: General

91.102 - Computing II. Modularity, Information Hiding and Abstract Data Types . Difficulty : Programs that solve “Real World” problems can get very large - up to millions of lines of code. Nobody can understand or remember that many lines of code.

Download Presentation

91.102 - Computing II

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.


- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -

Presentation Transcript


91 102 computing ii

91.102 - Computing II

Modularity, Information Hiding and Abstract Data Types.

Difficulty: Programs that solve “Real World” problems can get very large - up to millions of lines of code. Nobody can understand or remember that many lines of code.

Problem: How do we make it possible for normal human beings to contribute to such large endeavors?

Note: I did not say “How do we remove the difficulty”, because we do not KNOW how to remove it. We have some techniques that help us get close enough to produce - much of the time - usable programs within feasible budgets and time requirements.


91 102 computing ii

91.102 - Computing II

There are two major ideas (borrowed from other, much older, engineering disciplines):

1) Break the task up into well-defined pieces;

2) Hide the implementation details of the pieces whenever possible.

What do they imply?


91 102 computing ii

91.102 - Computing II

  • The first idea requires that we find how to break the project into small pieces that can be individually designed, built and glued back together in such a way that the “big product” can be delivered.

  • The second idea allows us to recover from occasional bad implementation decisions without having to start all over again. A side benefit is that good implementations of “pieces” might be reusable by other projects, cutting their costs down.


91 102 computing ii

91.102 - Computing II

What is the mechanism we provide to implement the policies just described?

The MODULE.

This consists, physically, of two parts:

Interface.

Implementation.

The Interface provides the PUBLIC information, i.e. the information a USER needs to make use of the functionality.

The Implementation contains the PRIVATE information, i.e. the implementation of the functionality advertised in the Interface File.


91 102 computing ii

91.102 - Computing II

User

Program

Or

Other Module

Interface

Public Information

Header File

Implementation

Private Information

Code File


91 102 computing ii

91.102 - Computing II

What IS a Module?

A set of declarations placed into service inside a program. (A quote from the text…)

In somewhat more detail, a module is a unit of organization of a software system that

A) packages together a collection of entities (data and operations) that provide a set of capabilities useful in the solution of a class of problems;

B) carefully controls what external users of the module can see and use.

(NB: this has NOTHING to do with C - every useful language implements some variant of these ideas: C provides one of the LEAST sophisticated variants)


91 102 computing ii

91.102 - Computing II

A common characteristic of Modules through all the languages that support this mechanism is Separate Compilation.

This simply means that collections of functions and data can be compiled independently of one another and of any program that may use them. Any changes to the user program, or to any of the collections require only recompilation of a minimal number of modules. This may not be important with programs of a few hundred or a few thousand lines, but is crucial to the development of software that takes hundreds of thousands or millions of lines to deliver.


91 102 computing ii

91.102 - Computing II

How we create a Module (C notation…)

ModuleInterface.h

A file that contains all the entities that must be visible to the user of the module: any constants, type definitions, variable definitions, and functions (i.e., function prototypes) that the user’s program is allowed to have explicit access to, to use or modify - depending on the entity.

ModuleImplementation.c

A file that contains all the private entities: the implementation code for the functions, and all those constants, variables and functions which the module user may NOT have direct access to.


91 102 computing ii

91.102 - Computing II

#include "ModuleInterface.h"

User program

ModuleInterface.h

#include "ModuleInterface.h"

Implementation

Private Information

Code File

ModuleImplementation.c


91 102 computing ii

91.102 - Computing II

How we USE a Module:

The USER PROGRAM must request the interface file via an include directive.

Example:

#include<stdio.h> /* include system file */

/* …. Other system inclusions …. */

#include“ModuleInterface.h” /* include non-system

module */

/* …. Other modules …. */

/* …. User Program …. */


91 102 computing ii

91.102 - Computing II

An Example: Priority Queues.

A priority queue is a finite collection of items of the same type, each associated with a number called “the priority of the item”, for which the following operations are defined:

1) Initialization - returns an empty PQ.

2) Check for Empty - is the PQ empty or not?

3) Check for Full - is there any more room in the PQ?

4) Insert a new item into an existing (not Full) PQ.

5) If the PQ is not Empty, remove from it an item X of highest priority.


91 102 computing ii

91.102 - Computing II

Notice that we have said NOTHING YET about implementation: the Priority Queue is an ABSTRACT DATA TYPE.

The implementor will have to decide on implementation details. A good “user interface” (Module Interface, sometimes also known as API - Application Programmer Interface) - and a good design - will make the implementation details invisible to the user.


91 102 computing ii

91.102 - Computing II

What are Priority Queues used for? Every time you run a task on a computer, the task ends up on some kind of priority queue. The Operating System manages a number of such queues, to provide all kinds of services: printing your output, scheduling your task to be run, saving your files to disk, etc..

Any time you attempt to use a “shared resource”, you end up on some kind of PQ, waiting for your turn… So they are important and they are very common.


91 102 computing ii

91.102 - Computing II

Remember the list functions: Head, Tail and Cons?

They allowed us to construct a list starting with an empty list and items from some universe. The construction added one item at a time. We could find the item at head of the list by just applying the function Head to the list; we could find the remainder of the list by applying the function Tail.

In particular:

Tail(Cons(&info, L)) = L

Head(Cons(&info, L)) = &info

And

Cons(Head(L), Tail(L))

returns a list with the same contents as L.


91 102 computing ii

91.102 - Computing II

We could, by analogy, introduce functions PQCons, PQHead and PQTail. What should they do?

Let PQ be a variable pointing to the Priority Queue

PQHead(PQ) = &highestPriorityItemInPQ

PQTail(PQ) = &aPriorityQueue

which contains all the items in PQ EXCEPT FOR the highest priority one.

Cons(&info, PQ) = PQ', (the address of) a new priority queue, containing all the old elements plus the new one.


91 102 computing ii

91.102 - Computing II

The main thing to observe is that the order of insertion is unrelated to the order of extraction. The order of extraction depends on a property of the information field which we call the PRIORITY.


91 102 computing ii

91.102 - Computing II

You may observe that a LIST - as defined by Head, Tail and Cons - is just a priority queue in which the latest item inserted has priority higher than that of any item already in the priority queue...

There are a number of reasons why this is NOT the most convenient way to look at Priority Queues - we will now turn to a slightly more conventional approach.

We also interested in constructing MODULES rather than just showing how to write a few functions...


91 102 computing ii

91.102 - Computing II

An immediate problem is: who decides the type of the objects that will make up the priority queue? It should be obvious from the previous discussions that NO module designer can cover ALL the possible types of objects that could be put into a priority queue - i.e., for which the notion of priority could make sense.

This decision must be made by the application programmer who is the module user: only she can know what she is trying to prioritize…

PQInterface.h

needs to contain an “include directive” to provide the necessary type definitions : this is true for C - it need not be true for other (even strongly typed) languages.


91 102 computing ii

91.102 - Computing II

// PQInterface.h

#include “PQUserTypes.h” // defines PQItem

// See next two slides for choices for here...

extern void PQInitialize(PriorityQueue *); //init empty

extern bool PQEmpty(PriorityQueue *); // check if empty

extern bool PQFull(PriorityQueue *); // check if full

extern intPQSize(PriorityQueue *); // how many items

extern void PQInsert(PQItem, PriorityQueue *);// insert

extern PQItem PQRemove(PriorityQueue *);//remove highest

This tells the user what the “user interface” is: the functions, the types of objects expected as function parameters, and the types of objects returned by the functions.


91 102 computing ii

91.102 - Computing II

PQInterface.h- Linked List Implementation.

C needs to know these details - some other languages can hide them.

typedefstructPQNodeTag {

PQItemNodeItem;

structPQNodeTag*Link;

}PQListNode;

typedefstruct{

intCount;

PQListNode*ItemList;

}PriorityQueue;


91 102 computing ii

91.102 - Computing II

PQInterface.h- Array Implementation.

typedefPQItemPQArray[MAXCOUNT];

typedefstruct{

intCount;

PQArrayItemArray;

}PriorityQueue;


91 102 computing ii

91.102 - Computing II

PQUserTypes.h- User decision on type of objects in Priority Queue and maximum size of queue expected..

#defineMAXCOUNT10//Just 10?

typedefintPQItem;//User defined

/* this is where we stop the explicit user knowledge*/

Although YOU don’t need to know any more, your program - actually the compiler compiling your program - does. This is why more detail about the representation is available in the other header file (interface part of the module).


91 102 computing ii

91.102 - Computing II

Unfortunately, all this makes it impossible to “compile once - use many times”. The whole module will have to be recompiled every time you change the type of object managed by the priority queue. It can also manage only one single type of object: you could not have Priority Queues of different types of objects using functions with exactly the same names.

When the object managed is fixed (e.g., in the strings module, the stdio module, etc.), then it is possible to “compile once - use many times”.

In other language environments the ”single type restriction” need not hold.


91 102 computing ii

91.102 - Computing II

Example of use: sorting an array.

//Define the types

typedef intPQItem;

typedef PQItem SortingArray[10];

//Declare the array

SortingArray A;

//Define the sorting function

voidPriorityQueueSort(SortingArray A)

{int i; PriorityQueue PQ;

Initialize(&PQ);

for(i = 0; i < 10; ++i) PQInsert(A[i], &PQ);

for(i = 9; i >= 0; --i) A[i] = PQRemove(&PQ); }


91 102 computing ii

91.102 - Computing II

Problem: how can we perform comparisons if we don’t know WHAT kind of items we need to compare and HOW we can compare them?

The Priority Queue, AS GIVEN, does not require the definition of a user provided comparison function, and does not accept such a function as a parameter to either the Initialize, Insert or Remove functions. It appears to require such a comparison as a “built in”, which would seriously limit the generality of the Module.

We leave this with the statement that it can be done within C - we won’t pursue this at this point, because it would further complicate our discussion.


91 102 computing ii

91.102 - Computing II

What are the trade-offs between the two implementations?

1) The linked-list implementation uses only the space it needs, while the array one must allocate all of its space at the beginning.

2) The Sorted linked list implementation is less efficient than the Unsorted array one at inserting an item.

3) The Unsorted array implementation is less efficient than the Sorted linked list one at removing an item.


91 102 computing ii

91.102 - Computing II

A Generalization.

The idea of modularization can be extended to what is called Work Breakdown Structure : the division of a software project into subprojects, tasks, subtasks, deliverables, etc.

The example given by the text is that of the design and implementation of a simple calculator.


91 102 computing ii

91.102 - Computing II

In this case, the decisions are “fairly simple”:

there is a reasonably clear “user interface module” and a reasonably clear “computation module”.

The functions of the two are easily separable.

The interface BETWEEN the two can consist of strings of characters: the user interface sends the string containing an expression to the compute engine, which determines the legality of the expression, translates from character form to one suitable for arithmetic, performs the arithmetic, translates the result into character form and returns the result string, to be displayed.


91 102 computing ii

91.102 - Computing II

/*CalculatorModuleInterface.h */

char *Expression, *Value;

extern void InitializeAndDisplayCalculator(void);

extern void GetAndProcessOneEvent(void);

extern int UserSubmittedAnExpression(void);

extern void Display(char *);

extern int UserWantsToQuit(void);

extern void ShutDown(void);

/*YourCalculatorModuleInterface.h */

extern char *Evaluate(char *);


91 102 computing ii

91.102 - Computing II

#include <stdio.h>

#include “CalculatorModuleInterface.h”

#include “YourCalculatorModuleInterface.h”

int main(void) {

InitializeAndDisplayCalculator();

do {

GetAndProcessOneEvent();

if (USerSubmittedAnExpression()){

Value = Evaluate(Expression);

Display(Value);

}

} while (!UserWantsToQuit());

ShutDown();

}


91 102 computing ii

91.102 - Computing II

More Ideas about Information Hiding and Modularization.

We cannot "really" implement "Abstract Data Types" - since our implementing them requires we make representational decisions based on multiple considerations.

A reasonable question is : How close can we get to a representation independent notation so that our "approximation" to an abstract data type is as good as we can manage?


91 102 computing ii

91.102 - Computing II

Example:

Three Implementations (in C) of LINKED LISTS to be used for the abstract data type LIST.

First Implementation: based on C pointers as links:

L

Item

Link

Item

Link

Item

Link

Second Implementation: array of Node (Info and Link) structs:

0 1 2 3 4 5 6 7 8 9

.Item

.Link

x1

x2

x3

x4

1

2

5

-1

L = 0


91 102 computing ii

91.102 - Computing II

Third Implementation: array of Info AND array of Link

0 1 2 3 4 5 6 7 8 9

Item

x1

x2

x3

x4

Link

1

2

5

-1

L = 0

In the second and third implementation, the Link is just an integer used to index into the array.


91 102 computing ii

91.102 - Computing II

Some of these implementations involve pointer variables, some involve arrays of structs, some involve arrays of simple types.

How can we design an interface that will work THE SAME WAY regardless of the underlying implementation?

1) It can’t explicitly deal with the underlying pointer variables;

2) It can’t explicitly deal with the underlying structs;

3) It can’t explicitly deal with the underlying arrays.


91 102 computing ii

91.102 - Computing II

4) We assume the Item field can be managed as a struct; (early FORTRAN had no structs, so the multiple fields of a struct would have required one full array each)

5) We have to introduce a NULL that can be made meaningful in all three representations: call it null (lower case - no conflict with the built-in) and be prepared to initialize it to the correct value for each implementation.


91 102 computing ii

91.102 - Computing II

The textbook provides a set of functions, all ready for us.

Question: how did the author get those functions? Miraculous inspiration? The author is SO experienced that he was able to figure them out by just having the problem presented?

Neither alternative: pick a function that you think is fairly representative of the kind of operation you will need to support, code it in all three representations and see what THAT tells you.

If that’s not enough, pick a function that will require manipulation of most of the representation features you missed on the first pass and try again. If you need several successive tries, so be it...


91 102 computing ii

91.102 - Computing II

A Reverse for Normal Linked Lists.

Use a function where the new list is returned through the single parameter passed by reference (i.e., pass its address by value), since this is the accepted way to manage two-way communication via the parameter list.

L is a NodePointer

which could be either a true pointer or an integer used as an index into the array.


91 102 computing ii

91.102 - Computing II

L

Item

Link

Item

Link

Item

Link

void Reverse(NodePointer *L)

/* L is the address of a pointer to a Node */

{NodePointer R, N;

R = null;// NULL in this case

while(*L != null) { // there is something

N = *L;// save it

*L = (*L)->Link;// find the next one

N->Link = R;// re-link the saved one

R = N; // new head of part-reversed list

}

*L = R; // new head of reversed list

}


91 102 computing ii

91.102 - Computing II

A Reverse for Linked Lists as arrays of struct.

The NodePointer is just an integer; the array index of the next structure.

We need something like:

NodeListMemory[MAXPOINTER];// for the array of nodes


91 102 computing ii

91.102 - Computing II

0 1 2 3 4 5 6 7 8 9

.Item

.Link

x1

x2

x3

x4

1

2

5

-1

L = 0

void Reverse(NodePointer *L)

/* L is the address of a pointer to a Node */

{NodePointer R, N;

R = null; // -1 in this case

while(*L != null) { // there is something

N = *L; // save it

*L = ListMemory[*L].Link; // next one

ListMemory[N].Link = R;// re-link saved one

R = N; // new head of part-reversed list

}

*L = R;// new head of reversed list

}


91 102 computing ii

91.102 - Computing II

A Reverse for Linked Lists as double arrays.

The NodePointer is just an integer. We also need something like:

ListItemItem[MAXPOINTER];// for the array of items

NodePointerLink[MAXPOINTER];// for the array of links


91 102 computing ii

91.102 - Computing II

0 1 2 3 4 5 6 7 8 9

Item

x1

x2

x3

x4

Link

1

2

5

-1

L = 0

void Reverse(NodePointer *L)

/* L is the address of a pointer to a Node */

{NodePointer R, N; R = null; // -1 in this case

while(*L != null) { // there is something

N = *L;// save it

*L = Link[*L]; // find the next one

Link[N] = R;/ re-link the saved one

R = N; // new head of part-reversed list

}

*L = R;// new head of reversed list

}


91 102 computing ii

91.102 - Computing II

Reverse differs in only two lines from definition to definition:

Normal Linked Lists:

*L = (*L)->Link;// get the next one

N->Link = R;// re-link the saved one

Array of Nodes:

*L = ListMemory[*L].Link; // get the next one

ListMemory[N].Link = R;// re-link the saved one

Double Array:

*L = Link[*L];// get the next one

Link[N] = R;// re-link the saved one


91 102 computing ii

91.102 - Computing II

They differ in the syntax for GETTING the value of the link and for SETTING the value of a link. Those two operations are candidates for “hiding”: introduce intermediate functions that hide the details.

Getting the Link:

NodePointer GetLink(NodePointer N)

{return(N->Link);} // normal linked lists

NodePointer GetLink(NodePointer N)

{return(ListMemory[N].Link);} // arrays of struct

NodePointer GetLink(NodePointer N)

{return(Link[N]);} // double arrays


91 102 computing ii

91.102 - Computing II

Setting the Link:

void SetLink(NodePointer N, NodePointer L)

{N->Link = L;} /* normal linked lists */

void SetLink(NodePointer N , NodePointer L)

{ ListMemory[N].Link = L;} /* arrays of struct */

void SetLink(NodePointer N , NodePointer L)

{ Link[N] = L; } /* double arrays */


91 102 computing ii

91.102 - Computing II

This leads us to the Reverse function:

void Reverse(NodePointer *L)

/* L is the address of a pointer to a Node */

{NodePointer R, N; R = null; // -1 in this case

while(*L != null) { // there is something

N = *L;// save it

*L = GetLink(*L); // find the next one

SetLink(N, R);// re-link the saved one

R = N; // new head of part-reversed list

}

*L = R;// new head of reversed list

}


91 102 computing ii

91.102 - Computing II

Another area of potential problems is in the allocation and deallocation of nodes. In the Normal Linked List version we could define (in preparation for the fact that ALL implementations will need the same function calls):

void AllocateNewNode(NodePointer *N);

{*N = (NodePointer)malloc(sizeof(Node));}

void FreeNode(NodePointer N)

{free(N);} /* no safety - YOU set to null */

Where the runtime environment takes care of managing space...


91 102 computing ii

91.102 - Computing II

When lists are implemented via arrays, we must keep track of which array elements are in use and which are free.

NodePointer Avail;// points to the head of the FREE LIST

void AllocateNewNode(NodePointer *N);

{*N = Avail;/* for arrays of struct */

Avail = ListMemory[Avail].Link;

}

void AllocateNewNode(NodePointer *N);

{*N = Avail;/* for double arrays */

Avail = Link[Avail];

}


91 102 computing ii

91.102 - Computing II

Unfortunately, this requires an initialization:

Avail = 0;

for(i = 0; i < MAXPOINTER - 1; i++)

ListMemory[i].Link = i + 1;

ListMemory[MAXPOINTER - 1].Link = null;

Or

for(i = 0; i < MAXPOINTER - 1; i++)

Link [i] = i + 1;

Link [MAXPOINTER - 1] = null;


91 102 computing ii

91.102 - Computing II

Avail = 0; /* and all the nodes are empty */

0 1 2 3 4 5 6 7 8 9

.Item

.Link

1

2

3

4

5

6

7

8

9

null

0 1 2 3 4 5 6 7 8 9

Item

Link

1

2

3

4

5

6

7

8

9

null

Calls to AllocateNewNode will simply return the first free node.


91 102 computing ii

91.102 - Computing II

Now to FREE nodes: attach the node to the current head of the FREE list and update Avail.

void FreeNode(NodePointer N)

{ListMemory[N].Link = Avail;

Avail = N;

}

Or:

void FreeNode(NodePointer N)

{Link [N] = Avail;

Avail = N;

}


91 102 computing ii

91.102 - Computing II

We are, essentially, done: a bit more cleanup, a few more functions, and we have a successful interface. The header files must contain some of the information about all this, but all the user needs to do is include the correct header files and the program will run correctly, regardless of the underlying list implementation.

The user must provide definitions for

ItemType

(only the user knows what will be put into the lists)

MAXPOINTER

(in the case of the array implementations - again, only the user will have any idea of how big the lists can become)


91 102 computing ii

91.102 - Computing II

Header File for Normal Lists:

typedef ItemTypeListItem;

typedef structNodeTag {

ListItemItem;

structNodeTag*Link;

}Node;

typedefNode*NodePointer;

L

Item

Link

Item

Link

Item

Link


91 102 computing ii

91.102 - Computing II

Header File for Parallel Arrays:

typedefintNodePointer;

typedefItemTypeListItem;

NodePointerAvail;

ListItemItem[MAXPOINTER];

NodePointerLink[MAXPOINTER];

Item

x1

x2

x3

x4

Link

1

2

5

-1

L = 0


91 102 computing ii

91.102 - Computing II

Header File for Array of node struct:

typedefintNodePointer;

typedefItemTypeListItem;

typedefstruct {

ListItemItem;

NodePointerLink;

}Node;

NodePointerAvail;

NodeListMemory[MAXPOINTER];

0 1 2 3 4 5 6 7 8 9

.Item

.Link

x1

x2

x3

x4

1

2

5

-1

L = 0


91 102 computing ii

91.102 - Computing II

We now have (let's imagine we have completed all the work) three distinct implementations of lists.

How can we use them?

First of all, we need to set up the header file where ItemType is defined - without it, there is not much we can do.

We also need to define functions that are specific to the exact ItemType we need: reading functions, printing functions, assignments, etc… Those will be made available as an Implementation File.


91 102 computing ii

91.102 - Computing II

Header File for items of type int: ItemInterface.h

typedefintItemType;

extern voidPrintItem(ItemType *);

externvoidAssignItem(ItemType *, ItemType);

Implementation File for items of type int: ItemImplementation.c

#include<stdio.h>

#include"ItemInterface.h"

voidPrintItem(ItemType *i)

{printf("%d", *i);}

voidAssignItem(ItemType *Left, ItemType Right)

{*Left= Right;}


91 102 computing ii

91.102 - Computing II

Header File for items of type AirportCode: ItemInterface.h

typedefcharItemType[4];

extern voidPrintItem(ItemType *);

externvoidAssignItem(ItemType *, ItemType);

Implementation File for AirportCodes: ItemImplementation.c

#include<stdio.h>

#include<string.h>

#include"ItemInterface.h"

voidPrintItem(ItemType *i)

{printf("%s", *i);}

voidAssignItem(ItemType *Left, ItemType Right)

{*Left= Right;} // Careful - what does this do??!!


91 102 computing ii

91.102 - Computing II

Types and Specialized

Functions: int

Types and Specialized

Functions: AirportCodes

Linked Lists

Arrays of Structs

Parallel Arrays

Main Program


91 102 computing ii

91.102 - Computing II

Abstraction:

Procedural Abstraction - or how to replace a (long) sequence of operations (the HOW) by a NAME and an interface.

This is embodied in the idea of FUNCTION with a well defined parameter list and a well defined return type.

Data Abstraction - or how to separate the details of the WHAT from the details of the HOW while hiding both.

This is embodied in the idea of Abstract Data Type where details of representation AND manipulation are hidden from the user.


91 102 computing ii

91.102 - Computing II

The combination of these two ideas - extended to the best of our understanding - has provided many of the tools that allow us to manage the design and construction of today's large programs.

They allow us to replace "spaghetti bowl" programs, where almost every item depends on some other item, with fairly clean, hierarchically structured programs where the dependencies are minimized and the interfaces among modules are (mostly) well specified.

The latter claim is only approximated in reality: it is the ATTEMPT at approximating it, and the considerable efforts expended towards the approximation, that makes large programs (e.g.: 11.5 M lines of Windows 95) possible at all.


  • Login