1 / 47

# Arrays and Pointers - PowerPoint PPT Presentation

Arrays and Pointers. Programming Language Principles Lecture 23. Prepared by Manuel E. Bermúdez, Ph.D. Associate Professor University of Florida. Arrays. Most common composite data type. Semantically, viewed as a mapping from the index type to the element type.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

## PowerPoint Slideshow about 'Arrays and Pointers' - albert

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

### Arrays and Pointers

Programming Language Principles

Lecture 23

Prepared by

Manuel E. Bermúdez, Ph.D.

Associate Professor

University of Florida

• Most common composite data type.

• Semantically, viewed as a mapping from the index type to the element type.

• Some languages permit only integer as the index type; others allow any scalar.

C:

char upper[26] ; /* array of 26 chars, 0..25 */

Fortran:

character(26) upper

Pascal:

var upper: array[‘a’ .. ‘z’] of char;

upper: array (character range ‘a’ .. ‘z’) of character;

In either case, upper(‘a’) returns ‘A’.

matrix: array (1..10, 1..10) of real;

Modula-3:

VAR matrix: ARRAY [1..10],[1..10] OF REAL;

(same as)

VAR matrix: ARRAY [1..10] OF

ARRAY [1..10] OF REAL;

and

matrix[3,4] is the same as matrix[3][4].

matrix: array(1..10,1..10) of real;

is NOT the same as

matrix: array(1..10) of array (1..10) of real;

matrix(3)(4) not legal in first form;

matrix(3,4) not legal in second form.

An array of arrays is a slice.

In C, double matrix[10][10].

However, C integrates arrays and pointers,

so

matrix[3] is not an array of 10 doubles.

It is (depending on context) either:

A pointer to the third row of matrix, or

the value of matrix[3][0]

Five cases:

static allocation (easy).

space allocated on a stack frame.

• Local lifetime, shape bound at elaboration: stack frame needs fixed-size part and variable-size part.

• Arbitrary lifetime, shape bound at elaboration: Java, programmer allocates space.

• Arbitrary lifetime, dynamic shape: array lives on the heap.

• Array shape determined at time of call.

• Pascal doesn’t allow local dynamic-shaped arrays.

• Ada DOES allow local dynamic-shaped arrays (see textbook)

• C: Arrays passed by reference, so bounds are irrelevant ! (programmer’s problem)

• Java strings:

String s = “short”;

s = s + “ and sweet”; // immutable

Create a new array of proper length and data type:

Integer[] a = new Integer[10];

Object[] newArray = new Object[newLength];

• Copy all elements from old array into new one:

System.arraycopy(a,0,newArray,0,a.length);

Rename array:

element = newArray;

// old space reclaimedby garbage

// collector.

Arrays sized at run time, but can’t be changed once set.

With static array bounds, we’ve “moved” the array in 3Ds.

• A “run-time” descriptor for the array.

• Contains, for each dimension (except last one, always statically known):

• Lower bound

• Size

• Upper bound (if dynamic checks are required)

• Size of dope vector depends on # of dimensions (i.e. static).

• Typically placed next to the array pointer, in the fixed-size portion of the stack frame.

• Usually an array of characters.

• Many languages allow more flexibility with strings than with other types of arrays.

• Single-character string vs. single character:

• Pascal: no distinction.

• C: *very* different

• String constants: 'abc', ”abc”.

• Rules for embedding special characters:

• Pascal: double the character: ' ab''cd'

• C: escape sequence: ”ab\”cd”.

• C, Pascal, Ada: string length bound no later than elaboration time (allocate in stack frame).

• Lisp, Icon, ML, Java: allow dynamically-bound strings, stored in the heap.

• Pascal supports lexicographically-ordered comparison of strings('abc' < 'abd'). Ada supports it on all 1D discrete-valued arrays.

• C: no string assignment, elements copied individually (library functions).

Pascal supports sets of any discrete type:

var a,b,c: set of char;

d,e: set of weekday;

a := b + c; (* union *)

a := b * c: (* intersection *)

a := b – c: (* difference *)

• Arrays, hash tables, trees.

• Bit-vectors: each entry true (element in the set), or false (element not in the set)

• Efficient operations:

• Union is inclusive bit-wise OR.

• Intersection is bit-wise AND.

• Difference is NOT, followed by AND.

• Won’t work for large base types:

• A set of 32-bit integers ~ 500MBs.

• A set of 64-bit integers ~ 241 MBs

• Usually limited to 128, or 512.

• Most recursive types are records.

• Reference model languages (Lisp, ML, Clu, Java): every field is a reference.

• A record of type f contains a reference to another record of type f.

• Value model languages (C, Pascal, Ada): need a pointer (a variable whose value is a reference).

• Explicit (C,C++, Pascal, Modula-2): programmer must reclaim unused heap space.

• Can be done efficiently.

• Easy to get wrong; if so, can lead to memory leaks.

• Implicit (Lisp, ML, Modula-3, Ada, Java): heap space reclaimed automatically.

• Not so efficient (but getting better)

• Simplifies programmer’s task a LOT.

node (‘R’,node(‘X’,empty,empty), node(‘Y’,node(‘Z’,empty,empty),

node(‘W’,empty,empty)))

'(#\R(#\X()())(#\Y(#\Z()())(#\W()())))

• Pascal:

type chr_tree_ptr = ^chr_tree;

chr_tree = record

left, right:chr_tree_ptr;

val: char

end;

• C:

struct chr_tree {

struct chr_tree *left, *right;

char val;

}

• In C, struct names are not quite type names. Shorthand:

typedef struct chr_tree chr_tree_type

• Pascal: new(my_ptr);

• C: my_ptr=(struct chr_tree *)

malloc(sizeof (struct chr_tree));

• C++, Java:my_ptr = new chr_tree(args);

Pascal:

my_ptr^.val := ‘X’;

C:

(*my_ptr).val = ‘X’;

my_ptr->val = ‘X’;

T: chr_tree;

P: char_tree_ptr;

T.val := ‘X’;

P.val := ‘X’;good for record or pointer to one.

T := P.all;if need to reference the record.

int n;

int *a;

int b[10];

All are valid:

a = b;

n = a[3];

n = *(a+3);

n = b[3];

n = *(b+3);

Interoperable, but not the same:

int *a[n]allocatesnpointers

int[n][m]allocates a full 2D array.

In fact, assumingint a[n];

*(a+i)

*(i+a)

a[i]

i[a]

are all equivalent !

• In C, arrays are passed by reference: the array name is a pointer.

• It’s customary to pass the array name, and its dimensions:

double det (double *M, int rows, int cols)

{ int i,j; ...

val = *(M+i*cols+j); /* M[i][j] */

}

Technique for catching dangling references

• Catch dangling references.

• Prevent memory leaks.

• Cheap on the heap, expensive on the stack (procedure entry/return).

• Tombstones themselves can dangle.

• No need to keep tombstones around.

• Only work for heap objects.

• Increase the cost of copying a pointer.

• Increase the cost of every access.

• Set count to 1 upon object creation

• Upon assignment.,

• Decrement count of object on left.

• Increment count of object on right.

• Upon subroutine entry, increment counts for local pointers.

• Upon subroutine return, decrement counts for local pointers.

• Need type descriptors for this: objects can be deeply structured.

• WILL FAIL ON CIRCULAR STRUCTURES !

### Garbage Collection

System determines which memory is not in use and return the memory to the pool of free storage.

Done in two or three steps:

Mark nodes that are in use.

Compact free space (optional).

Move free nodes to storage pool.

### Marking

c

a

e

d

b

firstNode

Unmark all nodes (set all mark bits to false).

Start at each program variable that contains a reference, follow all pointers, mark nodes that are reached.

### Compaction

Free Memory

c

a

e

d

b

e

d

b

firstNode

Move all marked nodes (i.e., nodes in

use) to one end of memory, updating

all pointers as necessary.

• Equality comparison is easy for scalars

• For complex or abstract data types, say,stringssand t, s = tcould mean

• sandt are aliases

• sandt occupy the same storage

• sandt contain the same sequence of characters

• sandt print the same

• Shallow Comparison:

• Both expressions refer to the same object.

• Deep Comparison:

• Expressions refer to objects that are “equal” in content somehow.

• Most PLs use shallow comparisons, and shallow assignments.

### Arrays and Pointers

Programming Language Principles

Lecture 23

Prepared by

Manuel E. Bermúdez, Ph.D.

Associate Professor

University of Florida