Data Types

1 / 27

Data Types - PowerPoint PPT Presentation

Data Types. Primitives. Aggregates. Integer Float Character Boolean Pointers. Strings Records Enumerated Arrays Objects. Strings. B e t s y b b b. Fixed length. Null terminated. Length field. Heap allocated. B e t s y 0. 5 B e t s y. B e t s y. String Allocation.

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.

PowerPoint Slideshow about ' Data Types' - solana

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript

Data Types

Primitives

Aggregates

• Integer
• Float
• Character
• Boolean
• Pointers
• Strings
• Records
• Enumerated
• Arrays
• Objects
Strings

B e t s y b b b

• Fixed length.
• Null terminated.
• Length field.
• Heap allocated.

B e t s y 0

5 B e t s y

B e t s y

String Allocation
• Static length.
• Blank fill as in Fortran, Pascal, etc.
• Limited Dynamic length
• grows to a limit
• Dynamic length
• no length restriction
• reallocates from heap

char x[4];

strcpy(x,”abc”); //OK

strcat(x,”def”); // NO

x=“abc”;

x=“abcdefghij”;

Implementation
• Can be viewed as primitive type
• some machine language supports string operations at a level which treats them as primitives even though operations are slower
• Sometimes requires both
• compile-time descriptors
• run-time descriptors
• know the difference
Enumerated types
• Usually implemented as integers.
• Implied size limitation which is not a problem
• (red, green, blue) red is 0, green is 1, etc
• Strong typing sometimes creates ambiguity
• desire types to be distinguished but for
• weekday = (Mon, Tue, Wed, Thur, Fri);
• classday = (Mon, Wed, Fri)
• assignment ok one direction, but not other
• I/O sometimes allowed, others not
Subrange
• Sequence of an ordinal type
• Mon..Fri
• Used for tighter restriction of values than primitive types provide
• subtype age is integer 0..150;
• Sometimes compatible, others not
• EXAMPLE: is age compatible with integer?
• type age is new integer range 1..150; NO
• type age is integer range 1..150; YES
Array Operations
• ARRAY operations are infrequent except APL
• Examples
• elements (common)
• entire array (as parameters/pointers)
• slice (a row, column, or series of rows/columns)
• APL
• matrix multiplcation
• vector dot product
• add a scalar to each element
Allocation strategies
• Static array
• Fixed stack-dynamic
• int x[20]; compile-time decision of size allocation
• Stack dynamic
• int x[n]; once allocated, size can’t change, but determined by n
• Heap dynamic
• array can grow dynamically and change subscript
• ever been frustrated by the MAX size of array?
Subscript/subrange errors
• Subscript bounds problems for arrays are one of our biggest programming nuisances
• Checking for them at run-time is expensive
• Even if within the range -> no assurance they are correct
• Some languages such as c do NO checking
• Consequence in programs is difficult/impossible to trace
• Storage is row-major or column-major order

int A[2,3];

(1,1)

(1,2)

(1,1)

(1,1)

(1,2)

(2,1)

(2,1)

(2,2)

(2,1)

(3,1)

(3,1)

(3,2)

(2,2)

(1,2)

(3,1)

(2,2)

(3,2)

(3,2)

Determining location

Location (a[I]) = base address (a)+ (I- lowerbound)*element size

100

integer a[6];

[1]

Assume size 4 bytes each starting at 100

[2]

104

108

[3]

112

[4]

Loc(a[3])= 100 + (3-1)*4

= 108

116

[5]

[6]

120

Most of this is compile-time!

2-d arrays (column major)

Loc (a[I,J]) = base address (a)

(I-lb1)*size element +

(J-lb2)*size of column

size of column=number rows allocated *

size element

100

(1,1)

104

(2,1)

108

(3,1)

112

Loc (a[1,2]) = 100 + (1-1)*4 +

(2-1)*3*4

= 100 + 0 + 12

= 112

(1,2)

116

(2,2)

120

(3,2)

Passing 2-d arrays as parameters
• The receiving procedure needs to have DIMENSION information
• Some languages are tightly bound and force that .. Pascal by requiring it to be a declared type
• Others have strange rules
• Fortran (column major)

Called:

SUBROUTINE PROCESS(A,N)

INTEGER A(N,1)

Caller:

INTEGER A(10,20)

CALL PROCESS(A,10)

Associative arrays
• Not common… in perl
• Uses a hash function
• Stores Key and Value

“gary”

47850

hash

%salaries

In math class:

hash(key) = value

or

hash(“gary”)=47850

mary

55750

cedric

75000

gary

47850

%salaries{“gary”} -> 47850

perry

57000

Arrays as pointers in c
• Use of array name in c is the same as a pointer to the beginning element
• Incrementing the associated pointer increments by the true memory size
• integers are 4 bytes
• int * j;
• j++; // increments j by 4.. assuming byte addressable
Example code in c

Assign j to be the address of c[0]

of j is within the bounds of c

int c[10], *j;

for (j=c; j<&c[10]; j++)

{ *j = 0; }

Increment j by size of integer

Set the element to 0

for (int j=0; j<10; j++)

{ c[j] = 0; }

Records
• Record operations
• assignment
• comparison
• block operations without respect to fields
• Strange syntax in c
• Unions
Record pointers in c

In declaring routine:

teacher.age=35;

Struct person{

int weight;

int age;

char name[20];

}; // not exact format

person teacher;

When passing to function

and inside function:

teacher->age=35;

Unions
• Free unions
• two names for the same place
• it’s up to you to keep them straight
• no support for checking
• Discriminated unions
• a value in the record indicates how to interpret the associated data.
• Not always easy to check.. Sometimes not done

rectangle:side1,side2

circle:diameter

triangle:leftside, rightside, angle

Discriminant(form)

color

filled

Sets
• Bit fields implemented as binary values (below)
• fast implementation
• set operations are easy binary operations
• try set union
• limit to size of set related to binary ops

Type colors = (red,blue,green,yellow,orange,white,black);

colorset = set of colors;

var set1 : colorset;

set1 := [red,orange,blue];

implemented as ( 1 1 0 0 1 0 0 )

Pointers
• Lots of flexibility
• Data from heap
• Difficult to manage what you are pointing at
• Many languages strongly manage the types to which the pointers point
• c doesn’t care
• c++ does
• Real problems are programmer management
Pointer problems

Dangling reference:

int *p1, *p2;

p1 = new (int);

p2=p1;

delete(p1);

Lost heap-dynamic:

int *p1, *p2;

p1 = new (int);

p1 = p2;

(lost)

p1

p1

p2

p2

Handling Pointer Problems
• Tombstones
• always stays even after memory deallocated
• never have a variable pointing at deallocated data

Before

cell

After

null

cell

tombstone

Handling Pointer Problems

REFERENCE COUNTERS

3 pointers at same cell

2 pointers at same cell

3

2

cell

cell

Delete cell when reference count is 0

Other than efficiency, trick is with circular lists

Handling Pointer Problems

GARBAGE COLLECTION

Mark all w/0

Mark all pointed

at w/1

Initial scenario

1

0

0

0

0

1

0

1

0

1