Chapter 6 data types
1 / 50

- PowerPoint PPT Presentation

  • Uploaded on

Chapter 6 Data Types. What is a data type? A set of values versus A set of values + set of operations on those values. Why data types?. Data abstraction Programming style – incorrect design decisions show up at translation time Modifiability – enhance readability

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about '' - lev

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Chapter 6 data types l.jpg

Chapter 6Data Types

What is a data type?

A set of values


A set of values + set of operations on those values

Why data types l.jpg
Why data types?

  • Data abstraction

    • Programming style – incorrect design decisions show up at translation time

    • Modifiability – enhance readability

  • Type checking (semantic analysis) can be done at compile time - the process a translator goes through to determine whether the type information is consistent

  • Compiler uses type information to allocate space for variables

  • Translation efficiency: Type conversion (coercion) can be done at compile time

Overview cont l.jpg
Overview cont…

  • Declarations

    • Explicit type information

  • Type declarations

    • Give new names to types in type declararion

  • Type checking

    • Type Inference Rules for determining the types of constructs from available type information

    • Type equivalence determines if two types are the same

  • Type system

    • Type construction methods + type inference rules + type equivalence algorithm

Simple types l.jpg
Simple Types

  • Predefined types

  • Enumerated types


    type fruit = (apple, orange, banana);


    enum fruit { apple, orange, banana };

Simple types5 l.jpg
Simple Types

  • Subrange types


    type byte = 0..255;

    minors = 0..19;

    teens = 13..19;


    subtype teens is INTEGER range 13..19;

Data aggregates and type constructors l.jpg
Data Aggregates and Type Constructors

  • Aggregate (compound) objects and types are constructed from simple types

  • Recursive – can also construct aggregate objects and types from aggregate types

  • Predefined – records, arrays, strings……

Constructors cartesian product l.jpg
Constructors - Cartesian Product

  • U X V = { (u, v) | u is in U and v is in V}

  • Projection functions

    • p1: U X V -> U and P2: U X V -> V

  • Pascal family of languages - record

type polygon = record

edgeCt: integer;

edgeLen: real


var a : polygon;

Cartesian product type





Constructors mapping arrays l.jpg
Constructors – Mapping (arrays)

  • The array constructor defines mappings as data aggregates

  • Mapping from an array index to the value stored in that position in the array

  • The domain is the values of the index

  • The range is the values stored in the array

Constructors mapping arrays9 l.jpg
Constructors Mapping (arrays)

  • C/C++

typedef int little_people_num[3];

little_people_num gollum_city = {0, 0, 0}

gollum_city[3] = 5

typedef int matrix[10][20];

Arrays l.jpg

  • Design Issues:

    1. What types are legal for subscripts?

    2. Are subscripting expressions in element

    references range checked?

    3. When are subscript ranges bound?

    4. When does allocation take place?

    5. What is the maximum number of subscripts?

    6. Can array objects be initialized?

    7. Are any kind of slices allowed?

Arrays11 l.jpg

  • Indexing is a mapping from indices to elements

    map(array_name, index_value_list)  an element

Arrays12 l.jpg

  • Subscript Types:

    • FORTRAN, C - integer only

    • Pascal - any ordinal type (integer, boolean, char, enum)

    • Ada - integer or enum (includes boolean and char)

    • Java - integer types only

Arrays13 l.jpg

  • Categories of arrays (based on subscript binding and binding to storage)

    1. Static- range of subscripts and storage bindings are static

    e.g. FORTRAN 77, some arrays in Ada

    • Advantage: execution efficiency (no allocation or deallocation)

Arrays14 l.jpg

2. Fixed stack dynamic - range of subscripts is statically bound, but storage is bound at elaboration time (at function call time)

  • e.g. Most Java locals, and C locals that are not static

  • Advantage: space efficiency

Arrays15 l.jpg

3. Stack-dynamic - range and storage are dynamic (decided at run time), but fixed after initial creation on for the variable’s lifetime

  • e.g. Ada declare blocks


    STUFF : array (1..N) of FLOAT;




  • Advantage: flexibility - size need not be known until the array is about to be used

Arrays16 l.jpg

4. Heap-dynamic – stored on heap and sizes decided at run time.

e.g. (FORTRAN 90)


(Declares MAT to be a dynamic 2-dim array)


(Allocates MAT to have 10 rows and



(Deallocates MAT’s storage)

Arrays17 l.jpg

4. Heap-dynamic (continued)

  • Truly dynamic: In APL, Perl, and JavaScript, arrays grow and shrink as needed

  • Fixed dynamic: Java, once you declare the size, it doesn’t change

Arrays18 l.jpg

  • Number of subscripts

    • FORTRAN I allowed up to three

    • FORTRAN 77 allows up to seven

    • Others - no limit

  • Array Initialization

    • Usually just a list of values that are put in the array in the order in which the array elements are stored in memory

Arrays19 l.jpg

  • Array Operations

    1. APL - many, see book (p. 240-241)

    2. Ada

    • Assignment; RHS can be an aggregate constant or an array name

    • Catenation; for all single-dimensioned arrays

    • Relational operators (= and /= only)

      3. FORTRAN 90

    • Intrinsics (subprograms) for a wide variety of array operations (e.g., matrix multiplication, vector dot product)

Arrays20 l.jpg

  • Slices

    • A slice is some substructure of an array; nothing more than a referencing mechanism

    • Slices are only useful in languages that have array operations

Arrays21 l.jpg

  • Slice Examples:

    1. FORTRAN 90

    INTEGER MAT (1:4, 1:4)

    MAT(1:4, 1) - the first column

    MAT(2, 1:4) - the second row

Arrays23 l.jpg

  • Implementation of Arrays

    • Access function maps subscript expressions to an address in the array

    • Row major (by rows) or column major order (by columns)

Compile time descriptors l.jpg
Compile-Time Descriptors

Multi-dimensional array

Single-dimensioned array

Accessing formulas 1d l.jpg
Accessing Formulas – 1D

  • Address(A[i]) = BegAdd + (i-lb)*size

    + VirtualOrigin +i*size

    lb: lower bound

    size: number of bytes of one element

    Virtual origin allows us to do math once, so don’t have to repeat each time.

    You must check for valid subscript before you use this formula, as obviously, it doesn’t care what subscript you use.

Accessing formulas multiple dimensions l.jpg
Accessing FormulasMultiple Dimensions

  • ubi: upper bound in ith dimension

  • lbi: lower bound in ith dimension

  • lengthi = ubi –lbi +1

  • In row-major order

    Address(A[i,j]) = begAdd + size((i-lbi)*lengthj + j-lbj)

    = VO + i*multi + j*multj

Slide28 l.jpg

Address(A[i,j]) = begAdd + size((i-lbi)*lengthj + j-lbj)

= VO + i*multi + j*multj = 40 +28i+20j

For Example: array of floats A[0..6, 3..7] beginning at location 100

begAdd = 100

size = 4 (if floats take 4 bytes)

lbi = 0 ubi = 6 lengthi = 7

lbj = 3 ubj = 7 lengthj = 5

VO = 100 + 4*(-3)*5 = 40

multi = 28

multj = 20

repeated for each dimension

Accessing formulas multiple dimensions29 l.jpg
Accessing FormulasMultiple Dimensions

  • In column-major order

    Address(A[i,j]) = begAdd + size((i-lbi) + (j-lbj)*lengthi)

  • In 3D in row major:

    Addr(A[I,j,k]) =

    begAdd + size*((i-lbi)*lengthj*lengthk) + (j-lbj)lengthk + k-lbk)

Accessing formulas slices l.jpg
Accessing Formulas Slices

  • Suppose we want only the second row of our previous example. We would need array descriptor to look like a normal 1D array (as when you pass the slice, the receiving function can’t be expected to treat it any differently)

Slide31 l.jpg

Address(A[i,j]) = begAdd + size((i-lbi)*lengthj + j-lbj)

= VO + i*multi + j*multj = 40 +28i+20j

For Example: array of floats A[0..6, 3..7] beginning at location 100

If we want only the second row, it is like we have hardcoded the j=2 in the accessing formula,


The accessing formula is simple

Just replace j with 2, adjust the VO, and remove

the ub,lb, and length associated with j

so 40 +28i+20j = 80+28i

and the table is changed

accordingly (and looks just

like the descriptor for

a regular 1D array)

Constructors union l.jpg
Constructors Union

  • Cartesian products – conjunction of fields

  • Union – disjunction of fields

  • Discriminated or undiscriminated

Slide33 l.jpg

  • Pascal Variant Record – discriminated union

type address_range = 0..maxint;

address_type = (absolute, offset);

safe_address =


case kind: address_type of

absolute: (abs_addr:address_range);

offset: (off_addr: integer);


References l.jpg

  • A reference is the address of an object under the control of the system – which cannot be used as a value or operated on

  • C++ is perhaps the only language where pointers and references exist together.

  • References in C++ are constant pointers that are dereferenced everytime they are used.

Constructors pointer and recursive types l.jpg
Constructors Pointer and Recursive Types

  • Some languages (Pascal, Ada) require pointers to be typed

  • PL/1 treated pointers as untyped data objectsWhat is the significance of this for a type checker?

  • C pointers are typed but C allows arithmetic operations on them unlike Pascal and Ada

Type equivalence l.jpg
Type Equivalence

  • When are two types the same

  • Structural equivalence

  • Declaration equivalence

  • Name equivalence

Structural equivalence l.jpg
Structural Equivalence

  • Two types are the same if they have the same structure

  • i.e. they are constructed in exactly the same way using the same type constructors from the same simple types

  • May look alike even when we wanted them to be treated as different.

Structural type equivalence l.jpg
Structural Type Equivalence

(Note we are just using the syntax of C as an example.

C does NOT use structural equivalence for structs

typedef int anarray[10];

typedef struct {

anarray x;

int y;}


typedef struct {

int x[10];

int y;


typedef int anarray[10];

typedef struct {

int b;

anarray a;


typedef int anarray[10];

typedef struct {

anarray a;

int b;


Structural equivalence39 l.jpg
Structural Equivalence

  • Check representing types as trees

    • Check equivalence recursively on subtrees

  • Consider…

  • Dynamic arrays

Type array1 = array[-1..9] of integer;

array2 = array[0..10] of integer;

Array (INTEGER range <>) of INTEGER

Name equivalence l.jpg
Name Equivalence

  • Two name types are equivalent only if they have the exact same type name

  • Name equivalence in Ada and C

  • ar1 and ar2 are not considered name equivalent

typedef int ar1[10];

typedef ar1 ar2;

typedef int age;

type ar1 is array (INTEGER range1..10) of INTEGER;

type ar2 is new ar1;

type age is new INTEGER;

Name equivalence41 l.jpg
Name equivalence…

v1: ar1;

v2: ar1;

v3: ar2;

v4: array (INTEGER range 1..100) of INTEGER;

v5: array (INTEGER range 1..100) of INTEGER;

v4 and v4 cannot be name equivalent, as there is no name.

v6,v7: array (INTEGER range 1..100) of INTEGER;

Declaration equivalent l.jpg
Declaration Equivalent

  • Lead back to the same original structure declaration via a series of redeclarations

type t1 = array [1..10] of integer;

t2 = t1;

t3 = t2;

These are the same type

type t4 = array [1..10] of integer;

t5 = array [1..10] of integer;

These are different types.

Type checking l.jpg
Type Checking

  • Involves the application of a type equivalence algorithm to expressions and statements to determine if they make sense

  • Any attempt to manipulate a data object with an illegal operation is a type error

  • Program is said to be type safe (or type secure) if guaranteed to have no type errors

  • Static versus dynamic type checking

  • Run time errors

Type checking44 l.jpg
Type Checking…

  • Strong typing and type checking

    • Strong guarantees type safety

  • A language is strongly typed if its type system guarantees statically (as far as possible) that no data-corrupting errors can occur during execution.

  • Statically typed versus dynamically typed

    • Static (type of every program expression be known at compile time)

      • All variables are declared with an associated type

      • All operations are specified by stating the types of the required operands and the type of the result

  • Java is strongly typed

  • Python is dynamic typed and strong typed

Strongly typed l.jpg
Strongly typed

  • /* Python code */ >>> foo = "x" >>> foo = foo + 2 Traceback (most recent call last):  File "<pyshell#3>", line 1, in ?    foo = foo + 2 TypeError: cannot concatenate 'str' and 'int' objects >>>

  • foo is of str type.

  • In the second line, we're attempting to add 2 to a variable of str type.

  • A TypeError is returned, indicating that a str object cannot be concatenated with an int object. This is what characterizes strong typed languages: variables are bound to a particular data type.

Weakly typed l.jpg
Weakly Typed

  • /* PHP code */ <?php $foo = "x"; $foo = $foo + 2; // not an error echo $foo; ?>

  • In this example, foo is initially a string type. In the second line, we add this string variable to 2, an integer. This is permitted in PHP, and is characteristic of all weak typed languages.

  • C is static typed and weak typed as we can cast a variable to be a different type (which should not be allowed if it isn’t of that type)

Type conversion l.jpg
Type Conversion

r is float and j is integer

r = j + 45.6

i is integer and j is integer

i = j + 45.6

Type conversion48 l.jpg
Type Conversion…

  • Modula2

    i := TRUNC (FLOAT(j) + 45.6)

  • Explicit type conversion

    • type conversion functions

  • Implicit conversion

    • coercion

    • can weaken type checking – as lots of coercions are allowed. For example, if (a=b) coerces int to boolean.

Type conversion49 l.jpg
Type conversion…

  • Casts

    • A value or object of one type is preceded by a type name

      (int) 3.14

  • Often does not cause a conversion to take place. Internal representation is reinterpreted as a new type



    This is very dangerous!

Type conversions l.jpg
Type Conversions

  • Def: A mixed-mode expression is one that has operands of different types

  • Def: A coercion is an implicit type conversion

  • The disadvantage of coercions:

    • They decrease in the type error detection ability of the compiler

  • In most languages, all numeric types are coerced in expressions, using widening conversions

  • In Ada, there are virtually no coercions in expressions