Introduction to Computer Science I Topic 2: Structured Data Types Data abstraction

Introduction to Computer Science ITopic 2: Structured Data Types Data abstraction Prof. Dr. Max MühlhäuserDr. Guido Rößling

Structures • The input/output of a function is seldom an atomic value (number, boolean, symbol), but frequently a data object with many different attributes. • E.g. CD: title and price • We need mechanisms to put compounding data together • One of these mechanisms is the structure • A structure definition has the following form • for example (define-struct s (field1 … fieldn)) (define-struct point (x y))

Structure definitions (define-struct s (field1 … fieldn)) This definition creates a series of procedures: • make-s • a constructor procedure, which gets n arguments and returns a structure-value • e.g. (define p (make-point 3 4)) creates a new point • s? • a predicate procedure, which returns true for a value that is generated by make-s and false for every other value • e.g. (point? p) true • s-field • for every field a selector, which gets a structure as an argument and extracts the value of the field • e.g. (point-y p) 4 (define-struct point (x y))

Design of procedures for compound data • When do we need structures? • If the description of an object consists of many different pieces of information • How does our design recipe change? • Data analysis: Search the problem statement for descriptions of relevant objects, then generate corresponding data types; describe the contract of the data type • Definition of a contract can use the new defined type names, i.e. ;; grant-qualified: Student  bool • Template: Header + Body, which contains all possible selectors • Implementation of the bodies: Design an expression that uses primitive operations, other functions, selector expressions and the variables

Example ;; Data Analysis & Definitions: (define-struct student (last first teacher)) ;; A student is a structure: (make-student l f t) ;; where f, l, and t are symbols. ;; Contract: subst-teacher : student symbol -> student ;; Purpose: to create a student structure with a new ;; teacher name if the teacher's name matches 'Fritz ;; Examples: ;;(subst-teacher (make-student 'Find 'Matthew 'Fritz) 'Elise) ;; = (make-student 'Find 'Matthew 'Elise) ;;(subst-teacher (make-student 'Smith 'John 'Bill) 'Elise) ;; = (make-student 'Smith 'John 'Bill)

Example (continued) ;; Template: ;; (define (subst-teacher a-student a-teacher) ;; ... (student-last a-student) ... ;; ... (student-first a-student) ... ;; ... (student-teacher a-student) ...) ;; Definition: (define (subst-teacher a-student a-teacher) (cond [(symbol=? (student-teacher a-student) 'Fritz) (make-student (student-last a-student) (student-first a-student) a-teacher)] [else a-student])) ;; Test 1: (subst-teacher (make-student 'Find 'Matthew 'Fritz) 'Elise) ;; expected value: (make-student 'Find 'Matthew 'Elise) ;; Test 2: (subst-teacher (make-student 'Smith 'John 'Bill) 'Elise) ;; expected value: (make-student 'Smith 'John 'Bill)

The meaning of structuresin the substitution model (1/2) • How does define-struct work in the substitution model? • This structure produces the following operations: • make-c : a constructor • c-f1 … c-f2: a series of selectors • c? : a predicate (define-struct c (f1 … fn))

The meaning of structuresin the substitution model • We proceed like in every combination • Evaluation of the operator and the operands • The value of (make-c v1 … vn) is (make-c v1 … vn) • this way constructors are self evaluating! • The evaluation of (c-fi v) is • vi if v = (make-c v1 …vi … vn) • An error in all other cases • The evaluation of (c? v)is • true, if v = (make-c v1 … vn) • false, otherwise • Try it with the DrScheme Stepper!

Data abstraction • For procedures we have • primitive expressions (+, -, and, or, …) • means of combination (procedure implementation) • means of abstraction (procedural abstraction) • We have the same for data : • primitive data (numbers, boolean values, symbols) • compounded data (e.g. structures) • Data abstraction.

Why do we need data abstraction? • Example: Implementation of an operation for adding rational numbers • Rational numbers are composed of a numerator and a denominator, e.g. 1/2 or 7/9. • The addition of two rational numbers produces two results:the resulting numerator and the resulting denominator. • But a procedure can only return one value. • That’s why we would need two procedures: One returns the resulting numerator, the other the resulting denominator. • We have to remember which numerator is part of which denominator. • Data abstraction is a method that combines several Data objects, so they can be used as a single object. How this works is hidden by means of data abstraction.

Why do we need data abstraction? • The new data objects are abstract data: • They are used without making any assumptions about how they are implemented. • Data abstraction helps to... • elevate the conceptual level at which programs are designed, • increase the modularity of designs and • enhance the expressive power of a programming language. • A concrete data representation is defined independent of the programs using the data. • The interface between the representation and a program using the abstract data is a set of procedures, called selectorsandconstructors.

Language extensions for handling abstract data • Constructor: a procedure that creates instances of abstract data from data that is passed to it • Selector: a procedure that returns a data item that is in an abstract data object • The component data item returned might be • the value of an internal variable • or it might be computed. • Constructors/Selectors generated by define-struct are a special case • The component data returned by these selectors is one of the values that was passed during the constructor call (never computed)

Example: Rational Numbers • Mathematically represented by a pair of integers: 1/2, 7/9, 56/874, 78/23, etc. • Constructor:(make-rat numeratordenominator) • Selectors:(numerrn) (denomrn) • That's all a user needs to know! • But it’s not quite enough for the programmer and DrScheme, as we have not defined the “rat” structure – this will follow in a couple of slides.

User-defined Operation for Rational Numbers Multiplication of x = nx/dxand y = ny/dy (nx/dx) * (ny/dy) = (nx*ny) / (dx*dy) ;; mul-rat: rat rat -> rat ;; Multipliestwo rational numbers ;; Example: (mul-rat (make-rat 1 2) (make-rat 2 3) ;; = (make-rat 2 6) (define (mul-rat x y) (make-rat (* (numer x) (numer y)) (* (denom x) (denom y))))

Another Operation on Rational Numbers Addition of x = nx/dx and y = ny/dy nx/dx + ny/dy = (nx*dy + ny*dx) / (dx*dy) ;; add-rat: rat rat -> rat (define (add-rat x y) (make-rat (+ (* (numer x) (denom y)) (* (numer y) (denom x))) (* (denom x) (denom y)))) Subtraction and division are defined similarly to addition and multiplication.

A test Equality: nx/dx = ny/dy iff nx*dy = ny*dx iff means: if and only if ;; equal-rat: rat rat -> bool (define (equal-rat? x y) (= (* (numer x) (denom y)) (* (numer y) (denom x))))

An output operation ;; print-rat: rat -> String (define (print-rat x) (string-append (number->string (numer x)) "/" (number->string (denom x)))) To output rational numbers in a convenient form, we define an output procedure using data abstraction. This is your first example with string manipulation!string-append puts several strings together.number->string turns a number in a string. This is not possible using symbols.

Below the abstract data • We implemented the operators add-rat, mul-rat and equal-rat using make-rat, denom, numer. • Withoutimplementingmake-rat, denom, numer! • Even withoutknowing, howthey will beimplemented… • We still need to define make-rat, denom, numer. • Therefore, we have to glue together numerator and denominator. • To achieve this, we create a Scheme structure for storing pairs: (define-structxy (x y))

Representing rational numbers (define (make-rat n d) (make-xy n d)) (define (numer r) (xy-x r)) (define (denom r) (xy-y r)) • We can define the constructor and the selectors with the assistance of the xy structure.

Using operations on rational numbers (defineone-third (make-rat 1 3)) (definefour-fifths (make-rat 4 5)) (print-rat one-third) “1/3“ (print-rat (mul-rat one-thirdfour-fifths)) “4/15“ (print-rat (add-rat four-fifthsfour-fifths)) “40/25“

Programs that use rational numbers add-rat mul-rat equal-rat … make-rat numer denom Levels of abstraction • Programs are built up as layers of language extensions. • Each layer is a level of abstraction . • Each abstraction hides some implementation details. • There are four levels of abstraction in our rational numbers example. Rational numbers in the problem domain Rational numbers as numerators and denominators Rational numbers as structures make-xy xy-x xy-y Whatever way structures are implemented

make-xy xy-x xy-y Bottom level • Level of pairs • Procedures make-xy, xy-xandxy-yarealready constructed by the interpreter due to the structure definition. • The actual implementation of structures is hidden. Rational numbers as structures Whatever way structures are implemented

make-rat numer denom Second level • Level of rational numbers as data objects • Procedures make-rat, numeranddenomare defined at this level. • The actual implementation of rational numbers is hidden at this level. Rational numbers as numerators and denominators Rational numbers as structures

add-rat mul-rat equal-rat … Third level • Level of service procedures on rational numbers • Procedures add-rat, mul-rat, equal-rat, etc. are defined at this level. • Implementation of these procedures are hidden at this level. Rational numbers in problem domain Rational numbers as numerators and denominators

Programs that use rational numbers Top level • Program level • Rational numbers are used in calculations as if they were ordinary numbers. Rational numbers in the problem domain

Abstraction barriers • Each level is designed to hide implementation details from higher-level procedures. • These levels act as abstraction barriers.

Advantages of data abstraction • Programs can be designed one level of abstraction at a time. • Thereby data abstraction supports top-down design. • We can gradually figure out data representations and how to implement constructors, selectors and service procedures that we need, one level at a time. • We do not have to be aware of implementation details below the level at which we are programming. • An implementation can be changed later without changing procedures written at higher levels.

Change data representation • A few slides ago we saw: • Our rational numbers are not always in reduced form. • We decide that rational number should be represented in a reduced form. • 40/25 and 8/5 are the same number. • Thanks to data abstraction our service procedures do not care in which form the number is represented. • The procedures like add-rat or equal-rat function correctly in either case. (print-rat (add-rat four-fifthsfour-fifths)) "40/25"

Change data representation (define (make-rat n d) (make-xy (/ n (gcd n d)) (/ d (gcd n d)))) • We can change the constructor... • ...or the selectors. (define (numer x) (/ (xy-x x) (gcd (xy-x x) (xy-y x)))) (define (denom x) (/ (xy-y x) (gcd (xy-x x) (xy-y x)))) gcd is a built-in procedure that produces the greatest common divisor!

Designing Procedures for Mixed Data • Up to this point, our procedures have handled only one type of data • Numbers • Booleans • Symbols • Types of special structures • But we often want that procedures operate with different types of data • We will also learn how to protect procedures from wrong use

Example • We have (define-struct point (x y)) for points • Many points are on the x-axis • In this case we want to represent these points just by a number • A = (make-point 6 6), B= (make-point 1 2)C = 1, D = 2, E = 3

Designing Procedures for Mixed Data • To document our representation of points, we make the following informal definition of data types • Now the contract, the description and the header of a procedure distance-to-0 is easy: • How can we differentiate between the data types? • With the help of the predicates: number?, point? etc. ;; a pixel-2 is either ;; 1. a number ;; 2. a point-structure ;; distance-to-0 : pixel-2 -> number ;; to compute the distance of a-pixel to the origin (define (distance-to-0 a-pixel) ...)

Designing Procedures for Mixed Data • Base structure: Procedure body with cond-expression that analyzes the type of the input • We know that in the second case the input is composed of two coordinates … (define (distance-to-0 a-pixel) (cond [(number? a-pixel) ...] [(point? a-pixel) ...])) (define (distance-to-0 a-pixel) (cond [(number? a-pixel) ...] [(point? a-pixel) … (point-x a-pixel) … (point-y a-pixel) … ]))

Designing Procedures for Mixed Data Now it is easy to complete the function… (define (distance-to-0 a-pixel) (cond [(number? a-pixel) a-pixel] [(point? a-pixel) (sqrt (+ (sqr (point-x a-pixel)) (sqr (point-y a-pixel))))])) • built-in procedures: • (sqr x) : x square • (sqrt x) : square root of x

Designing Procedures for Mixed Data • Another Example: graphical objects • Variants: squares, circles,… • procedures: calculating the perimeter, draw, … ;; A shape is either ;; a circle structure: ;; (make-circle p s) ;; where p is a point describing the center;; and s is a number describing the radius; or ;; a square structure: ;; (make-square nw s) ;; where nw is the north-west corner point ;; and s is a number describing the side length. (define-struct circle (center radius)) (define-struct square (nw length)) ;; Examples:(make-circle (make-point 5 9) 87) ;; (make-square (make-point 20 5) 5)

Designing Procedures for Mixed Data • Compute the perimenter: • Using our design-recipe… • Using our design-recipe… (continued) ;; perimeter : shape -> number ;; tocomputetheperimeterof a-shape (define (perimeter a-shape) (cond [(square? a-shape) ... ] [(circle? a-shape) ... ])) ;; perimeter : shape -> number ;; to compute the perimeter of a-shape (define (perimeter a-shape) (cond [(square? a-shape) ... (square-nw a-shape)..(square-length a-shape) ...] [(circle? a-shape) ... (circle-center a-shape)..(circle-radius a-shape) ..]))

Designing Procedures for Mixed Data • Compute perimeter: • Final result ;; perimeter : shape -> number ;; tocomputetheperimeterof a-shape (define (perimeter a-shape) (cond [(square? a-shape) (* (square-length a-shape) 4)] [(circle? a-shape) (* (* 2 (circle-radius a-shape)) pi)]))

Program design and heterogeneous data • The data analysis gets more important • Which classes of objects exists and what are their attributes? • Which meaningful groupings of classes are there? • So called “subclass creation” • Yields a hierarchy of data definitions in general • Templates • 1. step: cond-expression that analyzes the types of data inside a group • 2. step: add selectors accordingly for every branch • Alternative: call a procedure specific to the respective data type (i.e.: procedure perimeter-circle, perimeter-square) • Program body • Combine the available information in every branch of the case, depending on the purpose • Alternative: implement procedures specific to data types one by one using the normal design recipe • For an overview of the new design process, see HTDP Fig. 18

Get your data structures correct first, and the rest of the program will write itself. David Jones

Introduction to Computer Science I Topic 2: Structured Data Types Data abstraction

Introduction to Computer Science I Topic 2: Structured Data Types Data abstraction

Presentation Transcript

Foundations of Computer Science from Data Manipulation to Theory of Computation

Learning Bayesian Networks from Data

INTRODUCTION TO COMPUTER SCIENCE

Introduction to Informatica PowerCenter

Python Programming: An Introduction to Computer Science

INTRODUCTION OF COMPUTER

Data Bases in Cloud Environments

Learning Bayesian Networks from Data

Classes and Objects: A Deeper Look

DATA MINING: AN INTRODUCTION

Multi-Q Introduction

Data Mining: Concepts and Techniques — Chapter 10 — 10.3.2 Mining Text and Web Data (II)

Visualisations of 6dF data

Introduction to Data Structures

Computer Communication Networks

Data Networking

Chapter #6: GRAPHS Fundamentals of Data Structures in C Horowitz, Sahni and Anderson-Freed

Chapter 1. Introduction

Introduction to Data Communications and Networking

Computer Organization and Architecture