CS1101S
Lecture 11: Representation of Data Structures
19 September 2012
Martin Henz
Overview
- Past examples of data structures
- Process of data structure design
- More examples
- Generic data structures
Reading: SICP 2.4, 2.5
Past examples: State in Nim
function make_game_state(n, m) {
return pair(n, m);
}
function size_of_pile(game_state, p) {
if (p === 1) {
return head(game_state);
} else if (p === 2) {
return tail(game_state);
} else {
return "error";
}
}
Past examples: State in Nim
- Construction of state: make_game_state(n, m)
- Access of state: size_of_pile(game_state, p)
Past examples: State in Nim
Identified two problems
- After human error, computer moves
- We want to have an "undo" function
Problems and solution
- At the moment, game engine play_with_turn fixes the next
player
- At the moment, we do not remember the previous state
- Solution: make next player and previous move part of the game state
Assessment of Nim example
- Changes were data-intensive
- Overall logic remained (appropriately) intact
- Changes remained localized to data processing
remove_coins_from_pile
Second example: Classes: Constructors
function make_class(number,units) {
return list(number,units);
};
function make_units(lecture,tutorial,lab,
homework,prep) {
return list(lecture,tutorial,
lab,homework,prep);
};
Second example: Classes: Accessors
var get_class_number = head;
function get_class_units(cl) {
return head(tail(cl));
}
var get_units_lecture = head;
function get_units_tutorial(units) {
return head(tail(units));
}
function get_units_lab(units) {
return head(tail(tail(units)));
}
Second example: Classes: Predicate
function is_same_class(c1,c2) {
return get_class_number(c1) ===
get_class_number(c2);
}
Third example: rational numbers
What is a rational number?
A pair, consisting of a denominator and a numerator
// constructor
function make_rat(n, d) {
return pair(n, d);
}
// accessors
function numer(x) {
return first(x);
}
function denom(x) {
return second(x);
}
Constructor:
Addition of Rational Numbers
function add_rat(x, y) {
return make_rat(numer(x) * denom(y) +
numer(y) * denom(x),
denom(x) * denom(y));
}
Predicate: Equality (second attempt)
function equal_rat(x, y) {
return numer(x) * denom(y) ===
numer(y) * denom(x);
}
Printing
function rat_to_string(x) {
return numer(x) + " / " + denom(x);
}
Recall implementation choices
- Simplify in make_rat constructor
- Simplify in operation constructor
- Simplify in printer
Overview
- Past examples of data structures
- Process of data structure design
- More examples
- Generic data structures
Reading: SICP 2.4, 2.5
Process of designing data structures
- Specification: describes what Implementation: describes how
Degrees of freedom in implementation
- Choice of programming language
- Choice of data structures
- Choice of abstractions
Common patterns in data structure design
- Constructors: make_game_state(n, m)
- Accessors: get_class_units(cl)
- Predicates: is_same_class(c1,c2)
- Printers: rat_to_string(x)
Overview
- Past examples of data structures
- Process of data structure design
- More examples
- Generic data structures
Reading: SICP 2.4, 2.5
Pairs: Specification
Ordered collection of two items
- Constructor: pair
- Accessors: head, tail
- Predicates: is_pair(p)
- Printers: Try to avoid them
Pairs: Implementation
- Many alternatives
- Discussions for CS1101S between tutors and lecturer
- Decisive was printing in Web Console, for didactic
reasons
List: Specification
A list is either empty, or a pair whose
head is a data item and whose tail is a list.
- Constructors: pair, list, []
- Accessors: head, tail
- Predicates: is_pair(xs), is_empty_list(xs)
- Printers: Try to avoid them
List: Implementation
Obvious: [] plus pairs
plus "list discipline"
Tree: Specification
Specification:
A tree is a list, whose elements are data items
or trees.
Definition of list:
A list is either empty, or a pair whose
head is a data item and whose
tail is a list.
Observation:
Every list is a tree, and every tree is a list
Data types
List of numbers:
A list of numbers is either empty, or a pair whose
head is a number and whose
tail is a list of numbers.
Trees of numbers:
A tree of numbers is a list, whose
elements are numbers or trees of numbers.
Observation:
Every list of numbers is a tree of numbers, but
some trees of numbers are not lists of numbers
Data types
List of x:
A list of x is either empty, or a pair whose
head is of type x and whose
tail is a list of x.
Trees of x:
A tree of x is a list, whose
elements are of type x or trees of x.
Many programming languages provide extensive support
for types: Algol, Pascal, SML, Java, C++
Data types in JavaScript
JavaScript is dynamically typed.
Before we execute a program, we do not check
that data structures are used correctly
This is a deficit: Type checking is very useful
especially when writing large programs
"Poor-man's typing": Functions such as
is_empty_list, head, and tail
cause "error" when type is incorrect
Sets: Specification
A set is an unordered collection of objects.
- Constructors: make_empty(), add(s,e), remove(s,e)
- Predicates: is_element(s,e)
- Contract:
- is_element(add(s,e), e) returns true
- is_element(remove(s,e), e) returns false
Set: Implementation (version 1)
Idea: represent sets by unordered lists
- Consequence for add(s,e)?
- Consequence for remove(s,e)?
- Consequence for is_element(s,e)?
Set: Implementation (version 2)
Idea: represent sets by ordered lists
- Consequence for add(s,e)?
- Consequence for remove(s,e)?
- Consequence for is_element(s,e)?
Set: Implementation (version 3)
Idea: represent sets by binary trees
(each list has three elements: left tree, number,
and right tree)
Invariant: Every number in the left tree is
smaller than the number, and every number in the right
tree is larger than the number
- Consequence for add(s,e)?
- Consequence for remove(s,e)?
- Consequence for is_element(s,e)?
Set: Summary
Several reasonable choices
- Some operations are fast in one implemenation
and slow in another one
- Often there is a trade-off
- Best representation often depends on
which operations are most frequent.
Overview
- Past examples of data structures
- Process of data structure design
- More examples
- Generic data structures
Reading: SICP 2.4, 2.5
Motivation
- Best representation often depends on
which operations are most frequent.
- In some part of a program, one operation dominates,
in another part, another operation dominates
- Key idea: Alternative implementations
should be able to co-exist
Example: Adding complex numbers
Real-part(z1 + z2)
= Real-part(z1) + Real-part(z2)
Imag-part(z1 + z2)
= Imag-part(z1) + Imag-part(z2)
Example: Multiplying complex numbers
Magnitude(z1 * z2)
= Magnitude(z1) * Magnitude(z2)
Angle(z1 * z2)
= Angle(z1) + Angle(z2)
Specification: Accessors for complex numbers
- real_part(z)
- imag_part(z)
- magnitude(z)
- angle(z)
Generic implementation of operations
function add_complex(z1,z2) {
return make_from_real_imag(real_part(z1) +
real-part(z2),
imag_part(z1) +
imag_part(z2));
}
function mul_complex(z1,z2) {
return make_from_mag_ang(magnitude(z1) *
magnitude(z2),
angle(z1) +
angle(z2));
}
Ben's Implementation: Cartesian
function real_part(z) {
return head(z);
}
function imag_part(z) {
return tail(z);
}
function magnitude(z) {
return sqrt(square(real_part(z)) +
square(imag_part(z)));
}
function angle(z) {
return atan(imag_part(z),real_part(z));
}
Ben's Implementation: Cartesian
function make_from_real_imag(x,y) {
return pair(x,y);
}
function make_from_mag_ang(r,a) {
return pair(r * cos(a), r * sin(a));
}
Alyssa's Implementation: Polar
function real_part(z) {
return magnitude(z) * cos(angle(z));
}
function imag_part(z) {
return magnitude(z) * sin(angle(z));
}
function magnitude(z) {
return head(z);
}
function angle(z) {
return tail(z);
}
Alyssa's Implementation: Polar
function make_from_real_imag(x,y) {
return pair(sqrt(square(x) + square(y)),
atan(y,x));
}
function make_from_mag_ang(r,a) {
return pair(r,a);
}
Principle of least commitment
Whenever possible, delay decisions until you have enough
information, or until the choice becomes unavoidable
Generic implementation of operations follows principle
of least commitment
function add_complex(z1,z2) {
return make_from_real_imag(real_part(z1) +
real-part(z2),
imag_part(z1) +
imag_part(z2));
}
Can we take the principle further?
- Can we delay commitment for actual representation?
- Both implementations use pairs. How do we
distinguish between Cartesian and polar representation?
First technique: tagged data
Idea: Use a "tag" that uniquely identifies the
type of representation
First technique: tagged data
function attach_tag(type_tag,contents) {
return pair(type_tag,contents);
}
function type_tag(datum) {
if (is_pair(datum)) {
return head(datum);
} else {
return error("bad tag",datum);
}
}
function contents(datum) {
if (is_pair(datum)) {
return tail(datum);
} else {
return error("bad tag",datum);
}
}
Ben's implementation: using tagged data
function make_from_real_imag_rectangular(x, y) {
return attach_tag("rectangular",
pair(x, y));
}
function make_from_mag_ang_rectangular(r, a) {
return attach_tag("rectangular",
pair(r * cos(a),
r * sin(a)));
}
Alyssa's implementation: using tagged data
function make_from_real_imag_polar(x,y) {
return attach_tag("polar",
pair(sqrt(square(x) +
square(y)),
atan(y,z)));
}
function make_from_mag_ang_polar(r,a) {
return attach_tag("polar",
pair(r,a));
}
Tag allows to distinguish representation
function is_rectangular(z) {
return type_tag(z) === "rectangular";
}
function is_polar(z) {
return type_tag(z) === "polar";
}
Accessors need to dispatch on tag
function real_part(z) {
if (is_rectangular(z))
return real_part_rectangular(contents(z));
else if (is_polar(z))
return real_part_polar(contents(z));
else
return error("Unknown type -- real_part",z);
}
How to avoid dispatching on tags
- Tables of generic operations
- Object-oriented programming (Lecture 16)
Summary
- Specification vs implementation
- Types
- Implementation trade-off
- Principle of least commitment
- Tagged representation (not ideal)
←
→
/
#