Distributed systems and security an introduction
1 / 16

Distributed Systems and Security: An Introduction - PowerPoint PPT Presentation

  • Uploaded on

Distributed Systems and Security: An Introduction. Brad Karp and Steve Hailes UCL Computer Science. CS Z03 / 4030 2 nd October, 2006. Today’s Lecture. Logistics Course Communication Overview of Distributed Systems What are they? Why build them? Why are they hard to build well?

I am the owner, or an agent authorized to act on behalf of the owner, of the copyrighted work described.
Download Presentation

PowerPoint Slideshow about ' Distributed Systems and Security: An Introduction' - norris

An Image/Link below is provided (as is) to download presentation

Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.

- - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - -
Presentation Transcript
Distributed systems and security an introduction

Distributed Systems and Security:An Introduction

Brad Karp and Steve Hailes

UCL Computer Science

CS Z03 / 4030

2nd October, 2006

Today s lecture
Today’s Lecture

  • Logistics

  • Course Communication

  • Overview of Distributed Systems

    • What are they?

    • Why build them?

    • Why are they hard to build well?

  • Detailed Syllabus: Distributed Systems

  • Assessment regime

  • Questionnaire


  • Meeting Times/Locations

    • Monday: 9 AM – 10 AM , MPEB 1.20

    • Monday: 1 PM – 2 PM, MPEB 1.03

    • Tuesday: 1 PM – 2 PM, MPEB 1.20

  • Schedule

    • 2nd October – 31st October: Distributed Systems

    • 6th November – 10th November: Reading Week (no lecture)

    • 13th November – 12th December: Security

Course communication
Course Communication

  • Course web page

    • http://www.cs.ucl.ac.uk/staff/B.Karp/z03/

    • Detailed calendar: readings, lecture topics, coursework, announcements/corrections

    • Your responsibility: check page daily!

  • Course mailing lists

    • {gz03,4030}@<department’s domain>

    • Verify you’re a member using mlist

    • Used for course announcements

    • Your responsibility: check email daily!

What is a distributed system
What Is a Distributed System?

  • Multiple computers (“machines,” “hosts,” “boxes,” &c.)

    • Each with CPU, memory, disk, network interface

    • Interconnected by LAN or WAN (e.g., Internet)

  • Application runsacross this dispersed collection of networked hardware

  • Butuser sees single, unified system

What is a distributed system alternate take
What Is a Distributed System?(Alternate Take)

“A distributed system is a system in which I can’t do my work because some computer that I’ve never even heard of has failed.”

  • Leslie Lamport, Microsoft Research (ex DEC)

Start simple centralized system

What are shortcomings of this design?

Start Simple: Centralized System

  • Suppose you run Gmail

  • Workload:

    • Inbound email arrives; store on disk

    • Users retrieve, delete their email

  • You run Gmail on one server with disk








Server (PC)







Why distribute for availability
Why Distribute? For Availability

  • Suppose Gmail server goes down, or network between client and it goes down

  • No incoming mail delivered, no users can read their inboxes

  • Fix: replicate the data on several servers

    • Increased chance some server will be reachable

    • Consistency? One server down when delete message, then comes back up; message returns in inbox

    • Latency? Replicas should be far apart, so they fail independently

    • Partition resilience? e.g., airline seat database splits, one seat remains, bought twice, once in each half!

Why distribute for scalable capacity
Why Distribute?For Scalable Capacity

  • What if Gmail a huge success?

  • Workload exceeds capacity of one server

  • Fix: spread users across several servers

    • Best case: linear scaling—if U users per box, N boxes support NU users

    • Bottlenecks? If each user’s inbox on one server, how to route inbound mail to right server?

    • Scaling? How close to linear?

    • Load balance? Some users get more mail than others!

Performance can be subtle
Performance Can Be Subtle

  • Goal: predictable performance under high load

  • 2 employees run a Starbucks

    • Employee 1: takes orders from customers, calls them out to Employee 2

    • Employee 2:

      • writes down drink orders (5 seconds per order)

      • makes drinks (10 seconds per order)

  • What is throughput under increasing load?

Starbucks throughput

What would preferable curve be?

What design achieves that goal?

Starbucks Throughput

  • Peak system performance: 4 drinks / min

  • What happens when load > 4 orders / min?

  • What happens to efficiency as load increases?

Why are distributed systems hard to design
Why Are Distributed SystemsHard to Design?

  • Failure: of hosts, of network

    • Remember Lamport’s lament

  • Heterogeneity

    • Hosts may have different data representations

  • Need consistency (many specific definitions)

    • Users expect familiar “centralized” behavior

  • Need concurrency for performance

    • Avoid waiting synchronously, leaving resources idle

    • Overlap requests concurrently whenever possible


  • Before Internet:

    • Encryption and authentication using cryptography

    • Between parties known to each other (e.g., diplomatic wire)

  • Today:

    • Entire Internet of potential attackers

    • Legitimate correspondents often have no prior relationship

    • Online shopping: how do you know you gave credit card number to amazon.com? How does amazon.com know you are authorized credit card user?

    • Software download: backdoor in your new browser?

    • Crypto not enough alone to solve these problems!

Detailed syllabus distributed systems
Detailed Syllabus:Distributed Systems

  • No textbook

  • Readings: research papers on real, built distributed systems that illustrate concepts

    • You must read papers by day assigned!

    • Lectures will assume you have.

  • Lectures: (some) background for papers; review system from paper; discuss system

  • See schedule on course web page

How will you be evaluated
How Will You Be Evaluated?

  • 15% coursework

    • 2 courseworks, 7.5% each

    • At least one will involve significant programming

    • Z03 (DCNDS, SSE): Must score 40% mean on coursework

  • 85% final exam

    • 2.5 hours; rubric: 3 of 5 questions

    • 4030 (4th-years): must have 40% to pass

  • Overall:

    • 4030 (4th-years): must have 40% mean to pass

    • Z03 (DCNDS, SSE): must have 50% mean to pass


  • Please fill out completely and honestly!

  • Not assessed, kept confidential

  • Important to help me tailor lectures to class’ experience level