Imprint | Privacy Policy

The Internet

(Usage hints for this presentation)

Summer Term 2023
Dr. Jens Lechtenbörger (License Information)

1. Introduction

1.1. Today’s Core Questions

  • What is the Internet?
  • How to provide global connectivity in view of heterogeneous network technologies, diverse devices, and novel (and forthcoming) applications?
  • How to cope with complexity?

1.2. Learning Objectives

  • Explain and contrast Internet and OSI architectures
  • Explain layers in Internet architecture
    • Roles and interplay for communication
    • Basic properties of IP, UDP, TCP
  • Explain forwarding of Internet messages based on (IP and MAC) addresses and demux keys
    • Use Wireshark for basic network diagnosis
      • What DNS server is in use? Does it reply? Do response messages for TCP/IP arrive? What next-hop router is in use?
  • Explain end-to-end argument

1.3. Previously on CACS …

1.3.1. Communication and Collaboration

  • Communication frequently takes place via the Internet
    • Telephony
    • Instant messaging
    • E-Mail
    • Social networks
  • Collaboration frequently supported by tools using Internet technologies
    • All of the above means for communication
    • ERP, CRM, e-learning systems
    • File sharing: Sciebo, etherpad, etc.
    • Programming (which subsumes file sharing): Git, subversion, etc.
  • All of the above are instances of DSs

1.3.2. Ubiquity of DSs Internet

2. Basics

2.1. (Computer) Networks

[PD11]: A network can be defined recursively as

  • two or more nodes/devices/hosts connected by a link
    • (e.g., copper, fiber, nothing)

  • or two or more networks connected by one or more nodes (with necessary links)
    • (e.g., gateway, router)

2.1.1. On Routers

  • Previous slide mentions routers as nodes that connect networks
    • One example: router at home that connects home network to ISP’s network
    • Other example: router that connects “large” networks at backbone of Internet
      • Independently managed networks are called autonomous systems
      • Routers exchange information about reachable networks with protocols such as BGP
        • Usually, multiple paths between networks (and nodes) exist (alternatives may allow to “route around” link and router failures)
        • Routers choose paths based on local policies (e.g., distance, cost)

2.2. Internet vs Web

  • The Internet is a network of networks
    • Connectivity for heterogeneous devices
    • Various protocols, some details on later slide
      • IPv4 and IPv6 to send messages between devices on the Internet
      • TCP and UDP to send messages between processes on Internet devices
        • (E.g., process of Web browser talks with remote process of Web server)
        • TCP: Reliable full-duplex byte streams
        • UDP: Unreliable message transfer
  • The Web is an application using the Internet
    • Clients and servers talking HTTP over TCP/IP
      • E.g., GET requests asking for HTML pages (separate presentation)
      • Web servers provide resources to Web clients (browsers, apps)
  • Internet and Web are and contain DSs

2.3. Heterogeneity

  • Internet is network of networks
  • Potentially each network with
    • independent administrative control
    • different applications and protocols
    • different performance and security requirements
    • different technologies (fiber, copper, wired, wireless)
    • different hardware and operating systems
  • How to overcome heterogeneity?

3. Layering and Protocols

3.1. Layering

General technique in Software Engineering and Information Systems

  • Use abstractions to hide complexity
    • Abstractions naturally lead to layering
    • Alternative abstractions at each layer
      • Abstractions specified by standards/protocols/APIs
  • Thus, problem at hand is decomposed into manageable components
    • Design becomes (more) modular

3.2. Network Models/Architectures

  • Models frequently have different layers of abstraction
    • Goal of layering: Reduce complexity
      • Each layer offers services to higher layers
        • Semantics: What does the layer do?
      • Layer interface defines how to access its services from higher layers
        • Parameters and results
        • Implementation details are hidden
        • (Think of class with interface describing method signatures while code is hidden)
  • Peer entities, located at same layer on different machines, communicate with each other
    • Protocols describe rules and conventions of communication
      • E.g., message formats, sequencing of events
  • Network architecture = set of layers and protocols

(Based on: [Tan02])

3.3. Protocol Layers

  • Each protocol instance talks virtually to its peer

    Layered Communication in OSI Model

    Layered Communication in OSI Model” by Runtux under Public domain; from Wikimedia Commons

    • E.g., HTTP GET request from Web browser to Web server
  • Each layer communicates only by using the one below
    • E.g., Web browser asks lower layer to transmit GET request to Web server
    • Lower layer service accessed by an interface
  • At bottom, messages are carried by the medium

(Based on: [Tan02])

3.4. Famous Models/Architectures

  • ISO OSI Reference Model
    • Mostly a model, describes what each layer should do
      • But no specification of services and protocols (thus, no real architecture)
    • Predates real systems/networks
  • TCP/IP Reference Model
    • Originally, no clear distinction between services, interfaces, and protocols
      • Instead, focus on protocols
    • Model a la OSI as afterthought

(Based on: [Tan02])

4. Internet and OSI Models

4.1. OSI Reference Model

  • International standard
    • Seven layer model to connect different systems
      • Media Layers
        1. Sends bits as signals
        2. Sends frames of information
        3. Sends packets from source host over multiple links to destination host
      • Host layers
        1. Provides end-to-end delivery
        2. Manages task dialogues
        3. Converts different representations
        4. Provides functions needed by users/applications

OSI Model

OSI Model” by Offnfopt under CC0 1.0; from Wikimedia Commons

4.1.1. Drawing for OSI Model

Networking layers

Networking layers

Figure © 2016 Julia Evans, all rights reserved; from julia's drawings. Displayed here with personal permission.

4.1.2. Where are Top and Bottom?

  • In layered architectures, lower layers represent more technical details while higher layers abstract away details
    • E.g., in the OSI model the top layer (7) is the application layer, which does not care about technical communication details
  • The previous drawing does not follow that convention when showing layers, but implicitly assumes it anyways (layer 3 “ignores layers 4 and above”)

4.2. OSI Model on Internet

  • Internet architecture involves following subset of OSI layers

    • Application layer
      • E.g., Web (HTTP), e-mail (SMTP), naming (DNS)
      • (Presentation and session omitted; part of application protocols)
    • Transport layer
    • Network layer
      • Unifying standard: Internet Protocol (IP; v4, v6)
      • Everything over IP, IP over everything
    • Data link layer
      • E.g., Ethernet, WiFi, cellular phone network, satellite link

4.3. Internet Standards

4.3.1. Internet Architecture

  • “Hourglass design”

    Internet Architecture with narrow waist

  • IP is focal point
    • “Narrow waist”
    • Application independent!
      • Everything over IP
    • Network independent!
      • IP over everything
    • No security
      • “IP datagrams are like postcards, written with erasable pencils”
        • Usual security protocol is TLS for encryption and integrity protection, located between application and TCP

4.3.2. IP, UDP, and TCP

  • IP (Internet protocol)
    • Offers best-effort host-to-host connectivity
      • Best effort: Try once, no effort to recover from transmission errors
      • Connection-less delivery of datagrams
  • Transport layer alternatives
    • UDP (User Datagram Protocol)
      • Extends IP towards best-effort application-to-application connectivity
        • Ports identify applications/processes (e.g., 53 for DNS)
        • Connection-less
    • TCP (Transmission Control Protocol)
      • Offers reliable application-to-application connectivity
        • Ports identify applications/processes (e.g., 80/443 for Web servers)
        • Full-duplex byte stream
        • Three-way handshake to establish connection
        • Acknowledgements and timeouts for retransmissions

4.3.3. Drawing on TCP

TCP basics!

TCP basics!

Figure © 2016 Julia Evans, all rights reserved; from julia's drawings. Displayed here with personal permission.

5. Internet Communication

5.1. IP Stack Connections

IP stack connections

IP stack connections” by Jens Lechtenbörger under CC BY-SA 4.0; based on work under CC BY-SA 3.0 by en:User:Kbrose and en:User:Cburnett by changing arrow labels; from GitLab

5.1.1. Drawing on MAC Addresses

What's a MAC address?

What's a MAC address?

Figure © 2016 Julia Evans, all rights reserved; from julia's drawings. Displayed here with personal permission.

5.1.2. Drawing of Packet

Anatomy of a packet

Anatomy of a packet

Figure © 2016 Julia Evans, all rights reserved; from julia's drawings. Displayed here with personal permission.

5.1.3. Typical Communication Steps (0/2)

  • Prerequisites

    • Internet communication requires numeric IP addresses
      • Lookup of IP addresses for human readable names via DNS
        • DNS is request-reply protocol
        • DNS client (e.g., the browser) asks DNS server for IP address of name, e.g., query for www.wwu.de may result in 128.176.6.250
        • (And more)
    • LAN communication requires MAC addresses
      • MAC (media access control) address: Hardware address of network card, e.g., for Ethernet, WiFi
        • Typical format with hexadecimal digits: 02:42:fa:5c:4a:4a
      • Lookup of MAC addresses for IP addresses via ARP (Address Resolution P.)
        • Send ARP request into local network: “If you have IP addresses x, what is your MAC address?
        • ARP request is a broadcast: Sent to every device in LAN
        • Device that has IP address x replies with its MAC address

5.1.4. Typical Communication Steps (1/2)

  • Ex.: Send HTTP message M to host www.wwu.de
    1. Perform DNS lookup for www.wwu.de
      • Returns IP address 128.176.6.250
    2. Encapsulate M by adding TCP header
      • Source and destination TCP ports: Numbers that identify processes
        • Typically, destination port 80 for Web servers with HTTP (443 for HTTPS)
        • Random source ports for Web browsers
    3. Encapsulate TCP segment by adding IP header
      • Source and destination IP addresses
      • Demux key to indicate that TCP segment is contained

5.1.5. Typical Communication Steps (2/2)

  • Ex.: Send HTTP message M to host www.wwu.de

    1. Perform DNS lookup for www.wwu.de
    2. Encapsulate with TCP header
    3. Encapsulate with IP header
    1. Routing decision to determine IP address of next hop router
      • Returns IP address IP_R within sender’s network
      • E.g., 128.176.158.1 at my work, 192.168.178.1 at home
    1. ARP lookup to determine MAC address for IP_R
      • E.g., 0:0:c:7:ac:0
    1. Encapsulate IP datagram with LAN-specific header with MAC address, send via LAN to router
    • Routers repeat steps (4) - (6) to forward M to final destination

5.2. Encapsulation

Sample encapsulation of GET request

5.3. Encapsulation and Demux Keys

  • Encapsulation
    • Protocol specific header added for each layer
      • Starting from “pure” application message
      • Headers prepended when moving down the protocol stack
    • Headers “unwrapped” when moving up again
  • Demux key
    • Identifies recipient protocol at next higher layer
    • Different protocols use different forms of demux keys (see previous slide)
      • Ethernet header contains type field (IPv4 = 0x0800, ARP = 0x0806)
      • IP header contains protocol field (TCP = 6, UDP = 17)
      • TCP header contains port (application id) as demux key

5.4. Review Questions

5.5. Wireshark Demo

  • Wireshark is a network protocol analyzer
    • For live or recorded traffic
    • Wireshark Demo (including 8-minute video) to get you started
  • Use of Wireshark improves understanding of networking and device communication

6. End-to-End Argument

6.1. Network: Core, Edge, Endpoint

  • Network core: Devices implementing the network
    • Routers, switches
  • Network edge: Devices using the network
    • Computers, “smart” devices, IoT devices
  • Endpoints of communication: Distributed applications
    • Processes that send and receive messages
      • E.g., your e-mail client, your Web browser, your messenger
      • Beware: Who is the other end for your browser? Who for your mail client and messenger?

6.2. Overarching Question

  • What functionality to implement in the network core, what within communication endpoints?
    • Observations
      • If functionality is available as Internet standard, every application can immediately use it. No need to reinvent wheels.
      • Simplicity and generality of protocols increase potential for re-use, e.g., IP allows to connect “everything.”
    • Answer to question given in [SRC84]: End-to-end argument
    • Intuition
      • Some functionality needs application knowledge
      • Such functionality cannot be implemented inside the net
      • In general, application functionality should not be implemented in the net

6.3. End-to-End Definition

  • Quotes from [SRC84]
    • “The principle, called the end-to-end argument, suggests that functions placed at low levels of a system may be redundant or of little value when compared with the cost of providing them at that low level.”
    • “The function in question can completely and correctly be implemented only with the knowledge and help of the application standing at the end points of the communication system. Therefore, providing that questioned function as a feature of the communication system itself is not possible. (Sometimes an incomplete version of the function provided by the communication system may be useful as a performance enhancement.)”

6.4. End-to-End Example

  • Careful file transfer
    • Read file from disk, transfer over Internet, write to disk at remote end
    • Possible errors, leading to corrupted data
      • Disk error
      • Software errors in file system, file transfer, network protocol, buffering or copying
      • Hardware errors (e.g., processor or memory failures)
      • Network failures/attacks (messages lost or bits changed)
      • Crash in the middle of the transfer
    • Possible solutions
      • Lots of “small” tests
      • One end-to-end checksum check, with retry in case of errors
  • How many “small” tests will be necessary?
    • Notice: A test regarding network transfer does not help much since all other types of errors can still corrupt data
      • Hence, an end-to-end check will be necessary anyways
    • However, from a performance perspective, a single end-to-end check may be costly
      • Consider transfer of some GB, which may take a long time
        • The end-to-end check detects individual errors only after full transfer
        • In contrast, intermediate checks may identify individual bit errors early, allowing partial retries

6.5. End-to-End Security

6.5.1. Hybrid End-to-End Encryption

End-to-End Encryption (Hybrid)

End-to-End Encryption (Hybrid)” by Noah Lücke, Moritz van den Berg, Anton Levkau, Nick Vrban and Jannes Werk under CC BY-SA 4.0; converted from GitLab

6.6. Then vs Now

  • [BC01]: Rethinking the Design of the Internet
    • Challenges since 1980s
      • Untrustworthy world, e.g., attacks, spam, DDoS
        • Need more mechanism in the core to enforce “good” behavior?
      • More demanding applications, e.g., video streaming
        • Best effort model may not be good enough, need intermediate storage sites for streaming?
      • ISP service differentiation
        • Different pieces of content provided with different QoS guarantees?
      • Rise of third-party involvement
        • Officials of organizations or governments interpose themselves
      • Less sophisticated users
        • From initial experts to Joe Sixpack, who may be overwhelmed by complexity in endpoints
    • RFC 3724, 2004: End-to-end is still relevant, though
      • End-to-end manages state at the edges, not the core
        • Failures in core do not affect application state
      • Protection of innovation, reliability, trust

6.7. Review Questions

  • Think of your favorite messenger application. Do you know how messages are transferred? Is communication hop-by-hop or end-to-end? Does it implement end-to-end security (details are not important for your response here—maybe provide a pointer to verifiable source)? Does security of communication benefit from WPA?

6.8. Concluding Questions

  • What did you find difficult or confusing about the contents of the presentation? Please be as specific as possible. For example, you could describe your current understanding (which might allow us to identify misunderstandings), ask questions in a Learnweb forum that allow us to help you, or suggest improvements (maybe on GitLab). Most questions turn out to be of general interest; please do not hesitate to ask and answer in the forum. If you created additional original content that might help others (e.g., a new exercise, an experiment, explanations concerning relationships with different courses, …), please share.

7. Conclusions

7.1. Summary

  • Computer networks are general purpose networks
    • The Internet forms the backbone for modern communication and collaboration
  • Complexity reduced via layered architecture
    • Modular design
    • Internet vs OSI architecture
    • Encapsulation and demux keys

Bibliography

License Information

This document is part of an OER collection to teach basics of distributed systems. Source code and source files are available on GitLab under free licenses.

Except where otherwise noted, the work “The Internet”, © 2018-2023 Jens Lechtenbörger, is published under the Creative Commons license CC BY-SA 4.0.

No warranties are given. The license may not give you all of the permissions necessary for your intended use.

In particular, trademark rights are not licensed under this license. Thus, rights concerning third party logos (e.g., on the title slide) and other (trade-) marks (e.g., “Creative Commons” itself) remain with their respective holders.