Imprint | Privacy Policy

Cloud Computing

(Usage hints for this presentation)

IT Systems, Summer Term 2024
Dr. Jens Lechtenbörger (License Information)

1. Introduction

1.1. Core Questions

  • What is a distributed system?
  • What do “cloud computing” and “serverless computing” mean?
    • How do they help to build distributed systems?

1.2. Learning Objectives

  • Explain distributed systems with their basic scalability techniques.
  • Explain and contrast cloud computing and serverless computing based on definitions and examples.

1.3. Retrieval practice

Agenda

2. Distributed Systems

Internet of Things

Internet of Things” by Wilgengebroed on Flickr under CC BY 2.0; from Wikimedia Commons

2.1. Definitions

2.2. Internet vs Web

  • Major concepts

    • Internet: Network of networks, an internetwork

      • Each network with hosts and links
      • E.g., our home networks, university networks, ISPs, etc.

        • Connectivity for heterogeneous devices in DSs, regardless of their home network
      • Connectivity enabled by various protocols

        • IPv4 and IPv6 for host-to-host connectivity (IP = Internet Protocol)

        • DNS translates human-readable names to IP addresses, e.g., www.uni-muenster.de to 128.176.6.250 (IPv4) or 2001:4cf0:2:20::80b0:6fa (IPv6)

        • TCP, UDP, QUIC for process-to-process connectivity (e.g., process of web browser talks with remote process of web server)

    • The web is an application using the Internet

      • Clients and servers talk HTTP (another protocol)
        • E.g., GET requests of HTTP ask for HTML pages (and more)
        • Web servers provide resources to web clients (browsers, apps)
    • Internet and web contain DSs

2.3. Technical DS Challenges

  • No shared memory, but message passing
  • Concurrency
  • Autonomy and heterogeneity
  • Neither global clock nor global state
  • Independent failures
  • Hostile environment, safety vs security

2.4. DS Goals

  • Make resources accessible
    • E.g., CPUs or GPUs, printers, files, communication and collaboration
  • Openness
    • Accepted standards, interoperability
  • Various distribution transparencies
  • Scalability

(Source: (Tanenbaum and Steen 2007))

2.4.1. Distribution Transparencies

  • Transparency = Invisibility (hide complexity)
  • Sample selection of transparencies from ISO/ODP (Farooqui, Logrippo, and de Meer 1995)
    • Location t.: clients need not know physical server locations
    • Migration t.: clients need not know locations of objects, which can migrate between servers
    • Replication t.: clients need not know if/where objects are replicated
    • Failure t.: (partial) failures are hidden from clients

2.4.2. Scalability

  • Dimensions of scale

    • Numerical: Numbers of users, objects, services
    • Geographical: Distance over which system is scattered
    • Administrative: Number of organizations with control over system components
  • Typical scalability techniques for IT systems

    • Scale up: Improve hardware; limited potential

    • Scale out (horizontal scaling): Use partitioning and replication; (almost) unlimited potential

(Based upon: (Neuman 1994))

2.4.3. Replication

  • To replicate = to copy to multiple machines/nodes
  • Positive effects

    • Increased availability (usability in presence of faults)
      • System usable as long as “enough” replicas available
    • Reduced latency
      • Use local or nearby replica
    • Increased throughput
      • Distribute/balance load among replicas
  • Challenge: Keep replicas in sync (consistent)

2.4.4. Caching

  • To cache = to save (intermediate) results close to client

    • Temporary form of replication, e.g.:
      • CPU caches keep data from RAM closer to CPU; in turn, RAM acts as cache for data from disk; in turn, disks act as caches for “cloud” data
      • Browser caches for web resources
      • SIEVE algorithm for caching
  • Positive effects
    • Reduced load on server/origin
    • Increased availability and throughput as well as reduced latency as with replication
  • Challenge: Keep cache contents up to date

2.4.5. Partitioning

  • To partition = to spread data or services among multiple machines/nodes
  • Effects
    • Reduced availability: each node is additional point of failure
      • If node fails, its data/services are not available
      • (To improve availability, partitioning usually paired with replication)
    • Reduced latency and increased throughput
      • Each node operates on (small) subset
        • (Partial) results on subsets produced fast; combined into overall result
      • Nodes operate in parallel
        • (Think of search in large set of data)

3. Cloud Computing

3.1. Computing as Utility

  • Computing as 5th utility (Buyya et al. 2009)

    • “Computing is being transformed to a model consisting of services that are commoditized and delivered in a manner similar to traditional utilities such as water, electricity, gas, and telephony. In such a model, users access services based on their requirements without regard to where the services are hosted or how they are delivered.”

    • Subscription-based pay-per-use of complex IT infrastructure, without heavy up-front investment/developments
      • Flexibility
    • Economies of scale for providers

3.2. NIST Definition

  • (Mell and Grance 2011)

    NIST Visual model of cloud computing definition

    NIST Visual model of cloud computing definition” by P Naveen, Wong Kiing Ing, Michael Kobina Danquah, Amandeep S Sidhu, and Ahmed Abu-Siada under CC BY 3.0; from Fig. 2 in P Naveen et al 2016 IOP Conf. Ser.: Mater. Sci. Eng. 121 012010

    • Cloud computing is a model for enabling ubiquitous, convenient, on-demand network access to a shared pool of configurable computing resources (e.g., networks, servers, storage, applications, and services) that can be rapidly provisioned and released with minimal management effort or service provider interaction. This cloud model is composed of five essential characteristics, three service models, and four deployment models.”
      • (Emphasis added)

(Figure source: (Naveen et al. 2016))

3.2.1. NIST Definition: Characteristics

  • On-demand self-service
    • Users provision servers with CPUs, memory, storage, networking
      • Without human interaction with service provider
  • Broad network access
    • Capabilities accessible over Internet
  • Resource pooling
    • Provider’s computing resources assigned dynamically to multiple consumers
      • Multi-tenant
      • Customers without knowledge/control of exact locations
  • Rapid elasticity
    • Capabilities quickly scalable up or down
    • Illusion of unlimited resources
  • Measured service
    • Cloud systems control and optimize resource usage
      • For both providers and consumers

(Source: (Mell and Grance 2011))

3.2.2. NIST Definition: Service Models

  • Infrastructure as a Service (IaaS)
    • Consumers deploy arbitrary software in VMs on provider’s cloud infrastructure
      • “VMs in the cloud”, e.g., major cloud providers and project seminar servers
  • Platform as a Service (PaaS)

    • Consumers deploy applications using programming languages, libraries, and tools supported by provider
      • “Programming environment in the cloud”, e.g., major cloud providers
  • Software as a Service (SaaS)

    • Consumers use provider’s applications on cloud infrastructure
      • “Applications in the cloud”, e.g., office suite, CRM system, ERP system
  • (Anything as a Service (XaaS))

    • (X = Container, Function, Backend, Database, …)

3.2.3. NIST Definition: Deployment Models

  • Public cloud
    • Company manages cloud to be used by others
  • Private cloud
    • Organization operates its own cloud
    • Exclusive use
  • Community cloud
    • Community of consumers operates their own cloud
  • Hybrid cloud
    • Two or more distinct private, community, or public cloud infrastructures
    • Standards for data and application portability

3.3. Cloud Caveats

  • Digital sovereignty?

    • Provider reliability
    • Vendor lock-in
  • Security and privacy concerns

4. Serverless Computing

  • (Kounev et al. 2023)
    • “Serverless computing is a cloud computing paradigm encompassing a class of cloud computing platforms that allow one to develop, deploy, and run applications (or components thereof) in the cloud without allocating and managing virtualized servers and resources or being concerned about other operational aspects.”
      • Provider is responsible for operational aspects, e.g., fault tolerance, elastic scaling
      • Pay-per-use with fine granularity
      • Examples include AWS Lambda, Google Cloud Functions

4.1. Self-Study

4.2. Sample Serverless Applications

  • Anomaly detection for industrial sensors

    • Cloud functions consume micro batches, e.g., thresholding or machine learning for anomaly detection
  • Object storage

    • (Unlimited) storage of data with meta-data under unique identifiers
    • No details of servers necessary
  • Serverless databases, SQL-as-a-Service

    • Capacity planning and autoscaling included
  • Serverless edge or fog computing

    • Computation and storage close to data sources
      • (In contrast to shipping of data to central data centers for computations)
      • E.g., real-time computer vision or analytics for mobile or (resource-constrained) IoT devices

(Source: (Kounev et al. 2023))

5. Conclusions

5.1. Summary

  • Distributed systems are everywhere
    • Internet as core infrastructure
    • Networked machines coordinated with messages
  • Cloud computing provides infrastructure for distributed systems
    • Different service models for different applications
    • Serverless computing as paradigm without operational concerns for users with pay-per-use

Bibliography

Buyya, Rajkumar, Satish Narayana Srirama, Giuliano Casale, Rodrigo Calheiros, Yogesh Simmhan, Blesson Varghese, Erol Gelenbe, et al. 2018. “A Manifesto for Future Generation Cloud Computing: Research Directions for the next Decade.” Acm Comput. Surv. 51 (5): 1–38. https://doi.org/10.1145/3241737.
Buyya, Rajkumar, Chee Shin Yeo, Srikumar Venugopal, James Broberg, and Ivona Brandic. 2009. “Cloud Computing and Emerging It Platforms: Vision, Hype, and Reality for Delivering Computing as the 5th Utility.” Future Generation Computer Systems 25 (6): 599–616. https://doi.org/10.1016/j.future.2008.12.001.
Coulouris, George, Jean Dollimore, Tim Kindberg, and Gordon Blair. 2011. Distributed Systems: Concepts and Design. 5th ed. USA: Addison-Wesley Publishing Company. http://www.cdk5.net/.
Farooqui, Kazi, Luigi Logrippo, and Jan de Meer. 1995. “The Iso Reference Model for Open Distributed Processing: An Introduction.” Computer Networks and Isdn Systems 27 (8): 1215–29. http://www.sciencedirect.com/science/article/pii/016975529500087N.
Kounev, Samuel, Nikolas Herbst, Cristina L. Abad, Alexandru Iosup, Ian Foster, Prashant Shenoy, Omer Rana, and Andrew A. Chien. 2023. “Serverless Computing: What It Is, and What It Is Not?” Commun. Acm 66 (9): 80–92. https://doi.org/10.1145/3587249.
Mell, Peter, and Timothy Grance. 2011. “The Nist Definition of Cloud Computing.” Nist Special Publication 800-145. https://doi.org/10.6028/NIST.SP.800-145.
Naveen, P, Wong Kiing Ing, Michael Kobina Danquah, Amandeep S Sidhu, and Ahmed Abu-Siada. 2016. “Cloud Computing for Energy Management in Smart Grid - an Application Survey.” Iop Conference Series: Materials Science and Engineering 121 (1): 012010. https://doi.org/10.1088/1757-899X/121/1/012010.
Neuman, B. Clifford. 1994. “Scale in Distributed Systems.” In Readings in Distributed Computing Systems. IEEE Computer Society Press. http://clifford.neuman.name/publications/.
Tanenbaum, Andrew S., and Maarten van Steen. 2007. Distributed Systems: Principles and Paradigms. 2nd ed. Upper Saddle River, NJ, USA: Prentice-Hall, Inc.

License Information

Source files are available on GitLab (check out embedded submodules) under free licenses. Icons of custom controls are by @fontawesome, released under CC BY 4.0.

Except where otherwise noted, the work “Cloud Computing”, © 2018-2024 Jens Lechtenbörger, is published under the Creative Commons license CC BY-SA 4.0.