Imprint | Privacy Policy

Interrupts and I/O II

(Usage hints for this presentation)

IT Systems, Summer Term 2024
Dr. Jens Lechtenbörger (License Information)

1. Introduction

2. I/O Processing Variants

2.1. Recall: I/O with Interrupts

  • Recall

    • Asynchronous processing of I/O
    • External notifications via interrupts

    I/O with Interrupts

2.2. Blocking vs Non-Blocking I/O

  • Previous slide left open which application continues after I/O system call

    • OS provides blocking and non-blocking system calls

    • Blocking system call

      • Thread invoking system call has to wait (is blocked) until I/O completed
      • However, a different thread may continue
        • Scheduling, context switch, overhead
    • Non-blocking system call
      • OS initiates I/O and returns incomplete result to thread
      • Thread continues (and is informed of or needs to check for I/O completion at later point in time)
      • (Notice: This is impossible with polling)

2.3. Latency Example (1/2)

  • Goal: Explain interrupt overhead as serious challenge if interrupts are frequent
  • See (Larsen et al. 2009)
    • Two PCs with Intel Xeon processors (2.13 GHz)
    • 1 Gbps Ethernet networking cards connected via PCIe
    • 1 – 2 frames may arrive per 1 µs (1 µs = one millionth second)
      • For the curious
        • Ethernet’s unit of transfer: frame with minimum size of 512 b
        • At 1 Gbps, 1000 b need 1 µs for transfer, plus propagation and queueing delays
        • Thus, 1 – 2 frames may arrive per 1 µs
    • Interrupt per frame arrival!?
      • What about 10 Gbps networking?

2.4. Latency Example (2/2)

  • Numbers from (Larsen et al. 2009)
    • Processing of single frame takes total of 7.7 µs
    • Latency breakdown according to different sources
      • Hardware: ≈ 0.6 µs
      • Interrupt processing: > 3 µs
      • Processing of data: > 3 µs
  • If one or two frames arrive per 1 µs and each frame needs 7.7 µs processing time, something is seriously wrong
    • Network data will be dropped because it arrives too fast
      • The system could even crash
    • Interrupt per arrival does not work

2.5. Interrupt Livelocks

  • Livelock: Situation in which computations take place but (almost) no progress is made
    • Computation time is mostly wasted on overhead
  • Interrupt livelock
    • Interrupts arrive so fast that they cannot be processed any longer
      • Also, not enough CPU time left for other tasks
        • Interrupts served with high priority
      • Context switching, cache pollution
      • Nothing useful happens any more
    • Prevent by hybrid of polling and interrupts
      • E.g., NAPI (New API)

2.5.1. Starvation

  • Interrupt livelock is special case of starvation
  • Starvation = continued denial/lack of resource

    • Under interrupt livelock, threads do not receive resource CPU (in sufficient quantities for progress) as long as “too many” interrupts are triggered

2.6. NAPI

  • Linux “New API” for networking, see (Salim, Olsson, and Kuznetsov 2001)
  • Hybrid scheme

    • Use interrupts under low load

      • Utilize CPUs better
        • Avoid polling for devices without data
    • Switch to polling under high load
      • Avoid interrupt overhead
        • Data will be available anyways

3. Outlook

3.1. When to Poll?

Measurements for DRAM-based storage prototype (data from Yang, Minturn, Hady (2012))

(Source for numbers: (Yang, Minturn, and Hady 2012))

3.2. I/O Processing – Then and Now

  • Then: Disks are slow
    • Mechanical devices
    • Delivered data is processed immediately by CPU
    • Latency before data arrives → Interrupts beneficial
  • Now: Nonvolatile memory is fast, see (Nanavati et al. 2016)
    • Mechanics eliminated
    • Operation at network/bus speed (PCIe)
    • Data can be delivered faster than processed → Polling beneficial
    • Need to rethink previous techniques
      • Balancing, scheduling, scaling, tiering

3.3. Call for Research

  • (Barroso et al. 2017): Attack of the Killer Microseconds

    • Nanosecond latency (DRAM access when data not in CPU cache) is hidden by CPU hardware

      • Out-of-order execution, branch prediction, multithreading (two threads per core)
      • (However, also ongoing research to address Killer Nanoseconds (Jonathan et al. 2018))
    • Millisecond latency (disk I/O) is hidden by OS

      • Multitasking
    • What about microseconds of new generation of fast I/O devices?

      • E.g., Gbps networking, flash memory
      • Paper describes datacenter challenges experienced at Google

4. Conclusions

4.1. Summary

  • Interrupt handling is major OS task
    • I/O processing
    • Timers, to be revisited for scheduling
    • System call implementation
  • Polling vs interrupt-driven I/O
    • Efficiency trade-off
    • Interrupt livelocks and NAPI

Bibliography

Barroso, Luiz, Mike Marty, David Patterson, and Parthasarathy Ranganathan. 2017. “Attack of the Killer Microseconds.” Cacm 60 (4): 48–54. https://dl.acm.org/citation.cfm?id=3015146.
Jonathan, Christopher, Umar Farooq Minhas, James Hunter, Justin Levandoski, and Gor Nishanov. 2018. “Exploiting Coroutines to Attack the ‘Killer Nanoseconds’.” Proc. Vldb Endow. 11 (11): 1702–14. https://doi.org/10.14778/3236187.3236216.
Larsen, Steen, Parthasarathy Sarangam, Ram Huggahalli, and Siddharth Kulkarni. 2009. “Architectural Breakdown of End-to-End Latency in a TCP/IP Network.” Int. J. Parallel Prog. 37 (6): 556–71. http://link.springer.com/article/10.1007/s10766-009-0109-6.
Nanavati, Mihir, Malte Schwarzkopf, Jake Wires, and Andrew Warfield. 2016. “Non-Volatile Storage – Implications of the Datacenter’s Shifting Center.” Acm Queue 13 (9). https://queue.acm.org/detail.cfm?id=2874238.
Salim, Jamal Hadi, Robert Olsson, and Alexey Kuznetsov. 2001. “Beyond Softnet.” In Proceedings of the 5th Annual Linux Showcase & Conference. Oakland, California: USENIX Association. https://www.usenix.org/publications/library/proceedings/als01/full_papers/jamal/jamal.pdf.
Yang, Jisoo, Dave B. Minturn, and Frank Hady. 2012. “When Poll Is Better than Interrupt.” In Fast 2012. https://www.usenix.org/conference/fast12/when-poll-better-interrupt.

License Information

Source files are available on GitLab (check out embedded submodules) under free licenses. Icons of custom controls are by @fontawesome, released under CC BY 4.0.

Except where otherwise noted, the work “Interrupts and I/O II”, © 2017-2024 Jens Lechtenbörger, is published under the Creative Commons license CC BY-SA 4.0.