Howto for TTS with Emacs-Reveal

Jens Lechtenbörger

View source

Press “play” once in audio-controls below (or type “a”) to start the presentation, which advances automatically afterwards.

1. General thoughts

  • All of this builds on emacs-reveal (Lechtenbörger 2019a, 2019b)
    • Check out its howto first
  • Text-To-Speech (TTS) should read notes (#+begin_notes ... #+end_notes)
    • Controlled by option reveal-with-tts
      • Use customization for available speakers
    • Audio is played with the audio slideshow plugin for Reveal.js
  • If slides with audio advance automatically, this is a video mode
    • Then, notes are required for every slides
    • Reveal.js “fragments” (animations) are still possible

1.1. Technical Idea

  • Implement TTS as two-stage process

    • First, extract notes from presentation

      • Generate a text file for each note

        • Its name is a hash value of the contents
      • Generate one index file that stores names (and other information) for all text files

      • This happens during export/publication of Org files into reveal.js presentations

    • Second, run TTS software on index file to generate audio
      • Implemented in Docker image emacs-reveal/tts
        • Image includes TTS implementations SpeechT5 and SpeechBrain
      • StyleTTS2 available in Docker image emacs-reveal/tts-styletts2
        • Activate with default voice: #+OPTIONS: reveal_with_tts:StyleTTS2
        • Or with target audio for voice cloning: #+OPTIONS: reveal_with_tts:StyleTTS2:/oer/target.wav
      • Generated audio shares hash value of its text as part of its name, enabling caching of unchanged audio

1.2. Docker image emacs-reveal/tts

2. Slide with notes and fragments

Notes on this slide clarify some aspects of the text generated by org-re-reveal as basis for TTS. To pronounce numbers, abbreviations, and “complicated” word, see variable org-re-reveal-tts-normalize-table.

Besides, for demonstration purposes, this slide contains fragments with separate notes:

  • First appearing point, with notes

  • Second appearing point

3. A real example

3.0.1. Offset as Pointer into Range

image/svg+xml 10 bits 10 bits Virtual page number Page offset 10 bits 5 bits Frame number Page offset Virtual page number 3 Virtual page number 2 Virtual page number 1 Virtual page number 0 Frame number 3 Frame number 2 Frame number 1 Frame number 0 1 KiB 1 KiB 1 KiB 1 KiB Address Translation 1 KiB 1 KiB 1 KiB 1 KiB Start ofpage 0 Offset Start offrame 1 Offset

Address translation with offset in covered address range” by Max Lütkemeyer and Jens Lechtenbörger under CC BY-SA 4.0; from GitLab

4. The End

Person taking steps to top

The road ahead …

Figure” under CC0 1.0; converted from Pixabay

https://gitlab.com/oer/

4.1. Bibliography

Lechtenbörger, Jens. 2019a. “Emacs-reveal: A software bundle to create OER presentations.” Journal of Open Source Education (Jose) 2 (18). https://doi.org/10.21105/jose.00050.
———. 2019b. “Simplifying license attribution for OER with emacs-reveal.” In 17. Fachtagung Bildungstechnologien (DELFI 2019), edited by Niels Pinkwart and Johannes Konert, 205–16. Bonn: Gesellschaft für Informatik e.V. https://doi.org/10.18420/delfi2019_280.

License Information

Except where otherwise noted, the work “Howto for TTS with Emacs-Reveal”, © 2023-2025 Jens Lechtenbörger, is published under the Creative Commons license CC BY-SA 4.0.

No warranties are given. The license may not give you all of the permissions necessary for your intended use.

In particular, trademark rights are not licensed under this license. Thus, rights concerning third party logos (e.g., on the title slide) and other (trade-) marks (e.g., “Creative Commons” itself) remain with their respective holders.