View Categories

Text to Speech – Voice Output in Scratch

🔊 Text to Speech – Voice Output in Scratch #

The Text to Speech extension brings realistic spoken voice to your Scratch projects.
Make your characters talk, narrate stories, create voice assistants, build language learning tools, and add accessibility features – all with simple blocks that convert text into natural-sounding speech.
Works in 23 languages with 5 unique voices.


🌟 Overview #

  • Natural Speech Synthesis: Convert any text into spoken words with realistic voices.
  • 23 Languages Supported: Speak in English, Spanish, French, German, Chinese, Arabic, and many more.
  • 5 Unique Voices: Alto, Tenor, Squeak, Giant, and Kitten – each with distinct characteristics.
  • Simple Integration: Just one block to make your sprites talk!
  • Language Detection: Automatically defaults to your editor’s language.
  • Cloud-Based: Uses Scratch’s synthesis service for high-quality speech generation.

Key Features #

  • 5 distinct voice personalities: neutral (Alto/Tenor), funny (Squeak/Giant), and playful (Kitten).
  • 23 supported languages with natural pronunciation.
  • Automatic language matching based on editor locale.
  • Gender-appropriate voice selection per language.
  • Character-based voices for creative storytelling.
  • Sync with Scratch animations and sound effects.
  • Perfect for accessibility, language learning, and interactive narratives.

🚀 How to Use #

  1. Go to: pishi.ai/play
  2. Open the Extensions section.
  3. Select the Text to Speech extension.
  4. Use the “speak [WORDS]” block to make your sprite talk!
  5. (Optional) Set voice and language before speaking.
  6. Your sprite will speak the text out loud – Scratch waits until speech finishes.

Tips

  • The extension automatically detects your editor’s language and uses it by default.
  • Use different voices for different characters to create conversations.
  • Combine with “think” or “say” blocks to display text while speaking.
  • Works best with internet connection – requires cloud speech synthesis service.
  • Keep text under 128 characters for best performance.

🧱 Blocks and Functions #

 

🗣️ Main Speech Block #

speak [WORDS]

Converts text to speech and plays it as audio.
The script waits until the speech finishes before continuing – perfect for synchronized storytelling.

How it works:

  • Type or insert text to be spoken.
  • The block sends the text to Scratch’s synthesis service.
  • Audio is generated with the selected voice and language.
  • Audio plays through your speakers/headphones.
  • The block waits until speech completes before moving to the next block.

Arguments:

  • [WORDS] – Text to speak (string, up to 128 characters)

Examples:

  • speak [Hello, my name is Scratch Cat!]
  • speak [join [The answer is ] (score)] – Speak dynamic text with variables
  • speak [What is your favorite color?]

Note: This block waits for speech to complete – use it for dialogue, narration, and synchronized animations.

 


🎭 Voice Selection #

set voice to [VOICE]

Changes the voice character for speech synthesis.
Each voice has a unique personality – perfect for different characters or moods.

Available Voices:

  • alto – Neutral, ambiguous gender voice (default) – professional, clear
  • tenor – Neutral, ambiguous gender voice – slightly deeper than alto
  • squeak – High-pitched, playful voice – great for small characters, excited speech
  • giant – Low-pitched, deep voice – perfect for large characters, serious tones
  • kitten – Ultra-high baby voice – speaks “meow” for all words (fun character effect!)

Examples:

  • set voice to [alto] → Standard neutral voice
  • set voice to [squeak] → High-pitched, energetic voice
  • set voice to [giant] → Deep, serious voice
  • set voice to [kitten] → Cat character (says “meow meow meow”)

Creative Uses:

  • Use alto for narrators, teachers, or professional characters.
  • Use tenor for heroes, leaders, or confident characters.
  • Use squeak for fairies, children, robots, or excited emotions.
  • Use giant for monsters, villains, authority figures, or serious moments.
  • Use kitten for pet characters or comedic effect.

 


🌍 Language Selection #

set language to [LANGUAGE]

Changes the language for speech synthesis.
The extension automatically defaults to your editor language, but you can override it for multilingual projects.

Supported Languages (23 total):

  • English (en) – American English pronunciation
  • Spanish (European) (es) – Castilian Spanish
  • Spanish (Latin American) (es-419) – Latin American Spanish
  • French (fr) – French pronunciation
  • German (de) – German pronunciation
  • Italian (it) – Italian pronunciation
  • Portuguese (Brazilian) (pt-br) – Brazilian Portuguese
  • Portuguese (European) (pt) – European Portuguese
  • Chinese (Mandarin) (zh-cn) – Mandarin Chinese (simplified & traditional)
  • Japanese (ja) – Japanese pronunciation
  • Korean (ko) – Korean pronunciation
  • Arabic (ar) – Modern Standard Arabic
  • Hindi (hi) – Hindi pronunciation
  • Russian (ru) – Russian pronunciation
  • Dutch (nl) – Dutch pronunciation
  • Polish (pl) – Polish pronunciation
  • Turkish (tr) – Turkish pronunciation
  • Danish (da) – Danish pronunciation
  • Swedish (sv) – Swedish pronunciation
  • Norwegian (nb) – Norwegian Bokmål
  • Icelandic (is) – Icelandic pronunciation
  • Romanian (ro) – Romanian pronunciation
  • Welsh (cy) – Welsh pronunciation

How it works:

  • The extension automatically sets language to match your Scratch editor’s language (if supported).
  • Use this block to override the default or create multilingual projects.
  • Language setting is saved with the project.
  • Some languages have only female voices available – the extension automatically adjusts voice pitch.

Examples:

  • set language to [Spanish (European)] → Speak in Spanish
  • set language to [French] → Speak in French
  • set language to [Japanese] → Speak in Japanese

Language Learning Projects:

  • Create projects that teach pronunciation in different languages.
  • Build multilingual vocabulary quizzes.
  • Make story translations with spoken audio.
  • Practice language listening comprehension.

Note on Single-Gender Languages:
Some languages (Arabic, Chinese, Hindi, Korean, Norwegian, Romanian, Swedish, Turkish, Welsh) only have a female voice available from the synthesis service. The extension automatically adjusts pitch for Tenor and Giant voices in these languages.

 


🎓 Educational Uses #

  • Accessibility: Add voice narration for visually impaired users or non-readers.
  • Create read-aloud story projects – combine text display with speech.
  • Build language learning tools – teach pronunciation, vocabulary, phrases.
  • Make interactive tutorials with spoken instructions.
  • Develop voice-responsive games and quizzes.
  • Create multilingual projects for diverse classrooms.
  • Teach character voice and dialogue writing.
  • Build assistive technology projects for students with reading challenges.

🎮 Example Projects #

  • Talking Storybook: Characters narrate story text with different voices.
  • Language Tutor: Speak vocabulary words and phrases in different languages – combine with Speech Recognition extension.
  • Voice Assistant: Scratch cat acts as a helpful AI assistant answering questions.
  • Multilingual Greeter: Greet users in multiple languages based on selection.
  • Quiz Game with Spoken Questions: Voice reads questions aloud for accessibility.
  • Interactive Dialogue: Create conversations between multiple sprites with different voices.
  • Audio Notifications: Use voice to announce scores, achievements, timer updates.
  • Pronunciation Practice: Display word, speak it, ask user to repeat.
  • Character Conversations: Use Alto for one character, Tenor for another, Squeak for a third.
  • Comedy Sketch: Use Giant for the villain, Squeak for the hero, Kitten for the sidekick.

🧩 Try it yourself: pishi.ai/play

 


🔧 Tips & Troubleshooting #

 

🔊 Text to Speech Specific Tips #

  • No sound? Check your device volume and browser sound permissions. Make sure speakers/headphones are connected.
  • Speech not working? This extension requires an internet connection – the synthesis service is cloud-based.
  • Wrong language pronunciation? Make sure you’ve set the correct language before speaking. Language defaults to editor locale.
  • Text too long? Keep text under 128 characters – longer text is automatically truncated.
  • Voice not changing? Make sure to use “set voice to [VOICE]” before the “speak” block.
  • Kitten voice speaking gibberish? That’s normal! Kitten voice replaces all words with “meow” for character effect.
  • Speech sounds weird in some languages? Some languages only have female voices – Tenor/Giant may sound unusual due to pitch adjustment.
  • Want to stop speech early? Use the red stop button or “stop all” block – all speech stops immediately.
  • Multiple sprites talking? Set different voices for each sprite to create conversations.
  • Sync speech with animation? Use “speak” in sequence with costume changes, movement, or sound effects.
  • Speech overlapping? “speak” block waits for completion – use multiple “speak” blocks in sequence for dialogue.

💡 Creative Tips #

🎭 Voice Characterization #

  • Narrator: Use Alto or Tenor for clear, professional narration.
  • Children/Small Characters: Use Squeak for high-pitched, energetic speech.
  • Villains/Large Characters: Use Giant for deep, commanding presence.
  • Animals/Pets: Use Kitten for playful cat characters (all words become “meow”).
  • Robots/AI: Use Tenor or Alto with technical vocabulary.
  • Fantasy Creatures: Experiment with Squeak or Giant for unique character voices.

📖 Storytelling Techniques #

  • Combine “say” blocks with “speak” blocks to show and speak text simultaneously.
  • Use different voices for different story characters.
  • Add pauses between speech with “wait” blocks for dramatic effect.
  • Alternate between narrator voice (Alto/Tenor) and character voices (Squeak/Giant).
  • Use variables to store dialogue and speak dynamic responses.

🌍 Multilingual Projects #

  • Create language switcher with buttons or variables.
  • Store translations in lists – speak the appropriate translation.
  • Build “Learn a Language” projects with word pronunciation.
  • Make international greeting projects – say “hello” in many languages.

🔒 Privacy and Safety #

  • Text sent to speech synthesis is processed by Scratch’s cloud service (synthesis-service.scratch.mit.edu).
  • No text is permanently stored – it’s only used to generate audio.
  • Generated audio is streamed directly to your browser – not saved on servers.
  • Language and voice settings are saved in your project file locally.
  • Requires internet connection to function – speech synthesis is not offline.

🧪 Technical Info #

  • Synthesis Service: Scratch’s cloud-based text-to-speech service
  • Server URL: synthesis-service.scratch.mit.edu
  • Timeout: 10 seconds (requests exceeding timeout will fail)
  • Max Text Length: 128 characters (automatically truncated)
  • Audio Volume: 250% (boosted for clarity)
  • Speech Format: Audio buffer (decoded and played through Scratch audio engine)
  • Languages Supported: 23 languages with natural pronunciation
  • Voices: 5 distinct voices with playback rate adjustments
  • Single-Gender Languages: Arabic, Chinese, Hindi, Korean, Norwegian, Romanian, Swedish, Turkish, Welsh
  • Internet Required: Yes – cloud synthesis service

🆚 Voice Comparison Chart #

Voice Pitch Personality Best For
Alto Neutral Professional, clear Narrators, teachers, standard speech
Tenor Neutral Slightly deeper Heroes, leaders, confident characters
Squeak High (+3) Playful, energetic Small characters, fairies, children, robots
Giant Low (-3) Deep, serious Large characters, villains, authority
Kitten Very High (+6) Ultra-playful Cat characters, comedy (says “meow”)

↔ Swipe left or right to view full table on mobile


🔗 Related Extensions #

  • 🎤 Speech Recognition – convert voice to text (perfect complement for voice interactions)
  • 🌐 Translate – translate text between languages before speaking
  • 💬 ChatGPT – generate dialogue with AI, then speak it with Text to Speech
  • 🎵 Music – combine speech with music and sound effects

📚 Learn More #


Scroll to Top