Browser & AutomationDocumentedScanned

vocal-chat

Handles voice-to-voice conversations on WhatsApp.

Share:

Installation

npx clawhub@latest install vocal-chat

View the full skill documentation and source below.

Documentation

Walkie-Talkie Mode

This skill automates the voice-to-voice loop on WhatsApp using local transcription and local TTS.

Workflow

  • Incoming Audio: When a user sends an audio/ogg/opus file:

  • - Use tools/transcribe_voice.sh to get the text.
    - Process the text as a normal user prompt.

  • Outgoing Response:

  • - Instead of a text reply, generate speech using bin/sherpa-onnx-tts.
    - Send the resulting .ogg file back to the user as a voice note.

    Triggers

    • User sends an audio message.
    • User says "activa modo walkie-talkie" or "hablemos por voz".

    Constraints

    • Use local tools only (ffmpeg, whisper-cpp, sherpa-onnx-tts).
    • Maintain a fast response time (RTF < 0.5).
    • Always reply with BOTH text (for clarity) and audio.

    Manual Execution (Internal)

    To respond with voice manually:

    bin/sherpa-onnx-tts /tmp/reply.ogg "Tu mensaje aquí"

    Then send /tmp/reply.ogg via message tool with filePath.