CommunicationDocumentedScanned

phone-agent

Run a real-time AI phone agent using Twilio, Deepgram.

Share:

Installation

npx clawhub@latest install phone-agent

View the full skill documentation and source below.

Documentation

Phone Agent Skill

Runs a local FastAPI server that acts as a real-time voice bridge.

Architecture

Twilio (Phone) <--> WebSocket (Audio) <--> [Local Server] <--> Deepgram (STT)
                                                  |
                                                  +--> OpenAI (LLM)
                                                  +--> ElevenLabs (TTS)

Prerequisites

  • Twilio Account: Phone number + TwiML App.

  • Deepgram API Key: For fast speech-to-text.

  • OpenAI API Key: For the conversation logic.

  • ElevenLabs API Key: For realistic text-to-speech.

  • Ngrok (or similar): To expose your local port 8080 to Twilio.
  • Setup

  • Install Dependencies:

  • pip install -r scripts/requirements.txt

  • Set Environment Variables (in ~/.moltbot/.env, ~/.clawdbot/.env, or export):

  • export DEEPGRAM_API_KEY="your_key"
        export OPENAI_API_KEY="your_key"
        export ELEVENLABS_API_KEY="your_key"
        export TWILIO_ACCOUNT_SID="your_sid"
        export TWILIO_AUTH_TOKEN="your_token"
        export PORT=8080

  • Start the Server:

  • python3 scripts/server.py

  • Expose to Internet:

  • ngrok http 8080

  • Configure Twilio:

  • - Go to your Phone Number settings.
    - Set "Voice & Fax" -> "A Call Comes In" to Webhook.
    - URL: - Method: POST ## Usage Call your Twilio number. The agent should answer, transcribe your speech, think, and reply in a natural voice. ## Customization - **System Prompt**: Edit SYSTEM_PROMPT in scripts/server.py to change the persona. - **Voice**: Change ELEVENLABS_VOICE_ID to use different voices. - **Model**: Switch gpt-4o-mini to gpt-4` for smarter (but slower) responses.