Voice Integration

Name: GoCodeMe
Author: GoSiteMe

Integrate Alfred's voice-first AI into your applications. Support for phone calls, web widgets, webhooks, and multiple voice engines.

Architecture Overview

Alfred Voice uses a real-time voice transport layer built into the platform. Calls come in via phone or web widget, get processed through the Alfred Voice API, and route to Alfred's tool engine for execution.

Voice Call Flow

User (Phone/Web)

Voice Transport

Alfred Webhook

Tool Engine (1,220+)

Voice Response

STT → Alfred Processing → TTS, with real-time tool execution in the middle

Key Components

Alfred Voice API — Real-time voice transport and speech-to-text/text-to-speech pipeline
Webhook Server — Your endpoint that receives voice events and returns tool results
Tool Engine — Alfred's 1,220+ tools, executable via voice or API
Voice Engines — Kokoro, Orpheus, Cartesia Sonic, ElevenLabs for TTS output
Conference Rooms — Multi-agent voice collaboration spaces

Voice API Setup

The Alfred Voice API handles the real-time voice connection between users and Alfred. You'll need to configure a voice assistant and point it to your Alfred webhook.

Step 1: Get Your Voice API Key

Navigate to your Developer Portal and generate a Voice API key from the dashboard.

Step 2: Create an Assistant

cURL

curl -X POST https://gositeme.com/api/voice-assistant \
  -H "Authorization: Bearer your_voice_api_key" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "Alfred AI",
    "model": {
      "provider": "custom-llm",
      "url": "https://gositeme.com/api/vapi-webhook.php",
      "model": "alfred-v2"
    },
    "voice": {
      "provider": "cartesia",
      "voiceId": "sonic-english-male-1"
    },
    "firstMessage": "Hello! I am Alfred, your AI assistant. How can I help you today?",
    "serverUrl": "https://gositeme.com/api/vapi-webhook.php"
  }'

Step 3: Configure Server URL

Point your voice assistant's server URL to Alfred's webhook endpoint:

POST /api/vapi-webhook.php

This endpoint receives all voice events: call start, speech input, tool calls, and call end.

Managed Setup: If you're using Alfred's hosted service, the Voice API is pre-configured. You only need manual setup when self-hosting or customizing the voice pipeline.

Webhook Configuration

Alfred exposes multiple webhook endpoints for voice event handling:

Endpoint	Purpose
`/api/vapi-webhook.php`	Main voice event handler (call events, messages, tool calls)
`/api/vapi-callback.php`	Callback handler for async tool results
`/api/vapi-tools.php`	Tool registration endpoint for voice function calling
`/api/vapi-auth.php`	Voice session authentication
`/api/vapi-outbound.php`	Outbound call initiation

Webhook Event Types

The main webhook receives events in this format:

JSON

// Call started
{
  "message": {
    "type": "assistant-request",
    "call": {
      "id": "call_abc123",
      "phoneNumber": "+18334674836",
      "customer": { "number": "+15145551234" }
    }
  }
}

// Tool call request
{
  "message": {
    "type": "tool-calls",
    "toolCalls": [
      {
        "id": "tc_xyz",
        "type": "function",
        "function": {
          "name": "weather_lookup",
          "arguments": "{\"location\":\"Montreal\"}"
        }
      }
    ]
  }
}

// Call ended
{
  "message": {
    "type": "end-of-call-report",
    "call": { "id": "call_abc123" },
    "summary": "User asked about weather in Montreal",
    "duration": 45
  }
}

Handling Webhook Events (PHP)

PHP

<?php
// vapi-webhook.php - Handle incoming voice events
$payload = json_decode(file_get_contents('php://input'), true);
$type = $payload['message']['type'] ?? '';

switch ($type) {
    case 'assistant-request':
        // New call - return assistant configuration
        echo json_encode([
            'assistant' => [
                'firstMessage' => 'Hello! I am Alfred. How can I help?',
                'model' => [
                    'provider' => 'openai',
                    'model' => 'gpt-4',
                    'systemMessage' => 'You are Alfred, an AI assistant with 1,220+ tools...'
                ],
                'voice' => ['provider' => 'cartesia', 'voiceId' => 'sonic-english-male-1']
            ]
        ]);
        break;

    case 'tool-calls':
        // Execute tool calls
        $results = [];
        foreach ($payload['message']['toolCalls'] as $tc) {
            $args = json_decode($tc['function']['arguments'], true);
            $result = executeAlfredTool($tc['function']['name'], $args);
            $results[] = [
                'toolCallId' => $tc['id'],
                'result' => json_encode($result)
            ];
        }
        echo json_encode(['results' => $results]);
        break;

    case 'end-of-call-report':
        // Log call summary
        logCall($payload['message']);
        echo json_encode(['success' => true]);
        break;
}
?>

Phone Number Provisioning

Alfred's primary phone number is 1-833-GOSITEME (1-833-467-4836). For custom phone numbers, provision through the Voice API or the voice management endpoint:

Provision a Number

POST /api/voice-manage.php

JavaScript

const number = await fetch('https://gositeme.com/api/voice-manage.php', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer sess_abc123...'
  },
  body: JSON.stringify({
    action: 'provision_number',
    area_code: '514',
    country: 'CA',
    agent_id: 'agent_xyz'
  })
}).then(r => r.json());

console.log(number);
// { phone_number: "+15141234567", status: "active", agent: "agent_xyz" }

List Your Numbers

cURL

curl -H "Authorization: Bearer sess_abc123..." \
  "https://gositeme.com/api/voice-manage.php?action=list_numbers"

# {
#   "numbers": [
#     { "number": "+15141234567", "agent": "agent_xyz", "calls_today": 23, "status": "active" }
#   ]
# }

Initiate Outbound Call

POST /api/vapi-outbound.php

Python

import requests

call = requests.post('https://gositeme.com/api/vapi-outbound.php',
    headers={'Authorization': 'Bearer sess_abc123...'},
    json={
        'action': 'call',
        'to': '+15145551234',
        'from_number': '+15141234567',
        'agent_id': 'agent_xyz',
        'context': 'Follow up on support ticket #1234'
    }
).json()

print(f"Call initiated: {call['call_id']}")

Compliance: Outbound calls must comply with TCPA/CRTC regulations. Ensure you have consent before initiating automated calls. Enterprise plan required for outbound calling.

Voice Agent Configuration

Voice agents are AI personalities that interact with callers. Configure their personality, tools, knowledge base, and voice settings.

Create a Voice Agent

JavaScript

const agent = await fetch('https://gositeme.com/api/fleet.php', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer sess_abc123...'
  },
  body: JSON.stringify({
    action: 'create_agent',
    name: 'Receptionist',
    personality: 'Professional, warm receptionist for a law firm. Always ask for caller name and purpose.',
    tools: ['schedule_appointment', 'lookup_client', 'transfer_call', 'take_message'],
    knowledge_base: 'kb_firm_info',
    voice_enabled: true,
    voice_engine: 'cartesia',
    voice_config: {
      voice_id: 'sonic-english-female-1',
      speed: 1.0,
      emotion: 'friendly'
    },
    call_settings: {
      max_duration: 600,
      silence_timeout: 10,
      greeting: 'Thank you for calling Smith & Associates. How may I direct your call?'
    }
  })
}).then(r => r.json());

Agent Configuration Reference

Field	Type	Description
`personality`	string	System prompt defining agent behavior and tone
`tools`	array	Tool names the agent can invoke during calls
`knowledge_base`	string	Knowledge base ID for RAG-powered responses
`voice_engine`	string	TTS engine: `kokoro`, `orpheus`, `cartesia`, `elevenlabs`
`voice_config.voice_id`	string	Specific voice ID from chosen engine
`voice_config.speed`	float	Speech speed multiplier (0.5–2.0, default: 1.0)
`call_settings.max_duration`	integer	Max call duration in seconds (default: 600)
`call_settings.silence_timeout`	integer	Hang up after N seconds of silence (default: 10)
`call_settings.greeting`	string	First message spoken when call connects

Supported Voice Engines

Alfred supports multiple text-to-speech engines. Choose based on your needs for latency, quality, and expressiveness.

Kokoro

Ultra-fast, open-source TTS. Great for real-time interactions with minimal latency.

Fastest

Orpheus

High-quality open-source voice with natural prosody and emotional range.

Expressive

Cartesia Sonic

Low-latency commercial TTS with excellent voice cloning and consistent quality.

Best Balance

ElevenLabs

Premium voice synthesis with the most natural-sounding output. Voice cloning available.

Premium

Engine Comparison

Engine	Latency	Quality	Languages	Plan Required
`kokoro`	~80ms	Good	EN, FR, ES, DE, JA	All plans
`orpheus`	~150ms	Very Good	EN, FR, ES	All plans
`cartesia`	~120ms	Excellent	EN, FR, ES, DE, PT, JA, KO	Professional+
`elevenlabs`	~200ms	Premium	29 languages	Enterprise

Setting a Voice Engine

JavaScript

// Update agent voice engine
await fetch('https://gositeme.com/api/fleet.php', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer sess_abc123...'
  },
  body: JSON.stringify({
    action: 'update_agent',
    agent_id: 'agent_xyz',
    voice_engine: 'cartesia',
    voice_config: {
      voice_id: 'sonic-english-female-1',
      speed: 1.1,
      emotion: 'professional'
    }
  })
});

Python

import requests

requests.post('https://gositeme.com/api/fleet.php',
    headers={'Authorization': 'Bearer sess_abc123...'},
    json={
        'action': 'update_agent',
        'agent_id': 'agent_xyz',
        'voice_engine': 'elevenlabs',
        'voice_config': {
            'voice_id': 'rachel',
            'stability': 0.7,
            'similarity_boost': 0.8
        }
    }
)

Embed Alfred's voice widget in your website for browser-based voice interaction. The widget handles microphone access, speech-to-text, and real-time responses.

Quick Embed

HTML

<!-- Add to your page's <body> -->
<script src="https://gositeme.com/assets/js/alfred-voice-widget.js"></script>
<script>
  AlfredVoice.init({
    token: 'sess_abc123...',       // Your session token
    position: 'bottom-right',       // bottom-right, bottom-left, top-right, top-left
    theme: 'dark',                  // dark or light
    accent: '#6c5ce7',              // Widget accent color
    greeting: 'Hi! How can I help you today?',
    agent_id: 'agent_xyz',          // Optional: use specific agent
    voice_engine: 'kokoro',         // TTS engine
    tools: ['weather_lookup', 'dns_lookup', 'summarize_text'],  // Allowed tools
    onReady: () => console.log('Widget loaded'),
    onMessage: (msg) => console.log('Alfred said:', msg.text),
    onError: (err) => console.error('Voice error:', err)
  });
</script>

Widget API

JavaScript

// Programmatic control
AlfredVoice.open();              // Open the widget
AlfredVoice.close();             // Close the widget
AlfredVoice.startListening();    // Start microphone
AlfredVoice.stopListening();     // Stop microphone
AlfredVoice.speak('Hello!');     // Text-to-speech
AlfredVoice.sendText('What is the weather in Montreal?');  // Send text input
AlfredVoice.destroy();           // Remove widget from DOM

// Event listeners
AlfredVoice.on('callStart', (data) => { /* call connected */ });
AlfredVoice.on('callEnd', (data) => { /* call ended */ });
AlfredVoice.on('transcript', (text) => { /* user speech recognized */ });
AlfredVoice.on('toolCall', (tool) => { /* tool being executed */ });

Custom Styling

CSS

/* Override widget styles */
.alfred-voice-widget {
    --av-bg: #12121e;
    --av-accent: #6c5ce7;
    --av-text: #e8e8f0;
    --av-radius: 16px;
    --av-shadow: 0 8px 32px rgba(0,0,0,0.4);
}

.alfred-voice-btn {
    width: 64px;
    height: 64px;
    border-radius: 50%;
}

.alfred-voice-panel {
    width: 380px;
    max-height: 500px;
}

Conference Room API

Conference rooms allow multiple AI agents and human participants to collaborate in real-time voice sessions. Useful for multi-agent workflows, team meetings with AI support, and complex problem-solving.

Create a Room

POST /api/fleet.php

JavaScript

const room = await fetch('https://gositeme.com/api/fleet.php', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer sess_abc123...'
  },
  body: JSON.stringify({
    action: 'create_room',
    name: 'Strategy Session',
    agents: ['agent_analyst', 'agent_researcher', 'agent_writer'],
    max_participants: 10,
    recording: true,
    auto_transcribe: true
  })
}).then(r => r.json());

console.log(room);
// {
//   id: "room_abc",
//   name: "Strategy Session",
//   join_url: "https://gositeme.com/conference-room.php?id=room_abc",
//   agents: 3,
//   recording: true
// }

Join a Room

JavaScript

// Via web browser
window.location.href = room.join_url;

// Via API (add participant)
await fetch('https://gositeme.com/api/fleet.php', {
  method: 'POST',
  headers: {
    'Content-Type': 'application/json',
    'Authorization': 'Bearer sess_abc123...'
  },
  body: JSON.stringify({
    action: 'join_room',
    room_id: 'room_abc',
    participant: {
      name: 'John Doe',
      role: 'moderator'
    }
  })
});

Room Management

cURL

# Get room status
curl -H "Authorization: Bearer sess_abc123..." \
  "https://gositeme.com/api/fleet.php?action=room_status&room_id=room_abc"

# End room and get transcript
curl -X POST https://gositeme.com/api/fleet.php \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sess_abc123..." \
  -d '{"action":"end_room","room_id":"room_abc"}'

# Response includes transcript and recording URL
# {
#   "transcript": "...",
#   "recording_url": "https://gositeme.com/recordings/room_abc.mp3",
#   "duration": 1847,
#   "participants": 5,
#   "tools_used": 12
# }

Try It Live: Visit gositeme.com/conference-room.php to create a conference room with multiple AI agents right now.

Need Help? For voice integration support, contact support@gositeme.com or call 1-833-GOSITEME.

Voice Integration

Architecture Overview

Key Components

Voice API Setup

Step 1: Get Your Voice API Key

Step 2: Create an Assistant

Step 3: Configure Server URL

Webhook Configuration

Webhook Event Types

Handling Webhook Events (PHP)

Phone Number Provisioning

Provision a Number

List Your Numbers

Initiate Outbound Call

Voice Agent Configuration

Create a Voice Agent

Agent Configuration Reference

Supported Voice Engines

Engine Comparison

Setting a Voice Engine

Web Voice Widget

Quick Embed

Widget API

Custom Styling

Conference Room API

Create a Room

Join a Room

Room Management

Bon retour

Créer un compte

Reset Password