# Audio Generation

ShortGenius can turn any piece of text into a narrated audio file using its text-to-speech (TTS) engine. Choose from various **voices** and locales to match your brand or project needs. This section covers creating speech, listing existing audio, and retrieving detailed audio info.

***

## Creating Speech

**Endpoint**: `POST /audio/speech`

Use this endpoint to generate a new audio file from text. You can optionally let the request return immediately or wait until the audio is fully generated.

{% tabs %}
{% tab title="TypeScript" %}

```typescript
import { ShortGenius } from 'shortgenius'

const client = new ShortGenius({
  bearerAuth: 'YOUR_API_TOKEN'
})

// Get a voice first
const voices = await client.getVoices({ locale: 'en-US' })
const voice = voices[0]

// Create speech
const audio = await client.createSpeech({
  text: 'Hello from ShortGenius!',
  locale: 'en-US',
  voiceId: voice.id,
  waitForGeneration: true
})

console.log(`Audio created: ${audio.id}`)
console.log(`URL: ${audio.url}`)
console.log(`Duration: ${audio.duration}s`)
```

{% endtab %}

{% tab title="Python" %}

```python
from shortgenius import Shortgenius

client = Shortgenius(api_key="YOUR_API_TOKEN")

# Get a voice first
voices = client.audio.voices.list_voices(locale="en-US")
voice = voices[0]

# Create speech
audio = client.audio.create_speech(
    text="Hello from ShortGenius!",
    locale="en-US",
    voice_id=voice.id,
    wait_for_generation=True
)

print(f"Audio created: {audio.id}")
print(f"URL: {audio.url}")
print(f"Duration: {audio.duration}s")
```

{% endtab %}

{% tab title="cURL" %}

```bash
curl --request POST \
  --url "https://api.shortgenius.com/v1/audio/speech" \
  --header "Authorization: Bearer YOUR_API_TOKEN" \
  --header "Content-Type: application/json" \
  --data '{
    "text": "Hello from ShortGenius!",
    "locale": "en-US",
    "voice_id": "9BWtsMINqrJLrRacOk9x",
    "wait_for_generation": true
  }'
```

{% endtab %}
{% endtabs %}

### Request Fields

| Field                 | Type    | Required | Description                                                                                                                       |
| --------------------- | ------- | -------- | --------------------------------------------------------------------------------------------------------------------------------- |
| `text`                | string  | Yes      | The text to be converted to speech.                                                                                               |
| `voice_id`            | string  | Yes      | ID of the chosen voice. See [List Voices](#list-voices) to retrieve possible `voice_id`.                                          |
| `locale`              | string  | No       | Defaults to `"auto"`. Use a two-letter language code + region code if you want to specify a locale (e.g., `en-US`, `de-DE`).      |
| `wait_for_generation` | boolean | No       | If **false**, the response immediately returns a pending record. If **true**, it waits until the audio is ready (default: false). |

### Sample Response (Synchronous)

If `wait_for_generation` is **true** and generation completes quickly, you'll receive:

```json
{
  "id": "3804fef4-5329-42b8-8a5b-a12eb5c3dc2c",
  "created_at": "2025-05-05T14:00:00Z",
  "updated_at": null,
  "url": "https://cdn.shortgenius.com/audio/3804fef4.mp3",
  "user_id": "8f157306-139a-4f38-b783-e13e326ecaaa",
  "transcript": {
    "words": [
      {
        "text": "Hello",
        "start": 0.4,
        "end": 0.7,
        "confidence": 0.99
      },
      {
        "text": "from",
        "start": 0.75,
        "end": 1.0,
        "confidence": 0.98
      },
      ...
    ]
  },
  "state": "completed",
  "text": "Hello from ShortGenius!",
  "locale": "en-US",
  "voice": {
    "id": "769d93d4-3c7f-47c0-9a9c-5db259e67b95",
    "name": "Samantha",
    "description": null,
    "avatar_url": null,
    "flag_url": null,
    "tags": null,
    "preview_url": null,
    "locale": "en-US",
    "source": "ElevenLabs"
  },
  "duration": 1.2,
  "lufs": -14.3
}
```

If `wait_for_generation` is **false**, you may see `"state": "pending"` or `"generating"`, and you need to poll the [Get Audio](#retrieve-a-single-audio) endpoint until it's `"completed"`.

***

## Listing & Retrieving Audio

### List Audio

**Endpoint**: `GET /audio`

| Query Param | Default | Description                      |
| ----------- | ------- | -------------------------------- |
| `page`      | 0       | Results page number (zero-based) |
| `limit`     | 50      | Items per page, up to 200        |

{% tabs %}
{% tab title="cURL" %}

```bash
curl --request GET \
  --url "https://api.shortgenius.com/v1/audio?page=0&limit=10" \
  --header "Authorization: Bearer YOUR_API_TOKEN"
```

{% endtab %}

{% tab title="TypeScript" %}

```typescript
const response = await client.getAllAudio({
  page: 0,
  limit: 10
})

console.log(`Found ${response.audio.length} audio files`)
console.log(`Has more: ${response.hasMore}`)

for (const audio of response.audio) {
  console.log(`- ${audio.text} (${audio.id})`)
  console.log(`  Voice: ${audio.voice.name}`)
  console.log(`  State: ${audio.state}`)
}
```

{% endtab %}

{% tab title="Python" %}

```python
response = client.audio.list_audio(page=0, limit=10)

print(f"Found {len(response.audio)} audio files")
print(f"Has more: {response.has_more}")

for audio in response.audio:
    print(f"- {audio.text} ({audio.id})")
    print(f"  Voice: {audio.voice.name}")
    print(f"  State: {audio.state}")
```

{% endtab %}
{% endtabs %}

### Retrieve a Single Audio

**Endpoint**: `GET /audio/{id}`

{% tabs %}
{% tab title="cURL" %}

```bash
curl --request GET \
  --url "https://api.shortgenius.com/v1/audio/3804fef4-5329-42b8-8a5b-a12eb5c3dc2c" \
  --header "Authorization: Bearer YOUR_API_TOKEN"
```

{% endtab %}

{% tab title="TypeScript" %}

```typescript
const audio = await client.getAudio('3804fef4-5329-42b8-8a5b-a12eb5c3dc2c')

console.log(`Text: ${audio.text}`)
console.log(`Voice: ${audio.voice.name}`)
console.log(`State: ${audio.state}`)
console.log(`Duration: ${audio.duration}s`)
console.log(`URL: ${audio.url}`)

// Access transcript if available
if (audio.transcript) {
  console.log('Transcript words:', audio.transcript.words.length)
}
```

{% endtab %}

{% tab title="Python" %}

```python
audio = client.audio.retrieve_audio("3804fef4-5329-42b8-8a5b-a12eb5c3dc2c")

print(f"Text: {audio.text}")
print(f"Voice: {audio.voice.name}")
print(f"State: {audio.state}")
print(f"Duration: {audio.duration}s")
print(f"URL: {audio.url}")

# Access transcript if available
if audio.transcript:
    print(f"Transcript words: {len(audio.transcript.words)}")
```

{% endtab %}
{% endtabs %}

***

## Voices

ShortGenius offers a wide selection of voices with distinct accents, languages, and tonalities. You can filter them by locale or retrieve details about a specific voice.

### List Voices

**Endpoint**: `GET /audio/voices`

| Query Param | Default | Description                                            |
| ----------- | ------- | ------------------------------------------------------ |
| `locale`    | `auto`  | Language/region code to filter voices (e.g., `en-US`). |
| `page`      | 0       | Page number.                                           |
| `limit`     | 20      | Items per page, can go up to 10,000,000.               |

{% tabs %}
{% tab title="cURL" %}

```bash
curl --request GET \
  --url "https://api.shortgenius.com/v1/audio/voices?locale=en-US&page=0&limit=5" \
  --header "Authorization: Bearer YOUR_API_TOKEN"
```

{% endtab %}

{% tab title="TypeScript" %}

```typescript
const voices = await client.getVoices({
  locale: 'en-US',
  page: 0,
  limit: 5
})

console.log(`Found ${voices.length} voices:`)

for (const voice of voices) {
  console.log(`- ${voice.name} (${voice.id})`)
  console.log(`  Locale: ${voice.locale}`)
  console.log(`  Source: ${voice.source}`)
  if (voice.tags) {
    console.log(`  Gender: ${voice.tags.gender}`)
    console.log(`  Accent: ${voice.tags.accent}`)
  }
  if (voice.previewUrl) {
    console.log(`  Preview: ${voice.previewUrl}`)
  }
}
```

{% endtab %}

{% tab title="Python" %}

```python
voices = client.audio.voices.list_voices(
    locale="en-US",
    page=0,
    limit=5
)

print(f"Found {len(voices)} voices:")

for voice in voices:
    print(f"- {voice.name} ({voice.id})")
    print(f"  Locale: {voice.locale or 'Not specified'}")
    print(f"  Source: {voice.source}")
    if voice.tags:
        print(f"  Gender: {voice.tags.gender}")
        print(f"  Accent: {voice.tags.accent}")
    if voice.preview_url:
        print(f"  Preview: {voice.preview_url}")
```

{% endtab %}
{% endtabs %}

### Retrieve a Single Voice

**Endpoint**: `GET /audio/voices/{id}`

{% tabs %}
{% tab title="cURL" %}

```bash
curl --request GET \
  --url "https://api.shortgenius.com/v1/audio/voices/769d93d4-3c7f-47c0-9a9c-5db259e67b95" \
  --header "Authorization: Bearer YOUR_API_TOKEN"
```

{% endtab %}

{% tab title="TypeScript" %}

```typescript
const voice = await client.getVoice('769d93d4-3c7f-47c0-9a9c-5db259e67b95')

console.log(`Name: ${voice.name}`)
console.log(`Locale: ${voice.locale}`)
console.log(`Source: ${voice.source}`)

if (voice.tags) {
  console.log('Tags:', voice.tags)
}

if (voice.previewUrl) {
  console.log(`Preview URL: ${voice.previewUrl}`)
}
```

{% endtab %}

{% tab title="Python" %}

```python
voice = client.audio.voices.retrieve_voice("769d93d4-3c7f-47c0-9a9c-5db259e67b95")

print(f"Name: {voice.name}")
print(f"Locale: {voice.locale or 'Not specified'}")
print(f"Source: {voice.source}")

if voice.tags:
    print(f"Tags: {voice.tags}")

if voice.preview_url:
    print(f"Preview URL: {voice.preview_url}")
```

{% endtab %}
{% endtabs %}

***

## Complete Example

Here's a complete example that demonstrates voice selection and audio generation:

{% tabs %}
{% tab title="TypeScript" %}

```typescript
import { ShortGenius } from 'shortgenius'

const client = new ShortGenius({
  bearerAuth: 'YOUR_API_TOKEN'
})

async function generateNarration() {
  // 1. Find a suitable voice
  const voices = await client.getVoices({
    locale: 'en-US',
    limit: 100
  })

  // Filter for female conversational voices
  const femaleVoices = voices.filter(v => v.tags?.gender === 'Female' && v.tags?.tone === 'Conversational')

  if (femaleVoices.length === 0) {
    console.error('No suitable voices found')
    return
  }

  const selectedVoice = femaleVoices[0]
  console.log(`Selected voice: ${selectedVoice.name}`)

  // 2. Generate speech
  const audio = await client.createSpeech({
    text: "Welcome to ShortGenius! Let's create amazing content together.",
    voiceId: selectedVoice.id,
    locale: 'en-US',
    waitForGeneration: true
  })

  console.log(`Audio generated successfully!`)
  console.log(`URL: ${audio.url}`)
  console.log(`Duration: ${audio.duration}s`)

  return audio
}

generateNarration()
```

{% endtab %}

{% tab title="Python" %}

```python
from shortgenius import Shortgenius

client = Shortgenius(api_key="YOUR_API_TOKEN")

def generate_narration():
    # 1. Find a suitable voice
    voices = client.audio.voices.list_voices(locale="en-US", limit=100)

    # Filter for female conversational voices
    female_voices = [
        v for v in voices
        if v.tags and
        v.tags.gender == "Female" and
        hasattr(v.tags, 'tone') and v.tags.tone == "Conversational"
    ]

    if not female_voices:
        print("No suitable voices found")
        return

    selected_voice = female_voices[0]
    print(f"Selected voice: {selected_voice.name}")

    # 2. Generate speech
    audio = client.audio.create_speech(
        text="Welcome to ShortGenius! Let's create amazing content together.",
        voice_id=selected_voice.id,
        locale="en-US",
        wait_for_generation=True
    )

    print("Audio generated successfully!")
    print(f"URL: {audio.url}")
    print(f"Duration: {audio.duration}s")

    return audio

generate_narration()
```

{% endtab %}
{% endtabs %}

***

## Best Practices & Tips

* **Preview Voices**: Use the `preview_url` from the voices list to quickly audition how a voice sounds.
* **Check Credits**: Generating long or high-quality TTS may consume more credits, so keep an eye on [credits](/api/guides/usage-credits.md).
* **Combine with Video**: Add TTS audio to your [video drafts](/api/guides/video-generation.md) for a more engaging, fully AI-generated experience.

***

## Next Steps

Now you know how to:

1. Generate audio from text using TTS.
2. Retrieve or list audio files.
3. Explore a variety of voices.

Continue to the [Music](/api/guides/music.md) section to see how you can add music soundtracks, or head to the [Connections & Publishing](/api/guides/publishing.md) chapter to learn how to publish your creations automatically.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://shortgenius.gitbook.io/api/guides/audio-generation.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
