Audio Generation
ShortGenius can turn any piece of text into a narrated audio file using its text-to-speech (TTS) engine. Choose from various voices and locales to match your brand or project needs. This section covers creating speech, listing existing audio, and retrieving detailed audio info.
Creating Speech
Endpoint: POST /audio/speech
Use this endpoint to generate a new audio file from text. You can optionally let the request return immediately or wait until the audio is fully generated.
import { ShortGenius } from 'shortgenius'
const client = new ShortGenius({
bearerAuth: 'YOUR_API_TOKEN'
})
// Get a voice first
const voices = await client.getVoices({ locale: 'en-US' })
const voice = voices[0]
// Create speech
const audio = await client.createSpeech({
text: 'Hello from ShortGenius!',
locale: 'en-US',
voiceId: voice.id,
waitForGeneration: true
})
console.log(`Audio created: ${audio.id}`)
console.log(`URL: ${audio.url}`)
console.log(`Duration: ${audio.duration}s`)
Request Fields
text
string
Yes
The text to be converted to speech.
locale
string
No
Defaults to "auto"
. Use a two-letter language code + region code if you want to specify a locale (e.g., en-US
, de-DE
).
wait_for_generation
boolean
No
If false, the response immediately returns a pending record. If true, it waits until the audio is ready (default: false).
Sample Response (Synchronous)
If wait_for_generation
is true and generation completes quickly, you'll receive:
{
"id": "3804fef4-5329-42b8-8a5b-a12eb5c3dc2c",
"created_at": "2025-05-05T14:00:00Z",
"updated_at": null,
"url": "https://cdn.shortgenius.com/audio/3804fef4.mp3",
"user_id": "8f157306-139a-4f38-b783-e13e326ecaaa",
"transcript": {
"words": [
{
"text": "Hello",
"start": 0.4,
"end": 0.7,
"confidence": 0.99
},
{
"text": "from",
"start": 0.75,
"end": 1.0,
"confidence": 0.98
},
...
]
},
"state": "completed",
"text": "Hello from ShortGenius!",
"locale": "en-US",
"voice": {
"id": "769d93d4-3c7f-47c0-9a9c-5db259e67b95",
"name": "Samantha",
"description": null,
"avatar_url": null,
"flag_url": null,
"tags": null,
"preview_url": null,
"locale": "en-US",
"source": "ElevenLabs"
},
"duration": 1.2,
"lufs": -14.3
}
If wait_for_generation
is false, you may see "state": "pending"
or "generating"
, and you need to poll the Get Audio endpoint until it's "completed"
.
Listing & Retrieving Audio
List Audio
Endpoint: GET /audio
page
0
Results page number (zero-based)
limit
50
Items per page, up to 200
curl --request GET \
--url "https://shortgenius.com/api/v1/audio?page=0&limit=10" \
--header "Authorization: Bearer YOUR_API_TOKEN"
Retrieve a Single Audio
Endpoint: GET /audio/{id}
curl --request GET \
--url "https://shortgenius.com/api/v1/audio/3804fef4-5329-42b8-8a5b-a12eb5c3dc2c" \
--header "Authorization: Bearer YOUR_API_TOKEN"
Voices
ShortGenius offers a wide selection of voices with distinct accents, languages, and tonalities. You can filter them by locale or retrieve details about a specific voice.
List Voices
Endpoint: GET /audio/voices
locale
auto
Language/region code to filter voices (e.g., en-US
).
page
0
Page number.
limit
20
Items per page, can go up to 10,000,000.
curl --request GET \
--url "https://shortgenius.com/api/v1/audio/voices?locale=en-US&page=0&limit=5" \
--header "Authorization: Bearer YOUR_API_TOKEN"
Retrieve a Single Voice
Endpoint: GET /audio/voices/{id}
curl --request GET \
--url "https://shortgenius.com/api/v1/audio/voices/769d93d4-3c7f-47c0-9a9c-5db259e67b95" \
--header "Authorization: Bearer YOUR_API_TOKEN"
Complete Example
Here's a complete example that demonstrates voice selection and audio generation:
import { ShortGenius } from 'shortgenius'
const client = new ShortGenius({
bearerAuth: 'YOUR_API_TOKEN'
})
async function generateNarration() {
// 1. Find a suitable voice
const voices = await client.getVoices({
locale: 'en-US',
limit: 100
})
// Filter for female conversational voices
const femaleVoices = voices.filter(v => v.tags?.gender === 'Female' && v.tags?.tone === 'Conversational')
if (femaleVoices.length === 0) {
console.error('No suitable voices found')
return
}
const selectedVoice = femaleVoices[0]
console.log(`Selected voice: ${selectedVoice.name}`)
// 2. Generate speech
const audio = await client.createSpeech({
text: "Welcome to ShortGenius! Let's create amazing content together.",
voiceId: selectedVoice.id,
locale: 'en-US',
waitForGeneration: true
})
console.log(`Audio generated successfully!`)
console.log(`URL: ${audio.url}`)
console.log(`Duration: ${audio.duration}s`)
return audio
}
generateNarration()
Best Practices & Tips
Preview Voices: Use the
preview_url
from the voices list to quickly audition how a voice sounds.Check Credits: Generating long or high-quality TTS may consume more credits, so keep an eye on credits.
Combine with Video: Add TTS audio to your video drafts for a more engaging, fully AI-generated experience.
Next Steps
Now you know how to:
Generate audio from text using TTS.
Retrieve or list audio files.
Explore a variety of voices.
Continue to the Music section to see how you can add music soundtracks, or head to the Connections & Publishing chapter to learn how to publish your creations automatically.
Last updated