Audio Generation
Last updated
Last updated
ShortGenius can turn any piece of text into a narrated audio file using its text-to-speech (TTS) engine. Choose from various voices and locales to match your brand or project needs. This section covers creating speech, listing existing audio, and retrieving detailed audio info.
Endpoint: POST /audio/speech
Use this endpoint to generate a new audio file from text. You can optionally let the request return immediately or wait until the audio is fully generated.
import { ShortGenius } from 'shortgenius'
const client = new ShortGenius({
bearerAuth: 'YOUR_API_TOKEN'
})
// Get a voice first
const voices = await
text
string
Yes
The text to be converted to speech.
voice_id
string
Yes
ID of the chosen voice. See to retrieve possible voice_id
.
locale
string
No
If wait_for_generation
is true and generation completes quickly, you'll receive:
{
"id": "3804fef4-5329-42b8-8a5b-a12eb5c3dc2c",
"created_at": "2025-05-05T14:00:00Z",
"updated_at": null,
"url": "https://cdn.shortgenius.com/audio/3804fef4.mp3",
"user_id": "8f157306-139a-4f38-b783-e13e326ecaaa",
"transcript": {
"words": [
{
"text": "Hello",
"start": 0.4,
"end": 0.7,
"confidence": 0.99
},
{
"text": "from",
"start": 0.75,
"end": 1.0,
"confidence": 0.98
},
...
]
},
"state": "completed",
"text": "Hello from ShortGenius!",
"locale": "en-US",
"voice": {
"id": "769d93d4-3c7f-47c0-9a9c-5db259e67b95",
"name": "Samantha",
"description": null,
"avatar_url": null,
"flag_url": null,
"tags": null,
"preview_url": null,
"locale": "en-US",
"source": "ElevenLabs"
},
"duration": 1.2,
"lufs": -14.3
}
If wait_for_generation
is false, you may see "state": "pending"
or "generating"
, and you need to poll the Get Audio endpoint until it's "completed"
.
Endpoint: GET /audio
page
0
Results page number (zero-based)
limit
50
Items per page, up to 200
Endpoint: GET /audio/{id}
ShortGenius offers a wide selection of voices with distinct accents, languages, and tonalities. You can filter them by locale or retrieve details about a specific voice.
Endpoint: GET /audio/voices
locale
auto
Language/region code to filter voices (e.g., en-US
).
page
0
Page number.
limit
20
Items per page, can go up to 10,000,000.
Endpoint: GET /audio/voices/{id}
Here's a complete example that demonstrates voice selection and audio generation:
Preview Voices: Use the preview_url
from the voices list to quickly audition how a voice sounds.
Check Credits: Generating long or high-quality TTS may consume more credits, so keep an eye on credits.
Combine with Video: Add TTS audio to your video drafts for a more engaging, fully AI-generated experience.
Now you know how to:
Generate audio from text using TTS.
Retrieve or list audio files.
Explore a variety of voices.
Continue to the Music section to see how you can add music soundtracks, or head to the Connections & Publishing chapter to learn how to publish your creations automatically.
Defaults to "auto"
. Use a two-letter language code + region code if you want to specify a locale (e.g., en-US
, de-DE
).
wait_for_generation
boolean
No
If false, the response immediately returns a pending record. If true, it waits until the audio is ready (default: false).
curl --request GET \
--url "https://shortgenius.com/api/v1/audio?page=0&limit=10" \
--header "Authorization: Bearer YOUR_API_TOKEN"
const response = await client.getAllAudio({
page: 0,
limit: 10
})
console.log(`Found ${response.audio.length} audio files`)
console.log(`Has more: ${response.hasMore}`)
for (const audio of response.audio) {
console.log(`- ${audio.text} (${audio.id})`)
console.log(` Voice: ${audio.voice.name}`)
console.log(` State: ${audio.state}`)
}
response = client.audio.list_audio(page=0, limit=10)
print(f"Found {len(response.audio)} audio files")
print(f"Has more: {response.has_more}")
for audio in response.audio:
print(f"- {audio.text} ({audio.id})")
print(f" Voice: {audio.voice.name}")
print(f" State: {audio.state}")
curl --request GET \
--url "https://shortgenius.com/api/v1/audio/3804fef4-5329-42b8-8a5b-a12eb5c3dc2c" \
--header "Authorization: Bearer YOUR_API_TOKEN"
const audio = await client.getAudio('3804fef4-5329-42b8-8a5b-a12eb5c3dc2c')
console.log(`Text: ${audio.text}`)
console.log(`Voice: ${audio.voice.name}`)
console.log(`State: ${audio.state}`)
console.log(`Duration: ${audio.duration}s`)
console.log(`URL: ${audio.url}`)
// Access transcript if available
if (audio.transcript) {
console.log('Transcript words:', audio.transcript.words.length)
}
audio = client.audio.retrieve_audio("3804fef4-5329-42b8-8a5b-a12eb5c3dc2c")
print(f"Text: {audio.text}")
print(f"Voice: {audio.voice.name}")
print(f"State: {audio.state}")
print(f"Duration: {audio.duration}s")
print(f"URL: {audio.url}")
# Access transcript if available
if audio.transcript:
print(f"Transcript words: {len(audio.transcript.words)}")
curl --request GET \
--url "https://shortgenius.com/api/v1/audio/voices?locale=en-US&page=0&limit=5" \
--header "Authorization: Bearer YOUR_API_TOKEN"
const voices = await client.getVoices({
locale: 'en-US',
page: 0,
limit: 5
})
console.log(`Found ${voices.length} voices:`)
for (const voice of voices) {
console.log(`- ${voice.name} (${voice.id})`)
console.log(` Locale: ${voice.locale}`)
console.log(` Source: ${voice.source}`)
if (voice.tags) {
console.log(` Gender: ${voice.tags.gender}`)
console.log(` Accent: ${voice.tags.accent}`)
}
if (voice.previewUrl) {
console.log(` Preview: ${voice.previewUrl}`)
}
}
voices = client.audio.voices.list_voices(
locale="en-US",
page=0,
limit=5
)
print(f"Found {len(voices)} voices:")
for voice in voices:
print(f"- {voice.name} ({voice.id})")
print(f" Locale: {voice.locale or 'Not specified'}")
print(f" Source: {voice.source}")
if voice.tags:
print(f" Gender: {voice.tags.gender}")
print(f" Accent: {voice.tags.accent}")
if voice.preview_url:
print(f" Preview: {voice.preview_url}")
curl --request GET \
--url "https://shortgenius.com/api/v1/audio/voices/769d93d4-3c7f-47c0-9a9c-5db259e67b95" \
--header "Authorization: Bearer YOUR_API_TOKEN"
const voice = await client.getVoice('769d93d4-3c7f-47c0-9a9c-5db259e67b95')
console.log(`Name: ${voice.name}`)
console.log(`Locale: ${voice.locale}`)
console.log(`Source: ${voice.source}`)
if (voice.tags) {
console.log('Tags:', voice.tags)
}
if (voice.previewUrl) {
console.log(`Preview URL: ${voice.previewUrl}`)
}
voice = client.audio.voices.retrieve_voice("769d93d4-3c7f-47c0-9a9c-5db259e67b95")
print(f"Name: {voice.name}")
print(f"Locale: {voice.locale or 'Not specified'}")
print(f"Source: {voice.source}")
if voice.tags:
print(f"Tags: {voice.tags}")
if voice.preview_url:
print(f"Preview URL: {voice.preview_url}")
import { ShortGenius } from 'shortgenius'
const client = new ShortGenius({
bearerAuth: 'YOUR_API_TOKEN'
})
async function generateNarration() {
// 1. Find a suitable voice
const voices = await client.getVoices({
locale: 'en-US',
limit: 100
})
// Filter for female conversational voices
const femaleVoices = voices.filter(v => v.tags?.gender === 'Female' && v.tags?.tone === 'Conversational')
if (femaleVoices.length === 0) {
console.error('No suitable voices found')
return
}
const selectedVoice = femaleVoices[0]
console.log(`Selected voice: ${selectedVoice.name}`)
// 2. Generate speech
const audio = await client.createSpeech({
text: "Welcome to ShortGenius! Let's create amazing content together.",
voiceId: selectedVoice.id,
locale: 'en-US',
waitForGeneration: true
})
console.log(`Audio generated successfully!`)
console.log(`URL: ${audio.url}`)
console.log(`Duration: ${audio.duration}s`)
return audio
}
generateNarration()
from shortgenius import Shortgenius
client = Shortgenius(api_key="YOUR_API_TOKEN")
def generate_narration():
# 1. Find a suitable voice
voices = client.audio.voices.list_voices(locale="en-US", limit=100)
# Filter for female conversational voices
female_voices = [
v for v in voices
if v.tags and
v.tags.gender == "Female" and
hasattr(v.tags, 'tone') and v.tags.tone == "Conversational"
]
if not female_voices:
print("No suitable voices found")
return
selected_voice = female_voices[0]
print(f"Selected voice: {selected_voice.name}")
# 2. Generate speech
audio = client.audio.create_speech(
text="Welcome to ShortGenius! Let's create amazing content together.",
voice_id=selected_voice.id,
locale="en-US",
wait_for_generation=True
)
print("Audio generated successfully!")
print(f"URL: {audio.url}")
print(f"Duration: {audio.duration}s")
return audio
generate_narration()