Skip to main content

Actions

The ballerinax/openai.audio package exposes the following clients:

ClientPurpose
ClientProvides text-to-speech generation, audio transcription, and audio translation via the OpenAI Audio API.

Client

Provides text-to-speech generation, audio transcription, and audio translation via the OpenAI Audio API.

Configuration

FieldTypeDefaultDescription
authhttp:BearerTokenConfigRequiredBearer token configuration containing the OpenAI API key.
httpVersionhttp:HttpVersionHTTP_2_0HTTP protocol version.
http1Settingshttp:ClientHttp1Settings{}HTTP/1.x specific configurations.
http2Settingshttp:ClientHttp2Settings{}HTTP/2 specific configurations.
timeoutdecimal30Request timeout in seconds.
forwardedstring"disable"Forwarded header handling mode.
followRedirectshttp:FollowRedirects()Redirect handling configuration.
poolConfighttp:PoolConfiguration()Connection pool configuration.
cachehttp:CacheConfig{}HTTP caching configuration.
compressionhttp:CompressionCOMPRESSION_AUTOContent compression setting.
circuitBreakerhttp:CircuitBreakerConfig()Circuit breaker configuration.
retryConfighttp:RetryConfig()Retry configuration for failed requests.
cookieConfighttp:CookieConfig()Cookie handling configuration.
responseLimitshttp:ResponseLimitConfigs{}Response size limit configuration.
secureSockethttp:ClientSecureSocket()SSL/TLS configuration.
proxyhttp:ProxyConfig()Proxy server configuration.
socketConfighttp:ClientSocketConfig{}TCP socket configuration.
validationbooleantrueEnable payload validation.
laxDataBindingbooleantrueEnable lax data binding.

Initializing the client

import ballerinax/openai.audio;

configurable string apiKey = ?;

audio:Client openaiAudio = check new ({
auth: {
token: apiKey
}
});

Operations

Text-to-Speech

Create speech

Generates audio from the input text using the specified TTS model and voice.

Parameters:

NameTypeRequiredDescription
payloadCreateSpeechRequestYesThe speech generation request containing the input text, model, and voice.
headersmap<string|string[]>NoOptional HTTP headers.

Returns: byte[]|error

Sample code:

byte[] audioBytes = check openaiAudio->/audio/speech.post({
model: "tts-1",
input: "The quick brown fox jumped over the lazy dog.",
voice: "alloy"
});
check io:fileWriteBytes("speech.mp3", audioBytes);

Sample response:

<binary audio data written to speech.mp3>

Transcription

Create transcription

Transcribes audio into text in the language of the input audio using the Whisper model.

Parameters:

NameTypeRequiredDescription
payloadCreateTranscriptionRequestYesThe transcription request containing the audio file and model.
headersmap<string|string[]>NoOptional HTTP headers.

Returns: CreateTranscriptionResponse|error

Sample code:

byte[] audioFileBytes = check io:fileReadBytes("meeting.mp3");
audio:CreateTranscriptionResponse response = check openaiAudio->/audio/transcriptions.post({
model: "whisper-1",
file: {
fileContent: audioFileBytes,
fileName: "meeting.mp3"
}
});

Sample response:

{"text": "Good morning everyone. Let's start by reviewing the quarterly results and then move on to the product roadmap discussion."}

Translation

Create translation

Translates audio into English text using the Whisper model. The input audio can be in any supported language.

Parameters:

NameTypeRequiredDescription
payloadCreateTranslationRequestYesThe translation request containing the audio file and model.
headersmap<string|string[]>NoOptional HTTP headers.

Returns: CreateTranslationResponse|error

Sample code:

byte[] audioFileBytes = check io:fileReadBytes("french_audio.mp3");
audio:CreateTranslationResponse response = check openaiAudio->/audio/translations.post({
model: "whisper-1",
file: {
fileContent: audioFileBytes,
fileName: "french_audio.mp3"
}
});

Sample response:

{"text": "Hello, welcome to today's news broadcast. We will be covering the latest developments in international politics and the economy."}