Skip to content

feat(groq): Add TTS adapter#346

Open
dhamivibez wants to merge 4 commits intoTanStack:mainfrom
dhamivibez:feat/groq-tts
Open

feat(groq): Add TTS adapter#346
dhamivibez wants to merge 4 commits intoTanStack:mainfrom
dhamivibez:feat/groq-tts

Conversation

@dhamivibez
Copy link
Contributor

@dhamivibez dhamivibez commented Mar 6, 2026

Adds a tree-shakeable Text-to-Speech (TTS) adapter for the Groq API. This includes support for English and Arabic voices, various output formats (with WAV as default),
and configuration options like speed and sample rate.

Includes new types, model metadata for TTS models, and comprehensive unit tests.

🎯 Changes

  • Added a tree-shakeable Text-to-Speech (TTS) adapter for the Groq API.
  • Added support for English and Arabic voices.
  • Added configurable output formats (default: WAV), speed, and sample rate.
  • Introduced new types and model metadata for TTS models.
  • Added comprehensive unit tests for the TTS adapter.

✅ Checklist

  • I have followed the steps in the Contributing guide.
  • I have tested this code locally with pnpm run test:pr.

🚀 Release Impact

  • This change affects published code, and I have generated a changeset.
  • This change is docs/CI/dev-only (no release).

Summary by CodeRabbit

  • New Features

    • Added Groq Text-to-Speech (TTS) support with English and Arabic voices, multiple output formats (wav/mp3/flac/ogg/mulaw), configurable speed and sample rate, and 200-character input limit.
    • Supports explicit API key or automatic environment-based configuration; sensible defaults (voice: autumn, format: wav).
  • Tests

    • Added comprehensive unit tests covering adapter creation, defaults, parameter handling, formats, and error conditions.

Adds a tree-shakeable Text-to-Speech (TTS) adapter for the Groq API.
This includes support for English and Arabic voices, various output
formats (with WAV as default),
and configuration options like speed and sample rate.

Includes new types, model metadata for TTS models, and comprehensive
unit tests.
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Mar 6, 2026

📝 Walkthrough

Walkthrough

Adds a tree-shakeable Groq Text-to-Speech adapter, TTS types and model metadata, input validation, unit tests, and exports to the public package API. Includes schema/tool guard adjustments affecting object-schema handling.

Changes

Cohort / File(s) Summary
TTS Adapter Implementation
packages/typescript/ai-groq/src/adapters/tts.ts
New GroqTTSAdapter with generateSpeech that validates input (max 200 chars), applies defaults (voice: 'autumn', format: 'wav'), calls Groq SDK audio.speech.create, converts response to base64, maps content-type, and returns TTSResult. Includes createGroqSpeech and groqSpeech factories and env-key handling.
Audio Types & Validation
packages/typescript/ai-groq/src/audio/audio-provider-options.ts, packages/typescript/ai-groq/src/audio/tts-provider-options.ts
Adds AudioProviderOptions and validateAudioInput (max 200 chars). Introduces TTS types: GroqTTSEnglishVoice, GroqTTSArabicVoice, GroqTTSVoice, GroqTTSFormat, GroqTTSSampleRate, and GroqTTSProviderOptions.
Model Metadata
packages/typescript/ai-groq/src/model-meta.ts
Adds GROQ_TTS_MODELS, GroqTTSModel, GroqTTSModelProviderOptionsByName, and TTS model entries (ORPHEUS_V1_ENGLISH, ORPHEUS_ARABIC_SAUDI). Updates resolver types to include TTS provider options.
Public API Surface
packages/typescript/ai-groq/src/index.ts
Exports TTS adapter, factory functions, and TTS types; extends model exports to include GROQ_TTS_MODELS and GroqTTSModel alongside chat models.
Schema & Tooling Adjustments
packages/typescript/ai-groq/src/tools/function-tool.ts, packages/typescript/ai-groq/src/utils/schema-converter.ts
Removes defensive insertion of empty properties for object schemas; narrows object-processing guard so objects without properties are skipped and required handling changed accordingly.
Tests & Changeset
.changeset/green-colts-kiss.md, packages/typescript/ai-groq/tests/groq-tts.test.ts
Adds changeset entry and comprehensive TTS tests (adapter creation, env/api-key behavior, defaults, parameter propagation, content-type mapping, error on long input, multi-language voices, and SDK mocking).

Sequence Diagram

sequenceDiagram
    participant Client as Client
    participant Adapter as GroqTTSAdapter
    participant SDK as Groq SDK
    participant API as Groq API

    Client->>Adapter: generateSpeech(input, options)
    Adapter->>Adapter: validateAudioInput(input)
    Adapter->>Adapter: build params (model, voice, format, speed, sample_rate)
    Adapter->>SDK: audio.speech.create(params)
    SDK->>API: POST /audio/speech
    API-->>SDK: audio blob
    SDK-->>Adapter: audio data
    Adapter->>Adapter: convert to base64 + derive contentType
    Adapter-->>Client: TTSResult { id, model, audio, format, contentType }
Loading

Estimated Code Review Effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibbled code and hummed a tune,
I gave the groq a voice this moon,
From autumn breath to sultan's call,
Base64 echoes down the hall,
Hop, clap — now speech for one and all.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title 'feat(groq): Add TTS adapter' clearly and concisely describes the main change—adding a Text-to-Speech adapter for Groq—and is directly related to the changeset content.
Description check ✅ Passed The description follows the template structure with completed sections for Changes, Checklist, and Release Impact. It clearly describes the TTS adapter addition, voice support, configuration options, and mentions tests and changeset generation.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@nx-cloud
Copy link

nx-cloud bot commented Mar 6, 2026

View your CI Pipeline Execution ↗ for commit 28ad865

Command Status Duration Result
nx affected --targets=test:sherif,test:knip,tes... ✅ Succeeded 38s View ↗
nx run-many --targets=build --exclude=examples/** ✅ Succeeded 5s View ↗

☁️ Nx Cloud last updated this comment at 2026-03-07 08:42:31 UTC

@pkg-pr-new
Copy link

pkg-pr-new bot commented Mar 6, 2026

Open in StackBlitz

@tanstack/ai

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai@346

@tanstack/ai-anthropic

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-anthropic@346

@tanstack/ai-client

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-client@346

@tanstack/ai-devtools-core

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-devtools-core@346

@tanstack/ai-fal

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-fal@346

@tanstack/ai-gemini

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-gemini@346

@tanstack/ai-grok

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-grok@346

@tanstack/ai-groq

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-groq@346

@tanstack/ai-ollama

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-ollama@346

@tanstack/ai-openai

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-openai@346

@tanstack/ai-openrouter

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-openrouter@346

@tanstack/ai-preact

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-preact@346

@tanstack/ai-react

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-react@346

@tanstack/ai-react-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-react-ui@346

@tanstack/ai-solid

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-solid@346

@tanstack/ai-solid-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-solid-ui@346

@tanstack/ai-svelte

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-svelte@346

@tanstack/ai-vue

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-vue@346

@tanstack/ai-vue-ui

npm i https://pkg.pr.new/TanStack/ai/@tanstack/ai-vue-ui@346

@tanstack/preact-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/preact-ai-devtools@346

@tanstack/react-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/react-ai-devtools@346

@tanstack/solid-ai-devtools

npm i https://pkg.pr.new/TanStack/ai/@tanstack/solid-ai-devtools@346

commit: 281c81a

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
packages/typescript/ai-groq/src/utils/schema-converter.ts (1)

53-91: ⚠️ Potential issue | 🟠 Major

Preserve strict object normalization for schemas without properties.

Line 53 now skips { type: 'object' } schemas entirely, so structuredOutput() can send a strict schema without additionalProperties: false when the AI layer omits properties. packages/typescript/ai-groq/src/adapters/text.ts:134-182 passes that result straight to Groq, so this regresses zero-property object outputs and nested object schemas that are declared without a properties key.

Suggested fix
-  if (result.type === 'object' && result.properties) {
-    const properties = { ...result.properties }
+  if (result.type === 'object') {
+    const properties = { ...(result.properties ?? {}) }
     const allPropertyNames = Object.keys(properties)

     for (const propName of allPropertyNames) {
       const prop = properties[propName]
       const wasOptional = !originalRequired.includes(propName)
@@
     }

     result.properties = properties
     result.required = allPropertyNames
     result.additionalProperties = false
   }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/ai-groq/src/utils/schema-converter.ts` around lines 53 -
91, The code currently only normalizes object schemas when result.properties
exists, skipping strict normalization for plain { type: 'object' } and nested
object/array items that lack a properties key; update the logic in
makeGroqStructuredOutputCompatible so that when result.type === 'object' but
result.properties is absent or empty you still enforce strict output by setting
result.properties = result.properties || {} (or leave empty object),
result.required = [], and result.additionalProperties = false; also ensure the
same normalization is applied to nested schemas passed into the recursive calls
(e.g., where prop.type === 'object' and prop.properties may be undefined, and
for array items via prop.items) so zero-property objects and nested object
schemas without a properties key remain strict.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@packages/typescript/ai-groq/src/adapters/tts.ts`:
- Around line 64-65: The code uses Node-only
Buffer.from(arrayBuffer).toString('base64') to produce base64 for the TTS
response (see variables arrayBuffer and base64), which breaks in browsers;
replace that call with an isomorphic encoder (e.g. add a helper
base64Encode(arrayBuffer) that checks for global Buffer and uses
Buffer.from(...).toString('base64') in Node, and uses a browser path (Uint8Array
-> binary string -> btoa) otherwise) and use that helper where base64 is
computed; this keeps compatibility with the existing browser-oriented window.env
lookup in the same adapter.
- Around line 46-60: In generateSpeech, stop using the caller-supplied model and
replace the request.model with this.model (the adapter-configured model) and
remove the unchecked casts of voice/format; validate the incoming options.voice
and options.format against explicit allow-lists (e.g., allowed voices and
allowed formats matching GroqTTSVoice and GroqTTSFormat) and fall back to safe
defaults ('autumn' for voice, 'wav' for format) only after validation; update
the construction of Groq_SDK.Audio.Speech.SpeechCreateParams in generateSpeech
to use validatedVoice and validatedFormat variables and preserve other fields
(input, speed, ...modelOptions) so unsupported values are rejected before
calling the Groq API.

In `@packages/typescript/ai-groq/src/audio/tts-provider-options.ts`:
- Around line 22-26: The JSDoc for GroqTTSFormat is misleading; update the
comment above the exported type GroqTTSFormat to accurately reflect supported
formats (either list all supported values 'flac', 'mp3', 'mulaw', 'ogg', 'wav'
or remove the "Only wav is currently supported" claim) so generated docs match
the type declaration; edit the comment near the GroqTTSFormat type to be
accurate and concise.

In `@packages/typescript/ai-groq/src/model-meta.ts`:
- Around line 377-418: ResolveProviderOptions<TModel> is only discriminating
chat models so TModel like 'canopylabs/orpheus-v1-english' resolves to
GroqTextProviderOptions; modify the provider-options resolver type to account
for the new TTS model union by adding a branch that maps GroqTTSModel (exported
GROQ_TTS_MODELS / GroqTTSModel) to GroqTTSProviderOptions (instead of falling
through to GroqTextProviderOptions), i.e. update the conditional type that
currently checks GROQ_CHAT_MODELS to also check (or add) GROQ_TTS_MODELS /
GroqTTSModel so sample_rate and other TTS-specific options are correctly typed.

---

Outside diff comments:
In `@packages/typescript/ai-groq/src/utils/schema-converter.ts`:
- Around line 53-91: The code currently only normalizes object schemas when
result.properties exists, skipping strict normalization for plain { type:
'object' } and nested object/array items that lack a properties key; update the
logic in makeGroqStructuredOutputCompatible so that when result.type ===
'object' but result.properties is absent or empty you still enforce strict
output by setting result.properties = result.properties || {} (or leave empty
object), result.required = [], and result.additionalProperties = false; also
ensure the same normalization is applied to nested schemas passed into the
recursive calls (e.g., where prop.type === 'object' and prop.properties may be
undefined, and for array items via prop.items) so zero-property objects and
nested object schemas without a properties key remain strict.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5e16ea17-fc04-4f5e-abfb-ea476c75ea0e

📥 Commits

Reviewing files that changed from the base of the PR and between 0ea82f6 and 5c99251.

📒 Files selected for processing (9)
  • .changeset/green-colts-kiss.md
  • packages/typescript/ai-groq/src/adapters/tts.ts
  • packages/typescript/ai-groq/src/audio/audio-provider-options.ts
  • packages/typescript/ai-groq/src/audio/tts-provider-options.ts
  • packages/typescript/ai-groq/src/index.ts
  • packages/typescript/ai-groq/src/model-meta.ts
  • packages/typescript/ai-groq/src/tools/function-tool.ts
  • packages/typescript/ai-groq/src/utils/schema-converter.ts
  • packages/typescript/ai-groq/tests/groq-tts.test.ts
💤 Files with no reviewable changes (1)
  • packages/typescript/ai-groq/src/tools/function-tool.ts

@dhamivibez dhamivibez marked this pull request as draft March 6, 2026 18:26
- Assign default values before type assertion to avoid unnecessary
  conditionals
- Use a single variable `voiceFormat` for GroqTTSFormat and reuse in
  request
- Remove redundant casts in `SpeechCreateParams` to satisfy ESLint
- Maintain type safety between TTSVoice/GroqTTSVoice and
  TTSFormat/GroqTTSFormat
@dhamivibez dhamivibez marked this pull request as ready for review March 7, 2026 08:47
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
packages/typescript/ai-groq/src/adapters/tts.ts (1)

46-85: ⚠️ Potential issue | 🟠 Major

Prior review issues remain unaddressed: model consistency, voice/format validation, and isomorphic base64.

Three issues flagged in previous reviews persist:

  1. Model inconsistency (lines 50, 63, 80): Uses caller-supplied options.model instead of this.model. The adapter is constructed with a specific model, but generateSpeech ignores it.

  2. Unchecked type casts (lines 60, 65): TTSOptions allows format: 'opus' | 'aac' | 'pcm' which aren't valid for Groq (GroqTTSFormat supports 'flac' | 'mp3' | 'mulaw' | 'ogg' | 'wav'). The casts bypass compile-time safety, causing API failures at runtime.

  3. Node-only Buffer.from (line 74): The adapter supports browser environments via window.env detection in getGroqApiKeyFromEnv, but Buffer.from() fails in browsers.

Suggested fix
   async generateSpeech(
     options: TTSOptions<GroqTTSProviderOptions>,
   ): Promise<TTSResult> {
     const {
-      model,
       text,
       voice = 'autumn',
       format = 'wav',
       speed,
       modelOptions,
     } = options

     validateAudioInput({ input: text, model })

-    const voiceFormat = format as GroqTTSFormat
+    const validatedFormat = this.validateFormat(format)
+    const validatedVoice = this.validateVoice(voice)

     const request: Groq_SDK.Audio.Speech.SpeechCreateParams = {
-      model,
+      model: this.model,
       input: text,
-      voice: voice as GroqTTSVoice,
-      response_format: voiceFormat,
+      voice: validatedVoice,
+      response_format: validatedFormat,
       speed,
       ...modelOptions,
     }

     const response = await this.client.audio.speech.create(request)

     const arrayBuffer = await response.arrayBuffer()
-    const base64 = Buffer.from(arrayBuffer).toString('base64')
+    const base64 = this.arrayBufferToBase64(arrayBuffer)

-    const contentType = this.getContentType(voiceFormat)
+    const contentType = this.getContentType(validatedFormat)

     return {
       id: generateId(this.name),
-      model,
+      model: this.model,
       audio: base64,
-      format: voiceFormat,
+      format: validatedFormat,
       contentType,
     }
   }
+
+  private validateFormat(format: string): GroqTTSFormat {
+    const validFormats: GroqTTSFormat[] = ['flac', 'mp3', 'mulaw', 'ogg', 'wav']
+    if (validFormats.includes(format as GroqTTSFormat)) {
+      return format as GroqTTSFormat
+    }
+    return 'wav' // fallback to default
+  }
+
+  private validateVoice(voice: string): GroqTTSVoice {
+    const validVoices: GroqTTSVoice[] = [
+      'autumn', 'diana', 'hannah', 'austin', 'daniel', 'troy',
+      'fahad', 'sultan', 'lulwa', 'noura'
+    ]
+    if (validVoices.includes(voice as GroqTTSVoice)) {
+      return voice as GroqTTSVoice
+    }
+    return 'autumn' // fallback to default
+  }
+
+  private arrayBufferToBase64(buffer: ArrayBuffer): string {
+    if (typeof Buffer !== 'undefined') {
+      return Buffer.from(buffer).toString('base64')
+    }
+    const bytes = new Uint8Array(buffer)
+    let binary = ''
+    for (let i = 0; i < bytes.byteLength; i++) {
+      binary += String.fromCharCode(bytes[i])
+    }
+    return btoa(binary)
+  }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@packages/typescript/ai-groq/src/adapters/tts.ts` around lines 46 - 85,
generateSpeech currently ignores the adapter's configured model and uses
caller-supplied options.model, unsafely casts voice/format to Groq types, and
uses Node-only Buffer.from for base64 conversion; update generateSpeech to use
this.model (not options.model) when building the request, remove the unchecked
casts for voice/format and instead validate/map options.voice and options.format
against GroqTTSVoice and GroqTTSFormat (throw or normalize when unsupported),
and replace Buffer.from(arrayBuffer).toString('base64') with an isomorphic
conversion helper (e.g., use Buffer when available or a browser-safe
ArrayBuffer→base64 path like Uint8Array→binary string→btoa) so the adapter works
in both Node and browsers; refer to the generateSpeech method, request variable
(Groq_SDK.Audio.Speech.SpeechCreateParams), and getContentType for where to
apply these fixes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@packages/typescript/ai-groq/src/adapters/tts.ts`:
- Around line 46-85: generateSpeech currently ignores the adapter's configured
model and uses caller-supplied options.model, unsafely casts voice/format to
Groq types, and uses Node-only Buffer.from for base64 conversion; update
generateSpeech to use this.model (not options.model) when building the request,
remove the unchecked casts for voice/format and instead validate/map
options.voice and options.format against GroqTTSVoice and GroqTTSFormat (throw
or normalize when unsupported), and replace
Buffer.from(arrayBuffer).toString('base64') with an isomorphic conversion helper
(e.g., use Buffer when available or a browser-safe ArrayBuffer→base64 path like
Uint8Array→binary string→btoa) so the adapter works in both Node and browsers;
refer to the generateSpeech method, request variable
(Groq_SDK.Audio.Speech.SpeechCreateParams), and getContentType for where to
apply these fixes.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: a0381b5d-b6d4-4520-8faf-a4c1e967da64

📥 Commits

Reviewing files that changed from the base of the PR and between 5c99251 and 28ad865.

📒 Files selected for processing (3)
  • packages/typescript/ai-groq/src/adapters/tts.ts
  • packages/typescript/ai-groq/src/index.ts
  • packages/typescript/ai-groq/src/model-meta.ts
🚧 Files skipped from review as they are similar to previous changes (1)
  • packages/typescript/ai-groq/src/index.ts

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant