SpeechConfig.language_code does not constrain input audio transcription language in Live API sessions



**Labels:** `bug`, `live-api`

### Description

When using the ADK's `run_live()` with native audio models (Gemini 2.0 Flash Live), setting `language_code="en-US"` in `SpeechConfig` does **not** constrain the **input** audio transcription language. The transcribed user input still intermittently appears in other languages (e.g., Hindi, Spanish, Japanese) even though the speaker is speaking English.

### Configuration

```python
run_config = RunConfig(
    streaming_mode=StreamingMode.BIDI,
    response_modalities=["AUDIO"],
    speech_config=types.SpeechConfig(
        language_code="en-US",
        voice_config=types.VoiceConfig(
            prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name="Kore")
        ),
    ),
    input_audio_transcription=types.AudioTranscriptionConfig(),
    output_audio_transcription=types.AudioTranscriptionConfig(),
)
```

### Observed behaviour

- **Output speech** is correctly in English.
- **Input transcription** (`input_audio_transcription` events) frequently returns non-English text even though the user is speaking English exclusively. For example, an English sentence might be transcribed in Devanagari script or mixed with non-Latin characters.

### Expected behaviour

Setting `language_code="en-US"` in `SpeechConfig` should constrain **both** the output speech language **and** the input speech-to-text transcription language. At minimum, the input transcription should respect the configured language and not produce text in unrelated languages.

### Root cause analysis

`AudioTranscriptionConfig` is currently an empty class (`pass`) with no `language_code` or similar field:

```python
class AudioTranscriptionConfig(_common.BaseModel):
  """The audio transcription configuration in Setup."""
  pass
```

There appears to be no mechanism to pass the desired transcription language to the underlying STT model for input audio. The `SpeechConfig.language_code` seems to only affect TTS output, not STT input.

### Suggested fix

Either:
1. **Propagate** `SpeechConfig.language_code` to the input transcription pipeline so it constrains the STT language, or
2. **Add a `language_code` field** to `AudioTranscriptionConfig` so developers can explicitly set the transcription language:
   ```python
   input_audio_transcription=types.AudioTranscriptionConfig(language_code="en-US")
   ```

### Environment

- **ADK version:** google-adk 1.x (latest)
- **google-genai SDK version:** latest
- **Model:** `gemini-2.5-flash-native-audio-preview-12-2025` 
- **Streaming mode:** BIDI (WebSocket)

---


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SpeechConfig.language_code does not constrain input audio transcription language in Live API sessions #5542

Description

Configuration

Observed behaviour

Expected behaviour

Root cause analysis

Suggested fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

SpeechConfig.language_code does not constrain input audio transcription language in Live API sessions #5542

Description

Description

Configuration

Observed behaviour

Expected behaviour

Root cause analysis

Suggested fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions