GS_Audio
Audio management, event-based sound playback, multi-layer music scoring, mixing buses with effects, and Klatt formant voice synthesis with 3D spatial audio.
GS_Audio provides a complete audio solution for GS_Play projects. It includes a visual node-based audio event authoring system, multi-layer music scoring, named mixing buses with configurable effects chains, and a built-in Klatt formant voice synthesizer with 3D spatial audio. All features integrate with the GS_Play manager lifecycle and respond to standby mode automatically.
For usage guides and setup examples, see The Basics: GS_Audio.
Contents
Audio Management
The Audio Manager singleton initializes the MiniAudio engine, manages mixing buses, and coordinates score track playback. It extends GS_ManagerComponent and participates in the standard two-stage initialization.
| Component | Purpose |
|---|
| Audio Manager | Master audio controller – engine lifecycle, bus routing, event library loading, score management. |
Audio Manager API
Audio Event Graph
A visual node-based editor for authoring complex sound events. Build audio processing chains with sources, filters, effects, and conditional routing — all evaluated as a data-flow graph at runtime using the gs_graphcanvas framework.
| Feature | Purpose |
|---|
| Audio Event Graph | Visual sound event editor — sources, filters, effects, routing, and phase lifecycle. |
Audio Event Graph
Mixing & Effects
Named audio buses with configurable effects chains and environmental influence. Includes 9 built-in audio filter types.
Mixing & Effects API
Score Arrangement
Multi-layer music scoring with configurable time signatures, tempo, and layer control.
Score Arrangement API
Klatt Voice Synthesis
Built-in text-to-speech using Klatt formant synthesis with 3D spatial audio.
Klatt Voice API
Dependencies
- GS_Core (required)
- MiniAudio (third-party audio library)
- SoLoud (embedded, for voice synthesis)
Installation
- Enable the GS_Audio gem in your project configuration.
- Ensure GS_Core and MiniAudio are also enabled.
- Create an Audio Manager prefab and add it to the Game Manager’s Startup Managers list.
See Also
For conceptual overviews and usage guides:
For related resources:
Get GS_Audio
GS_Audio — Explore this gem on the product page and add it to your project.
1 - Audio Manager
Master audio controller – engine initialization, mixing bus routing, event library loading, score playback, and Audio Event Graph instance management.
The Audio Manager is the master audio controller for every GS_Play project. It extends GS_ManagerComponent and participates in the standard two-stage initialization managed by the Game Manager. On startup it initializes the MiniAudio engine, creates the named mixing bus graph, and coordinates score track playback.
Like all GS_Play managers, the Audio Manager responds to standby mode automatically — muting or pausing audio output when the game enters a blocking operation such as a stage change.
For usage guides and setup examples, see The Basics: GS_Audio.

Contents
How It Works
Engine Lifecycle
When the Audio Manager activates, it initializes a MiniAudio engine instance and builds the mixing bus graph from its configured bus list.
Mixing Bus Routing
All audio output flows through named mixing buses. Each bus is a GS_MixingBus node in the MiniAudio graph with its own volume level and optional effects chain. The Audio Manager owns the top-level routing and exposes volume control per bus through the request bus.
Audio Event Graph Pooling
The Audio Manager maintains a template cache and instance pool for Audio Event Graphs. When a graph is played:
- First request for a path loads and caches the
GraphDocumentAsset - A
GraphInstance is created from the cached asset - On release, idle instances are pooled for reuse (default pool size: 8)
- Each instance is fully independent — variables, phase, and state do not bleed between instances
Score Playback
The Audio Manager coordinates playback of ScoreArrangementTrack assets — multi-layer musical scores with configurable tempo, time signature, and layer selection.
Inspector Properties
| Property | Type | Description |
|---|
| Mixing Buses | AZStd::vector<BusEffectsPair> | Named mixing buses with optional effects chains. |
| Default Master Volume | float | Initial master volume (0.0 to 1.0). |
API Reference
GS_AudioManagerComponent
| Field | Value |
|---|
| TypeId | {F28721FD-B9FD-4C04-8CD1-6344BD8A3B78} |
| Extends | GS_Core::GS_ManagerComponent |
| Header | GS_Audio/GS_AudioManagerBus.h |
Audio Event Graph
Instanced, visual-graph-based sound events with full lifecycle control, variable injection, and 3D spatialization.
Playback Control
| Method | Parameters | Returns | Description |
|---|
PlayAudioGraph | const AZStd::string& assetPath | AZ::u32 instanceId | Fire-and-forget graph playback. Auto-cleaned up when finished. Returns an instance ID. |
AcquireAudioGraph | const AZStd::string& assetPath | AZ::u32 instanceId | Acquires a graph instance for manual control. Caller is responsible for release. |
FireAudioGraph | AZ::u32 instanceId | void | Starts or restarts playback on an acquired instance. |
StopAudioGraph | AZ::u32 instanceId | void | Stops playback cleanly and returns the instance to the pool. |
StopAllAudioGraphs | — | void | Emergency stops all playing graph instances. |
ReleaseAudioGraph | AZ::u32 instanceId | void | Releases a manually acquired instance back to the pool. |
StopAudioGraphFade | AZ::u32 instanceId, float fadeTime | void | Fades out and stops over the specified duration (seconds). |
Lifecycle
| Method | Parameters | Returns | Description |
|---|
SetAudioGraphLooping | AZ::u32 instanceId, bool looping | void | Enables or disables the Loop phase. |
FinishAudioGraph | AZ::u32 instanceId | void | Transitions to the Finish phase after current sounds complete. |
Variable Control
Set graph variables at runtime to drive filter parameters, routing, and source selection.
| Method | Parameters | Returns | Description |
|---|
SetAudioGraphVariable | AZ::u32 instanceId, const AZStd::string& name, float value | void | Sets a float variable on the instance. |
SetAudioGraphVariableInt | AZ::u32 instanceId, const AZStd::string& name, int value | void | Sets an int variable. |
SetAudioGraphVariableBool | AZ::u32 instanceId, const AZStd::string& name, bool value | void | Sets a bool variable. |
SetAudioGraphVariableString | AZ::u32 instanceId, const AZStd::string& name, const AZStd::string& value | void | Sets a string variable. |
Setting a variable marks dependent nodes dirty and re-evaluates only the affected portion of the graph on the next tick.
3D Spatialization
| Method | Parameters | Returns | Description |
|---|
SetAudioGraphEntity | AZ::u32 instanceId, const AZ::EntityId& entityId | void | Tracks entity position for 3D audio. Updated each tick. |
SetAudioGraphPosition | AZ::u32 instanceId, const AZ::Vector3& position | void | Fixed world position for 3D audio. |
ClearAudioGraphSpatialization | AZ::u32 instanceId | void | Disables 3D spatialization (plays as 2D). |
Usage Pattern
// Fire-and-forget
AudioManagerRequestBus::Broadcast(&AudioManagerRequests::PlayAudioGraph,
"Assets/Audio/Footstep_Stone.audiograph");
// Manual control with variable injection
AZ::u32 instanceId = 0;
AudioManagerRequestBus::BroadcastResult(instanceId,
&AudioManagerRequests::AcquireAudioGraph,
"Assets/Audio/Ambience_Forest.audiograph");
AudioManagerRequestBus::Broadcast(&AudioManagerRequests::SetAudioGraphLooping, instanceId, true);
AudioManagerRequestBus::Broadcast(&AudioManagerRequests::SetAudioGraphEntity, instanceId, entityId);
AudioManagerRequestBus::Broadcast(&AudioManagerRequests::FireAudioGraph, instanceId);
// Later: set a variable to mute underwater
AudioManagerRequestBus::Broadcast(&AudioManagerRequests::SetAudioGraphVariable,
instanceId, "underwater", 0.8f);
// Stop when done
AudioManagerRequestBus::Broadcast(&AudioManagerRequests::StopAudioGraph, instanceId);
AudioManagerRequestBus::Broadcast(&AudioManagerRequests::ReleaseAudioGraph, instanceId);
Mixing
| Method | Parameters | Returns | Description |
|---|
SetMixerVolume | const AZStd::string& busName, float volume | void | Sets the volume of a named mixing bus (0.0 – 1.0). |
GetMixerVolume | const AZStd::string& busName | float | Returns the current volume of a named mixing bus. |
SetMasterVolume | float volume | void | Sets master output volume (0.0 – 1.0). |
GetMasterVolume | — | float | Returns current master output volume. |
Score
| Method | Parameters | Returns | Description |
|---|
PlayScoreTrack | const AZ::Data::Asset<ScoreArrangementTrack>& track | void | Begins playback of a score arrangement track. |
StopScoreTrack | — | void | Stops the currently playing score track. |
Notification Bus: AudioManagerNotificationBus
Events broadcast by the Audio Manager. Multiple handler bus — any number of components can subscribe.
| Event | Parameters | Description |
|---|
OnScoreTrackStarted | — | Fired when a score track begins. |
OnScoreTrackStopped | — | Fired when a score track stops. |
OnMixerVolumeChanged | const AZStd::string& busName, float volume | Fired when a mixing bus volume changes. |
See Also
Get GS_Audio
GS_Audio — Explore this gem on the product page and add it to your project.
2 - Audio Event Graph
Visual graph editor for authoring complex sound events with filters, effects, routing, and phase lifecycle.
The Audio Event Graph is an FMOD-style visual editor for authoring complex sound events. Each .audiograph file defines one sound event as a node graph — sources, filters, effects, and routing all connected visually. At runtime, the graph maps directly to a miniaudio processing pipeline.
The Audio Event Graph uses the gs_graphcanvas framework with a DataFlowGraph topology. Nodes evaluate in topological order with dirty-propagation — only nodes affected by a change are re-evaluated.

Contents
Opening the Editor
Open the Audio Event Graph editor from the O3DE Editor menu: GS Tools > Audio Event Graph Editor.
Each graph is a standalone .audiograph file — use File > New to create one, or File > Open to load an existing graph. Multiple graphs can be open simultaneously in separate tabs.
How It Works
Audio Event Graphs use a data-flow model. Connections carry AudioRoute values — handles that represent a point in the miniaudio processing chain. The signal flows left to right:
- Entry nodes gate which phase is active (Start, Loop, or Finish)
- Source nodes create sound instances and output an AudioRoute
- Filter and effect nodes process the audio and pass it along
- Routing nodes (If/Switch) direct the signal based on conditions or variables
- The Output node wires the final AudioRoute to a mixing bus
Variables can be set from gameplay code at any time, causing affected nodes to re-evaluate without restarting the entire graph.
Node Catalog
Entry Nodes (Phase Gates)
| Node | Purpose |
|---|
| Aud_StartNode | Active during the Start phase |
| Aud_LoopNode | Active during the Loop phase |
| Aud_FinishNode | Active during the Finish phase |
Entry nodes output a gate signal. Downstream nodes only produce audio when their entry gate is active.
Source Nodes

| Node | Purpose | Key Properties |
|---|
| Aud_SoundNode | Single audio asset playback | Volume, pitch, delay, looping, 3D spatialization |
| Aud_AudioPoolNode | Random selection from a pool of audio assets | Same as SoundNode, plus asset pool list |
Filter Nodes

| Node | Filter Type |
|---|
| LowPass | Low-pass filter |
| HighPass | High-pass filter |
| BandPass | Band-pass filter |
| Notch | Notch (band-reject) filter |
| PeakingEQ | Peaking EQ |
| LowShelf | Low shelf filter |
| HighShelf | High shelf filter |
All filter nodes take an audio_in input and produce an audio_out output. Filter parameters (cutoff frequency, Q factor, gain) are configurable per-node or bindable to variables.
Effect Nodes

| Node | Purpose |
|---|
| Aud_EchoNode | Echo / reverb effect |
Routing Nodes
The built-in gs_graphcanvas IfNode and SwitchNode work with AudioRoute values. Use them for conditional audio paths based on game state variables (e.g., play different sounds based on surface type or weather).
Terminal Node
| Node | Purpose |
|---|
| Aud_OutputNode | Wires the final audio chain to the mixing bus |
Every graph needs exactly one Output node.
Phase Lifecycle

Audio Event Graphs support a three-phase lifecycle for structured sound playback:
- Start — Initial playback. The Aud_StartNode gate is active. One-shot intro sounds play here.
- Loop — Repeating section. The Aud_LoopNode gate is active. Ambient loops and sustained sounds live here.
- Finish — Outro/tail. The Aud_FinishNode gate is active. Fade-outs and release tails play here.
Phases transition automatically: Start plays once, then Loop repeats (if looping is enabled), then Finish plays when FinishAudioGraph() is called. Not all phases are required — a simple one-shot sound only needs a Start node.
Notice that the Entry nodes share routing. The system determines the final sound stemming from the specific entry with no limits to crossover.
Using Variables
Variables let gameplay code control sound behavior at runtime. Common uses:
- Filter parameters — Bind a filter’s cutoff frequency to a variable, then adjust it from code (e.g., underwater muffling)
- Routing conditions — Use a variable as the condition for an If/Switch node to select different sound paths
- Source selection — Gate different source nodes based on game state
Declare variables in the Variable Panel, then either bind them to input slots (right-click > Convert to Reference) or use Get/Set variable nodes.
At runtime, set variables via the Audio Manager API:
AudioManagerRequestBus::Broadcast(&AudioManagerRequests::SetAudioGraphVariable, instanceId, "underwater", 0.8f);
Setting a variable marks dependent nodes dirty and re-evaluates only the affected portion of the graph.
Extending with Custom Nodes
To add a custom audio node:
- Create a class inheriting from
BaseNode and IDataFlowNode - Register it with
GS_AUTO_REGISTER_NODE_FOR(MyAudioNode, "audiograph") - Implement
Process(GraphExecutionContext& context) - Follow the empty any pattern for inactive outputs:
void MyAudioNode::Process(GraphExecutionContext& context)
{
auto inputAny = context.GetInputValueAny(this, "audio_in");
if (inputAny.empty())
{
context.SetOutputValueAny(this, "audio_out", AZStd::any{}); // CRITICAL: truly empty
return;
}
// Process audio...
context.SetOutputValue(this, "audio_out", outputRoute);
}
The empty any pattern ensures that inactive paths don’t mask valid audio from other connections in multi-input slots.
For full details on node creation, see gs_graphcanvas Nodes.
Runtime API
Audio Event Graphs are played through the Audio Manager component. Key methods:
| Method | Description |
|---|
PlayAudioGraph(path) | Fire-and-forget playback (auto-cleanup) |
AcquireAudioGraph(path) | Get a handle for manual control |
FireAudioGraph(id) | Start/restart playback |
StopAudioGraph(id) | Stop playback |
FinishAudioGraph(id) | Transition to the Finish phase |
SetAudioGraphVariable(id, name, value) | Set a graph variable at runtime |
SetAudioGraphLooping(id, looping) | Enable/disable loop phase |
SetAudioGraphEntity(id, entityId) | Track entity position for 3D spatialization |
SetAudioGraphPosition(id, pos) | Fixed world position for 3D |
StopAudioGraphFade(id, fadeTime) | Fade out and stop |
See the Audio Manager documentation for the full API reference.
See Also
3 - Mixing & Effects
Named audio mixing buses with configurable effects chains, environmental influence, and 9 built-in audio filter types.
GS_Audio provides a named mixing bus system built on custom MiniAudio nodes. Each GS_MixingBus is a node in the audio graph with its own volume level and an optional chain of audio filters. Buses are configured in the Audio Manager Inspector and can be controlled at runtime through the mixing request bus.
The effects system includes 9 built-in filter types covering frequency shaping, equalization, delay, and reverb. Environmental influence effects allow game world volumes (rooms, weather zones) to push effects onto buses with priority-based stacking.
For usage guides and setup examples, see The Basics: GS_Audio.
GS_Audio is in Early Development. Full support planned soon: 2026.
Contents
GS_MixingBus
Custom MiniAudio node for mixing and effects processing.
| Field | Value |
|---|
| TypeId | {26E5BA8D-33E0-42E4-BBC0-6A3B2C46F52E} |
API Reference
Request Bus: GS_MixingRequestBus
Mixer control commands. Singleton bus – Single address, single handler.
| Method | Parameters | Returns | Description |
|---|
SetBusVolume | const AZStd::string& busName, float volume | void | Sets the volume of a named mixing bus (0.0 to 1.0). |
GetBusVolume | const AZStd::string& busName | float | Returns the current volume of a named mixing bus. |
MuteBus | const AZStd::string& busName, bool mute | void | Mutes or unmutes a named mixing bus. |
IsBusMuted | const AZStd::string& busName | bool | Returns whether a named mixing bus is currently muted. |
ApplyBusEffects | const AZStd::string& busName, const AudioBusEffects& effects | void | Applies an effects chain to a named mixing bus, replacing any existing effects. |
ClearBusEffects | const AZStd::string& busName | void | Removes all effects from a named mixing bus. |
PushInfluenceEffects | const AZStd::string& busName, const AudioBusInfluenceEffects& effects | void | Pushes environmental influence effects onto a bus with priority stacking. |
PopInfluenceEffects | const AZStd::string& busName, int priority | void | Removes influence effects at the specified priority level from a bus. |
Audio Filters
All 9 built-in filter types. Each filter is configured as part of an effects chain applied to a mixing bus.
| Filter | Type | Description |
|---|
| GS_LowPassFilter | Frequency cutoff | Attenuates frequencies above the cutoff point. Used for muffling, distance simulation, and underwater effects. |
| GS_HighPassFilter | Frequency cutoff | Attenuates frequencies below the cutoff point. Used for thinning audio, radio/telephone effects. |
| GS_BandPassFilter | Band isolation | Passes only frequencies within a specified band, attenuating everything outside. Combines low-pass and high-pass behavior. |
| GS_NotchFilter | Band removal | Attenuates frequencies within a narrow band while passing everything outside. The inverse of band-pass. |
| GS_PeakingEQFilter | Band boost/cut | Boosts or cuts frequencies around a center frequency with configurable bandwidth. Used for tonal shaping. |
| GS_LowShelfFilter | Low frequency shelf | Boosts or cuts all frequencies below a threshold by a fixed amount. Used for bass adjustment. |
| GS_HighShelfFilter | High frequency shelf | Boosts or cuts all frequencies above a threshold by a fixed amount. Used for treble adjustment. |
| GS_DelayFilter | Echo/delay | Produces delayed repetitions of the input signal. Configurable delay time and feedback amount. |
| GS_ReverbFilter | Room reverb | Simulates room acoustics by adding dense reflections. Configurable room size and damping. |
Data Structures
BusEffectsPair
Maps a bus name to an effects chain configuration. Used in the Audio Manager’s Inspector to define per-bus effects at design time.
| Field | Value |
|---|
| TypeId | {AD9E26C9-C172-42BF-B38C-BB06FC704E36} |
| Field | Type | Description |
|---|
| Bus Name | AZStd::string | The name of the mixing bus this effects chain applies to. |
| Effects | AudioBusEffects | The effects chain configuration for this bus. |
AudioBusEffects
A collection of audio filter configurations that form an effects chain on a mixing bus.
| Field | Value |
|---|
| TypeId | {15EC6932-1F88-4EC0-9683-6D80AE982820} |
| Field | Type | Description |
|---|
| Filters | AZStd::vector<AudioFilter> | Ordered list of audio filters applied in sequence. |
AudioBusInfluenceEffects
Environmental effects with priority-based stacking. Game world volumes (rooms, weather zones, underwater areas) push influence effects onto mixing buses. Higher priority influences override lower ones.
| Field | Value |
|---|
| TypeId | {75D039EC-7EE2-4988-A2ED-86689449B575} |
| Field | Type | Description |
|---|
| Priority | int | Stacking priority. Higher values override lower values when multiple influences target the same bus. |
| Effects | AudioBusEffects | The effects chain to apply as an environmental influence. |
See Also
For conceptual overviews and usage guides:
For component references:
For related resources:
- Audio Events – Events route their output through mixing buses
Get GS_Audio
GS_Audio — Explore this gem on the product page and add it to your project.
4 - Score Arrangement
Multi-layer musical score system for dynamic music – tempo, time signatures, fade control, and layer selection.
The Score Arrangement system provides multi-layer dynamic music for GS_Play projects. A ScoreArrangementTrack asset defines a musical score with configurable tempo, time signature, fade behavior, and multiple layers that can be enabled or disabled at runtime. This allows game music to adapt to gameplay state – adding or removing instrumental layers, changing intensity, or crossfading between sections.
Score tracks are loaded and controlled through the Audio Manager request bus.
For usage guides and setup examples, see The Basics: GS_Audio.
GS_Audio is in Early Development. Full support planned soon: 2026.

Contents
Data Model
ScoreArrangementTrack
A multi-layer musical score asset. Each track defines the musical structure and contains one or more layers that play simultaneously.
| Field | Value |
|---|
| TypeId | {DBB48082-1834-4DFF-BAD2-6EA8D83F1AD0} |
| Extends | AZ::Data::AssetData |
| Reflection | Requires GS_AssetReflectionIncludes.h — see Serialization Helpers |
| Field | Type | Description |
|---|
| Track Name | AZStd::string | Identifier for this score track. |
| Time Signature | TimeSignatures | The time signature for this score (e.g. 4/4, 3/4, 6/8). |
| BPM | float | Tempo in beats per minute. |
| Fade In Time | float | Duration in seconds for the score to fade in when playback begins. |
| Fade Out Time | float | Duration in seconds for the score to fade out when playback stops. |
| Loop | bool | Whether the score loops back to the beginning when it reaches the end. |
| Layers | AZStd::vector<ScoreLayer> | The musical layers that compose this score. |
| Active Layers | AZStd::vector<int> | Indices of layers that are active (audible) at the start of playback. |
ScoreLayer
A single musical layer within a score arrangement. Each layer represents one track of audio (e.g. drums, bass, melody) that can be independently enabled or disabled.
| Field | Value |
|---|
| TypeId | {C8B2669A-FAEA-4910-9218-6FE50D2E588E} |
| Field | Type | Description |
|---|
| Layer Name | AZStd::string | Identifier for this layer within the score. |
| Audio Asset | AZ::Data::Asset<AudioClipAsset> | The audio clip for this layer. |
| Volume | float | Base volume level for this layer (0.0 to 1.0). |
| Fade Time | float | Duration in seconds for this layer to fade in or out when toggled. |
TimeSignatures (Enum)
Supported musical time signatures for score arrangement tracks.
| Field | Value |
|---|
| TypeId | {6D6B5657-746C-4FCA-A0AC-671C0F064570} |
| Value | Beats per Measure | Beat Unit | Description |
|---|
FourFour | 4 | Quarter note | 4/4 – Common time. The most widely used time signature. |
FourTwo | 4 | Half note | 4/2 – Four half-note beats per measure. |
TwelveEight | 12 | Eighth note | 12/8 – Compound quadruple meter. Four groups of three eighth notes. |
TwoTwo | 2 | Half note | 2/2 – Cut time (alla breve). Two half-note beats per measure. |
TwoFour | 2 | Quarter note | 2/4 – Two quarter-note beats per measure. March time. |
SixEight | 6 | Eighth note | 6/8 – Compound duple meter. Two groups of three eighth notes. |
ThreeFour | 3 | Quarter note | 3/4 – Waltz time. Three quarter-note beats per measure. |
ThreeTwo | 3 | Half note | 3/2 – Three half-note beats per measure. |
NineEight | 9 | Eighth note | 9/8 – Compound triple meter. Three groups of three eighth notes. |
See Also
For conceptual overviews and usage guides:
For component references:
Get GS_Audio
GS_Audio — Explore this gem on the product page and add it to your project.
5 - Klatt Voice Synthesis
Custom text-to-speech via Klatt formant synthesis with 3D spatial audio, phoneme mapping, and voice profiling.
The Klatt Voice Synthesis system provides custom text-to-speech for GS_Play projects using Klatt formant synthesis with full 3D spatial audio. It uses SoLoud internally for speech generation and MiniAudio for spatial positioning.
The system has two layers:
- KlattVoiceSystemComponent – A singleton that manages the shared SoLoud engine instance and tracks the 3D audio listener position.
- KlattVoiceComponent – A per-entity component that generates speech, queues segments, applies voice profiles, and emits spatialized audio from the entity’s position.
Voice characteristics are defined through KlattVoiceProfile assets containing frequency, speed, waveform, formant, and phoneme mapping configuration. Phoneme maps convert input text to ARPABET phonemes for the Klatt synthesizer, with support for custom pronunciation overrides.
For usage guides and setup examples, see The Basics: GS_Audio.

Contents
Components
KlattVoiceSystemComponent
Singleton component that manages the shared SoLoud engine and 3D listener tracking.
| Field | Value |
|---|
| TypeId | {F4A5D6E7-8B9C-4D5E-A1F2-3B4C5D6E7F8A} |
| Extends | AZ::Component, AZ::TickBus::Handler |
| Bus | KlattVoiceSystemRequestBus (Single/Single) |
KlattVoiceComponent
Per-entity voice component with spatial audio, phoneme mapping, and segment queue.
| Field | Value |
|---|
| TypeId | {4A8B9C7D-6E5F-4D3C-2B1A-0F9E8D7C6B5A} |
| Extends | AZ::Component, AZ::TickBus::Handler |
| Request Bus | KlattVoiceRequestBus (Single/ById, entity-addressed) |
| Notification Bus | KlattVoiceNotificationBus (Multiple/Multiple) |
API Reference
Request Bus: KlattVoiceSystemRequestBus
System-level voice management. Singleton bus – Single address, single handler.
| Method | Parameters | Returns | Description |
|---|
GetSoLoudEngine | – | SoLoud::Soloud* | Returns a pointer to the shared SoLoud engine instance. |
SetListenerPosition | const AZ::Vector3& position | void | Updates the 3D audio listener position for spatial voice playback. |
SetListenerOrientation | const AZ::Vector3& forward, const AZ::Vector3& up | void | Updates the 3D audio listener orientation. |
GetListenerPosition | – | AZ::Vector3 | Returns the current listener position. |
IsEngineReady | – | bool | Returns whether the SoLoud engine has been initialized and is ready. |
Request Bus: KlattVoiceRequestBus
Per-entity voice synthesis controls. Entity-addressed bus – Single handler per entity ID.
| Method | Parameters | Returns | Description |
|---|
Speak | const AZStd::string& text | void | Converts text to speech and plays it. Uses the component’s configured voice profile. |
SpeakWithParams | const AZStd::string& text, const KlattVoiceParams& params | void | Converts text to speech using the specified voice parameters instead of the profile defaults. |
StopSpeaking | – | void | Immediately stops any speech in progress and clears the segment queue. |
IsSpeaking | – | bool | Returns whether this entity’s voice is currently producing speech. |
QueueSegment | const AZStd::string& text | void | Adds a speech segment to the queue. Queued segments play in order after the current segment finishes. |
ClearQueue | – | void | Clears all queued speech segments without stopping current playback. |
SetVoiceProfile | const AZ::Data::Asset<KlattVoiceProfile>& profile | void | Changes the voice profile used by this component. |
GetVoiceProfile | – | AZ::Data::Asset<KlattVoiceProfile> | Returns the currently assigned voice profile asset. |
SetSpatialConfig | const KlattSpatialConfig& config | void | Updates the 3D spatial audio configuration for this voice. |
GetSpatialConfig | – | KlattSpatialConfig | Returns the current spatial audio configuration. |
SetVolume | float volume | void | Sets the output volume for this voice (0.0 to 1.0). |
GetVolume | – | float | Returns the current output volume. |
Notification Bus: KlattVoiceNotificationBus
Events broadcast by voice components. Multiple handler bus – any number of components can subscribe.
| Event | Parameters | Description |
|---|
OnSpeechStarted | const AZ::EntityId& entityId | Fired when an entity begins speaking. |
OnSpeechFinished | const AZ::EntityId& entityId | Fired when an entity finishes speaking (including all queued segments). |
OnSegmentStarted | const AZ::EntityId& entityId, int segmentIndex | Fired when a new speech segment begins playing. |
OnSegmentFinished | const AZ::EntityId& entityId, int segmentIndex | Fired when a speech segment finishes playing. |
Data Types
KlattVoiceParams
Core voice synthesis parameters controlling the Klatt formant synthesizer output.
| Field | Value |
|---|
| TypeId | {8A9C7F3B-4E2D-4C1A-9B5E-6D8F9A2C1B4E} |
| Field | Type | Description |
|---|
| Base Frequency | float | Fundamental frequency (F0) in Hz. Controls the base pitch of the voice. |
| Speed | float | Speech rate multiplier. 1.0 is normal speed. |
| Declination | float | Pitch declination rate. Controls how pitch drops over the course of an utterance. |
| Waveform | KlattWaveform | Glottal waveform type used by the synthesizer. |
| Formant Shift | float | Shifts all formant frequencies up or down. Positive values raise pitch character, negative values lower it. |
| Pitch Variance | float | Amount of random pitch variation applied during speech for natural-sounding intonation. |
KlattVoiceProfile
A voice profile asset combining synthesis parameters with a phoneme mapping.
| Field | Value |
|---|
| TypeId | {2CEB777E-DAA7-40B1-BFF4-0F772ADE86CF} |
| Reflection | Requires GS_AssetReflectionIncludes.h — see Serialization Helpers |
| Field | Type | Description |
|---|
| Voice Params | KlattVoiceParams | The synthesis parameters for this voice profile. |
| Phoneme Map | AZ::Data::Asset<KlattPhonemeMap> | The phoneme mapping asset used for text-to-phoneme conversion. |
KlattVoicePreset
A preset configuration for quick voice setup.
| Field | Value |
|---|
| TypeId | {2B8D9E4F-7C6A-4D3B-8E9F-1A2B3C4D5E6F} |
| Field | Type | Description |
|---|
| Preset Name | AZStd::string | Display name for this preset. |
| Profile | KlattVoiceProfile | The voice profile configuration stored in this preset. |
KlattSpatialConfig
3D spatial audio configuration for voice positioning.
| Field | Value |
|---|
| TypeId | {7C9F8E2D-3A4B-5F6C-1E0D-9A8B7C6D5E4F} |
| Field | Type | Description |
|---|
| Enable 3D | bool | Whether this voice uses 3D spatialization. When false, audio plays as 2D. |
| Min Distance | float | Distance at which attenuation begins. Below this distance the voice plays at full volume. |
| Max Distance | float | Distance at which the voice reaches minimum volume. |
| Attenuation Model | int | The distance attenuation curve type (linear, inverse, exponential). |
| Doppler Factor | float | Intensity of the Doppler effect applied to this voice. 0.0 disables Doppler. |
KlattPhonemeMap
Phoneme mapping asset for text-to-ARPABET conversion with custom overrides.
| Field | Value |
|---|
| TypeId | {F3E9D7C1-2A4B-5E8F-9C3D-6A1B4E7F2D5C} |
| Reflection | Requires GS_AssetReflectionIncludes.h — see Serialization Helpers |
| Field | Type | Description |
|---|
| Base Map | BasePhonemeMap | The base phoneme dictionary to use as the foundation for conversion. |
| Overrides | AZStd::vector<PhonemeOverride> | Custom pronunciation overrides for specific words or patterns. |
PhonemeOverride
A custom pronunciation rule that overrides the base phoneme map for a specific word or pattern.
| Field | Value |
|---|
| TypeId | {A2B5C8D1-4E7F-3A9C-6B2D-1F5E8A3C7D9B} |
| Field | Type | Description |
|---|
| Word | AZStd::string | The word or pattern to match. |
| Phonemes | AZStd::string | The ARPABET phoneme sequence to use for this word. |
Enumerations
Glottal waveform types available for the Klatt synthesizer.
| Field | Value |
|---|
| TypeId | {8ED1DABE-3347-44A5-B43A-C171D36AE780} |
| Value | Description |
|---|
Saw | Sawtooth waveform. Bright, buzzy character. |
Triangle | Triangle waveform. Softer than sawtooth, slightly hollow. |
Sin | Sine waveform. Pure tone, smooth and clean. |
Square | Square waveform. Hollow, reed-like character. |
Pulse | Pulse waveform. Variable duty cycle for varied timbres. |
Noise | Noise waveform. Breathy, whisper-like quality. |
Warble | Warble waveform. Modulated tone with vibrato-like character. |
BasePhonemeMap
Available base phoneme dictionaries for text-to-ARPABET conversion.
| Field | Value |
|---|
| TypeId | {D8F2A3C5-1B4E-7A9F-6D2C-5E8A1B3F4C7D} |
| Value | Description |
|---|
SoLoud_Default | The default phoneme mapping built into SoLoud. Covers standard English pronunciation. |
CMU_Full | The full CMU Pronouncing Dictionary. Comprehensive English phoneme coverage with over 130,000 entries. |
KTT (Klatt Text Tags) are inline commands embedded in strings passed to KlattVoiceComponent::SpeakText. They are parsed by KlattCommandParser::Parse and stripped from the spoken text before synthesis begins — they are never heard.
Format: <ktt attr1=value1 attr2=value2>
Multiple attributes can be combined in a single tag. Attribute names are case-insensitive. String values may optionally be wrapped in quotes. An empty value (e.g. speed=) resets that parameter to the voice profile default.
speed=X
Override the speech speed multiplier from this point forward.
| |
|---|
| Range | 0.1 – 5.0 |
| Default reset | speed= (restores profile default) |
| 1.0 | Normal speed |
Normal speech <ktt speed=2.0> fast bit <ktt speed=> back to default.
decl=X / declination=X
Pitch declination — how much pitch falls over the course of the utterance. Both decl and declination are accepted.
| |
|---|
| Range | 0.0 – 1.0 |
| 0.0 | Steady pitch (no fall) |
| 0.8 | Strong downward drift |
Rising <ktt decl=0.0> steady <ktt decl=0.8> falling voice.
Change the glottal waveform used by the synthesizer, setting the overall character of the voice.
| Value | Character |
|---|
saw | Default, neutral voice |
triangle | Softer, smoother |
sin / sine | Pure tone, robotic |
square | Harsh, mechanical |
pulse | Raspy, textured |
noise | Whispered, breathy |
warble | Wobbly, character voice |
<ktt waveform="noise"> whispered section <ktt waveform="saw"> normal voice.
vowel=X
First formant (F1) frequency multiplier. Shifts the quality of synthesised vowel sounds.
| |
|---|
| 1.0 | Normal |
| > 1.0 | More open vowel quality |
| < 1.0 | More closed vowel quality |
<ktt vowel=1.4> different vowel colour here.
accent=X
Second formant (F2) frequency multiplier. Shifts accent or dialect colouration.
| |
|---|
| 1.0 | Normal |
| < 1.0 | Shifted accent colouring |
<ktt accent=0.8> shifted accent here.
pitch=X
F0 pitch variance amount. Controls how much pitch varies during synthesis.
| |
|---|
| 1.0 | Normal variance |
| > 1.0 | More expressive intonation |
| < 1.0 | Flatter, more monotone |
<ktt pitch=2.0> very expressive speech <ktt pitch=0.1> flat monotone.
pause=X
Insert a pause of X seconds at this position in the voice playback. Value is required — there is no default.
Hello.<ktt pause=0.8> How are you?
Combined Example
Dialogue string using typewriter text commands and KTT voice tags together:
[b]Warning:[/b] [color=#FF0000]do not[/color] proceed.[pause=1]
<ktt waveform="square" pitch=1.8>This is a mechanical override.<ktt pause=0.5><ktt waveform="saw" pitch=1.0>
[speed=3]Resuming normal protocol.[/speed]
See Also
For conceptual overviews and usage guides:
For component references:
- Audio Manager – Manager lifecycle that the voice system participates in
Get GS_Audio
GS_Audio — Explore this gem on the product page and add it to your project.
6 - Third Party Implementations
Integration guides for third-party audio systems with GS_Audio.
This section will contain integration guides for connecting third-party audio middleware and tools with the GS_Audio system.
For usage guides and setup examples, see The Basics: GS_Audio.
Get GS_Audio
GS_Audio — Explore this gem on the product page and add it to your project.