This is the multi-page printable view of this section. Click here to print.

Return to the regular view of this page.

GS_Audio

Audio management, event-based sound playback, multi-layer music scoring, mixing buses with effects, and Klatt formant voice synthesis with 3D spatial audio.

GS_Audio provides a complete audio solution for GS_Play projects. It includes an event-based sound playback system, multi-layer music scoring, named mixing buses with configurable effects chains, and a built-in Klatt formant voice synthesizer with 3D spatial audio. All features integrate with the GS_Play manager lifecycle and respond to standby mode automatically.

For usage guides and setup examples, see The Basics: GS_Audio.

 

Contents


Audio Management

The Audio Manager singleton initializes the MiniAudio engine, manages mixing buses, loads audio event libraries, and coordinates score track playback. It extends GS_ManagerComponent and participates in the standard two-stage initialization.

ComponentPurpose
Audio ManagerMaster audio controller – engine lifecycle, bus routing, event library loading, score management.

Audio Manager API


Audio Events

Event-based sound playback. Events define clip pools with selection rules, spatialization mode, concurrent limits, and repeat-hold behavior. Events are grouped into library assets.

Component / AssetPurpose
GS_AudioEventSingle event definition – clip pool, selection type, 2D/3D mode.
AudioEventLibraryAsset containing a collection of audio events.
GS_AudioEventComponentPer-entity audio event playback with 3D positioning.

Audio Events API


Mixing & Effects

Named audio buses with configurable effects chains and environmental influence. Includes 9 built-in audio filter types.

Component / TypePurpose
GS_MixingBusCustom MiniAudio node for mixing and effects processing.
BusEffectsPairMaps a bus name to an effects chain.
AudioBusInfluenceEffectsEnvironmental effects with priority stacking.

Mixing & Effects API


Score Arrangement

Multi-layer music scoring with configurable time signatures, tempo, and layer control.

AssetPurpose
ScoreArrangementTrackMulti-layer music asset – time signature, BPM, fade, layers.
ScoreLayerIndividual track within a score arrangement.

Score Arrangement API


Klatt Voice Synthesis

Built-in text-to-speech using Klatt formant synthesis with 3D spatial audio.

ComponentPurpose
KlattVoiceSystemComponentShared SoLoud engine management, 3D listener tracking.
KlattVoiceComponentPer-entity voice with spatial audio, phoneme mapping, segment queue.

Klatt Voice API


Dependencies

  • GS_Core (required)
  • MiniAudio (third-party audio library)
  • SoLoud (embedded, for voice synthesis)

Installation

  1. Enable the GS_Audio gem in your project configuration.
  2. Ensure GS_Core and MiniAudio are also enabled.
  3. Create an Audio Manager prefab and add it to the Game Manager’s Startup Managers list.
  4. Create Audio Event Library assets for your sound effects.
  5. Add GS_AudioEventComponent to entities that need to play sounds.

See Also

For conceptual overviews and usage guides:

For related resources:


Get GS_Audio

GS_Audio — Explore this gem on the product page and add it to your project.

1 - Audio Manager

Master audio controller – engine initialization, mixing bus routing, event library loading, and score playback coordination.

The Audio Manager is the master audio controller for every GS_Play project. It extends GS_ManagerComponent and participates in the standard two-stage initialization managed by the Game Manager. On startup it initializes the MiniAudio engine, creates the named mixing bus graph, loads audio event libraries, and coordinates score track playback.

Like all GS_Play managers, the Audio Manager responds to standby mode automatically – muting or pausing audio output when the game enters a blocking operation such as a stage change.

For usage guides and setup examples, see The Basics: GS_Audio.

Audio Manager component in the O3DE Inspector

 

Contents


How It Works

Engine Lifecycle

When the Audio Manager activates, it initializes a MiniAudio engine instance and builds the mixing bus graph from its configured bus list. During Stage 2 startup it loads all referenced Audio Event Library assets so that events can be triggered immediately once startup completes.

Mixing Bus Routing

All audio output flows through named mixing buses. Each bus is a GS_MixingBus node in the MiniAudio graph with its own volume level and optional effects chain. The Audio Manager owns the top-level routing and exposes volume control per bus through the request bus.

Score Playback

The Audio Manager coordinates playback of ScoreArrangementTrack assets – multi-layer musical scores with configurable tempo, time signature, and layer selection. Score tracks are loaded and managed through the request bus.


Inspector Properties

PropertyTypeDescription
Mixing BusesAZStd::vector<BusEffectsPair>Named mixing buses with optional effects chains. Each entry maps a bus name to an effects configuration.
Startup LibrariesAZStd::vector<AZ::Data::Asset<AudioEventLibrary>>Audio event library assets to load during startup. Events in these libraries are available immediately after initialization.
Default Master VolumefloatInitial master volume level (0.0 to 1.0).

API Reference

GS_AudioManagerComponent

FieldValue
TypeId{F28721FD-B9FD-4C04-8CD1-6344BD8A3B78}
ExtendsGS_Core::GS_ManagerComponent
HeaderGS_Audio/GS_AudioManagerBus.h

Request Bus: AudioManagerRequestBus

Commands sent to the Audio Manager. Singleton bus – Single address, single handler.

MethodParametersReturnsDescription
PlayAudioEventconst AZStd::string& eventNamevoidPlays the named audio event from the loaded event libraries.
PlayAudioEventconst AZStd::string& eventName, const AZ::EntityId& entityIdvoidPlays the named audio event positioned at the specified entity for 3D spatialization.
StopAudioEventconst AZStd::string& eventNamevoidStops playback of the named audio event.
StopAllAudioEventsvoidStops all currently playing audio events.
SetMixerVolumeconst AZStd::string& busName, float volumevoidSets the volume of a named mixing bus (0.0 to 1.0).
GetMixerVolumeconst AZStd::string& busNamefloatReturns the current volume of a named mixing bus.
SetMasterVolumefloat volumevoidSets the master output volume (0.0 to 1.0).
GetMasterVolumefloatReturns the current master output volume.
LoadEventLibraryconst AZ::Data::Asset<AudioEventLibrary>& libraryvoidLoads an audio event library at runtime, making its events available for playback.
UnloadEventLibraryconst AZ::Data::Asset<AudioEventLibrary>& libraryvoidUnloads a previously loaded audio event library.
PlayScoreTrackconst AZ::Data::Asset<ScoreArrangementTrack>& trackvoidBegins playback of a score arrangement track.
StopScoreTrackvoidStops the currently playing score arrangement track.

Notification Bus: AudioManagerNotificationBus

Events broadcast by the Audio Manager. Multiple handler bus – any number of components can subscribe.

EventParametersDescription
OnAudioEventStartedconst AZStd::string& eventNameFired when an audio event begins playback.
OnAudioEventStoppedconst AZStd::string& eventNameFired when an audio event stops playback.
OnScoreTrackStartedFired when a score arrangement track begins playback.
OnScoreTrackStoppedFired when a score arrangement track stops playback.
OnMixerVolumeChangedconst AZStd::string& busName, float volumeFired when a mixing bus volume changes.

See Also

For conceptual overviews and usage guides:

For component references:

For related resources:


Get GS_Audio

GS_Audio — Explore this gem on the product page and add it to your project.

2 - Audio Events

Event-based sound playback – audio event definitions, clip pool selection, spatialization, and event library assets.

Audio Events are the primary mechanism for playing sounds in GS_Play. A GS_AudioEvent defines a single sound event with a pool of audio clips, selection rules (random or sequential), 2D/3D spatialization mode, concurrent playback limits, and repeat-hold behavior. Events are grouped into AudioEventLibrary assets that the Audio Manager loads at startup or on demand.

When an event is triggered, the system selects a clip from the pool according to the configured selection type, checks concurrent limits, and routes the output through the appropriate mixing bus.

For usage guides and setup examples, see The Basics: GS_Audio.

Audio Event Library asset in the O3DE Asset Editor

 

Contents


Data Model

GS_AudioEvent

A single audio event definition containing all playback configuration.

FieldValue
TypeId{2A6E337B-2B9A-4CB2-8760-BF3A12C50CA0}
FieldTypeDescription
Event NameAZStd::stringUnique identifier for this event within its library. Used by PlayAudioEvent calls.
Audio ClipsAZStd::vector<AudioClipAsset>Pool of audio clip assets available for this event.
Pool Selection TypePoolSelectionTypeHow clips are chosen from the pool: Random or Increment (sequential).
Is 3DboolWhen true, audio is spatialized in 3D space relative to the emitting entity. When false, audio plays as 2D (non-positional).
Max ConcurrentintMaximum number of simultaneous instances of this event. Additional triggers are ignored until a slot opens. 0 means unlimited.
Repeat Hold TimefloatMinimum time in seconds before this event can be retriggered. Prevents rapid-fire repetition of the same sound.
Mixing BusAZStd::stringName of the mixing bus to route this event’s output through.
VolumefloatBase volume level for this event (0.0 to 1.0).
Pitch VariancefloatRandom pitch variation range applied each time the event plays. 0.0 means no variation.

AudioEventLibrary

An asset containing a collection of GS_AudioEvent definitions. Libraries are loaded by the Audio Manager at startup or at runtime via LoadEventLibrary.

FieldValue
TypeId{04218A1E-4399-4A7F-9649-ED468B5EF76B}
ExtendsAZ::Data::AssetData
ReflectionRequires GS_AssetReflectionIncludes.h — see Serialization Helpers
FieldTypeDescription
EventsAZStd::vector<GS_AudioEvent>The collection of audio events defined in this library.

PoolSelectionType (Enum)

Determines how audio clips are selected from an event’s clip pool.

FieldValue
TypeId{AF10C5C8-E54E-41DA-917A-6DF12CA89CE3}
ValueDescription
RandomA random clip is chosen from the pool each time the event plays.
IncrementClips are played sequentially, advancing to the next clip in the pool on each trigger. Wraps around at the end.

GS_AudioEventComponent

Per-entity component that provides audio event playback with optional 3D positioning. Attach this component to any entity that needs to emit sounds.

FieldTypeDescription
Audio EventsAZStd::vector<AZStd::string>List of event names this component can play. Events must exist in a loaded library.
Auto PlayboolWhen true, the first event in the list plays automatically on activation.

See Also

For conceptual overviews and usage guides:

For component references:

  • Audio Manager – Engine initialization and event library loading

For related resources:


Get GS_Audio

GS_Audio — Explore this gem on the product page and add it to your project.

3 - Mixing & Effects

Named audio mixing buses with configurable effects chains, environmental influence, and 9 built-in audio filter types.

GS_Audio provides a named mixing bus system built on custom MiniAudio nodes. Each GS_MixingBus is a node in the audio graph with its own volume level and an optional chain of audio filters. Buses are configured in the Audio Manager Inspector and can be controlled at runtime through the mixing request bus.

The effects system includes 9 built-in filter types covering frequency shaping, equalization, delay, and reverb. Environmental influence effects allow game world volumes (rooms, weather zones) to push effects onto buses with priority-based stacking.

For usage guides and setup examples, see The Basics: GS_Audio.

Contents


GS_MixingBus

Custom MiniAudio node for mixing and effects processing.

FieldValue
TypeId{26E5BA8D-33E0-42E4-BBC0-6A3B2C46F52E}

API Reference

Request Bus: GS_MixingRequestBus

Mixer control commands. Singleton bus – Single address, single handler.

MethodParametersReturnsDescription
SetBusVolumeconst AZStd::string& busName, float volumevoidSets the volume of a named mixing bus (0.0 to 1.0).
GetBusVolumeconst AZStd::string& busNamefloatReturns the current volume of a named mixing bus.
MuteBusconst AZStd::string& busName, bool mutevoidMutes or unmutes a named mixing bus.
IsBusMutedconst AZStd::string& busNameboolReturns whether a named mixing bus is currently muted.
ApplyBusEffectsconst AZStd::string& busName, const AudioBusEffects& effectsvoidApplies an effects chain to a named mixing bus, replacing any existing effects.
ClearBusEffectsconst AZStd::string& busNamevoidRemoves all effects from a named mixing bus.
PushInfluenceEffectsconst AZStd::string& busName, const AudioBusInfluenceEffects& effectsvoidPushes environmental influence effects onto a bus with priority stacking.
PopInfluenceEffectsconst AZStd::string& busName, int priorityvoidRemoves influence effects at the specified priority level from a bus.

Audio Filters

All 9 built-in filter types. Each filter is configured as part of an effects chain applied to a mixing bus.

FilterTypeDescription
GS_LowPassFilterFrequency cutoffAttenuates frequencies above the cutoff point. Used for muffling, distance simulation, and underwater effects.
GS_HighPassFilterFrequency cutoffAttenuates frequencies below the cutoff point. Used for thinning audio, radio/telephone effects.
GS_BandPassFilterBand isolationPasses only frequencies within a specified band, attenuating everything outside. Combines low-pass and high-pass behavior.
GS_NotchFilterBand removalAttenuates frequencies within a narrow band while passing everything outside. The inverse of band-pass.
GS_PeakingEQFilterBand boost/cutBoosts or cuts frequencies around a center frequency with configurable bandwidth. Used for tonal shaping.
GS_LowShelfFilterLow frequency shelfBoosts or cuts all frequencies below a threshold by a fixed amount. Used for bass adjustment.
GS_HighShelfFilterHigh frequency shelfBoosts or cuts all frequencies above a threshold by a fixed amount. Used for treble adjustment.
GS_DelayFilterEcho/delayProduces delayed repetitions of the input signal. Configurable delay time and feedback amount.
GS_ReverbFilterRoom reverbSimulates room acoustics by adding dense reflections. Configurable room size and damping.

Data Structures

BusEffectsPair

Maps a bus name to an effects chain configuration. Used in the Audio Manager’s Inspector to define per-bus effects at design time.

FieldValue
TypeId{AD9E26C9-C172-42BF-B38C-BB06FC704E36}
FieldTypeDescription
Bus NameAZStd::stringThe name of the mixing bus this effects chain applies to.
EffectsAudioBusEffectsThe effects chain configuration for this bus.

AudioBusEffects

A collection of audio filter configurations that form an effects chain on a mixing bus.

FieldValue
TypeId{15EC6932-1F88-4EC0-9683-6D80AE982820}
FieldTypeDescription
FiltersAZStd::vector<AudioFilter>Ordered list of audio filters applied in sequence.

AudioBusInfluenceEffects

Environmental effects with priority-based stacking. Game world volumes (rooms, weather zones, underwater areas) push influence effects onto mixing buses. Higher priority influences override lower ones.

FieldValue
TypeId{75D039EC-7EE2-4988-A2ED-86689449B575}
FieldTypeDescription
PriorityintStacking priority. Higher values override lower values when multiple influences target the same bus.
EffectsAudioBusEffectsThe effects chain to apply as an environmental influence.

See Also

For conceptual overviews and usage guides:

For component references:

  • Audio Manager – Master controller that owns the mixing bus graph

For related resources:

  • Audio Events – Events route their output through mixing buses

Get GS_Audio

GS_Audio — Explore this gem on the product page and add it to your project.

4 - Score Arrangement

Multi-layer musical score system for dynamic music – tempo, time signatures, fade control, and layer selection.

The Score Arrangement system provides multi-layer dynamic music for GS_Play projects. A ScoreArrangementTrack asset defines a musical score with configurable tempo, time signature, fade behavior, and multiple layers that can be enabled or disabled at runtime. This allows game music to adapt to gameplay state – adding or removing instrumental layers, changing intensity, or crossfading between sections.

Score tracks are loaded and controlled through the Audio Manager request bus.

For usage guides and setup examples, see The Basics: GS_Audio.

Score Arrangement asset in the O3DE Asset Editor

 

Contents


Data Model

ScoreArrangementTrack

A multi-layer musical score asset. Each track defines the musical structure and contains one or more layers that play simultaneously.

FieldValue
TypeId{DBB48082-1834-4DFF-BAD2-6EA8D83F1AD0}
ExtendsAZ::Data::AssetData
ReflectionRequires GS_AssetReflectionIncludes.h — see Serialization Helpers
FieldTypeDescription
Track NameAZStd::stringIdentifier for this score track.
Time SignatureTimeSignaturesThe time signature for this score (e.g. 4/4, 3/4, 6/8).
BPMfloatTempo in beats per minute.
Fade In TimefloatDuration in seconds for the score to fade in when playback begins.
Fade Out TimefloatDuration in seconds for the score to fade out when playback stops.
LoopboolWhether the score loops back to the beginning when it reaches the end.
LayersAZStd::vector<ScoreLayer>The musical layers that compose this score.
Active LayersAZStd::vector<int>Indices of layers that are active (audible) at the start of playback.

ScoreLayer

A single musical layer within a score arrangement. Each layer represents one track of audio (e.g. drums, bass, melody) that can be independently enabled or disabled.

FieldValue
TypeId{C8B2669A-FAEA-4910-9218-6FE50D2E588E}
FieldTypeDescription
Layer NameAZStd::stringIdentifier for this layer within the score.
Audio AssetAZ::Data::Asset<AudioClipAsset>The audio clip for this layer.
VolumefloatBase volume level for this layer (0.0 to 1.0).
Fade TimefloatDuration in seconds for this layer to fade in or out when toggled.

TimeSignatures (Enum)

Supported musical time signatures for score arrangement tracks.

FieldValue
TypeId{6D6B5657-746C-4FCA-A0AC-671C0F064570}
ValueBeats per MeasureBeat UnitDescription
FourFour4Quarter note4/4 – Common time. The most widely used time signature.
FourTwo4Half note4/2 – Four half-note beats per measure.
TwelveEight12Eighth note12/8 – Compound quadruple meter. Four groups of three eighth notes.
TwoTwo2Half note2/2 – Cut time (alla breve). Two half-note beats per measure.
TwoFour2Quarter note2/4 – Two quarter-note beats per measure. March time.
SixEight6Eighth note6/8 – Compound duple meter. Two groups of three eighth notes.
ThreeFour3Quarter note3/4 – Waltz time. Three quarter-note beats per measure.
ThreeTwo3Half note3/2 – Three half-note beats per measure.
NineEight9Eighth note9/8 – Compound triple meter. Three groups of three eighth notes.

See Also

For conceptual overviews and usage guides:

For component references:


Get GS_Audio

GS_Audio — Explore this gem on the product page and add it to your project.

5 - Klatt Voice Synthesis

Custom text-to-speech via Klatt formant synthesis with 3D spatial audio, phoneme mapping, and voice profiling.

The Klatt Voice Synthesis system provides custom text-to-speech for GS_Play projects using Klatt formant synthesis with full 3D spatial audio. It uses SoLoud internally for speech generation and MiniAudio for spatial positioning.

The system has two layers:

  • KlattVoiceSystemComponent – A singleton that manages the shared SoLoud engine instance and tracks the 3D audio listener position.
  • KlattVoiceComponent – A per-entity component that generates speech, queues segments, applies voice profiles, and emits spatialized audio from the entity’s position.

Voice characteristics are defined through KlattVoiceProfile assets containing frequency, speed, waveform, formant, and phoneme mapping configuration. Phoneme maps convert input text to ARPABET phonemes for the Klatt synthesizer, with support for custom pronunciation overrides.

For usage guides and setup examples, see The Basics: GS_Audio.

Klatt Voice Profile asset in the O3DE Asset Editor

 

Contents


Components

KlattVoiceSystemComponent

Singleton component that manages the shared SoLoud engine and 3D listener tracking.

FieldValue
TypeId{F4A5D6E7-8B9C-4D5E-A1F2-3B4C5D6E7F8A}
ExtendsAZ::Component, AZ::TickBus::Handler
BusKlattVoiceSystemRequestBus (Single/Single)

KlattVoiceComponent

Per-entity voice component with spatial audio, phoneme mapping, and segment queue.

FieldValue
TypeId{4A8B9C7D-6E5F-4D3C-2B1A-0F9E8D7C6B5A}
ExtendsAZ::Component, AZ::TickBus::Handler
Request BusKlattVoiceRequestBus (Single/ById, entity-addressed)
Notification BusKlattVoiceNotificationBus (Multiple/Multiple)

API Reference

Request Bus: KlattVoiceSystemRequestBus

System-level voice management. Singleton bus – Single address, single handler.

MethodParametersReturnsDescription
GetSoLoudEngineSoLoud::Soloud*Returns a pointer to the shared SoLoud engine instance.
SetListenerPositionconst AZ::Vector3& positionvoidUpdates the 3D audio listener position for spatial voice playback.
SetListenerOrientationconst AZ::Vector3& forward, const AZ::Vector3& upvoidUpdates the 3D audio listener orientation.
GetListenerPositionAZ::Vector3Returns the current listener position.
IsEngineReadyboolReturns whether the SoLoud engine has been initialized and is ready.

Request Bus: KlattVoiceRequestBus

Per-entity voice synthesis controls. Entity-addressed bus – Single handler per entity ID.

MethodParametersReturnsDescription
Speakconst AZStd::string& textvoidConverts text to speech and plays it. Uses the component’s configured voice profile.
SpeakWithParamsconst AZStd::string& text, const KlattVoiceParams& paramsvoidConverts text to speech using the specified voice parameters instead of the profile defaults.
StopSpeakingvoidImmediately stops any speech in progress and clears the segment queue.
IsSpeakingboolReturns whether this entity’s voice is currently producing speech.
QueueSegmentconst AZStd::string& textvoidAdds a speech segment to the queue. Queued segments play in order after the current segment finishes.
ClearQueuevoidClears all queued speech segments without stopping current playback.
SetVoiceProfileconst AZ::Data::Asset<KlattVoiceProfile>& profilevoidChanges the voice profile used by this component.
GetVoiceProfileAZ::Data::Asset<KlattVoiceProfile>Returns the currently assigned voice profile asset.
SetSpatialConfigconst KlattSpatialConfig& configvoidUpdates the 3D spatial audio configuration for this voice.
GetSpatialConfigKlattSpatialConfigReturns the current spatial audio configuration.
SetVolumefloat volumevoidSets the output volume for this voice (0.0 to 1.0).
GetVolumefloatReturns the current output volume.

Notification Bus: KlattVoiceNotificationBus

Events broadcast by voice components. Multiple handler bus – any number of components can subscribe.

EventParametersDescription
OnSpeechStartedconst AZ::EntityId& entityIdFired when an entity begins speaking.
OnSpeechFinishedconst AZ::EntityId& entityIdFired when an entity finishes speaking (including all queued segments).
OnSegmentStartedconst AZ::EntityId& entityId, int segmentIndexFired when a new speech segment begins playing.
OnSegmentFinishedconst AZ::EntityId& entityId, int segmentIndexFired when a speech segment finishes playing.

Data Types

KlattVoiceParams

Core voice synthesis parameters controlling the Klatt formant synthesizer output.

FieldValue
TypeId{8A9C7F3B-4E2D-4C1A-9B5E-6D8F9A2C1B4E}
FieldTypeDescription
Base FrequencyfloatFundamental frequency (F0) in Hz. Controls the base pitch of the voice.
SpeedfloatSpeech rate multiplier. 1.0 is normal speed.
DeclinationfloatPitch declination rate. Controls how pitch drops over the course of an utterance.
WaveformKlattWaveformGlottal waveform type used by the synthesizer.
Formant ShiftfloatShifts all formant frequencies up or down. Positive values raise pitch character, negative values lower it.
Pitch VariancefloatAmount of random pitch variation applied during speech for natural-sounding intonation.

KlattVoiceProfile

A voice profile asset combining synthesis parameters with a phoneme mapping.

FieldValue
TypeId{2CEB777E-DAA7-40B1-BFF4-0F772ADE86CF}
ReflectionRequires GS_AssetReflectionIncludes.h — see Serialization Helpers
FieldTypeDescription
Voice ParamsKlattVoiceParamsThe synthesis parameters for this voice profile.
Phoneme MapAZ::Data::Asset<KlattPhonemeMap>The phoneme mapping asset used for text-to-phoneme conversion.

KlattVoicePreset

A preset configuration for quick voice setup.

FieldValue
TypeId{2B8D9E4F-7C6A-4D3B-8E9F-1A2B3C4D5E6F}
FieldTypeDescription
Preset NameAZStd::stringDisplay name for this preset.
ProfileKlattVoiceProfileThe voice profile configuration stored in this preset.

KlattSpatialConfig

3D spatial audio configuration for voice positioning.

FieldValue
TypeId{7C9F8E2D-3A4B-5F6C-1E0D-9A8B7C6D5E4F}
FieldTypeDescription
Enable 3DboolWhether this voice uses 3D spatialization. When false, audio plays as 2D.
Min DistancefloatDistance at which attenuation begins. Below this distance the voice plays at full volume.
Max DistancefloatDistance at which the voice reaches minimum volume.
Attenuation ModelintThe distance attenuation curve type (linear, inverse, exponential).
Doppler FactorfloatIntensity of the Doppler effect applied to this voice. 0.0 disables Doppler.

KlattPhonemeMap

Phoneme mapping asset for text-to-ARPABET conversion with custom overrides.

FieldValue
TypeId{F3E9D7C1-2A4B-5E8F-9C3D-6A1B4E7F2D5C}
ReflectionRequires GS_AssetReflectionIncludes.h — see Serialization Helpers
FieldTypeDescription
Base MapBasePhonemeMapThe base phoneme dictionary to use as the foundation for conversion.
OverridesAZStd::vector<PhonemeOverride>Custom pronunciation overrides for specific words or patterns.

PhonemeOverride

A custom pronunciation rule that overrides the base phoneme map for a specific word or pattern.

FieldValue
TypeId{A2B5C8D1-4E7F-3A9C-6B2D-1F5E8A3C7D9B}
FieldTypeDescription
WordAZStd::stringThe word or pattern to match.
PhonemesAZStd::stringThe ARPABET phoneme sequence to use for this word.

Enumerations

KlattWaveform

Glottal waveform types available for the Klatt synthesizer.

FieldValue
TypeId{8ED1DABE-3347-44A5-B43A-C171D36AE780}
ValueDescription
SawSawtooth waveform. Bright, buzzy character.
TriangleTriangle waveform. Softer than sawtooth, slightly hollow.
SinSine waveform. Pure tone, smooth and clean.
SquareSquare waveform. Hollow, reed-like character.
PulsePulse waveform. Variable duty cycle for varied timbres.
NoiseNoise waveform. Breathy, whisper-like quality.
WarbleWarble waveform. Modulated tone with vibrato-like character.

BasePhonemeMap

Available base phoneme dictionaries for text-to-ARPABET conversion.

FieldValue
TypeId{D8F2A3C5-1B4E-7A9F-6D2C-5E8A1B3F4C7D}
ValueDescription
SoLoud_DefaultThe default phoneme mapping built into SoLoud. Covers standard English pronunciation.
CMU_FullThe full CMU Pronouncing Dictionary. Comprehensive English phoneme coverage with over 130,000 entries.

KTT Voice Tags

KTT (Klatt Text Tags) are inline commands embedded in strings passed to KlattVoiceComponent::SpeakText. They are parsed by KlattCommandParser::Parse and stripped from the spoken text before synthesis begins — they are never heard.

Format: <ktt attr1=value1 attr2=value2>

Multiple attributes can be combined in a single tag. Attribute names are case-insensitive. String values may optionally be wrapped in quotes. An empty value (e.g. speed=) resets that parameter to the voice profile default.


speed=X

Override the speech speed multiplier from this point forward.

Range0.15.0
Default resetspeed= (restores profile default)
1.0Normal speed
Normal speech <ktt speed=2.0> fast bit <ktt speed=> back to default.

decl=X / declination=X

Pitch declination — how much pitch falls over the course of the utterance. Both decl and declination are accepted.

Range0.01.0
0.0Steady pitch (no fall)
0.8Strong downward drift
Rising <ktt decl=0.0> steady <ktt decl=0.8> falling voice.

waveform="TYPE"

Change the glottal waveform used by the synthesizer, setting the overall character of the voice.

ValueCharacter
sawDefault, neutral voice
triangleSofter, smoother
sin / sinePure tone, robotic
squareHarsh, mechanical
pulseRaspy, textured
noiseWhispered, breathy
warbleWobbly, character voice
<ktt waveform="noise"> whispered section <ktt waveform="saw"> normal voice.

vowel=X

First formant (F1) frequency multiplier. Shifts the quality of synthesised vowel sounds.

1.0Normal
> 1.0More open vowel quality
< 1.0More closed vowel quality
<ktt vowel=1.4> different vowel colour here.

accent=X

Second formant (F2) frequency multiplier. Shifts accent or dialect colouration.

1.0Normal
< 1.0Shifted accent colouring
<ktt accent=0.8> shifted accent here.

pitch=X

F0 pitch variance amount. Controls how much pitch varies during synthesis.

1.0Normal variance
> 1.0More expressive intonation
< 1.0Flatter, more monotone
<ktt pitch=2.0> very expressive speech <ktt pitch=0.1> flat monotone.

pause=X

Insert a pause of X seconds at this position in the voice playback. Value is required — there is no default.

Hello.<ktt pause=0.8> How are you?

Combined Example

Dialogue string using typewriter text commands and KTT voice tags together:

[b]Warning:[/b] [color=#FF0000]do not[/color] proceed.[pause=1]
<ktt waveform="square" pitch=1.8>This is a mechanical override.<ktt pause=0.5><ktt waveform="saw" pitch=1.0>
[speed=3]Resuming normal protocol.[/speed]

See Also

For conceptual overviews and usage guides:

For component references:

  • Audio Manager – Manager lifecycle that the voice system participates in

Get GS_Audio

GS_Audio — Explore this gem on the product page and add it to your project.

6 - Third Party Implementations

Integration guides for third-party audio systems with GS_Audio.

This section will contain integration guides for connecting third-party audio middleware and tools with the GS_Audio system.

For usage guides and setup examples, see The Basics: GS_Audio.


Get GS_Audio

GS_Audio — Explore this gem on the product page and add it to your project.