A Technique for Mixing RTS Audio in Unity


A bunch of Audio clips in a Unity asset library.

There are remarkably few resources around the internet for creating an audio mix in Unity. Of the ones that exist, they seem to always fall into one of two categories:

  1. Get started playing sounds in Unity with an Audio Source!
  2. Here’s how to integrate Wwise into Unity!

I needed something in the middle. Playing sounds using sources and listeners wasn’t going to cut it, and I didn’t have time to learn Wwise, as we were submitting our game to an expo in a month and we had zero sounds.

Since our game was an RTS, there were a few of specific challenges I needed to solve:

  1. Many sounds are less important than others (a gun firing, vs “You are under attack!”). Unity does support priorities on audio sources, but a higher priority only makes sure that the sound gets played; we needed the sound to play louder relative to other sounds.
  2. There are a ton of sounds going off, all at once. This especially presents an issue when a bunch of units all begin firing at once; this causes a very unpleasantly loud sound as the amplitudes add up.
  3. The sounds are all over the map, so we need 3D audio. Since the camera is overhead, we can’t use the 3D position to attenuate, we need to use the 2D position relative to the units.

In desperation, I turned to Unity’s AudioMixer. Though most sources I’ve read around the internet say it’s not good, I managed to make it work.

A note of caution, first, that I am not an audio engineer. I’m not even particularly good at audio stuff. This is just what I found works for our project - there are no doubt better ways to do it. With that in mind, below is a high-level outline of my solution.

Audio mixer setup

To create an AudioMixer asset, right-click in the asset library, go to “Create -> Audio Mixer”, and double-click it to edit.

Below is the AudioMixer I’ve configured for the RTS, along with an explanation for each of the groups:

Audio mixer configuration. If you need accessibility help, contact me at chaosed0 at gmail.

  • Generic, low-priority sound effects, such as guns firing or workers constructing, go through UnitSfxMidPri.
  • Unit SFX that are more important, such as deaths and firing of high-damage rockets, go through UnitSfxHighPri.
  • High-priority alerts, such as “You are under attack!” or “You don’t have enough ore to build that” go through Alerts.
  • UI noises, like button clicks, go through UI.
  • GameSound is a master group containing all groups dealing with sound effects. You’ll see why we need this later.
  • Music goes through… well, the Music group.

To play sounds through one of these groups, we simply create Audio Sources with the Output field configured to the right AudioMixer group.

Audio Source with Output configured to the UnitSfxMidPri group.

Problem 1: Hearing alerts over Unit SFX

I’ve defined an “alert” to be a voice line that notifies the player of something happening. Examples of this are “You’re under attack!”, or “You need more ore”.

When one of these alerts plays, we want to give them special importance in the audio mix. One way to do this is to lower the volume of all unit sound effects while an alert is being played, which can be achieved by using the Duck Volume effect.

To apply effects to a given group, first click on it in the AudioMixer editor to open the group’s inspector. Then, click “Add Effect”. Here are the inspectors for my UnitSfx and Alerts groups:

UnitSfxGroup has Duck Volume. Alerts has Send attached to the Duck Volume.

On the left is the inspector for the UnitSfx group. Note that the UnitSfx group contains both UnitSfxMidPri and UnitSfxHighPri. I’ve added a Duck Volume effect to it.

On the right is the inspector for the Alerts group. The “Send” effect applied to it doesn’t have any effect on the Alert group itself; it just sends the output level (in decibels) to the configured effect. In our case, we’re sending to the Duck Volume on the UnitSfx group.

Back on the UnitSfx Duck Volume, we’ve configured it such that:

  • Any input over -25db will cause the volume to go down (Threshold)
  • The volume will go down by a moderate amount (Ratio)
  • The volume will go down very quickly (Attack time) and, after the alert has gone below -25db, go back to normal somewhat quickly (Release time).

With this configured, any time we play a sound through the Alert group, all unit SFX sounds will become quieter, giving importance to those alerts.

Problem 2: Playing sound effects for hundreds of units at once

UnitSfxMidPri has Duck Volume and Send attached to its own Duck Volume.

This is the inspector for the UnitSfxMidPri group. It too has a Duck Volume applied, but here’s the trick: the input to the Duck Volume is the group’s own output. This makes UnitSfxMidPri duck its own volume when it reaches a certain threshold.

That means we can play as many sounds as we want through UnitSfxMidPri and it won’t matter; the maximum volume of all those sounds will always stay relatively quiet. Again, this is important when we have 100+ units on the battlefield, all wanting to play their own combat sounds, plus units dying and workers constructing and… you get the point.

The configuration for UnitSfxHighPri is nearly the same as the one for UnitSfxMidPri, with a few small tweaks.

You might be wondering; why not just scale each unit’s AudioSource’s volume depending on how many are playing? I tried this, and ran into a couple of issues:

  • It doesn’t take into account the relative volume of each sound.
  • We’d need to account for spatial audio.

It’s just darn complicated to do through script, and this way is much more elegant.

Problem 3: 3D audio with an overhead camera

This problem is unrelated to the audio mixer, and is purely a scripting challenge. The strategy I took was to:

  1. Remove the AudioListener component from the Camera, and instead put it on an empty GameObject located at (0,0,0)
  2. Create an AudioBus which handles playing audio and manually positions AudioSources relative to the AudioListener

When some unit wants to play a sound, it calls the Play method on the AudioBus:

public long Play(GameObject emitter, AudioClip clip, bool loop = false)

When this method is called, we store a new (pooled) AudioSource and the GameObject it is following. Then, in its Update method, the AudioBus updates the AudioSource’s position relative to the camera:

void UpdateSoundPosition(GameObject emitter, AudioSource source)
{
    // This variable configures the "size" of the camera in world units, and is used to
    // configure the 3D spatial falloff curve within the AudioSources
    float halfWorldDistanceWithinCamera = 10.0f;

    // First, get the position of the emitter in screen-space
    Vector3 worldPosition = emitter.transform.position;
    Vector3 screenPoint = Camera.main.WorldToScreenPoint(worldPosition);
    Vector3 sourcePosition = Vector3.zero;

    // Get the normalized distance from the center of the camera; 0 is the center, -1 or 1 is the edge, <-1 or >1 is off-screen
    float xLerp = (screenPoint.x - rtsAudioListener.camera.pixelWidth / 2) / (rtsAudioListener.camera.pixelWidth / 2);
    float yLerp = (screenPoint.y - rtsAudioListener.camera.pixelHeight / 2) / (rtsAudioListener.camera.pixelHeight / 2);

    // Get the "fake" world-space coordinate of the source and position it
    sourcePosition.x = halfWorldDistanceWithinCamera * xLerp;
    sourcePosition.z = halfWorldDistanceWithinCamera * yLerp;
    source.transform.position = sourcePosition;
}

The only tricky thing here is the halfWorldDistanceWithinCamera variable. Since we’ve configured halfWorldDistanceWithinCamera to be 10, if a unit at the right edge of the camera plays a sound, the associated AudioSource will be positioned at (10,0,0). If the unit is half a screen away from the right edge, the AudioSource will be positioned at (20,0,0), and so on. This gives us a way to configure the falloff curve of any given AudioSource, as long as we know what value the distance is configured to:

AudioSource with Output configured to UnitSfxMidPri and a custom falloff curve.

The way we’ve configured the UnitSfxMidPri AudioSource falloff curve, a unit at the edge of the camera will still be near full volume, and any unit positioned more than a full screen away from the camera edge will not be heard at all.

The advantage of projecting units into screen-space (vs. projecting the camera into world-space) is that units within camera bounds are always heard, no matter the aspect ratio. Units on screen are much more important (game-design-wise) than units off screen, so the player should hear those first and foremost.

Bonus problem: Audio options menu

In any game, RTS or not, you’ll want an options menu where the player can configure the music and SFX volume to their liking. With the way we’ve configured our AudioMixer, this is easily achieved. For music, all we need to do is expose the Volume parameter of the built-in Attenuation effect (get to the shown menu by right-clicking on Volume):

Menu open over the Volume property on the Attenuation effect.

You can rename the value of the property by using the “Exposed Parameters” button at the top right of the AudioMixer:

Exposed parameters, found at the top-right of the AudioMixer inspector.

I’ve set the name to MusicVolume. After figuring out the name, we can set the volume through a script:

    /// The input volume should be between 0 and 1
    public void SetMusicVolume(float volume)
    {
        audioMixer.SetFloat("MusicVolume", toDb(volume));
        musicVolume = volume;
    }

    /// To be honest I pulled this off of StackOverflow, any audio engineers want
    /// to weigh in on if this is correct or not?
    float toDb(float volume)
    {
        if (volume <= Mathf.Epsilon)
        {
            return -80f;
        }
        else
        {
            return 20 * Mathf.Log10(volume);
        }
    }

We can do the same with SFX by exposing the attenuation of the GameSound group. Since it holds all of UnitSfx, Alerts, and UI, it will attenuate all of them, and our effects in each of the sub-groups will keep working.

Conclusion

This AudioMixer setup is enough to create a quite functional RTS audio mix. Hundreds of sounds become manageable, we can prioritize unit sound effects, and voice alerts cut through the noise.

There’s some more work to be done here. The next step for me is to learn Wwise; I’ve heard that it gives you a lot of these things out-of-the-box, in an audio-designer-friendly package. In particular, there’s a lot of scripting in the AudioBus (which I haven’t shown here in full) that I think could be standardized by an audio middleware.

If you want to check out the final result, here’s the website for the RTS Gridpulse Legions, where you can find a pre-alpha demo build of the game.