Sorry for being
a pedant, y'all, but I continually find myself reading
posts that show a lot of confusion about the difference between an "API"
and a "Rendering Engine" when it comes to 3D audio especially.
This isn't too surprising, since most options for game development
that we hear about have an API component _and_ a rendering engine
component - sometimes even a software rendering engine and hardware
rendering engine, but they share the same name. A lot of people don't
realize that they aren't all one inseparable thing.
Also, to a large degree, API's are compatible with many different
competing rendering engines and visa versa.
API: Application Programming Interface.
This simply a set of commands, for example:
IDirectSound3DBuffer::SetPosition
...is the DS3D command that passes (x,y,z) coordinates for the
desired sound position to the rendering engine. (I haven't shown
the whole syntax here, just the name of the function as it would
appear in the code.)
These commands by themselves don't do anything to the sound. All this
command does, for example, is say "put the sound at these coordinates".
The commands have to be passed to something that actually does the work:
the rendering engine. The rendering engine must of course be capable of
supporting the specific function for anything to happen at all.
For those of you who know something about MIDI, it's also an API of
sorts. MIDI is a bunch of commands that say stuff like "turn on A below
middle C on channel 12 with a velocity of 104" or "select patch number 42
on channel 3". In this case the MIDI synthesizer is the rendering engine
that interprets the commands and turns them into action.
Rendering Engine
The 3D rendering engine actually takes the audio streams and applies some
sort of process to them to (hopefully) create an impression of 3D
placement in accordance with the commands passed by the API.
Not every rendering engine can support all the commands of every API, nor
does every API support all the commands of every rendering engine, but at
the risk of repeating myself: When it comes to 3D audio on the PC, there
is a huge amount of overlap, and a lot less exclusion than most people
realize.
Microsoft DirectSound3D
A game that uses "DS3D" is simply using a standard, open, free set of
commands supplied by Microsoft for passing information about desired
sound positions, etc. to a rendering engine, just like the same game
might use MIDI for passing commands to a synthesizer.
The rendering engine will use its own algorithms to actually place sound
to the best of its ability. This might use binaural synthesis for
headphones, 3D for two speakers, four-speaker panning or whatever. It
could be a QSound Q3D engine, or Creative Labs, or CRL or A3D or Yamaha
or whatever you put your money down for.
DS3D also has its own software rendering engine, which is only used to
emulate hardware if you don't have a 3D card, or, at the developer's
option it may be used to render additional sounds when hardware resources
are all committed. Unfortunately at present, this engine isn't terribly
effective and eats a lot of CPU for what it does. Therefore, wise game
programmers typically avoid letting the DS3D engine come into play -- no
pun intended.
They can mute sounds that exceed 3D hardware capabilities. They
can program a separate set of sound commands (using regular stereo
DirectSound) for non-accelerated systems, or they can use a development
kit that automatically replaces the DS3D rendering engine with a more
efficient and/or more effective software engine when there isn't hardware
acceleration available.
So when someone says something like "DS3D sucks" it's more than a bit
misleading. As an API, DS3D it isn't the greatest, but it does support
all 3D hardware, which is a good thing.
Also: every single API for 3D audio on the PC uses DS3D commands to a
large extent for talking to hardware, so the DS3D API is being used a lot
more than you may realize!
When supporting hardware, the use of DS3D as an API (either explicitly or
through some other higher-level interface) doesn't have anything to do
with how "good" the 3D positioning effects are. By the same token, using
using MIDI as a command protocol does not define the effectiveness of a
particular synthesizer engine.
If a game is "DS3D-compatible" you know it's going to work with your
3D hardware, whatever it is, and when you play the game you will be
listening to the algorithms in your hardware -- i.e. what you paid for.
Chances are you've never heard the DS3D software engine, because
unfortunately at the moment it _does_ suck, and as stated above,
developers avoid ever letting you hear it if possible.
We're all counting on Microsoft to improve the DS3D software engine in
DirectX version 7, but by then everyone will probably have some kind of
3D card anyway. ;-)
Hardware support of DS3D
A 3D technology vendor would be foolish not to support the DS3D command
set on their hardware. To do so would be to create incompatibility with a
lot of software and would accomplish nothing in the bargain.
Software support of DS3D
Nowadays, more and more developers are realizing that they'd also better
support the DS3D command set, because everybody's hardware does.
To do this, a game may actually be written directly to the DS3D API, or
it may use an API from one of several third-party vendors that is more
friendly or more powerful than DS3D, yet translates to DS3D commands
"under the hood" to support hardware in a universal way. Such an API may
also support extensions to DS3D, which might be open (like EAX) or
proprietary, like certain aspects of A3D.
Are API's mutually exclusive?
It's real important to understand that developers can combine the use of
multiple API's. For example, someone might use DS3D, the EAX extensions,
and A3D 2.x extensions all in one title. Creative Labs will want to
trumpet the game as an "EAX" title, Aureal will call it an "A3D"
title,
and QSound will list it as "Q3D compatible". ;-) Meanwhile, the basic 3D
functionality will work on ANY soundcard. The app will query the hardware
to see what features it supports, and use extensions if it can.
Bottom line: you have to dig past press releases, and even then, you
don't have a guarantee that the title is going to make really effective
use of ANY of these features!! Caveat freakin' emptor, folks!
Let's look closer at some API's and rendering engines (alphabetical
order):
A3D 1.x
As an API, A3D 1.x is in fact a small collection of commands for
identifying A3D hardware and configuring its resource management
capabilities. In the early days (DirectX version 3) DS3D didn't have
hardware support yet, so the very first titles that came out using A3D
1.x also incorporated a clever "work-around" for this limitation. This is
all history now that DS3D supports hardware properly.
In A3D 1.x titles, sound positioning and related functions are actually
communicated to hardware using standard DS3D command syntax. In other
words, a standard, open API provided by the OS vendor has been "bundled"
with a few proprietary commands and called "A3D". Take away DS3D commands
from the A3D 1.x API and you are left with nothing particularly useful.
As a rendering engine, A3D is also the name of the technology Aureal
provides to render sounds in 3D space, for speakers or headphones, with
their hardware, either on actual hardware processors or in the card
driver software.
A3D 1.x "Emulation"
"Emulators" simply identify themselves to the application as being
compatible with the small set of A3D 1.x API commands primarily related
to resource management.
Then:
-- They replace management capabilities of the A3D hardware driver with
their own. (Needless to say, this has to be done right or it could
cause some mysterious results.)
-- They use the straight DS3D commands passed by the "A3D" app to control
their own rendering engine using their own algorithms.
So-called emulators, in essence, liberate the DS3D API calls for use by
the hardware. They do not "fake" or "emulate" A3D positional audio
technology any more than A3D "fakes" or "emulates" DS3D. This is a
common
misconception. DS3D commands don't give a sh*t what rendering engine you
use!
If you prefer the sound of A3D rendering, then you must get yourself an
A3D card. If you prefer (or consider equally effective) the rendering
algorithms in your "emulator" then that's your choice and that's what you
hear.
A3D 2.x
As an API, A3D 2.x adds extended functionality to DS3D through
proprietary (non-public) functions.
According to recent statements from Aureal, if the end-user system
doesn't have Aureal hardware but does have a competing DS3D-compatible
card, the API will pass standard DS3D commands to drive the non-A3D
rendering engine, allowing it to use its native algorithms to place
sound.
As a hardware rendering engine, A3D 2.x supports standard DS3D API
commands as well as the proprietary extensions of the A3D 2.x API
(wavetracing). There is a strong indication at present that A3D 2.x
rendering engines will also support the EAX property set extensions.
A high-efficiency software rendering engine (A2D) is provided to
developers writing to the A3D 2.x API. This supports non-accelerated
hardware with less coding effort and CPU overhead than the DS3D software
engine can provide.
EAX
As an API, EAX is a "property set" extension to DS3D that provides
additional commands for controlling environmental audio effects. Unlike
the A3D 2.x extensions, it's an open standard that anyone can use.
As an engine, EAX is also the name Creative uses for their own
implementation of environmental audio effects in their hardware rendering
engines.
Any 3D sound technology vendor can create their own rendering engine (in
the sound card driver or in the hardware) that will answer "Yes!" when
the app asks "Do you do EAX?" and then accept and interpret the commands,
applying appropriate processing.
(BTW, for basic sound positioning, Creative at this time is primarily
emphasizing systems using 4-speaker panning. This engine is accessed via
the standard DS3D API. Again, this doesn't mean the DS3D rendering
algorithm is used, just the API!!)
QMDX
As an API, QSound's QMDX is a high-level set of commands that save a lot
of coding effort for developers. QMDX uses standard DS3D commands "under
the hood" to support any 3D hardware. The next version will also support
EAX commands "under the hood" to support EAX-capable hardware.
As a rendering engine, QMDX provides a high-efficiency stereo mixer in
software for non-accelerated systems. Like A2D, this results in less
coding effort and CPU overhead than the DS3D software engine can provide.
QMixer
As an API, QSound's QMixer is a high-level set of commands that save a
lot of coding effort for developers. QMixer uses standard DS3D commands
"under the hood" to support 3D hardware, and the next version will also
support EAX commands "under the hood" to support EAX-capable hardware.
As a rendering engine, QMixer provides full 3D rendering for speakers or
headphones via a high-efficiency 3D mixer in software for non-accelerated
systems. (This can also be used to render additional channels on
3D hardware with limited capabilities.)
QMixer is different from the other API's and rendering engines discussed
so far in that it actually costs the developer a license fee to use.
Basically this is a charge for the 3D software engine, which ships with
every single copy of the game.
Q3D
Q3D is the name QSound has given to their positional 3D audio rendering
technology. There is no "Q3D" API; like Creative, Yamaha, CRL (etc.)
rendering engines, Q3D is compatible with any application that uses DS3D
positional commands. I.e. the API may be DS3D, QMDX, QMixer, A3D 2.0, or
any other third-party DS3D-compatible API.
Q3D implementations typically provide emulation capabilities for the A3D
1.x API. That is, they duplicate the resource management functions and
extract the DS3D commands to control positional audio rendering.
We're currently adding rendering support for the EAX property set as
well.
Q3D, like any 3D technology or signal processing algorithm, can be
implemented as a software rendering engine (e.g. running in the soundcard
driver) or a hardware rendering engine (e.g. in the upcoming
ThunderBird128 chip.)
Other Licensed API's
It's worth noting that there are several third-party development kits
aside from QMixer that developers must pay to use, such as the Miles
system from Rad Game Tools, the Sound Tool Kit from Diamondware, Sound
Operating System from Human Machine Interface and probably more. To my
knowledge none of these provide built-in 3D, but all support DS3D
hardware via the DS3D command set, possibly on-the-fly Pro Logic
encoding, and provide a higher-level degree of functionality and/or other
sound-related functionality such as MIDI libraries etc. to justify their
modest cost.
Open Standards / Proprietary Standards
The beauty of an open standard like DirectX, EAX, or for that matter,
MIDI, Windows, etc. is that everyone can produce compatible hardware and
software, which increases the number of hardware and software things that
can work together. That doesn't mean that every feature of every software
or hardware item is necessarily supported, but it ensures basic
compatibility.
Competition keeps everyone on their toes, because the command set is
equally available for everyone to use.
One of the drawbacks of an open standard is that different
implementations will give variable results. On the other hand, a
proprietary standard means that one should be more confident of
predictable results if you have a Brand X software title and a Brand X
hardware card.
An open standard necessarily places some constraints on what features can
be supported, because it is either supplied by a single source (e.g.
Microsoft) or agreed upon by a committee (e.g. MIDI Manufacturers
Association, Interactive Audio SIG) and can't address the desires of
every single software or hardware provider.
One of the marks of a good standard is that it provides a mechanism for
different vendors to add additional functionality. This is what the
"Property Set" mechanism in DS is all about. (For you MIDI freaks out
there, property sets are like "System Exclusive" commands.)
If a bunch of vendors agree on a particular property set, it becomes an
open extension to the API. (Examples: Voice Manager, EAX.) However, even
the property set mechanism may not be capable of supporting the needs of
a particular vendor. For example, Aureal has indicated that this
mechanism is simply not adequate to support the requirements of their
wavetracing API in A3D 2.x. I expect they're right.
Whether open or proprietary standards are better in "the big picture" is
very much a matter of personal opinion. Our position on this is a matter
of public record, and it is in any event not my intent in this post to
argue a particular point of view in order to incite debate.

I hope that those of you who have stayed awake this far may have a better
idea of how this stuff works. It isn't simple, and believe it or not,
there's a lot more detail I've left out.
Mainly I'm concerned with the number of folks saying things like "this
sucks so get that instead" without realizing that they're wasting their
time, either because they're comparing apples to oranges, or that "this"
and "that" are not in fact mutually exclusive.
I've also tried to keep this as unbiased as possible, although I do work
for QSound. It's Saturday, and I'm on my own time, so I'm not being paid
to tell you this stuff. I'm just a verbose techie avoiding my looming
homework assignment. ;-)
Scott Willing
Tech Info Guy
QSound Labs, Inc.
August 1998