- 易迪拓培训,专注于微波、射频、天线设计工程师的培养
Integrate high-quality audio into mobile design(Part 2)
Performance Trade-offs
Perhaps the most important performance trade-off an embedded audio system designer or product manager will need to consider is quality versus quantity. Audio synthesizers are susceptible to the same marketing pressures that affect other technologies in the mobile market, and quite often it is a numbers game, with synthesizer polyphony being the most prevalent number bandied about.
It is possible to get more voices for the same processor bandwidth by reducing the complexity of a voice. As usual, this comes at a cost to quality, but that point may be moot if the user is listening to the audio through an 8mm transducer. Here are some tradeoffs to examine when playing the numbers game:
Sample rate is probably the single biggest contributor to audio quality. However, if the transducer specs are 300 Hz to 3 kHz +/-3dB, there is little point to running the synthesizer at 48 kHz. There is a direct relationship between the processor bandwidth used by the synthesizer engine and the sample rate. Of course, as the sample rate drops, other parts of the synthesizer become larger contributors to the overall performance.
Some synthesizer architectures feature a low-pass filter that can be controlled by the sound designer. This can be used to increase the overall quality of the instrument sounds. The filter uses considerable processor bandwidth in the synthesizer engine and eliminating it may reduce execution time by as much as 35%. However, dropping the filter may require additional sample memory to properly synthesize certain types of sounds.
Stereo output can also be costly. While most of the signal path in a mobile synthesizer is monophonic, the final output stage uses a stereo pan control to steer audio output to left and right channels. Eliminating the stereo pan control reduces execution time by eliminating the control logic and MACs in the inner loop, reduces the memory footprint by cutting the buffer size in half, and reduces cache pollution as well.
Size is Important
FM synthesizers use a purely algorithmic method of synthesizing instrument sounds, while sampling synthesizers use a mixture of algorithms and recorded audio. As a result, FM synthesizers usually require much less memory for storing instrument programs than a sampling synthesizer.
Since sample-based audio synthesis has at its core a wavetable of recorded sounds to drive the oscillators, the size and quality of the wavetable is crucial to the resulting quality of the synthesized sound. Therefore, the process of wavetable creation or selection is considered by many to be the most important aspect of a successful MIDI solution. After all, you can have the most elegant synthesizer design possible but if the samples you are playing back are of poor quality, the entire solution will sound bad.
The samples must be free of any background or player noise, of adequate dynamic range, and consistent in loudness and timbre across the range of notes sampled. It is not enough to have balance across one instrument’s scale, all instruments must balance with each other when played together in the context of a musical piece. This requires more than engineering finesse and is often a process undertaken by professionally trained musicians with highly discerning ears.
Once the instruments have been sampled, the resulting recordings need to be key-mapped. This is a process in which individual samples are assigned a range of notes they are used for playback on. After the key mapping process is completed, the task of "voicing" or adding the synthesizer control structures is done. This involves musical decisions and programming to take the final set of recordings to a playable state.
Time and velocity variant filters are added, amplitude envelopes to modulate the volume over time, pitch modulation, layering of sounds for synthesizer voices, etc. all are done at this stage. In order for the final wavetable to sound correctly with the standard MIDI files available, careful attention should be given to volume balancing and "mixing" the instrument set so it plays well in a multi-timbral, musical setting.
Small footprint wavetables for mobile handsets and audio players take on an extra set of important tasks that involve several techniques to reduce the size of the wavetable, while maintaining a high quality output. These tasks may involve pitch and time compression techniques, specialized looping and sampling rate reduction, equalization, and many others. To ensure the best results, special consideration should be given with small footprint wavetables to optimize them for the playback synthesizer and final product application.
Related Standards
Unlike a typical codec, MIDI does not guarantee a specific output for a specific input. Synthesizers from different vendors will produce different outputs from the same MIDI file based on their own algorithms and samples. Some vendors have addressed this issue with proprietary formats such as SMAF (Yamaha) and CMX (Qualcomm/Faith). While these proprietary standards allow the content author more control over the sound, they limit content to specific platforms that support the proprietary standard.
In contrast, General MIDI (GM) is a joint MIDI Manufacturers Association (MMA)/ Association of Musical Electronics Industry (AMEI) standard that defines a common set of 128 instruments and 47 percussion sounds and the means to select them on any platform that supports it. This gives the author of a music file some assurance that when his or her composition requires a violin, that the platform will attempt to reproduce a violin sound. General MIDI 2 increases the number of sounds available and further defines the behaviors of a compliant platform. However, even the combination of General MIDI and SMF files still cannot assure the quality of the sound that will be reproduced on a particular platform.
To address this limitation, the Downloadable Sounds (DLS) standard was jointly created by the MMA and AMEI to allow content authors to create files of instrument sounds that can be downloaded to a compliant synthesizer. DLS gives the author a standardized method to control the sound of the instruments used to reproduce a musical performance. DLS-2 increases the capability of DLS-compatible synthesizer and provides for both forward- and backward-compatibility. DLS-2 (under the moniker SASBF) was adopted by the MPEG standards body in a joint effort with the MMA as part of MPEG-4 Structured Audio.
Shortly after the DLS-1 standard was ratified, MMA/AMIE released the eXtensible Music Format (XMF) file format, which combines an SMF music file with a DLS file into a single encapsulated file. This format gives the author a way to deliver an audio performance in a single compact file that gives the listener a consistent playback experience on compatible platforms.
Given the push to open the mobile platform to more content, we can expect to see standards-based formats make significant inroads in the near future. Indeed, third-generation project partnership (3GPP) has been working with the MIDI organizations to standardize a new musical file format for mobile devices. To address this issue, a joint task group from the MMA, AMEI, and 3GPP approved the Mobile-DLS standard (mDLS) in September 2004. This is an extension of the Downloadable Sounds (DLS) standard intended for mobile applications.
Mobile-DLS is a subset of the DLS-2 standard that provides for different profiles based on the capabilities of the device. A Mobile-DLS file can be combined with a MIDI music file into a Mobile-XMF file, creating a single file that can be accurately reproduced on a compatible synthesizer. While Mobile-XMF does not fully specify the audio output of the synthesizer down to the bit level, it represents a big step towards giving users a consistent playback experience across different mobile platforms.
Finally, JSR-135 is a Java MP specification that provides a way for Java applications running on a mobile device to access the music synthesizer. Through predefined transport controls, this interface can be used in games to play audio sound tracks, or in music applications that allow the user to compose or "remix" audio.
Wrap Up
High quality audio is creating another opportunity for mobile operators to differentiate their product offerings. Polyphonic ring tones allow users to personalize their mobile devices such as uniquely identifying callers based by the ringtone that sounds. The audio synthesizer is also providing new content plays including multimedia games and music composition application. Integrating a high-quality audio synthesizer into a mobile platform presents its own unique challenges, but the rewards can greatly boost the bottom line and improve listeners'' audio experiences.
About the Author
Dave Sparks is a senior software architect with Sonic Network, Inc. with more than 25 years experience in embedded systems design. He chaired the MMA committee that drafted the DLS standard, authored the DLS-2 standard and served as liaison to the MPEG standard body during its adoption into the MPEG-4 standard. He can be reached at sax_man@pacbell.net.