[Tip] Create Audio Applications

  • Clicking Noises

    If there is a discontinuity in the audio, for example a jump from zero to half the maximum signal level in one sample, you will hear a clicking sound. These discontinuities are commonly caused by one of the following:

    • Discontinuity in Audio Data: either the sound file or waveform you are using has a discontinuity, or you’ve created a discontinuity with some sort of processing. For example, if you change volume suddenly, you may hear a click as the resulting waveform jumps from one value to another. The solution is usually to “ramp” your volume or other processing parameters so that the change is made gradually over several samples, rather than all at once. There are a number of ways to do this, the easiest of which is to just to change the value linearly over the course of, say, 20 to 100 samples. So instead of jumping from a gain coefficient of .1 to .2, you go from .1 to .101 to .102 to … to .198 to .199 to .2.
    • Dropouts: if your software and CPU are unable to provide audio data fast enough for the hardware, the hardware is forced to substitute zeros or other data to fill in the gaps. !PortAudio can detect these dropouts on some platforms, but it does not do so reliably on all platforms. Make sure you have ample CPU power for the processing you are trying to do, and make sure you follow the advice for your platform about latency and threading. Choosing a good buffer size can help a lot, both with keeping CPU usage down and improving reliability of playback. If you are unsure how to set the buffer size, try using paFramesPerBufferUnspecified, and !PortAudio will do it’s best to use an optimal (an possibly changing) buffer size.

  • Detecting Dropouts

    An effective way to detect dropouts is to record or play back 30 seconds to one minute of a low-frequency sine wave (about 100 Hz) and listen on a speaker. Any discontinuity in the sine wave will be readily apparent with a noticeable click in the audio. This has proven to be more effective than listening for dropouts in complex waveforms such as music or speech. The low frequency of the sine wave is fairly easy on the ears.

    Note that an extremely loud pop or click can be damaging not only to your speakers, but also your ears, so keep the volume low while you are writing new software!

  • Threading

    At the moment, the thread safety of PortAudio is technically unspecified and hence platform/implementation dependent.

    • Calls to open and close streams are definitely not thread safe on any platform. This means that if you plan to open and close streams from more than one thread, you will need to provide a locking mechanism to ensure that you don’t open or close streams in different threads at the same time.
    • In general, calls to ReadStream and WriteStream on different streams in different threads are probably safe. That means that if you are doing [wiki:TutorialDir/BlockingReadWrite Blocking I/O], one thread may read and write to one stream, and another thread may read and write to another.
  • Changing the Volume of Audio

    Audio data can easily be made louder or softer by multiplying each sample by a constant. For example, to make audio half as loud (which corresponds to about 6dB of gain reduction), simply multiply each sample by .5. Note that if you change this value abruptly, you might end up with Clicking Noises.

  • CPU Usage

    In many applications, it is important to keep the CPU usage to a minimum. Minimizing CPU usage may conflict with other objectives, such as minimizing latency, so it is up to you to decide which of these suggestions to use.

    Here are some tips for things that might improve CPU performance:

    • Use a high latency: Try using a large latency, which keeps lots of audio data buffered, so that the OS doesn’t need to constantly switch your audio app in and out of context. Of course, using too high a latency might cause cache misses, so you’ll have to experiment to see what’s best for your app.
    • Use a large buffer size: If you are using small buffer sizes, try something larger. Most platforms do well with something in the 128-1024 range, and powers of two usually work well. Often, using paFramesPerBufferUnspecified will result in both good CPU performance and low latency, though this depends on the platform.
    • Try platform specific flags: Some platforms offer special include files that offer platform specific features. For example, on Mac OS X, you can use the functions and constants in pa_mac_core.h to adjust sample-rate conversion settings and other things.
  • Audio Files

    Reading and writing audio from a file is one of the first things many people want to do with PortAudio. There are generally two considerations to keep in mind:

    • Don’t do I/O in the callback: Because the callback is a sensitive place in most operating systems, it is a good idea to pass the data to another thread for file I/O rather than doing file I/O in the callback. Even if the OS docs say it’s safe, doing I/O in the callback can cause unbounded delays which may cause [wiki:tips/ClickingNoises dropouts] in playback. If passing audio data between threads sounds hard to you, you’re right; it’s not easy. However, the task is made much easier by PortAudio’s [wiki:TutorialDir/BlockingReadWrite Blocking IO] interface. (Users of the older version of PortAudio, V18, may be able to take advantage of the PABLIO interface). Because the blocking interface takes care of transferring audio between threads for you, it is safe to do file I/O in the same thread as your blocking I/O calls.
  • Callbacks

    What is permissible in a PortAudio callback function?

    (This page is a summary of discussions from the mailing list.)

    Audio callbacks are often executed at a very high priority. In general the only things that should be happening in the callback routine are math and data movement. One should avoid:

    • memory allocation
    • file or device I/O
    • crossing language boundaries such as calling Java or Lua
    • calling graphics routines or other OS services

    If you need to gather input from MIDI, keyboards, a mouse or files then do that in a foreground thread and feed it into the audio callback via a command or data FIFO. If the audio callback needs to produce any output such as log files or graphics then that information should be put into an in-memory FIFO that is serviced by another foreground thread.

    • Avoid acquiring locks which could be held by lower priority threads. This can cause priority inversion. http://en.wikipedia.org/wiki/Priority_inversion
    • The execution time of your callback function must be guaranteed to be less than the callback framesPerBuffer duration. It could become time-unbounded due to acquiring locks (directly or inside OS API calls), waiting for something unbounded (like disk access or graphics card access) or performing computationally unbounded operations such as unbounded garbage collection sweeps, anything significant with time complexity over O(1) and variable working set size.

    The above two points are basic principles of real-time programming. Most of the rest of the advice we are giving derives from these two points. Very little third-party code you will encounter on a desktop OS can give you the above guarantees because desktop OSes are generally not real-time OSes. There is plenty of code out there which uses safe O(1) algorithms, bounded time incremental GC, special memory allocators, but you need to research these things and not assume you are using them (you probably aren’t) because simplicity and performance trump real-time operation for most desktop OS tasks.

    On some platforms (i.e. OS X) Apple has explicitly made stipulations about what you can and can’t do in the callback because it runs in a special real-time thread (they’re based on what I said above: basically, anything at the BSD level or higher is unsafe, some Mach stuff might be OK). On other platforms, if you set the latency high enough, you might be able to get a way with calling a lot of unkosher stuff — but if you start up Microsoft Office while you’re doing audio you may get a surprise. From a portability point of view, in principle, on a target OS the callback could run in a special IOProc, interrupt or some other context where regular OS calls are not even available.

    PortAudio often uses a very high priority thread for the audio callback — it’s usually not appropriate to make some calls on such a thread (forgetting real-time appropriateness) such as GUI drawing, because this will reduce the overall responsiveness of your machine and potentially starve out other tasks.

  • Transferring data to/from your application

    Transferring data between PortAudio and your application is typically accomplished using a ring buffer. A buffer is set up to hold approximately one-half second of audio data. During the callback function PortAudio reads data from or writes data to this buffer and keeps track of how much data has been read or written. A separate thread containing a timer “wakes up” at intervals of approximately 10 to 100 milliseconds and reads or writes this data from/to disk, performs graphics operations, memory allocation, etc. which involve calls to the operating system. The program pa_ringbuffer.c, available with the PortAudio source code, can be used for this purpose.

    If you are simply capturing audio and writing it to disk or reading audio from disk and playing it to a sound card or audio interface, the blocking interface is suitable. If you are also doing low-latency processing or monitoring then you should use the callback interface. In the callback, read or write your data from/to a ring buffer and then use another thread to perform the file I/O.

  • Regarding crossing language boundaries such as calling Java or Lua

    In general it should be avoided. But, in fact, Lua has a bounded time GC (except in rare circumstances involving huge tables, and that’s well documented) so, like the Supercollider language, it could be used in a PortAudio callback so long as you make sure it doesn’t violate the points I made above: i.e. avoid calling the OS level memory allocator, file I/O, or doing other things which violate the above in your Lua user functions. (Lua AV used to run audio processing in the callback). That said, running Lua in a PortAudio callback is definitely at the experimental end of the spectrum.

 

답글 남기기