I’ve recently been implementing the Speech Synthesis SDK on iOS at VocaliD and wanted so share some quick hints on working with the Apple Audio Queue Services. This is no thorough treatise but just a few bullets points that might be helpful.

  • If you want the callback to run for a specific buffer, you have to enqueue the buffer (AudioQueueEnqueueBuffer) and the Audio Queue must be running (AudioQueueStart).
    While this sounds obvious, one might also think that creating the buffers (AudioQueueAllocateBuffer) already registers it with the audio queue. This is not the case! So you normally want to have calls to AudioQueueEnqueueBuffer twice: before starting the queue and in the callback. This means that you either have to wait until you can fill all buffers before starting, or enqueue silence before starting the queue.
  • Moreover, you must enqueue buffers with size > 0 to ensure that the callback for this buffer is called again.
    When retrieving your data from a source, explicitly check for size 0. In my case I happily enqueued everything I got from the synthesizer. Later I noticed that the ouptut became creaky and at some point just stopped completely (with the Audio Queue still running). Debugging the issue showed that callbacks for buffers were not called anymore. And that happened for buffers with size 0, which happened to occur from time to time, reducing the number of used buffers every time it happens.
  • If you have to wait for data, enqueue silence.
    In my case a background thread was generating speech. So when the callback for a buffer was called, there was often no data available yet.
    I tried different approaches to learn more about how the Audio Queues work:
    – Return from the callback: This is shown in some tutorials but results in the callback not being called again for this buffer, again reducing the number of rotating buffers.
    – Enqueue an empty buffer (size 0): I thought this might give my background thread some time to produce data, but as described previously, this also results in the callback not being called again for this buffer.
    – Waiting: The callback blocks the Audio Queue and even short sleeps for several milliseconds can lead to a timeout – again the callback will not be called again for this buffer again.
    – Enqueue silence: I found that enqueueing 1 sample of silence worked reasonably well to give the synthesis thread enough time to produce data before the callback is called again.
  • Consider creating a new queue from time to time instead of reusing
    Especially when using AudioQueueStop with inImmediate:false (e.g. when there is a pause until the next playback). This is because it is not straightforward to find out if the Audio Queue has really finished playing when you plan to call AudioQueueStart again.