Proposed Enhancements to PortAudio API

002 - Improve Device Formats Query Interface

Enhancement Proposals Index, PortAudio Home Page

Updated: August 30, 2002

Status

This proposal is sufficiently well defined to be implemented.

Background

The current (V18) method of querying devices for supported sample formats, channels, and sample rates (Pa_GetDeviceInfo()) is considered to be weak. It is incapable of representing formats where stream parameters are interdependent (for example where full duplex is only supported for certain sample rates.) We have found that representing device capabilities using a static structure is not a good match for many host APIs where format discovery is performed by polling the driver. This is because populating the structure may involve many polling operations. The current PortAudio API is able to provide details about hardware which supports a continuous range of sample rates, however many host APIs only allow sample rate information to be gathered by polling for specific rates.

It has been noted that most host audio APIs have poor device capability querying mechanisms. Even the better APIs (ALSA is perhaps one) don't necessarily provide accurate information. Ideally this proposal should seek to maximise the amount of information that can be extracted from existing APIs while remaining expressive enough to take full advantage of APIs with more advanced capability querying systems should they become available in the future. In reality, the available design options consist of providing an expressive interface that may provide inaccurate information on some platforms, or providing a less expressive interface which is more likely to provide accurate information, but may limit the amount and detail of information compared to that supplied by the underlying host API. The latter design option has been chosen for this proposal.

Proposal

The existing (V18) PaDeviceInfo structure and Pa_GetDeviceInfo() function will be retained to provide simple attribute information about the device such as it's name, and for representing information that can be obtained without querying the host API multiple times, such as input and output channel counts:

typedef struct PaDeviceInfo {
    int structVersion; 
    const char *name;
    PaHostApiTypeCode hostApi;

    unsigned int maxInputChannels;
    unsigned int maxOutputChannels; 

    double defaultSampleRate;
} PaDeviceInfo;

The following fields have been removed from the above structure, relative to V18: numSampleRates, sampleRates, nativeSampleFormats.

Complete sample rate information will only be available by polling PortAudio using the Pa_IsFormatSupported() function defined below. The defaultSampleRate field is provided so that simple clients have access a known working sample rate, and so that host APIs that only support one possibly non-standard sample rate can expose it to clients. Implementations should use the following heuristics to determine which value to provide for defaultSampleRate when multiple rates are available:

If the implementation must poll for available rates, the following search order could be used to select the defaultSampleRate: 44100.0, 48000.0, 32000.0, 24000.0, 22050.0, 88200.0, 96000.0, 192000.0, 16000.0, 12000.0, 11025.0, 96000.0, 8000.0.

The concept of "native sample formats" - those that are directly supported by the underlying host API - will be removed from the PortAudio specification as PortAudio's conversion capabilities allow support for all PortAudio sample formats.

A new Pa_IsFormatSupported() function will be added, which allows clients to query whether a particular combination of formats for input and output is valid. It returns 0 if the format is supported, and a PortAudio error code if the format is not supported. The special symbol paFormatIsSupported is provided to compare against the return value.

#define paFormatIsSupported (0)

PaError Pa_IsFormatSupported(
   const PaStreamParameters *inputParameters,
   const PaStreamParameters *outputParameters,
   double sampleRate );

The Use Structs for Pa_OpenStream() Parameters proposal defines the PaStreamParameters structure which is used to pass device, number of channels and sample format information to Pa_IsFormatSupported().

Discussion

This proposal provides a mechanism for querying PortAudio for supported formats without requiring implementations to construct device information structures by polling the host api multiple times. The down side to this design is that some information that host APIs may make available without polling is hidden from PortAudio clients. Most significantly, if a PortAudio client wishes to display a list of supported sample rates, it's only option is to poll PortAudio. This was deemed to be satisfactory because in many cases PortAudio is unable to establish the available sample rates without polling itself. The alternative of providing sample rate information only when it is available was considered, but discarded as being confusing and error prone.

Originally a goal existed to provide clients with information about which sample formats were natively supported and which were emulated by PortAudio. The idea was to allow clients to bypass PortAudio's conversion capabilities (and overhead) by using natively formatted buffers. This goal has been removed for three main reasons. A) Both buffer block adaption and unusual formats (such as varying endianness in ASIO, and multiple stereo interleaved with MME) make it difficult to provide a direct path between client and host api except in rare cases. B) PortAudio aims to use best-of-breed buffer conversion routines, thus making it undesirable for clients to implement their own conversion layer on top of PortAudio. C) It is difficult to imagine a Portable software design that would want to determine its output sample buffer format based on host API native format constraints; usually internal constraints such as use of a floating point signal path, or the format of a soundfile being played back would be more important.

Although the retention of maxInputChannels and maxOutputChannels in the PaDeviceInfo structure is considered desirable by some clients, it has not been determined whether implementation of this feature is practical without polling some host APIs multiple times.

It was suggested that we could do things the way MME does: if Pa_OpenStream() is called with a NULL stream parameter then the stream isn't opened, but it is checked to see if the device supports the specified format - if the format is supported then paNoError would be returned, otherwise an error code would be returned. However, this has been ruled to be unsatisfactory, since querying for supported formats is really a different function from opening a stream.

Prior to the definition of the PaStreamParameters struct in the Use Structs for Pa_OpenStream() Parameters proposal, the following signature was proposed for Pa_IsFormatSupported():

PaError Pa_IsFormatSupported( PaDeviceIndex inputDevice, 
                        int numInputChannels, 
                        PaSampleFormat inputSampleFormat,
                        void *inputDriverInfo, 
                        PaDeviceIndex outputDevice, 
                        int numOutputChannels, 
                        PaSampleFormat outputSampleFormat, 
                        void *outputDriverInfo, 
                        double sampleRate );

Impact Analysis

This proposal will provide clients with more expressive methods for querying device capabilities, which should improve the utility of PortAudio. Clients which currently depend on PaDeviceInfo fields which have been removed (sample rate and native formates information) may have to make significant changes to their code.