Proposed Enhancements to PortAudio API

022 - Well Defined Sample Conversion Semantics

Enhancement Proposals Index, PortAudio Home Page

Updated: April 26, 2004

Status

This proposal is under consideration by the PortAudio development community.

Background

When the client requests samples in a different format from that provided by the host API, PortAudio automatically converts the sample's format. This happens, for example, if the client requests floating point samples from a host device which deals only with 16 bit linear samples. The PortAudio V19 conversion infrastructure includes generalised format conversion code which implementations use to convert between all supported sample formats (linear signed and unsigned 8 bit, signed 16, 24 and 32 bit, and 32 bit floating point). The client may also choose to have samples clipped and/or dithered during the conversion process.

Previously the methods employed to convert between different sample formats were implicit and were not clearly specified. This lead to uncertaintly about the correctness of some conversions in the V19 code. It has resulted in contention over the best approach to perform some conversions. Concerns have also been voiced that sample conversion should be deterministic. Doubts surround a number of aspects of the conversion process, including:

What to put in the low bits in conversions which increase the bit depth? (zeros, noise, duplicate of high bits)
How to scale when converting between floating point and integer formats? (integer formats have asymmetrical ranges around zero, floating point formats are symmetrical)
Should dithered conversions always clip? (they did in V18)

This proposal aims to answer the above questions (and perhaps others) by clearly specifying PortAudio's sample conversion semantics. Once a clear specification has been completed, future releases of PortAudio will adhere to it.

The issues of filling the contents of low order bits when increasing bit depth, and of clipping and dithering were discussed in this thread: http://techweb.rfa.org/pipermail/portaudio/2004-April/003477.html

A discussion of scaling issues for floating point formats began here: http://techweb.rfa.org/pipermail/portaudio/2004-April/003489.html

The issue of converting sample formats has been raised in other fora. The JACK mailing list discussed this last year: http://sourceforge.net/mailarchive/forum.php?thread_id=3061300&forum_id=3040 where consideration was also given to the format converters used by libsndfile. In 2001 converting from 16 to 24 bit was discussed at length on the music-dsp list: http://aulos.calarts.edu/pipermail/music-dsp/2001-May/009785.html

Requirements

The goal of this proposal is to define how all conversions work. Each conversion will work in one way, there will not be "user selectable" conversion methods because this will place excessive load on the PortAudio developers. The V19 conversion infrastructure already provides a mechanism for users to substitute their own conversion functions if the PortAudio conversion functions do not meet their needs.

Proposal

None yet.

Discussion

TODO: extract issues with float-int conversion from mailing list posts. No solution can fulfill everyone's needs.

It should be noted that PortAudio is intended as a layer above so-called "Host API" audio services. This means that PortAudio may forward some conversion capabilities to the Host API or its drivers. This proposal needs to be clear about the degree to which PortAudio takes responsibility for ensuring the semantics of sample format conversion.

Impact Analysis

The outcome of this specification should impact clients in a positive way in that sample format conversion will be deterministic. Clients which depended on the exact scaling factor for float-int-float conversions may require modfications if such scaling factors change.