WL Elements - ASIO interface bit depth issue

drb · August 17, 2015, 3:01am

Product: WaveLab Elements 8.0.4 (32-bit)
WaveLab Master Section turned off.
Audio Interface: Focusrite Scarlett 6i6/ASIO
Byte ordering in this discussion will be big-endian.

When recording a 24-bit session of a 16-bit PCM stream being fed to the Scarlett 6i6 via SPDIF in, I am getting recorded samples that have a least significant byte (LSB) = 0x01 on samples with a negative value (below center axis). For samples with a positive value, the LSB is always = 0x00. See below for an example. All samples should have a LSB = 0x00 according to Focusrite support. I think that this error is manifesting in the WaveLab software as this does not happen with Ableton Live.

The Focusrite’s interface to the DAW is always 24-bit. Focusrite tech support has advised that when 16-bit PCM is fed into the SPDIF in of the 6i6, that the interface will left shift the samples and zero fill the LSB with a 0 byte so that it can be transferred as a 24-bit sample.

The issue was discovered as I was recording a session at 16-bit and discovered that the recorded samples did not match the samples in the SPDIF PCM stream. So I redid the recording as a 24-bit session and examined the saved samples. When 24-bit samples are having the LSB set to 0x01, and the session is recorded as a 16-bit session, or recorded as a 24-bit session and saved as 16-bit, the saved samples are skewed by -1 as a resultant of WaveLab’s word length reduction algorithm. At first I thought the issue was manifested from the word length reduction, but after extensive analysis of the content of samples I determined that the sample length reduction for 16-bit sessions or 24-bit sessions saved as 16-bit was a simple integer divide by 256 algorithm, and as such the bogus LSB of 0x01 would skew the result vs a simple right shift LSB truncation.

The following is some output samples from a PCM diagnostic tool I use. Here are some samples from a 24-bit session and the resultant save to a 16-bit file.

     24-bit Save:
                   LEFT                   RIGHT
     10681 [ 00 03 00 ^  -80.8 dB | 00 01 00 ^  -90.3 dB ]
     10682 [ ff fd 01 v  -80.8 dB | ff ff 01 v  -90.3 dB ]
     10683 [ 00 03 00 ^  -80.8 dB | ff ff 01 v  -90.3 dB ]
     10684 [ ff fd 01 v  -80.8 dB | ff ff 01 v  -90.3 dB ]
     10685 [ 00 03 00 ^  -80.8 dB | ff ff 01 v  -90.3 dB ]
     10686 [ ff fd 01 v  -80.8 dB | 00 00 00 -   -Inf dB ]
     10687 [ 00 03 00 ^  -80.8 dB | ff fe 01 v  -84.3 dB ]
     10688 [ ff fd 01 v  -80.8 dB | 00 01 00 ^  -90.3 dB ]
     10689 [ 00 03 00 ^  -80.8 dB | ff fc 01 v  -78.3 dB ]
     10690 [ ff fe 01 v  -84.3 dB | 00 04 00 ^  -78.3 dB ]

    ( A )   (   B   ) C  (  D   )  (   E   ) F  (  G   )
     

     16-bit Save from the above 24-bit session
                   LEFT                RIGHT
     10681 [ 00 03 ^  -80.8 dB | 00 01 ^  -90.3 dB ]
     10682 [ ff fe v  -84.3 dB | 00 00 -   -Inf dB ]
     10683 [ 00 03 ^  -80.8 dB | 00 00 -   -Inf dB ]
     10684 [ ff fe v  -84.3 dB | 00 00 -   -Inf dB ]
     10685 [ 00 03 ^  -80.8 dB | 00 00 -   -Inf dB ]
     10686 [ ff fe v  -84.3 dB | 00 00 -   -Inf dB ]
     10687 [ 00 03 ^  -80.8 dB | ff ff v  -90.3 dB ]
     10688 [ ff fe v  -84.3 dB | 00 01 ^  -90.3 dB ]
     10689 [ 00 03 ^  -80.8 dB | ff fd v  -80.8 dB ]
     10690 [ ff ff v  -90.3 dB | 00 04 ^  -78.3 dB ]

The columns of the above are defined as per:
A : sample #
B : left channel sample value in hex (big endian format, 3 bytes for 24-bit, 2 bytes for 16-bit)
C : left channel sample polarity indicator ^ - positive, v – negative
D : left channel sample level in dbFS
E : right channel sample value in hex (big endian format)
F : rightt channel sample polarity indicator ^ - positive, v – negative
G : right channel sample level in dbFS

I present this situation and my observations to the community here to see if anyone can provide some insight into what I am seeing or can provide a suggestion to assist in diagnosing further. Specifically, the issue I am concerned about is the injection of the LSB=0x01 into the 24-bit session for the negative samples of 16-bit PCM SPDIF stream as captured by the WaveLab software.

I would really like to diagnose and completely rule out that this data is not being inserted by the interface or the drivers and in order to do that I would need to interface directly with the ASIO driver to see what exactly is being presented by the audio interface – at present I don’t know what I can use to diagnose this. When I was able to get samples with all LSB=0x00 from Ableton Live I immediately directed my suspicion on WaveLab. But, further analysis of the captured 24-bit samples from Ableton exposed some other issues and I am not completely sure I can use it as a gold standard.

So I am here hoping that someone from Steinberg development see this and may be able to provide some insight on if there could be a bug in WL with respect to this. This may in fact be a bug that has flown under the radar as a sample variation of 1 would not typically be detectable, or would be obscured with dither. The fact that I am presenting computer constructed PCM data to the interface allows me to validate the chain from source to recorded file by being able to detect something as subtle as a single bit error.

PG1 · August 17, 2015, 2:31pm

Can you summarize this with an example…
if you record “A” in 24 bit, what do you see saved in 16 bit by WaveLab? (B)
And what do you expect? (C), instead of (B)?

drb · August 17, 2015, 3:07pm

Summarizing, using sample 10688 left channel from the OP.
“A” : Recording in 24-bit, where “A” is originally 16-bit 0xfffd (supposed transferred as 0xfffd00)
“B” : Saved in 16-bit: 0xfffe.
“C” : I am expecting: 0xfffd

I must note that when I record in 24-bit and save in 24-bit
“A” : as above
“B” : Saved in 24-bit: 0xfffd01
“C” : I am expecting: 0xfffd00

–
Darren

PG1 · August 18, 2015, 6:42am

If did some test not using recording (I don’t have your hardware and drivers), just using the save to convert 24 bit to 16 bit (I artificially created a sample of 0xfffd00).
The results are the one you expect (and that I expect too). IOW, the sample reduction of WaveLab is fine.
This means the bit change you mention, happens upstream. In the ASIO driver or in the ASIO-to-WaveLab converter (samples got from the driver are received by WaveLab in 32 bit float). For this last step, I have to ask someone who is currently in vacations. Maybe an answer next week.

drb · August 18, 2015, 11:17am

Yes, I initially I suspected the sample (word length) reduction was generating the error, but I wanted to be very sure of what I was seeing before I posted and I did extensive analysis and I was able to rule it out (specifically saving to 16-bit or recording at 16-bit). Unfortunately that analysis did not take into consideration any extra int to float, or float to int conversions.

I really don’t know what sample type the hardware’s ASIO driver is providing. I had been assuming it was an Int. I also had been assuming that for non float 32 sessions that the data was being processed as ints and stored internally as ints based on Elements 2GB recording limit – but wasn’t sure about that as I’ve read that the effects chain is done in 32-bit float.

PG, Regarding the ASIO-to-WaveLab converter and 32-bit float’s… Is the sample type being converted from an int 24 from the interface’s asio driver to a float 32 and then back to an int 24 or int 16 (for a 24-bit session), or is it always maintained as float 32 in the session and converted only on save to disk?

PG1 · August 18, 2015, 12:14pm

The samples provided by the ASIO driver are either 24 bit, or 32 bit float. If 32 bit float, then the conversion is internal to the driver.
A sample delivered by the driver is then collected by a steinberg device called baios that itself delivers it to WaveLab. If the driver collected from the driver is 32 bit float, than WaveLab gets the same value from baios. Else baios does the conversion 32 bit float to 24 bit. I have to ask someone if this is done correctly.

drb · August 18, 2015, 12:55pm

I haven’t done a lot of reflection or research on 32-bit float representation of PCM data. As I reflect on it, I can see that there are at least two ways to represent a PCM sample as 32-bit float:

a traditional conversion between data types (eg in C: float f32 = (float) int24)
or
a floating point representation of the sample in the range -1.0 to 1.0
or
??

What does WaveLab use?

PG1 · August 18, 2015, 2:09pm

The standard is #2

drb · August 19, 2015, 3:45pm

@PG: Thanks for clarifying that. I await your findings from the pending discussions with your colleague.

In the mean time, I am starting to suspect that “something” is “messing up” with respect to the multiplier or divisor constants used for converting between the asymmetric integer domain and the symmetric float representation. As such, there may not be symmetry in the multipliers and divisors used.

With 3 multiplier/divisor choices (always use 2^n, always use 2^n-1, use 2^n if negative, else 2^n-1 where n = # bits) on each side of the conversion. And for how the result of the float to integer multiplier is rounded, there are multiple possibilities for rounding as well. Overall, there are many possibilities to introduce this type of single bit error.

For rounding the float 32 representation back to an n-bit sample, it seems that most dialog in this regard refers to the round half away from zero rounding method (+0.5 if > 0, -0.5 if < 0). In addition to dissimilar multipliers and divisors for these conversions, the aspect that the rounding could be misapplied based on polarity also comes into play.

As an example with negative samples, the 24bit sample 0x’fffd00’ (-768), for a conversion to float32, then from float32 back to 24bit, for all the permutations of multipliers, divisors and +/- 0.5 rounding, to get a result of 0x’fffd01’ then only four methods can produce this result:

   int -> float               float -> int
   ----------------------------------------------
1  f32= i1*2^n                i2=(f32/2^n)+0.5
2  f32= i1*2^n                i2=(f32/2^n-1)
3  f32= i1*2^n                i2=(f32/2^n-1)+0.5
4  f32= i1*2^n-1              i2=(f32/2^n-1)+0.5

Where
    i1 = the original 24-bit sample, i2 = 24-bit sample converted from float 32
   f32 = the sample in float32 (-1.0 - 1.0), and  n = 24

In all of these cases, the conversions are not being symmetrically applied. Further more, examples 1,3,4 are simply incorrect application of rounding as per negative samples, so I am left to assume example 2 is a primary suspect for the error introduction. I find it hard to believe such a case could be inside of the WaveLab code and go undetected till now.

There doesn’t seem to be an industry wide accepted method of doing these conversions, although I suspect that Steinberg is in more of position to define a defacto standard. If such a defacto standard exist, I have yet to see what the specifications are for this.

The identification of all the conversion stages and where/what is doing them, and how the conversion is being done is still not clear to me. I hope that eventually enough information can be collected to construct a complete picture of the sample processing chain from end to end to help identify the conflicting operation.

In the mean time, while this waits to get resolved, I have written a word length reduction module for shntool to allow me to reduce WaveLab 24-bit recordings to 16-bit outside of WaveLab using simple LSB truncation. The reduction module is a work around that allows me to get the results expected.

–
Darren

PG1 · August 19, 2015, 5:01pm

In 32 bit floats, there are enough bits to prevent any loss when you do back and forth conversions. And you don’t need to add +0.5 if you instruct the CPU to do the right rounding automatically. cf. https://en.wikipedia.org/wiki/IEEE_754-1985#Rounding_floating-point_numbers

So, you can simply use:

f32= i1*2^n

and

i2=(f32/2^n)

drb · August 19, 2015, 6:12pm

This just identifies that there is an additional dimension to this, which would be what compiler and compiler options/directives was used to compile to code – as the compiler is responsible for generating machine code that calls the CPU instructions.

I need to dig into this, as I find in a quick test (with gcc 3.4.3) that this equivalence is not true for n=16, positive integers, and a scaler of 2^n-1. There appears to be equivalence when the scaler is 2^n for positive and negative values. Shouldn’t the scaler be 2^n-1 for positive and 2^n for negative samples?

PG1 · August 19, 2015, 6:27pm

Shouldn’t the scaler be 2^n-1 for positive and 2^n for negative samples?

No

drb · August 19, 2015, 8:20pm

I can appreciate the computer science behind that as a standard. I could only hope that all products apply the premise equally. I’ve found too many papers and code examples that do not.

drb · September 28, 2015, 11:30pm

ttt