Can digital audio playback be improved by resolutions greater than 16/44.1?


Copied from another thread, courtesy of jgatie - full question was:

"Do you believe digital audio, outside of mastering/production techniques, can be improved by playback resolutions greater than 16/44.1?"

This topic has been closed for further comments. You can use the search bar to find a similar topic, or create a new one by clicking Create Topic at the top of the page.

93 replies

My answer is qualified in two ways:
1. Not unless the higher numbers - I think calling them resolutions is misleading - are in some way needed to capture what enhanced mastering/recording techniques contain.
2. There are many other ways that digital audio can/will improve, in the realms of speaker sound quality and room response capability.
Until of course the time comes when we get digital implants feeding sound signals directly to the brain!
Userlevel 5
Badge +10
I'm not the expert here, and I know a number on this board have disciplined and detailed opinions on This, but for fun, I'll weigh in early and then enjoy stalking from the sidelines...

From my experience, all I have read, and personal opinion I'll answer this way:

Not in any way that makes any meaningful difference to actual listening in a real world.

I live in a real house, with a honest-to-God forced-air *furnace*... and a refrigerator that insists on humming and buzzing at the odd time, and where people actually flush a toilet, and ... you get the point.

I enjoy music, yes. I love quality audio, yes. If people want to isolate amazing 'listening rooms' and convince themselves they can 'hear a difference' *Obviously!* when they listen to a "HiRes" file... great for them. Have at 'er all you like. But don't tell the rest of us we're stupid for listening to poor quality audio. Truth is, we're listening to pretty much all the human ear can discern now.

Sure HiRes may sell a few extra files for product manufactures (but I'm not convinced most of the original recording sources are of a format that can actually have anything but a bunch of empty 0s in the extra bit depth of the file) and I'm one that will be quick to point out that stats sell products... maybe the world eventually needs to be producing equipment that can deliver better than the human ear can hear in a real environment... but I don't think we should allow ourselves to be fooled that it is real... many many people don't care about facts... HiRes is better than 'normal res' and we obviously need TVs that display in better grain than the eye can perceive from a natural watching distance. That is the art of marketing... and we all know that marketing and facts may have a 'loose' relationship.
many many people don't care about facts... HiRes is better than 'normal res' .
I get what you are saying, but Hi Res music has lost the race won first by wireless multi room supported via streaming, which in turn is under attack for mind/market share from voice and home automation integration. People are willing to even sacrifice some sound quality for these features; the brain can fill in the small sound quality gaps quite easily, but the lack of these features remains once the need for them is perceived and brain games can't be a substitute.

If none of these developments had taken place, Hi Res would probably have been the next big thing for lack of anything else because people are always looking for it and would have drunk that Kool Aid for lack of an alternative.

Niche markets will always exist for everything of course - see vinyl and how that exists with little in its favour in terms of either user ease or sound quality!
Here's a thought that I'm surprised hasn't occurred to me before. It is standard practice in audio production to add dither (low-level random noise) to an audio signal when converting from 24 bit to 16 bit. It helps to mask the truncation error caused by rounding the low level 24 bit signal to the nearest 16 bit value. As I've mentioned elsewhere, 16 bit steps can be as large as 6dB at low volume. Truncation error is most noticeable when fading music to silence (and is presumably more obvious in headphones or with the furnace turned off 🙂 )

My thought is this: if adding low-level noise is required to reduce the audible effect of truncation to 16 bits, surely it would be preferable to retain more bits and eliminate the need to add random noise?

This link has more detail.
My limited understanding of the subject:
Used appropriately, 16 bits can give you 96dB of dynamic range; more than enough for almost everything but the Big Bang.

But because it’s so difficult to record an unpredictable performance, and because this almost inevitably results in a sub-par recording, it’s more sensible to use 24 bits when capturing audio live. There’s no harm done (relative to a 16 bit recording) in reducing 24 bit files to 16 bits - as long as it’s done nicely; nicely being a layman's term for doing dithering properly, I suppose. My guess is that doing this isn't rocket science to people in the industry and doing it allows for a best of both worlds approach, the other world being the one where living with 16 bits is dictated by the need to be compatible with the 16/44 format.

Will music recorded in 24 bit, rendered competently down to 16, sound audibly worse/different than that which can be played as recorded, in the 24 bit format? That is the question to be answered.
16 bits gives you 90dB. Dithering buys you the 96dB. You are right about the recording process needing 24 bits. There IS harm done reducing this to 16 bits, which is why dithering is necessary. My argument stands - if dithering is necessary, then more than 16 bits is preferable on playback.
Come to think of it, I don't think much of subjective competency is involved here. Many here have said that if a hi res file is down sampled to a Sonos compatible format and played on Sonos front ended kit, it cannot be audibly distinguished from where the file is being played in the native format via hi res capable kit, all other things like the speakers in use being the same.

I haven't any ABX experience of this, but I haven't found anyone that has countered this claim via ABX either.
My argument stands - if dithering is necessary, then more than 16 bits is preferable on playback.The counter argument is that if the dithering effects cannot be audibly picked up, why are more than 16 bits necessary?
The point is that if dithering is perceived to be necessary (which I think we agree on), wouldn't it be preferable to increase the bit rate rather than add random noise to audio?
Given the constraint imposed by the format, I suppose there isn't the space to increase the bit rate. Preferable perhaps but not essential then, and good engineering is that which achieves the required outcome with the least expenditure of resources.
Here is a Sep 2015 view on the subject from Sonos:
Sonos only supports up to CD quality, while others are capable of playing back 24-bit tracks. Giles Martin, sound leader at Sonos, claims this shouldn't make a difference though.

"You can't upscale audio, it doesn't work so it becomes a numbers game…I can make a call to shift us [to 24-bit], it isn't hard, but the problem is on the experiential side, it has to be right. You have to make sure things don't drop out or stop. Even with Tidal we are on the edge right now because the pipe needs to get bigger."

Martin added: "I refuse and the company refuses to play this numbers game where you go 'we're better than you', 'how much better', 'we are 8 better than you'. It doesn't make any sense as far as the consumer stuff goes and I think that's where we are at right now. I think it would be great if people listened to CDs right now and then say 'you know what, we need a bit more' and then they can experience it if they want and decide whether they can hear the difference or not as most people can’t."
And, if you read the often quoted xiph.org material on the subject, 24 bit for listening isn't even preferred to 16 bit, it just isn't necessary, and therefore a wasted resource.

A quoted part from that site:
"Professionals use 24 bit samples in recording and production for headroom, noise floor, and convenience reasons.

16 bits is enough to span the real hearing range with room to spare. It does not span the entire possible signal range of audio equipment. The primary reason to use 24 bits when recording is to prevent mistakes; rather than being careful to center 16 bit recording-- risking clipping if you guess too high and adding noise if you guess too low-- 24 bits allows an operator to set an approximate level and not worry too much about it. Missing the optimal gain setting by a few bits has no consequences, and effects that dynamically compress the recorded range have a deep floor to work with.

An engineer also requires more than 16 bits during mixing and mastering. Modern work flows may involve literally thousands of effects and operations. The quantization noise and noise floor of a 16 bit sample may be undetectable during playback, but multiplying that noise by a few thousand times eventually becomes noticeable. 24 bits keeps the accumulated noise at a very low level. Once the music is ready to distribute, there's no reason to keep more than 16 bits."

For why there is no reason, see the material on the site and see if it convinces you:) - dither is addressed.
All I know about 24 bit recordings is from my excellently produced and noticeably more expensive 24 bit recorded CDs from JVC, marketed as XRCDs - Xtra Range. Very good packaging, liner notes and mastering. But no better than the best CDs from other production houses for sound quality. They now sit in my NAS and sound just as good as they did when played through a high end SACD player with a pure copper chassis(sic), wired to a pre amp + amp, wired in turn to very good passive speakers, via thick speaker cables. But in place of all those boxes and messy wires and a component rack is a well placed 1 pair + Sub, tuned to the room via Trueplay. The racks of CDs are gone, into a NAS that isn't to be seen.

This is digital audio at its best. Will it be improved? Of course - via better active speakers and better versions of Trueplay. Hopefully there is also still some scope for Sonos to further improve the sound quality from the existing hardware, improvements delivered free over the net, all the way from the US to India.

This to my mind is real progress, not to be shied from. IMO, it is close to magic. All that is missing is the glow of the valves:D.
Peter, more from xiph for you to ponder and/or research:
"16 bit audio can go considerably deeper than 96dB. With use of shaped dither, which moves quantization noise energy into frequencies where it's harder to hear, the effective dynamic range of 16 bit audio reaches 120dB in practice [13], more than fifteen times deeper than the 96dB claim.
16 bits is enough to store all we can hear, and will be enough forever."

So not just dither, but shaped dither!
Thanks Kumar. That is fascinating - I never realized dither could increase the dynamic range that much. Having thought about it for a bit, I now understand how this works. It's interesting that Monty's plot cuts off at about 10kHz. I got to wondering what happens beyond that, and also what happens if there is more than one frequency? So here is an extended version of his plot. You can see the noise is shifted to frequencies that most of us can't hear, but those with excellent/young hearing might perceive increased very-high-frequency noise. Multiple frequencies increases the noise level slightly (see here), so real music with all frequencies present might increase this further.

So, based on this, I still contend that if dithering is necessary, it would be preferable to use more bits and not add potentially audible noise. I concede this is all really low level stuff.

Cheers, Peter.
Here is a recent scientific paper that gathers together many reputable high-res audio tests in a meta-analysis. The main conclusions (taken from the abstract) are:

"Results showed a small but statistically significant ability of test subjects to discriminate high resolution content, and this effect increased dramatically when test subjects received extensive training."

"The overall conclusion is that the perceived fidelity of an audio recording and playback chain can be affected by operating beyond conventional levels."

A careful reading will reveal that the author makes no finding about the causes of this result. However, it is also mentioned that most of the contributing studies varied the sample rate, not the bit depth. (See section 4.1 for these two statements).
Already discussed. Notice there is purposefully no mention of quality differences in the meta-analysis, only "differences". This slight significance can easily be explained by the presence of intermodulation distortion in the playback hardware caused by the stresses of reproducing ultra-sonics. This distortion has been tested to be audible, and unfortunately for the misleading commentary of the meta-analysis, it is always harmful to the quality of the sound.
Already discussed.
Where?

This slight significance can easily be explained by the presence of intermodulation distortion...
Intermodulation distortion was identified as a potential bias in three studies that, as far as I can see, were not used in the meta-analysis. See Table 2B and the second last para of section 2.4. Studies 25, 60 and 61 do not appear in Table 2.
I can make a call to shift us [to 24-bit], it isn't hard, but the problem is on the experiential side, it has to be right. You have to make sure things don't drop out or stop. Even with Tidal we are on the edge right now because the pipe needs to get bigger."

So he's saying that Sonos could shift to 24-bit, but then they would have trouble guaranteeing their current reliablity. They know this as Tidal is putting their technology 'on the edge right now'. Perfectly logical, given the much large file sizes involved.

It doesn't make any sense as far as the consumer stuff goes and I think that's where we are at right now.

So he's not even saying that there isn't a difference, just that it doesn't matter to consumers - in particular, potential Sonos customers, which is what they'll be mainly concerned about.
Many here have said that if a hi res file is down sampled to a Sonos compatible format and played on Sonos front ended kit, it cannot be audibly distinguished from where the file is being played in the native format via hi res capable kit, all other things like the speakers in use being the same.

Surely the test should be on the same (HiRes capable) player. There might have been an argument when the 'bit-perfect' argument was valid, but surely not on modern Connects.
Here is a recent scientific paper that gathers together many reputable high-res audio tests in a meta-analysis. The main conclusions (taken from the abstract) are:

"Results showed a small but statistically significant ability of test subjects to discriminate high resolution content, and this effect increased dramatically when test subjects received extensive training."


The link isn't working when I tried it.

A couple of things: first, the reference to the "many reputable tests" - is a surprise. I have to come across even one for hi res that does a decent job of doing level matched blind AB, leave alone a full protocol ABX, so where are these many "reputable" tests hiding? It would be useful to see one with a complete description.

Second of course is that what does discrimination mean in this case? In addition to correctly(statistically) picking out the difference, is there also an indication that a statistically adequate/significant percent of testers also expressed a preference for the hi res version? Those that did not, or those that said either version is fine with them - if these form a large part of the sample, what does that convey?
PS: never mind the link, I found another way to the summary report. And came across para 4.2 there that lays down all the issues identified with the "reputable" tests that need to be addressed for better conclusions in future! Or did I read 4.2 wrong?
I also thought I will see if Hydrogen Audio has discussed this report; and why am I not surprised at:
https://hydrogenaud.io/index.php/topic,112204.0.html
I haven't read the entire thread, just enough to get a good sense of it: Shannon Nyquist applied to save resources here as well!:D
Already discussed.
Where?


The search sucks on this site. Asking where is futile.


Intermodulation distortion was identified as a potential bias in three studies that, as far as I can see, were not used in the meta-analysis. See Table 2B and the second last para of section 2.4. Studies 25, 60 and 61 do not appear in Table 2.


You are missing the point. It doesn't matter if a couple of the studies factored in IM. It matters that the meta-analysis never considers qualitative differences in favor of Hi-res music, only "differences". Therefore those that do not take IM into consideration could very well be the result of IM. But even more than that, my question was

"Do you believe digital audio, outside of mastering/production techniques, can be improved by playback resolutions greater than 16/44.1?"

Not that by the barest of margins differences can be detected when put through meta-analysis, only that it can be "improved". The study you cite, by its very conclusions, has found no improvements, only "differences". It has no bearing on this thread.
It is puzzling why so much engineering effort has been/is being poured into hi res, and endless hours spent in promoting it including even doing meta-analsyses of these promotional efforts(!!!), and then the hours spent in saying that it is pointless.

I doubt that so much has been spent by so many on so little in any other field - to tweak a Churchillian phrase!
One only needs to look at the technical summation of the Reiss meta-analysis vs the press release to see what is at work here:

Paper:
In summary, these results imply that, though the effect is perhaps small and difficult to detect, the perceived fidelity of an audio recording and playback chain "is affected" by operating beyond conventional consumer oriented levels. Furthermore, though the causes are still unknown, this perceived effect can be confirmed with a variety of statistical approaches and it can be greatly improved through training.

PR:
said Reiss. “Our study finds high-resolution audio has a small but important advantage in its quality of reproduction over standard audio content."

There is absolutely no evidence in the meta-analysis that there is an "advantage in its quality". None. Zero. "Is affected" does not equate to "advantage in its quality". Mr. Reiss is lying in his PR statement. Common sense states that his cherry picking of studies to include in the meta-analysis also suffers the same bias.