"The Beginner’s Guide to Hi-Res Audio"

  • 7 December 2021
  • 92 replies
  • 2802 views

Userlevel 7

The 13.4.1 S2 update added hi-res (Ultra HD) and Dolby Atmos audio support from Amazon Music Unlimited. With this update, Sonos released this great article about hi-res audio and how you can listen to it on Sonos. It’s a very detailed and well-written article:

https://blog.sonos.com/en-us/hi-res-audio-guide


This topic has been closed for further comments. You can use the search bar to find a similar topic, or create a new one by clicking Create Topic at the top of the page.

92 replies

This is why digital audio with increased sampling rates of 96kHz or even 192kHz would indeed provide a very noticeable benefit as it allows for more precise positioning and depth of the sound sources.

On that basis the difference between Red Book and Hi Res in any blind test should be like ‘night and day’, and yet: https://www.realhd-audio.com/?p=6993

The author nails it in this paragraph: “Is I’ve often stated in these articles, it is the production path that establishes the fidelity of the final master. Things like how a track was recorded, what processing was applied during recording and mixing, and how the tracks were ultimately mastered. If all of these things are done with maximizing fidelity as the primary goal, a great track will result.”

Again, 16bit/44.1kHz is completely sufficient in terms of fidelity, as it keeps the quantization noise low enough (-96dB) and reproduces all the audible frequencies (up 22.05kHz).

I should have said, that most current music productions do not take full advantage of the higher spatial resolution you get when using 96kHz sampling frequency and it’s debatable whether a rock/pop production would ever exploit it. With classic music, when done properly, you can definetly hear it.

You should read the methodology in detail. The test samples auditioned by several hundred individuals compared the full resolution originals with RedBook equivalents. The conclusion speaks for itself: “Hi-Res Audio or HD-Audio provides no perceptible fidelity improvement over a standard-resolution CD or file. “

The author of that study originally set up a recording label specifically for the production of high-resolution audio, from recording to delivery. He of all people would surely have wished that Hi Res was detectable. According to his findings it wasn’t.

He really made the point why 192kHz sampling frequency is required if you want to accuately reproduce the gun fire of a laser blaster flying across a cinema theatre.   

But is it necessary to render an improvement, audible in a controlled blind listening test, to a 2 channel music recording? 

 

Cite what??

 

Actual evidence for your claim.  

 

If you are refering to my last statement about the audible difference between currently available HD and UltraHD tracks, that is more a conclusion, based on those fundamentals of digital signal porcessing which I tried to briefly summarize before, rather than a claim.

Cite what??

 

Actual evidence for your claim.  

 

If you are refering to my last statement about the audible difference between currently available HD and UltraHD tracks, that is more a conclusion, based on those fundamentals of digital signal porcessing which I tried to briefly summarize before, rather than a claim.

 

How about the claim that “digital audio with increased sampling rates of 96kHz or even 192kHz would indeed provide a very noticeable benefit as it allows for more precise positioning and depth of the sound sources.”

If you are refering to my last statement about the audible difference between currently available HD and UltraHD tracks, that is more a conclusion, based on those fundamentals of digital signal porcessing which I tried to briefly summarize before, rather than a claim.

 

Your "fundamentals of digital signal processing" are gibberish.  You made some claims that are not backed up by any mathematical or practical evidence.  All evidence points to there being no benefit of either 24 bits or sample rates over 48 kHz in the playback of digital audio.

Cite what??

 

Actual evidence for your claim.  

 

If you are refering to my last statement about the audible difference between currently available HD and UltraHD tracks, that is more a conclusion, based on those fundamentals of digital signal porcessing which I tried to briefly summarize before, rather than a claim.

 

You’ve clearly done enough research on the topic to know that what you’ve stated is far from accepted fact. You had to know that if you post a theory like this on a forum with others knowledgeable on the subject, it’s not going to be blindly accepted.

Cite what??

 

Actual evidence for your claim.  

 

If you are refering to my last statement about the audible difference between currently available HD and UltraHD tracks, that is more a conclusion, based on those fundamentals of digital signal porcessing which I tried to briefly summarize before, rather than a claim.

 

You’ve clearly done enough research on the topic to know that what you’ve stated is far from accepted fact. You had to know that if you post a theory like this on a forum with others knowledgeable on the subject, it’s not going to be blindly accepted.

Shame on me :)

If you are refering to my last statement about the audible difference between currently available HD and UltraHD tracks, that is more a conclusion, based on those fundamentals of digital signal porcessing which I tried to briefly summarize before, rather than a claim.

 

Your "fundamentals of digital signal processing" are gibberish.  You made some claims that are not backed up by any mathematical or practical evidence.  All evidence points to there being no benefit of either 24 bits or sample rates over 48 kHz in the playback of digital audio.

I basically agree with the statement above. And while it is fairly obvious (at least to me) that there is no audible benefit by increasing the bit/sample resolution form 16 to 24, I don’t think there is a similar conclsuion on the increase in sampling rate, yet. For 2D stereo recordings and stationary sound sources there are some studies out there suggesting that there is no audible benefit from increasing the sampling rate above 48kHz. This is where we might just stop the discussion with a simple 16bit/44.1kHz is good enough, period.

OTOH the psychoacoustic propeties around the detection of minimum audible angle and depth of a sound source from low to high frequencies are still very much under investiagtion and not entirely understood. All I was stating is, that when the CD was developed the focus was primarily on accurate reconstruction of the amplitude spectrum while the phase spectrum was of lower importance.  

Shame on me :)

I basically agree with the statement above. And while it is fairly obvious (at least to me) that there is no audible benefit by increasing the bit/sample resolution form 16 to 24, I don’t think there is a similar conclsuion on the increase in sampling rate, yet. For 2D stereo recordings and stationary sound sources there are some studies out there suggesting that there is no audible benefit from increasing the sampling rate above 48kHz. This is where we might just stop the discussion with a simple 16bit/44.1kHz is good enough, period.

OTOH the psychoacoustic propeties around the detection of minimum audible angle and depth of a sound source from low to high frequencies are still very much under investiagtion and not entirely understood. All I was stating is, that when the CD was developed the focus was primarily on accurate reconstruction of the amplitude spectrum while the phase spectrum was of lower importance.  

 

CIte for the following definitive statement in your original post, please:

While it's true that the human ear (and brain) cannot hear the amplitude of frequencies above, say, 18kHz our two ears can extremely well detect phase differences between frequencies that are much higher! So while we cannot hear those frequencies as tones, we can detect the tiny differences in runtime which it takes those inaudible frequencies to arrive at the left and right ear respectively. In other words, our spatial location capabilities are of much higher resolution than our frequency hearing capabilities. Btw, this effect is heavily used by 3D sound systems like Dolby Atmos or THX.

 

Because in your above quoted post, you seem to say the claim is “still very much under investiagtion (sic) and not entirely understood.”  So which is it, a definitive fact or something to be investigated?

BTW, THX is a quality standard, not a codec like Atmos or DTS.

Shame on me :)

I basically agree with the statement above. And while it is fairly obvious (at least to me) that there is no audible benefit by increasing the bit/sample resolution form 16 to 24, I don’t think there is a similar conclsuion on the increase in sampling rate, yet. For 2D stereo recordings and stationary sound sources there are some studies out there suggesting that there is no audible benefit from increasing the sampling rate above 48kHz. This is where we might just stop the discussion with a simple 16bit/44.1kHz is good enough, period.

OTOH the psychoacoustic propeties around the detection of minimum audible angle and depth of a sound source from low to high frequencies are still very much under investiagtion and not entirely understood. All I was stating is, that when the CD was developed the focus was primarily on accurate reconstruction of the amplitude spectrum while the phase spectrum was of lower importance.  

 

CIte for the following definitive claim in your original post, please:

While it's true that the human ear (and brain) cannot hear the amplitude of frequencies above, say, 18kHz our two ears can extremely well detect phase differences between frequencies that are much higher! So while we cannot hear those frequencies as tones, we can detect the tiny differences in runtime which it takes those inaudible frequencies to arrive at the left and right ear respectively. In other words, our spatial location capabilities are of much higher resolution than our frequency hearing capabilities. Btw, this effect is heavily used by 3D sound systems like Dolby Atmos or THX.

 

Because in your above quoted post, you seem to say the claim is “still very much under investiagtion (sic) and not entirely understood.”  So which is it, a definitive fact or something to be investigated?

Ahh, I see, good catch!

I think most audio and dsp engineers would agree with my statment that 44.1kHz is not enough to fully capture the audible relevant phase properties of a complex music source, such as an orchestra in a concert hall. There are complex dynamic patterns of phase variants and thus interaural phase differences created by musicians moving their instruments as they perfrom which are lost in a signal sampled at 44.1kHz. This I would call an accepted fact.

In a first but not necessarly sufficient step one could increase the sampling rate to capture more of this information. However, psychoaccoustic experiments indicate, for example, the perception of the phase to be non-linear across the spectrum. Also these effects seem to be time-variant and dependent on the source. So increasing the resolution of the phase evenly across the spectrum by increasing the sampling frequency alone might not be sufficient to deliver the desired result. This is something to be investigated.

Ahh, I see, good catch!

I think most audio and dsp engineers would agree with my statment that 44.1kHz is not enough to fully capture the audible relevant phase properties of a complex music source, such as an orchestra in a concert hall. There are complex dynamic patterns of phase variants and thus interaural phase differences created by musicians moving their instruments as they perfrom which are lost in a signal sampled at 44.1kHz. This I would call an accepted fact.

 

 

Please cite where this is proven as “accepted fact”.  Just stating it does not make it so.  Quite frankly, I’ve been around this stuff for a long time and I’ve never heard an inkling about these supposed phase differences being lost at 44.1 kHz.  

 

In a first but not necessarly sufficient step one could increase the sampling rate to capture more of this information. However, psychoaccoustic experiments indicate, for example, the perception of the phase to be non-linear across the spectrum. Also these effects seem to be time-variant and dependent on the source. So increasing the resolution of the phase evenly across the spectrum by increasing the sampling frequency alone might not be sufficient to deliver the desired result. This is something to be investigated.

 

So your definitive statements in your first post were mere bluster, thinking we’d accept your post at face value?  Sorry, this ain’t the forum for that type of BS.

Userlevel 3
Badge +2

Ahh, I see, good catch!

 

This is the way it was always going to turn out.  The conversation continues until you get caught in a trap.  There was never going to be another outcome.

This is the way it was always going to turn out.  The conversation continues until you get caught in a trap.  There was never going to be another outcome.

 

What trap is that?  The poster contradicted their own earlier posts and I asked for an explanation.  Any trap the poster fell into was of their own making.

A small digression out of curiosity - why was an “odd” number- 44100 - selected in the first place? If 40000 needed a margin of safely, why not 48000? Or even 44000?

A small digression out of curiosity - why was an “odd” number- 44100 - selected in the first place? If 40000 needed a margin of safely, why not 48000? Or even 44000?

 

Several reasons.  It was the Sony standard for PCM, it's a product of prime numbers (2*2*3*3*5*5*7*7), which makes calculations easier,, and one of the things the CD consortium insisted on was Beethoven's 9th Symphony would fit on one disc (but this was more related to the debate on the size of the disk).

 

 it's a product of prime numbers (2*2*3*3*5*5*7*7), which makes calculations easier,,

Who would have thought that?! Interesting.

Ahh, I see, good catch!

 

This is the way it was always going to turn out.  The conversation continues until you get caught in a trap.  There was never going to be another outcome.

Yeah, internet formus seem to attract those flat-earthers who use negative results to prove the non-existence of any given phenomenon.  I don’t really bother especially if the most compelling counter argument they present is that they have never heard about it garnished with a hint on the many yeears of experience they have under their belt.

But for arguments sake, you may want to read J. Robert Stuarts’s paper on “Coding for High-Resolution Audio Systems”, published 2004 in the Journal of the Audio Engineering Society. Everything I have stated about sampling rate and high frequency content you can more or less find in Chapter 5 of this paper, and in particular section 5.1 Psychoacoustic Data to Support Higher Sampling Rates:”...It has been suggested that perhaps higher sampling rates are preferred because, somehow, the human hearing system will resolve small time differences which might imply a wider bandwidth in a linear system. In considering this it is important to distinguish between perceiving separate events which are very close together in time (implying wide bandwidth and fine monaural temporal resolution) and those events which help build the auditory scene, for which the relative arrival times are either binaural or well separated. In the first case, wider bandwidth is required to discriminate acoustic events that are closer together in time. This seems to be an alternative statement of the problem to determine the maximum bandwidth necessary for audible transparency...Events in time can be dis- criminated to within very fine limits, and with a resolution very substantially smaller than the sampling period. This point is crucial because provided we treat all channels identically to ensure no skew of directional information, there is no direct relationship between the attainable tem- poral resolution and the sampling interval.

So independet of whether you follow the author’s hyphotheses and findings or not, it is a well known paper from an AES Fellow, so you cannot really say that you never heard about this stuff.  

Following this paper in 2007 there has been an article from Meyer and Moran, also published in the AES Journal, whith the objective to find out if there are any audible gains from high-res audio playback by doing some extensive, formalized testing. Their conclusion was that based on their test methodology they could not find any significant preference for hi-res audio over the CD standard, even when using high-end headphones or speaker systems. However, they very correctly noted that “it is very difficult to use negative results to prove the inaudibility of any given phenomenon or process”. The most intriguing part of this arcticle, however, was their final note on high-resolution recordings: “Though our tests failed to substantiate the calimed advantages of high-resolution encoding for two-channel audio, one trend became obvious...throughout our testing: virtually all of the SACD and DVD-A recdoings sounded better than most CD’s - sometimes much better...Partly because[...]engineers and producers are give the freedom to produce recodings that sound as good as they can make them, without having to compress or equalize the signal to suit lesser systems.

And here we go. I truly belive there are advances to be made in capturing and reproducing more accurately what our ears actually perceive in a concert hall. I am a big fan of innovation. And the proliferation of Hi-Res audio formats as we are witnessing right now is certainly one way to inspire more innovation to come forward in this field. Even if we are not (yet) experiencing it in the UltraHD tracks we get to listen to today.

 

Their conclusion was that based on their test methodology they could not find any significant preference for hi-res audio over the CD standard, even when using high-end headphones or speaker systems. However, they very correctly noted that “it is very difficult to use negative results to prove the inaudibility of any given phenomenon or process”. The most intriguing part of this arcticle, however, was their final note on high-resolution recordings: “Though our tests failed to substantiate the calimed advantages of high-resolution encoding for two-channel audio, one trend became obvious...throughout our testing: virtually all of the SACD and DVD-A recdoings sounded better than most CD’s - sometimes much better...Partly because[...]engineers and producers are give the freedom to produce recodings that sound as good as they can make them, without having to compress or equalize the signal to suit lesser systems.

 

 

How does one resolve the internal contradiction in the quoted except by attributing the “sounded better” thing to better mastering, which is said in the quote to be partly the reason for the better sound; and no one here disputes that better mastering can deliver audibly better sound. But, if that is partly the reason, what is rest of it? That is left hanging in the air...and where mastering is the same and sound levels are accurately matched, these differences have not survived in any published blind test, where the better master is downsampled to CD format and compared against the original version.

We can of course wait for the black swan, but I for one am not holding my breath.

To repeat also: all this applies only to 2 channel audio, where all the information needed to the extent audible, is captured by the bit rate and sampling frequency of the CD format.

To take this even further, very few can reliably hear the difference between even the CD format and lossy 320k, or the 256k that is how Apple lossy is coded. The head of Apple Music is on record as saying that he/his team cannot pick between Apple lossless and Apple lossy for 2 channel audio, except perhaps on very high quality headphones. In that case, can Hi Res offer more?

Apple Spatial Audio, or Dolby Atmos are an audibly different species no doubt and the information content needed to deliver them is a different matter.

This has been done to death all over the internet. However I’d make a few observations:

  • Bob Stuart is an originator of MQA, a ‘hi res’ format which has divided the industry. In part this is due to its lossy nature, in part because it’s a form of DRM extracting licence fees along the chain. 
  • The Meyer and Moran study has come in for criticism. An argument was that some of their ‘hi res’ content may not have actually had guaranteed high resolution provenance. 
  • The only large scale study, to my knowledge, is the one by Mark Waldrep referenced earlier. He too found fault with Meyer and Moran and, as an expert in high resolution recordings, took great care over the preparation of his test materials. By the sound of things @edchristoph has not delved into the full detail of this test which, as noted, concluded that ‘hi res’ added no perceptible fidelity improvement over Red Book.

The idea that there could be ‘something out there’ (unknown unknowns?) which Red Book fails to capture is a formula that’s been used down the ages by less than scrupulous salesmen to convince people to part with their money. Personally I’m not buying it. 

Userlevel 7
Badge +20

I think most audio and dsp engineers would agree with my statment that 44.1kHz is not enough to fully capture the audible relevant phase properties of a complex music source, such as an orchestra in a concert hall. There are complex dynamic patterns of phase variants and thus interaural phase differences created by musicians moving their instruments as they perfrom which are lost in a signal sampled at 44.1kHz. This I would call an accepted fact.

All of which just sums up to a scalar amplitude measured at each microphone, varying over time. This signal is fully and perfectly captured for all human audible frequencies (and beyond) by sampling at 44.1kHz.

I am not saying that current hi-res recodings are better sounding than CD-quality!

I am just saying that current 2D recordings in CD-quality are not adequately capturing/reproducing the binaural experience in a conert hall. You may come to the conclusion that for 2D audio formats, CD quality is as good as it gets and no further improvements are possible from there, fine. But this is also just an opinion based on the negative results of studies perfroming fromal listening tests. As stated above, negative results deliver no proof for the non-existence of some phenomenon. They just prove that in this case CD-quality could adequately capture what’s in the corresonding hi-res recording under test. 

I for one belive that even for 2D audio there are advances to be made which are related to the reproduction of the phase spectrum as “the human hearing system will resolve small time differences which might imply a wider bandwidth in a linear system”.

It’s prefectly fine if you have a different opinion

Please cite an academic reference supporting this thesis, and not one from any individual with a commercial interest.

 

I am just saying that current 2D recordings in CD-quality are not adequately capturing/reproducing the binaural experience in a conert hall.

Those of us that have been to live gigs in even small venues know that home audio today is a very limited version of that experience, and not just for reasons of the sound of the music. But Hi Res 2 channel audio that is presently being marketed as such does not change that situation at all, in coming any closer to the real thing than where CD takes us.

Will that change in the future? I don’t know. What is visible of course are the changes being brought by Atmos/Spatial audio and similar, but that isn’t 2 channel audio as is commonly understood.  

Who would have thought that?! Interesting.

 

Software Engineers think of this kind of stuff all the time.  It’s what they do.

Example:  The most efficient file sorting technique is to sort the files by dividing them up into separate files following the Fibonacci Sequence.  There’s also a search technique based on the Fib.  Both are proven to have a Log n algorithmic complexity, which is as good as you can get.

Ahh, I see, good catch!

 

This is the way it was always going to turn out.  The conversation continues until you get caught in a trap.  There was never going to be another outcome.

Yeah, internet formus seem to attract those flat-earthers who use negative results to prove the non-existence of any given phenomenon.  I don’t really bother especially if the most compelling counter argument they present is that they have never heard about it garnished with a hint on the many yeears of experience they have under their belt.

But for arguments sake, you may want to read J. Robert Stuarts’s paper on “Coding for High-Resolution Audio Systems”, published 2004 in the Journal of the Audio Engineering Society. Everything I have stated about sampling rate and high frequency content you can more or less find in Chapter 5 of this paper, and in particular section 5.1 Psychoacoustic Data to Support Higher Sampling Rates:”...It has been suggested that perhaps higher sampling rates are preferred because, somehow, the human hearing system will resolve small time differences which might imply a wider bandwidth in a linear system. In considering this it is important to distinguish between perceiving separate events which are very close together in time (implying wide bandwidth and fine monaural temporal resolution) and those events which help build the auditory scene, for which the relative arrival times are either binaural or well separated. In the first case, wider bandwidth is required to discriminate acoustic events that are closer together in time. This seems to be an alternative statement of the problem to determine the maximum bandwidth necessary for audible transparency...Events in time can be dis- criminated to within very fine limits, and with a resolution very substantially smaller than the sampling period. This point is crucial because provided we treat all channels identically to ensure no skew of directional information, there is no direct relationship between the attainable tem- poral resolution and the sampling interval.

So independet of whether you follow the author’s hyphotheses and findings or not, it is a well known paper from an AES Fellow, so you cannot really say that you never heard about this stuff.  

 

 

That AES Fellow has an direct financial relationship to the promotion of high resolution audio formats.  You might as well consult Dr. Daffy on whether his Magic Elixir cures all ails. 

 

Following this paper in 2007 there has been an article from Meyer and Moran, also published in the AES Journal, whith the objective to find out if there are any audible gains from high-res audio playback by doing some extensive, formalized testing. Their conclusion was that based on their test methodology they could not find any significant preference for hi-res audio over the CD standard, even when using high-end headphones or speaker systems. However, they very correctly noted that “it is very difficult to use negative results to prove the inaudibility of any given phenomenon or process”. The most intriguing part of this arcticle, however, was their final note on high-resolution recordings: “Though our tests failed to substantiate the calimed advantages of high-resolution encoding for two-channel audio, one trend became obvious...throughout our testing: virtually all of the SACD and DVD-A recdoings sounded better than most CD’s - sometimes much better...Partly because[...]engineers and producers are give the freedom to produce recodings that sound as good as they can make them, without having to compress or equalize the signal to suit lesser systems.

And here we go. I truly belive there are advances to be made in capturing and reproducing more accurately what our ears actually perceive in a concert hall. I am a big fan of innovation. And the proliferation of Hi-Res audio formats as we are witnessing right now is certainly one way to inspire more innovation to come forward in this field. Even if we are not (yet) experiencing it in the UltraHD tracks we get to listen to today.

 

 

Are you actually trying to use Meyer and Moran’s results to suggest SACD and DVD-A recordings are superior to CD, when their results showed all differences were due to mastering and not higher resolution formats?  Pretty freaking bold move!  But as before, your bluster doesn’t work here.