"The Beginner’s Guide to Hi-Res Audio"

  • 7 December 2021
  • 92 replies
  • 2802 views

Userlevel 7

The 13.4.1 S2 update added hi-res (Ultra HD) and Dolby Atmos audio support from Amazon Music Unlimited. With this update, Sonos released this great article about hi-res audio and how you can listen to it on Sonos. It’s a very detailed and well-written article:

https://blog.sonos.com/en-us/hi-res-audio-guide


This topic has been closed for further comments. You can use the search bar to find a similar topic, or create a new one by clicking Create Topic at the top of the page.

92 replies

I am not saying that current hi-res recodings are better sounding than CD-quality!

I am just saying that current 2D recordings in CD-quality are not adequately capturing/reproducing the binaural experience in a conert hall. You may come to the conclusion that for 2D audio formats, CD quality is as good as it gets and no further improvements are possible from there, fine. But this is also just an opinion based on the negative results of studies perfroming fromal listening tests. As stated above, negative results deliver no proof for the non-existence of some phenomenon. They just prove that in this case CD-quality could adequately capture what’s in the corresonding hi-res recording under test. 

I for one belive that even for 2D audio there are advances to be made which are related to the reproduction of the phase spectrum as “the human hearing system will resolve small time differences which might imply a wider bandwidth in a linear system”.

It’s prefectly fine if you have a different opinion

 

And until your opinion is backed up with scientific experimental proof, I’ll continue to laugh at it, no matter how much BS cut-and-paste word salad you spew.

I am not saying that current hi-res recodings are better sounding than CD-quality!

I am just saying that current 2D recordings in CD-quality are not adequately capturing/reproducing the binaural experience in a conert hall. You may come to the conclusion that for 2D audio formats, CD quality is as good as it gets and no further improvements are possible from there, fine. But this is also just an opinion based on the negative results of studies perfroming fromal listening tests. As stated above, negative results deliver no proof for the non-existence of some phenomenon. They just prove that in this case CD-quality could adequately capture what’s in the corresonding hi-res recording under test. 

I for one belive that even for 2D audio there are advances to be made which are related to the reproduction of the phase spectrum as “the human hearing system will resolve small time differences which might imply a wider bandwidth in a linear system”.

It’s prefectly fine if you have a different opinion

 

It sounds more like you’re saying that listening to a live acoustic performance is not accurately reproduced by a 2 channel recording at CD quality.  I would agree with that.  However, that does not mean that hi res  audio is the solution, particularly when  it’s tested and failed.  The factor that seems to be forgotten is that room acoustics, reflections, absorptions, direction of audio, and likely visual ques, come in to play to effect what we hear.  While you could reproduce some of that with pyscho accoustic effects, timing modifications, etc, you still can’t quite reproduce it with 2 channels.   Even then you brain is still aware that what it’s hearing is a recording rather than a live performance, and that surely factors in to some extent.

So it seems logical to me that instead of pushing higher resolution, it would make more sense, to me anyway,  to work on improving room acoustics, additional audio channels, speakers that operate closer to how instruments actually produce,  sound, etc.  That’s not even realistic though, since environments are created for purpose beyond mimicking the acoustics of a concert hall.   But sure, there is room for growth in audio reproduction, but saying that it needs to occur with higher resolution rather than other issues, doesn’t make a ton of sense.

And your statement “Yeah, internet formus seem to attract those flat-earthers who use negative results to prove the non-existence of any given phenomenon.” is just silly in this context.   You can absolutely prove that A doesn’t cause B when you repeatedly demonstrate that A doesn’t cause B.   Negative results don’t prove anything when we do not have the testing capability to examine an entire sample.  For example, just because we have not seen life on other planets does not prove that there is no life on other planets because we can not test the entire population of planets.  Flat earth is very different as we have proven that the earth is round, and flat earth believe that those that did the test are lying about their results.

Userlevel 7
Badge +22

 

Software Engineers think of this kind of stuff all the time.  It’s what they do.

 

Indeed. On the other hand sales folks concentrate on not leaving any money on the table. They are good and have no shame… 

So you get: 1.5-meter Ethernet cable for $499.

https://www.networkworld.com/article/2281260/denon-s-outrageous-price-for-ethernet-cable.html

“The manufacturer is Denon, and the target customer is the "audio enthusiast." Apparently "audio enthusiast" is Denonese for "sucker."”

https://web.archive.org/web/20100410235208/http://www.cs.ucc.ie/~ianp/CS2511/HAP.html

“...Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to the sides. Humans can discern interaural time differences of 10 microseconds or less.”

https://www.ece.ucdavis.edu/cipic/spatial-sound/tutorial/psychoacoustics-of-spatial-hearing/#azimuth

“...under optimum conditions, much greater accuracy (on the order of 1°) is possible...This is rather remarkable, since it means that a change in arrival time of as little as 10 microseconds is perceptible. (For comparison, the sampling rate for audio CD’s is 44.1 kHz, which corresponds to a sampling interval of 22.7 microseconds. Thus, in some circumstances, less than a one-sample delay is perceptible.)”

https://web.archive.org/web/20100410235208/http://www.cs.ucc.ie/~ianp/CS2511/HAP.html

“...Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to the sides. Humans can discern interaural time differences of 10 microseconds or less.”

https://www.ece.ucdavis.edu/cipic/spatial-sound/tutorial/psychoacoustics-of-spatial-hearing/#azimuth

“...under optimum conditions, much greater accuracy (on the order of 1°) is possible...This is rather remarkable, since it means that a change in arrival time of as little as 10 microseconds is perceptible. (For comparison, the sampling rate for audio CD’s is 44.1 kHz, which corresponds to a sampling interval of 22.7 microseconds. Thus, in some circumstances, less than a one-sample delay is perceptible.)”

 

The accuracy with which this can be done depends on the circumstances. For speech in normally reverberant rooms, typical human accuracies are on the order of 10° to 20°. However, under optimum conditions, much greater accuracy (on the order of 1°) is possible if the problem is to decide merely whether or not a sound source moves. This is rather remarkable, since it means that a change in arrival time of as little as 10 microseconds is perceptible. (For comparison, the sampling rate for audio CD’s is 44.1 kHz, which corresponds to a sampling interval of 22.7 microseconds. Thus, in some circumstances, less than a one-sample delay is perceptible.)

The underlined text is what you removed from your quote. The context matters quite a bit.  Obviously, the vast majority of homes, where Sonos speakers live, are not optical conditions.  As well, the greater accuracy is useful for determining whether the sound source moves, which is very debatable as being useful information when listening to music. And it’s not like that motion can’t be simulated at greater time intervals, as that certainly can occur in with 2 channel audio even in SD.  So this could only be possible when in a highly controlled environment, maybe headphones, and where the audio source is trying to give the impression of movement.

And of course, your quote is from a footnote, and timing is not the only factor  the ears/brain use for determining the location of sound source, as the article states. ILD seems rather important to me.  If you artificially modify timing in order to create a spatial illusion, how do you account from the difference in volume and frequency shift that each ear hears (especially without headphones)?

 

https://web.archive.org/web/20100410235208/http://www.cs.ucc.ie/~ianp/CS2511/HAP.html

“...Localization accuracy is 1 degree for sources in front of the listener and 15 degrees for sources to the sides. Humans can discern interaural time differences of 10 microseconds or less.”

https://www.ece.ucdavis.edu/cipic/spatial-sound/tutorial/psychoacoustics-of-spatial-hearing/#azimuth

“...under optimum conditions, much greater accuracy (on the order of 1°) is possible...This is rather remarkable, since it means that a change in arrival time of as little as 10 microseconds is perceptible. (For comparison, the sampling rate for audio CD’s is 44.1 kHz, which corresponds to a sampling interval of 22.7 microseconds. Thus, in some circumstances, less than a one-sample delay is perceptible.)”

 

You (and that paper) have a fundamental misunderstanding of what sample rate means and how it applies to digital sampling.  A change in arrival time of 10 microseconds due to positioning has no relationship to digital sample rate.  There is no “gap” in the data in which a phase shift can be missed because 10 microseconds is less than ½ sample rate. 

How, you say?  Well, as shown by Nyquist-Shannon, a bandwidth limited digital audio file converted back to analog doesn’t have gaps or stair steps or any of the other silly representations, it is actually EXACTLY the same as the original analog signal as captured by the listening device. 

Let’s say this again: Within ½ the bandwidth limit, there is no data loss, none.  Therefore, at 44.1 kHz, all audible sound is reproduced exactly as it was in analog form, and the ear hears all of it, including the phase shift.  All increasing the bandwidth would do is increase the frequencies that are reproduced, and the ear doesn’t hear ANYTHING over 20 kHz, phase shifted or not.

 

So read my Google Fu:

Strictly speaking, the theorem only applies to a class of mathematical functions having a Fourier transform that is zero outside of a finite region of frequencies. Intuitively we expect that when one reduces a continuous function to a discrete sequence and interpolates back to a continuous function, the fidelity of the result depends on the density (or sample rate) of the original samples. The sampling theorem introduces the concept of a sample rate that is sufficient for perfect fidelity for the class of functions that are band-limited to a given bandwidth, such that no actual information is lost in the sampling process. It expresses the sufficient sample rate in terms of the bandwidth for the class of functions. The theorem also leads to a formula for perfectly reconstructing the original continuous-time function from the samples.

 

https://en.wikipedia.org/wiki/Nyquist%E2%80%93Shannon_sampling_theorem

 

You fell for the old audiophile nonsense that there are “gaps” in the data because each sample is a “slice in time”.  It’s just not true.  Sampling isn’t slicing the data, it is transforming the data, ALL of the data, into a format that can be stored digitally.  The digital to analog step then transforms the data, ALL of the data, back to it’s original analog form.  Nothing is lost, phase shifted or not. 

It seems we’re now in the realm of Dirac impulses which, to my mind, don’t have much to do with music. :rolling_eyes:

Indeed the first reference says “Localisation performance is better for non-musical sounds (e.g., clicks, percussive noises, etc.) than for musical tones”. Presumably this accounts for the 1 degree figure.

It seems we’re now in the realm of Dirac impulses which, to my mind, don’t have much to do with music. :rolling_eyes:

 

Why is it that none of this elaborate theorising is needed to identify/justify HD video streams and to pick them over DVD quality on any HD quality screen played on a HD capable player of any price point, by eyes that are not capable of 20/20 vision? Probably because in the case of audio, there is a frantic effort to justify something that doesn't exist in a practical sense for any domestic use case? 

Does anyone here remember wapping high?:grin:

It seems we’re now in the realm of Dirac impulses which, to my mind, don’t have much to do with music. :rolling_eyes:

 

Why is it that none of this elaborate theorising is needed to identify/justify HD video streams and to pick them over DVD quality on any HD quality screen played on a HD capable player of any price point, by eyes that are not capable of 20/20 vision? Probably because in the case of audio, there is a frantic effort to justify something that doesn't exist in a practical sense for any domestic use case? 

Does anyone here remember wapping high?:grin:

 

It's like night and day!  

You need trained ears and a $50,000 system!

 

You need trained ears and a $50,000 system!

I can guarantee that in my home, even late at night, no human, however trained, will be able to pick the difference in a blind listening test even on a USD 100,000 system. Because that still has to deliver sound after interacting with my room - with its acoustics and ambient sound levels - which is a typical domestic one.

Whereas the HD video I can pick on a cheap HD capable TV, even when I have left my glasses out of reach.

HD audio is just digital snake oil, consumed by the credulous, who need the HD mark to be visible on their player/app to even know that they are listening to HD audio. 

Userlevel 2
Badge +4

The Hi Res audio push seems inevitable regardless of whether it makes any real difference or not. It has become just another spec tick box you have to have if you want to sell audio products in 2022.  My guess is the increased cost of the bandwidth is so insignificant to streaming music services they are like “why not?”,  If it hooks a few more customers it will be worth it. 

Most listeners won’t know or care.  But there is a certain population out there who will be lured in by the promises. Just like there used to be a certain population people will argue bitterly about megapixels in cameras long after we passed the limits of what really mattered. Some people just love to bicker about numbers I guess.

It does make me chuckle when I see people fretting about whether they are hearing Hi Res audio on their Sonos Roam other tiny speakers.  Hell even on a Port connected to a really nice Denon/Aperion Audio setup I can’t hear a difference so what chance do they have on what’s essentially a 200 dollar bluetooth speaker?  

Userlevel 2
Badge +4

@Kumar 

I could not agree more.  Arguing about bigger numbers is meaningless if people can’t see or hear the difference.  Every audio and video format eventually reaches a point of diminishing returns.  And at some point they surpass the limits of what our eyes and ears can actually perceive making these supposedly quality increases strictly academic.

Audio formats reached that point that a long time ago. It’s kind of crazy that a 40 year old standard like Red Book still represents the pinnacle of audio reproduction but it really does. Video formats on the other hand still have room to grow.  Hi Res audio played back on my $3K sound system, does not impress me at all.  But 4K HDR video played back on my 65” OLED blows me away.  

Why is it that none of this elaborate theorising is needed to identify/justify HD video streams and to pick them over DVD quality on any HD quality screen played on a HD capable player of any price point, by eyes that are not capable of 20/20 vision? Probably because in the case of audio, there is a frantic effort to justify something that doesn't exist in a practical sense for any domestic use case? 

 

 

Interesting point.  I really don’t have  a clue where the line is between video resolution and what your eyes can see.  Perhaps part of the reason is that resolution, isn’t the only advancement involved with video.  Size of the TV, how black are the blacks, frame rate, etc.  As well, in many cases, customers can actually see the difference betweeen TVs in a store, much better than they can hear the difference between speakers.

The Hi Res audio push seems inevitable regardless of whether it makes any real difference or not. It has become just another spec tick box you have to have if you want to sell audio products in 2022.  My guess is the increased cost of the bandwidth is so insignificant to streaming music services they are like “why not?”,  If it hooks a few more customers it will be worth it. 

 

 

 

I think a lot of the cost for higher bandwidth is further down the line with your service provider and local WiFi network.  This is why I would prefer to block resolutions beyond what I can hear.

 


Most listeners won’t know or care.  But there is a certain population out there who will be lured in by the promises. Just like there used to be a certain population people will argue bitterly about megapixels in cameras long after we passed the limits of what really mattered. Some people just love to bicker about numbers I guess.

 

 

Eh, I think most people always want more, and don’t really think about the limits of what they can possibly hear or use.   I wouldn’t say that I’m immune to that psychology either. 

 

 

4K HDR video played back on my 65” OLED blows me away.  

OLED and HDR are very fine. However with average vision the 4K part is only likely to be relevant if you’re sitting 9 feet or less from the screen. https://www.rtings.com/tv/reviews/by-size/size-to-distance-relationship

In many typical living room situations a viewer quite probably wouldn’t be able to distinguish 4K from FHD. 

Userlevel 7
Badge +22

The higher the audio bit rate the more data you have to move over your network. For folks with problems with issues with non-HD audio it will likely make things worse.

4K HDR video played back on my 65” OLED blows me away.  

OLED and HDR are very fine. However with average vision the 4K part is only likely to be relevant if you’re sitting 9 feet or less from the screen. https://www.rtings.com/tv/reviews/by-size/size-to-distance-relationship

In many typical living room situations a viewer quite probably wouldn’t be able to distinguish 4K from FHD. 

Agreed. And sitting close enough to pick the 4K on a large screen means turning to the left/right to catch all the action, that can get tiresome. 

But the few times I have watched 4K streams on a 2011 make 50 inch plasma HDTV, the picture clarity is a marked improvement even on that TV as compared to HD streams. Perhaps the better mastering analogy applies to video as well. Of course, how much better than that they would be on a 4K capable screen, I don’t know. 

The difference isn’t as marked as it is in the case of DVD to HD, so this suggests that diminishing returns for video are also now in play, and I would be surprised to see any need for 8K and higher.



Audio formats reached that point that a long time ago. It’s kind of crazy that a 40 year old standard like Red Book still represents the pinnacle of audio reproduction but it really does. 

Which is why you see all the frantic marketing of the latest audio kit, year after year, by the odious to the credulous - the HiFi specialist media being just as culpable. And their pandering to the all too common human condition of not being satisfied with what is available and at hand in the home.