Can digital audio playback be improved by resolutions greater than 16/44.1?



Show first post
This topic has been closed for further comments. You can use the search bar to find a similar topic, or create a new one by clicking Create Topic at the top of the page.

93 replies

When someone says that to me that I am not able to pick out the differences between two bits of kit because I don't have the ears trained to do so, and my requiring some kind of test to rule out bias is unreasonable/unnecessary to those with trained ears, I draw my own conclusions about how easily swayed that person is.

And training seemed to be such a strong variable that it appeared to override everything else. Essentially (and I'm being loose with the language here), every study where participants were carefully trained in the task showed a strong ability to discriminate, every study where participants were untrained did not.


The issue of training is very interesting. Years ago a friend and colleague proudly showed me the stereo he bought as an undergraduate at Harvard in the late 1960s. Despite their perished surrounds and yellowing paper cones, he loved the sound. Having just moved overseas, I bought a new stereo, including some excellent Canadian Mirage speakers. I took them around to his place to compare the sound, and he still preferred his speakers! I left the speakers with him for a week, and when I picked them up he had new B&W speakers. It took a while for his ears to make the decision.

So the question is, to train or not to train and stay in ignorance? Which brings greater happiness? I don't have a clear answer, but I tend to prefer knowledge over ignorance. And I know my friend now enjoys orchestral music more because he can hear and discriminate all the instruments more clearly.
Yes, Hi Rez is much ado about nothing in the lives of the vast majority of us music lovers. Primarily of interest only to those trying to make a buck off of it, which includes a lot of shady companies and the magazines which support their claims without question. Objective journalism has no place in "high end" audio reporting these days, unfortunately.
Userlevel 5
Badge +10
I simply have to say thank you to @joshr for joining in on this conversation! It is fantastic to have someone with your familiarity of this topic and depth of research connect with such a community!

As I'm sure you have noticed, many of the most active participants of this forum have built fairly strong views on this topic (and other topics :P) and as a result, may be perceived as being somewhat .... aggressive... in their response. 🙂 In my experience everyone on here does mean well and has a deep honest interest in the discussion, and you have added significantly to that conversation.

Incidentally, Having read much (with a lot yet to read) while I appreciate what people *may* be able to do in a testing environment, I stand by my original view that in a consumer market, higher resolution does not improve audio playback in any way that makes any meaningful difference to actual listening in a real world. I go back to the discussion referenced in the What Hi-Fi article : That people don't really note differences in their regular listening between mp3s and lossless such that they are not moving away from mp3s... I currently don't see the broader population demanding moving to higher-res as I don't perceive the typical individual truly discerning material differences in a real-world application. (And this is ignoring the potential communication bandwidth challenges that I perceive make product release/updates extremely problematic at this point in time.)
Nope. But set thread to Newest First, which does show the final page. Screwy.
Unable to navigate to page 3 of this thread. InSided, lol. Maybe this post will fix...
If the HD version is just an upsample of CD quality content, then its highly unlikely that there would be significant perceivable differences.

the kind you refer to - where both the HR and the CD are being sold at different prices, but are recorded at the same levels from the same master, and just down sampled for the CD version.


just to be precise, in my comment I referred to upsampling of CD quality content to a high res format. This was a potential criticism of the Meyer and Moran study, and could imply that some of the stimuli used in the test was not really high resolution.
Downsampled from high resolution to CD quality is a different scenario, and the studies primarily focused on just that sort of thing.
If the HD version is just an upsample of CD quality content, then its highly unlikely that there would be significant perceivable differences. Luckily, almost all studies avoided anything like this, and potential biases were noted and discussed when any such issues could have occurred.
I agree that in the study I was able to read about, this issue isn't a factor; but in the real world HR audio is almost aways sold after remastering the recording used for the original CD. Because - I suggest - this is the only way for it to sound audibly different in the real world of normal homes and listening environments, to make people that already have the original CD, to buy it again. And so that those that do not, buy it at a higher price than the original CD that is still on sale.

I doubt that there is anyone successfully selling HR music closely allied to the kind you refer to - where both the HR and the CD are being sold at different prices, but are recorded at the same levels from the same master, and just down sampled for the CD version.

IMO and to my mind this has been a good discussion that has reinforced for me the philosophy that real world improvement in sound quality can be obtained - if required for the enjoyment of music - only via paying attention to who is doing the mastering and to speaker quality and placement/room acoustics/room response DSP. That is where there are real improvements to be found.

Of course, when the day comes that the good masterings are only available in HR format, it will be a different matter. But as of now there are no trends that indicate that this sorry state of affairs will come to pass!


And training seemed to be such a strong variable that it appeared to override everything else.


If training is guidance on a range of parameters to be observed/heard, with words suggested for the two ends of the scales for each, it seems to me that it would be useful; anything more starts running into contested territory!
But to my thinking, the forced preference is the bigger issue - and I would propose that it shows up in the CD V HR results where roughly there are an equal amount that prefer one over the other. More than anything else, I suggest that this shows there is little to choose between the two and people say - if I have to pick one, I pick this; and the fact that preferences expressed in that manner fall equally on either side of the divide has a message to convey about the nature of the outcome.
... which made the effect of other factors like duration seem weak at best.

As for 2, you would clearly be comparing something very different if you look at commercial HD and CD quality recordings, if anything else was done in mastering besides just sample rate and bit depth conversion. But this argument works both ways of course. If the HD version is just an upsample of CD quality content, then its highly unlikely that there would be significant perceivable differences. Luckily, almost all studies avoided anything like this, and potential biases were noted and discussed when any such issues could have occurred.
Hi Kumar,
1. Would be quite interesting. I think the optimal duration of timing and intervals would depend on which cognitive processes play the strongest role for a given task. But anyway, I don't think there was sufficient data for me to do a proper analysis on this. A lot of studies simply didn't give a lot of information on these durations.
And training seemed to be such a strong variable that it appeared to override everything else. Essentially (and I'm being loose with the language here), every study where participants were carefully trained in the task showed a strong ability to discriminate, every study where participants were untrained did not. And
Userlevel 7
Badge +26
Hi everyone, and @joshr thanks for joining us! This is a great discussion, and I'd just like to remind everyone to respect the rules of the community and withhold from personal attacks or letting the heat of the discussion make you forget that this is a welcoming place where everyone is entitled to their opinion. Let's keep it friendly!

Thanks!

And since we are looking at the limits of perception, long samples might be needed for listeners to identify differences. I grouped studies into those with short samples / quick changeovers vs long samples / long changeovers and the latter had much stronger results.

Such is the nature of subjective testing at the limits of perception and dealing with real world conditions.

I take your points; a couple of responses to the above quoted:
1. What about long samples/quick changeovers? You haven't addressed this method - would this not be better than either of the other two you mention?
2. The limits of perception reached by CD quality sound for it to be heard to be different/worse than HR and real world conditions that further compound this issue are not often considered by HR advocates. And far too often this difficulty is addressed by HR advocates by having a second variable in the mix, the HR master. Because that can often make the difference night and day, and this reason for that to be the case isn't easily picked up by the layman/market.

It really is necessary to come as close to a 100% elimination of bias to have a dependable result... tester bias cannot be ruled out.

Good point and I fully agree.

I don't see that it is a double blind one.
Another hi-res evaluation paper by some of the same authors stated 'an operator was in a blind area outside the vehicle. It was not a double-blind experiment, but the subject and the operator could not communicate each other.'
I spoke with one of the authors for clarification on other aspects, and they used similar methodologies in both papers. So its likely only single blind, but with significant effort to reduce tester influence.

I haven't been able to get enough clarity on the outcome that approx 57% prefer HR to CD ah, that's the nature of these sorts of perceptual evaluation tests. They are actually well established throughout psychophysics (and applied to things like taste tests, perfume preference testing etc.), but its not your typical experimental and control groups as in evidence-based medicine. Think of the control as what would happen if participants were guessing. Then the outcome would be 50%. Note that a 50% result in a preference test with lots of trials does not necessarily imply that those participants can't discriminate, since it could be, for instance, that everyone can perceive the difference but half the participants prefer A and half preferred B. But a 100% result in those circumstances would be extremely unlikely to occur by chance.

For this experiment, standard error of the mean slightly overlapped 50%, so not considered significant. But it had a small number of participants and trials.

My understanding of human memory of sound quality is that it degrades in seconds and becomes unreliable - yes, and that would weaken any ability to discriminate. On the other hand, there is also the effect of auditory sensory memory. Studies suggest that it actually persists for quite some time. If the higher resolution content is played first, then one might retain the perception of that content if the lower resolution is played after (assuming that there's anything there to be perceived). And since we are looking at the limits of perception, long samples might be needed for listeners to identify differences. I grouped studies into those with short samples / quick changeovers vs long samples / long changeovers and the latter had much stronger results. But I wouldn't read too much into this since other differences between studies might have been the cause of this.


I haven't seen any test that adequately rules out bias and takes into account human auditory issues including memory duration, and proves a preference for HR. Even if people can hear a difference and even if that difference is due to higher quality, I doubt you'll find rigorous and general proof of preference. Such is the nature of subjective testing at the limits of perception and dealing with real world conditions. I'll give a couple of examples. In one study, participants were asked which audio format was best quality, and the live feed was the one most often ranked worst, suggesting that perhaps people didn't associate quality with purity. And a classic 1956 study suggested that those who have grown up listening to low quality reproductions may develop a preference for it.

if as you say a lot of the studies were like this, I am afraid that any meta analysis of these is just as flawed in its conclusions.
This is the big challenge with any meta-analysis, but luckily there is a lot of good advice from meta-analysis experts on how to deal with it. When I first started looking into this, it became clear that potential issues in the studies was a problem. One option would have been to just give up, but then I'd be adding no rigour to a discussion because I felt it wasn't rigourous enough. And its the same as not publishing because you don't get a significant result, only now on a meta scale. So I set some ground rules. I committed to publishing all results, regardless of outcome. I included all possible studies, even if I thought they were problematic (but discuss in detail potential biases). I decided that any choices regarding analysis or transformation of data would be made a priori, regardless of the result of that choice. And I did sensitivity analysis looking at alternative choices (like whether to exclude studies or analyse them differently).
One interesting thing that I found, which I did not at all expect, was that most of the potential biases would introduce false negatives. That is, most of the issues were things like not using high res stimuli, or having a filter in the playback chain that removed all high frequency content, or using test methodologies that made it difficult for people to answer correctly even when they heard differences.

One non-AES paper that can be downloaded by anyone is at https://www2.ia-engineers.org/iciae/index.php/icisip/icisip2013/paper/viewFile/160/146 . I don't by any means claim that this is the best, and it used an AB style paired comparison rather than ABX, but it should give you an idea of what a lot of the studies were like.

As you say, there are some issues with this report, and someone more experienced with ABX testing will do a better job of identifying these than I can. And what I point out may seem like nit picking to some, I believe it isn't for the following reason:

Achieving 90% of all that is needed to eliminate Bias does not mean there is only a 10% chance that Bias will colour the outcome - given how powerful and universal human biases are, that possibility can still remain as high as 100%. It really is necessary to come as close to a 100% elimination of bias to have a dependable result.

To start with the issues I see: first, the purpose of the study is such that tester bias cannot be ruled out.

Second, the test says it is a blind test, but I don't see that it is a double blind one. It isn't double blind AB, leave alone not being double blind ABX. Not being double blind, tester influence on the results cannot be ruled out.

I haven't been able to get enough clarity on the outcome that approx 57% prefer HR to CD, although it is clear that an outcome of no preference for either - as in both sounding the same - isn't possible to be obtained by the forced preference design of the test. Do 43% prefer CD to HR? I find that hard to swallow too! And do these numbers delivery any statistically valid outcome?

My understanding of human memory of sound quality is that it degrades in seconds and becomes unreliable. To overcome that, a rapid - less than a second - changeover between the tested samples is required to get a valid comparison outcome. This hasn't been employed.

Two important aspects DO seem to have been well addressed - mastering variability, and sound level matching.

So, while I have pointed out what my limited understanding tells me are flaws in test protocol, a robust peer review would do better in this effort, I admit, but would probably need access to the testers.

I don't plan to subscribe to AES, but if anyone can pick and describe any test there that is a robust ABX to determine the audible value of HR, it would be very interesting; till then, I will still maintain that I haven't seen any test that adequately rules out bias and takes into account human auditory issues including memory duration, and proves a preference for HR.

And if as you say a lot of the studies were like this, I am afraid that any meta analysis of these is just as flawed in its conclusions. To use a relevant analogy, no amount of subsequent effort and technology can overcome the defects in the source master.
Apologies in advance if I don't respond to many other queries.
Short answer is that I do request corrections when anything I publish contains errors. But I wasn't the author of this, it was just a press release, 'important advantage' sounds to me like opinion rather than a factual claim, I can see how it could be used as a layman's phrasing of 'statistically significant', and scouring the internet would be impossible.


I'm sorry, but this is stretching the bounds of reality. No matter how many semantic arguments you wish to put forth, not even a layman would equate 'statistically significant' with 'important advantage'. I stand by my point; stating this study showed a 'important advantage' isn't justifiable marketing spin, or a translation to layman's terms; it is a flat out lie.

And If I were quoted as such when my results said nothing of the sort (in fact they say quite the opposite), I personally would request a retraction. I don't care it it was presented as an opinion, a fact, or an edict from God. I've seen no retraction, so I am left to think that you agree with the "layman's phrasing" which even to a layman, is a gross misrepresentation of the facts. This is pure marketing BS, and you seem to be a willing participant in that BS.

In short, it is astonishing that you would blame a mere misunderstanding, a marketing speak translation, or a dumbing down for laymen for what is in reality a lie, attributed directly to you. I'm left to wonder if AES's apparent support of all things Hi-Res has any bearing on your agreement with this "layman's phrasing".
Thank you! The Japanese test seems to have definitive conclusions and I will read the methodology in detail to understand how it was done.
Apologies in advance if I don't respond to many other queries.

why have you not requested a retraction?
Short answer is that I do request corrections when anything I publish contains errors. But I wasn't the author of this, it was just a press release, 'important advantage' sounds to me like opinion rather than a factual claim, I can see how it could be used as a layman's phrasing of 'statistically significant', and scouring the internet would be impossible.


“I have [yet] to come across even one for hi res that does a decent job of doing level matched blind AB, leave alone a full protocol ABX… may I be pointed to even one ABX done in line with well established principle” – see the paper.

I have seen the paper. But is there any way to see a fully documented test itself? A link to one perhaps - that is the only way to it for most of us.
Unless I read the documented test, I can't conclude to how good it is. I may still not be able to do that, but it will be more than anything I have been able to find till now. By good, I mean how well single variable ABX protocols have been complied with.
Thank you in advance.


Most of the studies analysed in the paper were published by the Audio Engineering Society. They are all available from http://www.aes.org/e-lib/ , and can be downloaded freely by AES members.
One non-AES paper that can be downloaded by anyone is at https://www2.ia-engineers.org/iciae/index.php/icisip/icisip2013/paper/viewFile/160/146 . I don't by any means claim that this is the best, and it used an AB style paired comparison rather than ABX, but it should give you an idea of what a lot of the studies were like.

“I have [yet] to come across even one for hi res that does a decent job of doing level matched blind AB, leave alone a full protocol ABX… may I be pointed to even one ABX done in line with well established principle” – see the paper.

I have seen the paper. But is there any way to see a fully documented test itself? A link to one perhaps - that is the only way to it for most of us.
Unless I read the documented test, I can't conclude to how good it is. I may still not be able to do that, but it will be more than anything I have been able to find till now. By good, I mean how well single variable ABX protocols have been complied with.
Thank you in advance.
So, once again, the meta-analysis results do not state there is an qualitative advantage and thus irrelevant to this thread.

One question to the author, if the quote which stated the opposite was not what you said, why have you not requested a retraction? It's a pretty embarrassing statement, and a poor reflection of one's scholarship. If it were me, I'd be scouring the internet for every reference of that quote and requesting it be changed. Yet I see it in dozens of places touting the benefits of Hi-res music where none definitively exist.
Hi all,
I am the author of the meta-analysis paper being discussed, and I was asked to comment on some of the points made in this discussion.
The paper being referred to is available at http://www.aes.org/e-lib/browse.cfm?elib=18296 , and it links to additional resources with all the data and analysis.
Also, note that this was unfunded research. At no point has any of my research into high resolution audio or related topics ever been funded by industry or anything like that.
On to the specific comments;

“Dr. Reiss was the one making the PR statement… Dr. Reiss is lying in his PR statement” - I didn’t write the press release! Press releases are put forward by organisations with the aim of trying to get the press to cover their story, and as such are a combination of spin, marketing, opinion and fact. In this case, it was written by a press officer at my university, and then AES issued another similar one. The ‘advantage’ quote was based on a conversation that I had with the press officer, but it was not text directly from me (I just checked my email correspondence to confirm this). It most likely came from trying to translate the phrase ‘small but statistically significant ability of test subjects to discriminate’ to something that can be easily understood by a wide audience.

“explained by the presence of intermodulation distortion” – This was looked into in great detail, see the paper and supplemental material. First note that intermodulation distortion in these studies would primarily arise from situations where the playback chain was not fully high resolution, e.g., putting high resolution content through an amplifier that distorts high frequencies. Anyway, quite a lot of studies did look into this and other possible distortions (see Oohashi 1991, Theiss 1997, Nishiguchi 2003, Hamasaki 2004, Jackson 2014. Jackson 2016) and took measures to ensure it wasn’t an issue. This includes most studies that found a strong ability to discriminate high resolution content. In contrast, some studies that claim not to find a difference, either make no mention of distortion or modulation (like Meyer 2007), or had low resolution equipment that might cause distortion (like Repp 2006).

“I have [yet] to come across even one for hi res that does a decent job of doing level matched blind AB, leave alone a full protocol ABX… may I be pointed to even one ABX done in line with well established principle” – see the paper. There were a lot of studies that do double blind, level matched ABX testing. Many of those studies reported strong results. They all could suffer issues of course, but the point of the paper was to investigate all those studies.

“absolutely no evidence in the meta-analysis that there is an ‘advantage in its quality’” – I would not go as far as that. I neither claim there is or there isn’t; ‘advantage’ is too subjective. However, many of the studies looked at preference, or at what sounded ‘closer to live’, or asked people to comment on subjective qualities of what they heard. They do suggest an advantage to audiophiles, but I would argue that the data is not rigorous or sufficient in this regard.

“his cherry picking of studies” – A strong motivation for doing the meta-analysis was to avoid cherry-picking studies. For this reason, I included all studies for which there was sufficient data for them to be used in meta-analysis. That way, I could try to avoid any of my own biases or judgement calls. Even if I thought it was a poor study, its conclusions seemed flawed or it disagreed with my own conceptions, if I could get the minimal data to do meta-analysis, I included it.

“chance of him actually including studies that find there is no difference, like the M&M study … slim to none… disclusion of seminal studies on Hi-res audio like the M&M study” – I did include the M&M study (Meyer 2007)! See Sections 2.2 and 3.7 and Tables 2, 3, 4 and 5. I couldn’t include it in the Continuous results because Meyer and Moran never reported their participant results, even in summary form, and no longer had the data (I asked them), but I was able to use their study for Dichotomous results and it didn’t change the outcome.

‘explain why studies that did factor in the existence of IM distortion were left out, whereas studies that didn't consider IM were included’ – see previous points. I included every study with sufficient data, some of which considered IM and some didn’t. The Ashihara study (references 25, 60 and 61), was a detection threshold test, demonstrating only that IM could be heard and could be a factor in discrimination tests. Nor did they report results in a form that could be used for meta-analysis.
IMO(🆒) before dabbling in the meta domain, may I be pointed to even one ABX done in line with well established principles and accepted in the world of science that establishes:
1. That differences were heard by a statistically significant part of the sample to establish that these are audible
2. If so, from those that heard the difference, how many preferred it, how many did not, and how many found both just as good.
And this for any of the audiophile supported exotics like DACs/Hi Res/MQA etc etc. - anything except speakers where I don't need to be convinced that audible differences will survive blind testing. Which is good because speaker testing will involve hard to achieve speaker handling to eliminate all but one variable - the speakers being compared, with each speaker being compared having to be kept in the exact same place with reference to the room and listener; compared to this, upstream component ABX is a lot easier, though it will still need instruments and strict protocols.

Peer reviewed would be good, but peers with credibility, not the tester's good friends/bar buddies/spouses - by definition, they qualify as peers. Scholarly peer review would be more like it.

Meta can only be step 2; it is meaningless without step 1.

Unless proved wrong, my claim is that not one such review exists in the world. And there is corroboration to that claim: Someone here said - wrongly by the way - that if Sonos kit was in the audiophile league, Sonos would have not hesitated to say so in their marketing. So for sure if such a test existed it would have been very visible in advertising by now. Someone that knew of such evidence would not have been shy of advertising it to sell their product. Why haven't any done so? Because no such test that establishes a difference exists. And because makers with even mediocre lawyers know what can be the consequences of false claims.
This study was a carefully crafted shill for the Hi-res audio industry...
You state this as fact rather than opinion. You need to provide solid evidence or withdraw the remark.

...and even so, it proved nothing.
Another statement of fact for which you will need to provide proof. In this case, it would be appropriate to contest a peer-reviewed paper with a dissenting publication.

I remind you of this statement you made recently:
As stated, opinions are fine. It's once someone starts making definitive claims that they need to start giving proof.


So preface it all with "in my opinion", just as everybody else already inferred. Except the "it proved nothing" statement, which is a clause describing the use of the study as a shill for the Hi-res industry, and it indeed provided no proof of any advantage for Hi-res audio. Your convenient snipping of the context of my posts does not go unnoticed.

Then go ahead and pat yourself on the back for switching the conversation to semantical childishness after you were made to look stupid about the PR quote. Then maybe we can get this coversation away from the kid's sandbox and back to discussing things like adults?

By the way, no comment on the PR quote and what it means with regards to any bias shown in the disclusion of seminal studies on Hi-res audio like the M&M study? Also, explain why studies that did factor in the existence of IM distortion were left out, whereas studies that didn't consider IM were included?

Or are you just going to play gotcha?
This study was a carefully crafted shill for the Hi-res audio industry...
You state this as fact rather than opinion. You need to provide solid evidence or withdraw the remark.

...and even so, it proved nothing.
Another statement of fact for which you will need to provide proof. In this case, it would be appropriate to contest a peer-reviewed paper with a dissenting publication.

I remind you of this statement you made recently:
As stated, opinions are fine. It's once someone starts making definitive claims that they need to start giving proof.
Let me point out, having been "quoted" in a few PR statements myself, that often the quote isn't what the person said, it's what the marketing people want the person to say.

I don't know whether he really said that, or was told to say that.

On the other hand, I'm enjoying this discussion. Thank you all for adding to my knowledge.