After deciding to sacrifice my unused Sonos speakers to the upgrade gods, it didn't take long to run into an issue.
Playing back music, I had repeatable on demand issues
If a group contained 4 or more devices, there is a delay of up to 12 seconds or music dropping in and out during the first 12 seconds on all devices except the group controller when starting music or changing tracks. By devices that includes a sub and/or stereo pairs, not 4 sonos rooms.
Changing group members in the app gets slow after the above and playing music. Track progress starts getting jumpy, cloud api calls take longer.
I have spent some time investigating and think I know what is causing my speakers to have issues while grouped, which also introduces random app behaviour, cannot contact messages, slow responding. More often than not, once the controller gets into a state it doesn't recover and requires fully closing and reopening to return to normal.
Why I didn’t instantly assuming it was my network
For the past 3 years my Sonos has been completely trouble free, unless I was fiddling in my network settings and broke something myself. I have visibility on data about most parts of my network, so it’s usually obvious when something is wonky
Changes of note during past year
I no longer have my Sonos amps, port or one of the subs - the lack of port may have contributed
I have added some Yamaha gear which uses multicast, including from my HT AVR → Sub, so it is obvious when multicast is playing up because the avr and sub separate.
I replaced my noisy all singing all dancing enterprise managed POE switches for more suitable, fanless managed SMB switches about a year ago. Far less settings to fiddle and break things with
Following the upgrade was the first time I added a wifi network for a while. I’ve been on SonosNet for over a year. Maybe time to blame the missing port again
Page 2 / 3
The bookshelf + sub setup as a stereo pair + sub had been faultless prior to upgrading and the content I’m playing is the same as before the upgrade.
The lamps have been mainly single playback background music, but I did find time to quickly turn 2 into a stereo pair + sub and they played back without disruption the same things the bookshelf + sub struggled with.
A brief observation on SonoPad, was that the track time progress would stop and with a larger group the playback queue would disappear until playback stabilised as though it couldn’t retrieve it. If I change tracks while the group is unstable, then occasionally group members would play part of the previous track before trying to switch streams as though processing the control messages was delayed.
Curious that Apple only send ~3Mb/s which is more than enough rather than Qobuz pushing at it’s higher rate.
48kHz x 24 bits x 2 channels = 2.3Mb/s
Sprinkle a little overhead in the mix, rounded up its around ~3Mb/s
A brief observation on SonoPad, was that the track time progress would stop and with a larger group the playback queue would disappear until playback stabilised as though it couldn’t retrieve it. If I change tracks while the group is unstable, then occasionally group members would play part of the previous track before trying to switch streams as though processing the control messages was delayed.
I have no issues with groups, apart from the pause when adding older products to a higher bit rate stream.
In the interest of science, to stress even more, I tried a non typical use case. I started a separate Apple Music stream to separate Rooms, 9 independent Apple Music 48/24 streams and an additional 5 independent Apple Music standard streams to older devices (P1s from 2013), including 2 P1s paired, I had no issues with skipping or control of the rooms using Sonos iOS App on iPhone (not SonoPad) during the test of about 5 minutes.
Maybe there's going to be a price to pay for the conscious lack of hardware progression prior to the last generation or two.
It’s a key thing to understand though… if that hardware is on the cusp of not being able to support the new ecosystem, and causes instability, then to be honest it strengthens the calls for the old S2 app to come back, and the new one be rebranded as S3, dropping support for that hardware.
Now I’m sure that would bring about further consternation and hand wringing, but at least people could get back to a working setup without then needing to go on an upgrade spree.
Maybe some of the apologists would change tack!
@Ian_S it is certainly something useful to try to understand. While it is easy to say flac, for example, is a low overhead to decode, single digit Mbps network streams are nothing, dsp processing isn’t a huge overhead, it is not so easy for people to understand how much the multiple things going on inside an embedded device with a lower power system on a chip can be impacted.
Like the various why no gigabit conversations that go across many devices, not even Sonos specific, they rarely consider if the CPU needing to process the network traffic can cope with so much data, nevermind if the device itself ever needs that much bandwidth
@133133 People are people. Everyone has different experiences and backgrounds so has different views.
My intent with this thread is investigate into why my previously stable system is now unstable playing back the same music. Will I make mistakes or incorrect assumptions, sure. I will then adjust how I approach and investigate accordingly.
I chose to run on SonosNet previously because it kept the music playback traffic away from my AP, I have empty 2.4Ghz channels where I live and my WiFi AP has other demanding tasks to do. While there is a move away from SonosNet with new products removing all support, there is no obvious reason from the outside that my SonosNet setup which hasn’t given me any issues should become unstable to the point it can’t handle 3-4 speakers in a group or 2.1 setup.
It will play 1, 2 or 3 groups 2 speakers without issues, but then I run out of speakers. Add an extra speaker to a group to make 3 or 4 any things go wonky.
A brief observation on SonoPad, was that the track time progress would stop and with a larger group the playback queue would disappear until playback stabilised as though it couldn’t retrieve it. If I change tracks while the group is unstable, then occasionally group members would play part of the previous track before trying to switch streams as though processing the control messages was delayed.
I have no issues with groups, apart from the pause when adding older products to a higher bit rate stream.
In the interest of science, to stress even more, I tried a non typical use case. I started a separate Apple Music stream to separate Rooms, 9 independent Apple Music 48/24 streams and an additional 5 independent Apple Music standard streams to older devices (P1s from 2013), including 2 P1s paired, I had no issues with skipping or control of the rooms using Sonos iOS App on iPhone (not SonoPad) during the test of about 5 minutes.
Interesting, maybe Qobuz should start pushing a lower rate rather than the far higher 12-15Mbps that I have previously seen coming down.
Within the first 12 seconds of playing or changing streams, it is only the additional speakers which aren’t receiving the unicast stream which are disrupted. Playing to pairs or individual hasn’t been an issue, but with only 6 + sub I soon run out of speakers to pair up
Think I need to activate the Apple trial this weekend, maybe Tidal and/or Amazon as well.
@133133 People are people. Everyone has different experiences and backgrounds so has different views.
My intent with this thread is investigate into why my previously stable system is now unstable playing back the same music. Will I make mistakes or incorrect assumptions, sure. I will then adjust how I approach and investigate accordingly.
And it is much appreciated even though I am S1 and unaffected. It is beginning to resemble the IS / IS NOT I asked for in another thread - a basic fault definition as a starting point to understand what Sonos are dealing with which I’d hope they have even though they treat customers with disdain and like mushrooms. Of course, if they were to agree with your findings that would be admission of a wholly unacceptable software checkout plan. It would also explain why many people seem unaffected if they never group so many zones. It feels like just one area of issue amongst many but someone in Sonos must read this input and twig the effect of whatever architecture changes they’ve made.
The thing is whilst a little PITA for songs not or no longer being instantaneous, if there was a message informing the user that the system was taking time to fill buffers prior to playing there would be a bit more acceptance that hey, at least we know wtf is happening. Guys in labs trying stuff out on ethernet wired systems with 2 zones and limited sources likely?
There are certainly hardware differences between the ikea and sonos speakers even if using the same board as suggested by the online teardowns. My ikea speakers don’t have on board microphones, they have their own hardware identifiers so appear as Sonos devices with Ikea images and descriptions. Their target price point was lower than the Sonos speakers.
There could well be differences with music providers and how they provide the streams.
With Amazon and Apple supporting adaptive streams along with the Sonos Adaptive bit rate, they can switch as needed during playback based on changing bandwidth and capability.
It could provide an explanation for people seeing the Amazon lossless HD switching on and off, without noticeable disruption to playback and means things are operating as expected by design.
Maybe Qobuz deliver differently using file based so my speakers can’t auto-switch and are doing the best they can with what is available, but it doesn’t interact well with the new firmware. It would provide an explanation for why there is the initial download burst, potentially of multiple chunks, to fill the buffer, before settling into a lower update rate to keep the buffer full until the track ends.
Unfortunately Qobuz withdrew their previously public api version and docs following people abusing it, so the details of how to access or what methods they provide for streaming are no longer readily available publicly.
It took longer than intended but I have finished the isolated testing I wanted to perform.
Isolated network setup. Wifi router with all test lan to wan traffic allowed. Wan blocked. Pi-Hole blocking disabled, all dns allowed. No STP, loop protection or igmp snooping enabled on the switch or router.
Music to use needs to be on local library, Qobuz and Apple Music. Track changing will be performed in the order listed from queue and after download completes for Nas and Qobuz
Indila - Mini World. CD lossless Tracks: Derniere Dancer, Run Run, Mini World
Jessie Ware - That Feels Good. Hires lossless Tracks: Pearls, Begin Again, Shake the Bottle
Sade - Bring me home live. Stereo PCM from my bluray. Second local Hires lossless check Tracks: Your Love is King, Sweetest Taboo, No Ordinary Love
Sonos setup to test, which previously worked without issues and regularly used for all music local and Qobuz, Stereo Pair with Sub using Ikea gen 1 bookshelf with sub mini grouped with the Ikea gen 2 bookshelf
Tests to be repeated multiple times at different times of day on different days.
Ethernet Test - All devices wired with wifi disabled
SonosNet - Channel 11. Previously used for over a year without issues
Wifi - Channel 6. 2.4Ghz with 20Mhz width. Five out of seven speakers only support g/a so no point enabling n
Summary of test results
Ethernet
SonosNet
Wifi
Group
Max Stable
Group
Max Stable
Group
Max Stable
Indila Nas Qobuz Apple
stable stable stable
7 7 7
stable unstable unstable
5 3 3
stable unstable unstable
5 3 3
Jessie Ware Nas Qobuz Apple
stable stable stable
7 6 6
unstable unstable unstable
2 2 2
unstable unstable unstable
2 2 2
Sade Nas
stable
7
unstable
2
unstable
2
Max stable is the maximum devices where no interruption or disturbance occurs. On SonosNet and Wifi that meant using a stereo pair without the sub or two individual speakers in a group.
stable: track changes and playback plays without any disruption or disturbances.
unstable: volume fades down/up on random speakers, group members stop playing but recover and continue playing correctly until another stream change.
While one off failures were generally ignored, there were three notable one off failures
When The first Jessie Ware track (Pearls) from Qobuz was played. The group coordinator disappeared from network after ~12 seconds and grouped speakers went silent while it played out it's buffer. All other speakers pingable. Group coordinator no network response. Speakers in the 2.1 setup were missing from the app. After 10 minutes it didn't return so rebooted. App pop-up for new bookshelf speaker found after reboot with 2.1 setup still missing. Required full app close and restart to show correct speakers.
During later testing, When the second Jessie Ware track (Begin Again) from Qobuz was played the group coordinator went AWOL again after ~3 seconds. The App continued to update playback for ~2 minutes even though all speakers were silent. 'support/review' showed all speakers except the group coordinator. Eventually the app stopped showing playback and all 3 speakers in the 2.1 setup disappeared from the app. Again required reboot of the speaker to recover. App recovered when the speaker had finished booting.
An unstable switch to lossless with Apple caused the Sub to go AWOL and it didn't recovers when switching back to lossy. Playback became more stable as there was one less device. Again required a reboot of the speaker to recover.
Other observations
Nas and Qobuz have the same file based delivery pattern. Qobuz delivered ~3Mbps faster than the device file rate read from the Nas.
Apple is chunks with pauses between them throughout the entire track, the same way video streams deliver.
Nas an Qobuz provide no indication of the type of content being played back. From memory, I think the old App used to show CD and HD or Hires.
Apple lossless is predictable when it will switch streams. It starts playing lossy. 60 seconds later it will switch to another lossy stream, which will randomly cause a stutter as it switches. 30 seconds later it will switch to lossless. If lossless is unstable it will switch back to lossy 30 seconds later. Time to attempt to switch back to lossless increases.
Changing streams on Apple will reset back to lossy if the speakers haven't been playing lossless for long enough. Once enough time is spent successfully playing lossless the group will always attempt lossless first. Changing the group requires going through the training again.
Three groups of two devices (stereo pair, speaker and sub, two individual speakers grouped) would independently play different Hires music from any source without any issues.
Network traffic and behaviour during unstable period
Moved the speakers to SonosNet (it's only three). Laptop without IP collected mirror of the wired speaker port with wireshark.
When stable, regular STP, regular SSDP Notify of services available from each device.
When group members drop out and recover IGMP leave group is sent from each affected device. If they recover quickly, only one leave per device is sent. If it takes longer multiple leave group messages are sent.
An IGMP Leave group doesn't necessarily mean the device has left the group, it is issued by a process running on the device. The speaker recovering suggests a replacement process has started.
Once stable an IGMP group membership report is sent.
A timed curl to the support/review page takes 300-500ms on all devices when playback of hires content is stable. During unstable periods this increases to ~1 second for group members and 2-5 seconds for the group coordinator.
In a larger group which never recovers, multiple Leave group messages keep being issued, the group coordinator stops responding to requests for support/review, the Sonos App shows an empty queue, playback status is meaningless and it starts issuing mdns and arp broadcasts to all speakers. Nothing excessive, but will repeat until it gets a response.
If the group is large enough, all speakers struggle to play and eventually give up trying to play the current track and switches to the next track.
Thoughts
With the latest firmware the communication between speakers with Hires content and larger groups of my speakers appears to break easily. With the response times from the group coordinator increasing to seconds, the group members appear to replace the existing process with a new process which manages to re-establish communication.
Even on SonosNet there should be no issue with three devices playing back any content, yet for my Ikea speakers they are unable to provide stable playback of Hires content.
With the wifi AP on my main network, a wifi router isolating the speakers from my network with no other devices apart from an ipad, SonosNet, 2.4Ghz or 5Ghz wifi, the same behaviour is observed around when instability occurs. Suggestions of spending £x00 on new wireless mesh network for £300 of speakers that used to work is not a sensible or realistic idea. For Qobuz a work around would be to change the Qobuz device setting to only allow a max of CD streams for playback, but this should be unnecessary.
The lack of indication in the App about the stream being CD or HD/Hires makes it difficult for an end user to see what is being sent to them so can't identify what they are playing and if certain formats are giving them issues. Apple similarly makes no distinction between CD lossless and Hires lossless, so unless someone knows the format released by the artist/publisher they just see it as an issue with lossless.
It seems like whatever changes occurred in the new firmware isn't working well with my speakers. The communication between them is easily interrupted with more complex media formats.
The million dollar question is why? I can speculate about multiple things that could cause it, but I think the answer lies within the devices and what is going on inside them, rather than external factors.
@Corry P I submitted 4 diagnostics reports over the weekend in case they would be useful for someone to look at.
Presumably you were using the new app to trigger everything?
If so, do you see different behaviour if you use one of the desktop controllers or a 3rd party app like Sonophone?
Presumably you were using the new app to trigger everything?
If so, do you see different behaviour if you use one of the desktop controllers or a 3rd party app like Sonophone?
Over the period I did the test as controller apps I used
My laptop is Linux, so doesn’t have the desktop controller on it. My only Windows machine, which has the desktop controller on it, is plugged into the TV on the wrong side of the router I used to isolate the Sonos devices and not realistic to wire in.
The controller made no difference, but SonoPad was more reflective of the status when devices where unstable, where as the Sonos App appeared to cache more and display the last know state for longer giving the appearance that everything was ok.
The incorrect identification occurred on the 06/08/2024 app, so may not have happened if it had occurred with fixes in the latest release.
Hi @sigh
Thanks for the diagnostics.
I am 99% sure that this is the cause of the problems:
This indicates a large amount of multicast traffic not destined for a Sonos device reaching and flooding the Group Coordinator’s RX buffer.
The solution to this is usually to enable IGMP Snooping/Filtering on your network - either by enabling some router settings, or by fitting an IGMP-capable switch between the router and the wired Sonos component. Alternatively, use a packet sniffer to find the source of the flooding and take action to prevent it.
Aside from this, I’d also recommend making a wired speaker the Group Coordinator - I can see that the wired speaker is not part of the group, but if you were to not only include it, but were to make it the GC, you may see additional improvements - but, I’d take care of the multicast flooding first.
I hope this helps.
@Corry P Would the multicast flooding show up as dropped packets on the br0 interface?
Hi @Ian_S
Good question - I have no idea, so I asked someone else: they think so, but are not sure.
I guess we have different ways of looking at things - for example, I didn’t read any of this thread beyond the opening post. I figured it would have only served to confuse me, and I found the answer in the diagnostics where I thought I might.
Sorry I can’t be of more help.
Thanks @Corry P, the reason I ask is because I see a large number of packet drops on my Sonos speakers except for some weird reason one Play:3 … I have some instability and if there’s something other than Sonos causing a lot of multicast traffic it would be nice to be able to track it down and if possible stop it.
Hi @Ian_S
It’s always best to ask the speakers - please submit a support diagnostic and let me know here when you have. I’ll be happy to take a look for you.
Sent… There is a support case open…
Hi @sigh
Thanks for the diagnostics.
I am 99% sure that this is the cause of the problems:
This indicates a large amount of multicast traffic not destined for a Sonos device reaching and flooding the Group Coordinator’s RX buffer.
The solution to this is usually to enable IGMP Snooping/Filtering on your network - either by enabling some router settings, or by fitting an IGMP-capable switch between the router and the wired Sonos component. Alternatively, use a packet sniffer to find the source of the flooding and take action to prevent it.
Aside from this, I’d also recommend making a wired speaker the Group Coordinator - I can see that the wired speaker is not part of the group, but if you were to not only include it, but were to make it the GC, you may see additional improvements - but, I’d take care of the multicast flooding first.
I hope this helps.
Interesting, I did briefly try with igmp snooping enabled on the switch, but it didn’t improve things so switched it off again.
I’ll go back through the capture logs because I would have expected that much from another source to be easily visible and stand out like a sore thumb. The only external sources would be the iPad, router, switch or laptop without an IP running wireshark capturing the mirrored port.
I’d avoided using the speaker wired and providing SonosNet so it was dedicated purely to acting as the gateway.
I’ll also give things a shuffle around and see if that helps.
@sigh It is interesting with your test case as I’m assuming it was a dedicated network for testing just Sonos, and there shouldn’t have been anything else multi-casting … I’m equally confused by why in my network the Play:3 speaker appears immune to the large number of logical dropped packets that all my other speakers seem to get…
Hi @Ian_S
I see no evidence of multicast flooding on your network.
The Play:3 is definitely struggling a bit - it’s CPU usage is pegged at 100%, and I suspect this will be the reason behind any playback issues on it. It also has only 1660 bytes of free memory. Please try a reboot of the Play:3.
As for the rest of your Sonos system (well, and the Play:3 too), it seems the main issue is that the speakers are occasionally not getting connected to WiFi when they try - I see deauthorisation errors. This used to occur back before we supported 5GHz, but a router would try to steer a speaker on to 5GHz anyway. This is happening with multiple speakers and both APs. I’m not sure why this is happening, but I recommend that you give SonosNet a go by re-enabling WiFi on your ethernet-wired Port: Settings » iroom with Port] » Port » Enable WiFi - this will result in less DEAUTHs and better connectivity, hopefully.
The answer may be in our Supported WiFi modes and security standards for Sonos products help page and in your router’s settings. Please ensure your router is conforming to our network requirements as an alternative to using SonosNet. It may help to reboot the router by switching it off for at least 30 seconds. Please reboot the second AP afterwards.
I don’t actually see any recent playback errors (1 three days ago, and others 22 days ago) - if you see lost packets but don’t hear any playback issues, I wouldn't worry to much about them.
I hope this helps.
Thanks @Corry P , and it begs lots more questions, however I’m not sure polluting this thread is the right place? Happy for you to move this to its own thread?