Skip to main content
So a very new poster to the forums but a user of them and Sonos for 3 or 4 years. Having searched the forums I have been unable to find a more detailed description of how the integration with Alexa works beyond that given in interesting posts like https://en.community.sonos.com/amazon-alexa-and-sonos-229102/sonos-alexa-and-local-nas-stored-music-6791479/index1.html



These posts describe the process along the lines of

1 - echo hears you

2- echo sends voice to Alexa servers

3 - Alexa servers translate voice to text

4 - Alexa servers process the request, determining what music to play on what speakers

5 - Alexa sends request to Sonos cloud.

6 - Sonos cloud sends request to your speakers.



Steps 1 to 3 seem straight forward, it is after that point that the process is less clear.



The Sonos cloud (presumably this is a true cloud service?) cannot directly send the request to the local Sonos device(s) as the local firewall will block an incoming request, so life must be a bit more complicated.

Clearly something has to process the request but that may be different service depending upon the type of request or it may be that the request is processed in parts e.g. once alexa has created the text it passes it to the Sonos cloud which converts that into a command string which is then sent somewhere else. There are clearly a number of options. In all cases though somehow the local Sonos device must be instructed to do something.

It feels as though the request must be passed back to the echo as response containing a packaged command string. The Echo would then unpack and forward the command onto the specified Sonos device. This though would make it easy to deal with multiple music services which appears not to be the case. So that is not the solution which leaves the core questions - what happens once Alexa has converted the voice command to text and how does this get back to the Sonos device.



If I have failed to find the thread which answers this I apologise but would be grateful to be pointed in the right direction
I think it is more like:



1. Setup Sonos Alexa skill - now Alexa assigns each Sonos unit its own unique device # in the alexa app. Just like each Echo speaker has its own device #.

2. echo hears you

3. echo sends voice to alexa servers

4. Alexa translates voice to test

5. Alexa processes request and determines music to play and to what device #

6. Alexa pushes the song requested to your Sonos device (the Sonos device acting no different then an Echo device).



That is why you can only use Amazon music services. Amazon can only push its allowable music services to your Sonos. Therefore, there is no way to play other music sources that you have in your Sonos App (such as Apple Music, local library etc.). Alexa sees no music on your local system only serves music from Amazon to your device.



Sonos speaker is just and Echo then with no microphone (of course unless you get the Sonos One with Alexa built in).
Thank you Chris.

You have not included the Sonos cloud in your sequence which other posts have included. I had assumed as a minimum Sonos had to keep a record of what external services are subscribed to so that something (Alexa/AWS / Sonos) can validate that you have access to the service.



It is still the final part that puzzles me, Surely Alexa cannot push the music stream (Amazon/Tunein/ Spoyify) direct to the sonos device because that would be blocked by the firewall. A simple test of initiating play a radio station from the echo and then turning off the echo shows that the stream to the Sonos is independent of the echo as the radio station keeps playing. I cannot be sure but I think I paused and resumed the stream via the phone whilst the echo was turned off. All this to me suggests that at the end of the voice process a command is sent to the sonos device somehow and then the Sonos gets on with doing what it is good at - selecting a music source and playing it very well.

Or am I completely barking up the wrong tree
Actually Alexa app keeps record of your services. You have to set them up in the alexa app. With voice control you bypass your Sonos subscriber info as you do. It have access to the services in your Sonos app. only alexa app.
Station plays to Sonos the same way it plays to the echo. I beleive Sonos device keeps a connection to amazon server the way echo does a awaits being pinged.
Hi guys, if it helps, I have a thread here where I outline some of the basic process on how the integration works.



In short, your rundown is correct. The Sonos cloud does send a command to the speaker which in essence looks like "Alexa says go play this from here". And the player then goes and does it. This goes through secure traffic and what I've described is a very simplistic view of it. But the firewall at your router's level wouldn't be blocking it unless you were blocking traffic from Sonos servers to your devices.
If you're super interested in the firewall side of it, I believe it works like this -

1) The Sonos device opens up a connection to Amazon Alexa service

2) Whenever you request information, the firewall sits and waits on a response back from the server it requested information from.

3) The Amazon service then sends back information to your firewall (NAT) which sends it on to the specific device. Even though the connection has had information sent on it, you can send more information/it remains open. Since this is still open, Alexa can send back information. (The protocol also supports saying close this I'm done.)

4) Now, you do want to know if the other side is still there. (Say your internet goes down.) So, to let the other side know that you're still there, you regularly send information back and forth to say I'm still here. With Alexa, it's configured to sending that every 60 minutes.



The Amazon Alexa connection approach is documented at https://developer.amazon.com/docs/alexa-voice-service/manage-http2-connection.html. My assumption would be that Sonos Cloud also uses a similar approach to communication.



You can see a bit more on the TCP protocol side of this at https://en.wikipedia.org/wiki/Transmission_Control_Protocol#Protocol_operation .
On the data side of this, the Amazon documentation has a good example of the normal messages passed back and forth with 'Alexa' devices at https://developer.amazon.com/docs/alexa-voice-service/audioplayer-overview.html#scenario-1-alexa-play-rock-music-from-iheartradio



Now, from the overview post on how this works, the Amazon Blog post (https://developer.amazon.com/blogs/alexa/post/865c3b71-a592-492a-89bb-4e0850e60b25/sonos-one-brings-enhanced-alexa-music-capabilities-to-customers), and that the Alexa app shows two devices (one as a smart home device and one as an Alexa) for each Sonos One, I strongly suspect this process is not followed. I suspect each Sonos music player is being treated like a connected device (i.e. similar to a Nest Thermostat). So, the messages for something like the play x songs are supposed to be sent to the Sonos Cloud which then sends it to your Sonos device. (The parallel is asking Alexa to change the temperature on your thermostat. The Echo sends the request to Amazon Alexa which then sends a message to Nest Cloud which sends a message to the thermostat in your home, then sends a confirmation that it's been set to Amazon, and finally, a message is sent to your Echo that says I'm done.)



I suspect a lot of the bugs with the Sonos One are due to the standard AVS messages (say play this stream) being sent to the Alexa vs. sending it to the Sonos Cloud (i.e. it doesn't treat it like a smart device). Assuming the expectation wa that these are sent to sonos cloud, these would be bugs on the Amazon side of this vs. Sonos side of it.



More on the Alexa Smart Home API -

https://developer.amazon.com/docs/smarthome/smart-home-skill-api-message-reference.html