ZP100 not booting


Userlevel 2
Hi,

I've recently acquired a ZP100 however it looks to be stuck in the boot process with a single white light flashing endlessly.

It doesn't appear responsive to holding the mute button on bootup. On powerup all of the network port light flash briefly and the connection lights are solid if wired to bridge via ethernet.

I know this is an old product and I know it is my fault for purchasing it, but is there any other reset tricks or way to update the firmware to get out of this loop?

This topic has been closed for further comments. You can use the search bar to find a similar topic, or create a new one by clicking Create Topic at the top of the page.

197 replies

Userlevel 2
Thanks everyone for your posts. Using what is here, I've been able to recover two ZP100s that were loading the diagnostic firmware with a bus pirate.

I have a third that boots past the serial connection and gets an IP address from my router. The white light however stays blinking and it never gets to the point where I can add it to the rest of my sonos system.

When booting, even though it grabs an IP and responds to pings, it doesn't serve up the web pages with status, etc.

A factory reset gets it blinking orange, but there it stays.

Is it possible to have it revert back to the diagnostic firmware so I can reflash it? Any other suggestions?

Thanks
Userlevel 2
You can't load an image with a bus pirate again because without the diagnostic firmware you can't get to the console. Some have copied and loaded images by JTAG but I haven't got that adventurous yet.

I had a unit that acted similar to your's except I couldn't find what IP address it was coming up with to try and ping it. It also didn't have the diagnostic firmware on it when I received it and I couldn't add it to my sonos network. I was able to replace the ethernet IC (Rtl8139cl+) and get it functioning again.
Userlevel 2
Guys I got everything working but I need help with a basic networking question:

I have the ZP directly connected to my computer. When I change my local ip to be the pinged address of 169.254.2.2 what subnet do I specify? Isn't 169.254.1.1 a different subnet?


Also, for others using this great thread. The wire configuration used when connecting the Bus Pirate varies with the type of probes you use.

The Sparkfun kit uses the pin out described by sdekock in post #56

The Seeed Studio and Adafruit probe kits use:
Pin 1: orange
Pin 2: grey
Pin 3: black
Pin 4: brown

More info can be found here
Badge +1
Guys I got everything working but I need help with a basic networking question:

I have the ZP directly connected to my computer. When I change my local ip to be the pinged address of 169.254.2.2 what subnet do I specify? Isn't 169.254.1.1 a different subnet?


Also, for others using this great thread. The wire configuration used when connecting the Bus Pirate varies with the type of probes you use.

The Sparkfun kit uses the pin out described by sdekock in post #56

The Seeed Studio and Adafruit probe kits use:
Pin 1: orange
Pin 2: grey
Pin 3: black
Pin 4: brown

More info can be found


A subnet mask of 255.255.0.0 would indicate both networks on the same subnet
Userlevel 2
Badge
So has anyone had any luck reviving a no audio out zp100? I've had a couple of these I've tried with no luck. They both originally had the amps blown (literally burnt the VCC legs off the IC's). I've replaced the amps and the audio controller IC but no luck. They appear to play but get no sound. Tried to use the line in also but that didn't work either. If no one has any other ideas it appears the TI DSP IC is my next try.

@tonesam Check 0 ohm resistor R2133 next to one of the large capacitors. A few of the ones i have had exactly as yours have i have found on them all R2133 is open circuit. It appears that the amp chip is operating in a bridged mode, linking some of the amp outputs (config pin 24 of the amp chip). when the zero ohm resistor blows the amps switch back to 2 separate amps. the linked drivers pins then cause a shorted output draw lots of current and burn out the legs on the PSU. I have not tried a fix yet so not sure what is causing the zero resistor to blow.
Userlevel 2
Hi,

can somebody please post a picture of the location where the UART port is on the logic board.

Thanks in advance.
Hi,
Uart port on the top logic board near wireless card on the front side, you have 4 holes.
1:3.3v
2:RX
3:TX
4:GND
Badge
I've read through 99% of this thread and it really helped to get me onto the right track.
I was able to avoid dealing with UART / Serial connection and create a script right on my ZP100 via web browser.
I did assign myself a static IP of 169.254.1.2 and plug directly to one of ZP100 built-in ethernet ports.
I followed multiple instructions here to get the appropriate update file and saving it as "fw.upd", by substituting my ZP100 serial, id and householdid, using the following as a base:
http://update-firmware.sonos.com/firmware/Gold/28.1-83251-v5.2-pcyakr-RC4/28.1-83040-1-1.upd?cmaj=27&cmin=2&cbld=80271&subm=3&rev=2®=1&serial=XXXXX&sonosid=XXXXX&householdid=XXXXXX
Easiest way to start a web server to serve the file was to let PHP serve content from current directory with following: sudo php -S 0.0.0.0:80
Here are the URLs I've hit to create and run my script :)


http://169.254.1.1:1400/diag/cgi-bin/bin/echo touch /var/run/stopanacapa > /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/bin/echo ps "|" grep anac "|" awk "{print %5C$1}" "|" xargs kill -9 >> /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/bin/echo upgrade -fH http://169.254.1.2/fw.upd ">>" /jffs/log.txt >> /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/bin/echo rm /var/run/stopanacapa >> /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/bin/chmod 755 /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/tmp/run.sh


I assume that I got lucky and my ZP100 came with diagnostics firmware.. I could run any command with "http://169.254.1.1:1400/diag/cgi-bin/" by adding the full path of the desired binary to it and passing arguments as you normally would in linux CLI. Few characters had to be double quoted or used as HEX values..

After running it, my ZP100 lid up, I've hit the magic combo to connect to the Controller and it was added to the existing setup within seconds!

Thanks to everyone contributing, it really helped!
Userlevel 2
Guys,

I was able to download my .upd file and have connected the zone player 100 to buspirate. I got that into UART mode also. I am confused to what the next step should be and how to upload the downloaded firmware to the zone player. Can some one please help me out.

Thank you.
Badge

how to upload the downloaded firmware to the zone player

You should be able to pull it from the ZP itself via the terminal, since you have BusPirate connected.
If you are able to get the console / terminal, do the following:
0. Start a webserver on your computer, so that you can "serve" the downloaded firmware file via HTTP. I've done it with PHP, and so can you if you have PHP installed: php -S 0.0.0.0:8080
1. stop anacapa process:
code:
$ touch /var/run/stopanacapa
# this prevents it from starting
code:
$ ps | grep anac | awk '{print $1}' | xargs kill -9
# this kill all ancapa processes
2. Run the upgrade tool:
code:
$ upgrade -fH http://169.254.1.2:8080/fw.upd
# make sure to use correct IP/PORT and file name instead of mine :)
3. Allow anacapa to start (probably unnecessary):
code:
$ rm /var/run/stopanacap
Userlevel 2
anapsix,

thank you so much for the detailed info.
I have couple other questions, hope you won't mind answering

1) Do i run PHP after i have put UART in bridge mode?
2) php i got was by setting up in the windows program, additional features. Where will i run the above commands that you have mentioned? Is it in the browser? Sorry for asking these questions as these are totally new to me and hope you won't mind helping.

Thanks again.
Userlevel 2
I have not had any success connecting my Z80 unit to the J1000 connection and was hoping someone could help.

I am getting no output from the board, but can connect to the board via the IP address and 1400 port and view the log. I have 2 units and have tried swapping the top board out, with the same results as noted below.

Using this USB to serial unit: http://www.amazon.com/gp/product/B011CDF3W8

VCC pin is connected to pin 1, TX to pin 2, RX to pin 3, GND to pin 4
Connection using "screen /dev/cu.usbserial-AI02SZYS" is successful, but no output occurs when the ZP80 is plugged in
I've also tried picocom using "picocom -b 9600 /dev/cu.usbserial-AI02SZYS"
I'm using a Mac, OS X 10.10.5 and it shows the device in the system report
I've tried reversed the pin-out, but the screen terminates when power is supplied to the ZP80

In the 2nd pic, you will see a set of green, red, black and white wires. Those are just being used to wedge the connector pins into the holes, since I wasn't getting a solid connection using the standard pins on the connection cable.
Badge
Sorry for such a delay in responding..

1) Do i run PHP after i have put UART in bridge mode?
2) php i got was by setting up in the windows program, additional features. Where will i run the above commands that you have mentioned? Is it in the browser

I start PHP in server mode only to serve files (the firmware update file), so that the upgrade program that I run on the ZP100 would be able to get it from some place.
In my case, it get's it from PHP served directory.
You can use Apache, Nginx or any other web-server.. even IIS if you are on Windows.
You should run your web server after you find a way to control ZP. Either though UART terminal or like I did it via web-browser hack.
if you can test if you can do it my way by connecting to the ZP100 built-in switch, assigning your computer 169.254.1.2 IP and directing your browser to "http://169.254.1.1:1400/diag/cgi-bin/bin/echo yes we can".
If you'll see a page with "yes we can", you can use my method.
If you are able to get a terminal console with UART method, it's just as good..
Makes sense?
Thanks everyone for your posts. Using what is here, I've been able to recover two ZP100s that were loading the diagnostic firmware with a bus pirate.

I have a third that boots past the serial connection and gets an IP address from my router. The white light however stays blinking and it never gets to the point where I can add it to the rest of my sonos system.

When booting, even though it grabs an IP and responds to pings, it doesn't serve up the web pages with status, etc.

A factory reset gets it blinking orange, but there it stays.

Is it possible to have it revert back to the diagnostic firmware so I can reflash it? Any other suggestions?

Thanks


Hi everyone, I recently bought a second hand ZP100 and it's displaying the exact same issues as enton is describing. When I first got the unit home I added it to my existing system (without factory reset) and all seemed OK for a while. I then noticed that some of my other units were dropping out randomly and then the ZP100 disappeared so I disconnected it and tried to do a factory reset. This resulted in a continuous orange/white flashing light that never turns green. If I power on normally I just get a flashing white light that never turns green. With the unit plugged in via ethernet to my router I can see its IP address but can't access any of the status web pages etc. Any suggestions about where to start? Or is it a lost cause?!
Badge
To all the contributors, thank you so much for all the trial and error and documenting your solutions. I have a ZP100 with the same symptoms as most others with 3.2-29243-diag firmware. I can access it via the the default IP 169.254.1.1:1400/status, etc. I have ordered a USB to UART card but in the meantime I finally finished reading the last few entries of this thread and found the url solution. Yeah, it sure seemed to me that there ought to be a way.... btw, I am not proficient in LINUX.

@anapsix
I have the ZP100 connected to a local router configured for 169.254.x.x with internet access. My laptop is assigned 169.254.1.2. I am using XAMPP under WIN7 as a web server from my laptop. Since I'm not a linux guy, I don't understand you comment about the PHP server and running the command sudo php -S 0.0.0.00. I did however, place the fw.upd file in the C:\xampp\apache/ directory, in the C:\xammp directory and also in the root C:\

I can run "http://169.254.1.1:1400/diag/cgi-bin/bin/echo yes we can" and indeed see the response "yes we can".

As I run each of the URL scripts you provided above, I don't see any browser feedback except for the very last url. The response is:
awk: cmd. line:1: Unexpected end of string
wget: server returned error 404: HTTP/1.1 404 Not Found
Read failure 0
WGET exited with 1
Upgrade failed: (11) upgrade file download failed
pull_upgrade failed

When I look at the apache logs, it shows:
169.254.1.1 - - [27/Apr/2016:17:38:09 -0500] "GET /fw.upd HTTP/1.1" 404 1056 "-" "Wget"

Should I be seeing some sort of echo response from the first 5 url scripts?

Any idea what my error means? Is my web server just not finding the file or is it a format issue?
Userlevel 4
Badge +14

@anapsix
I have the ZP100 connected to a local router configured for 169.254.x.x with internet access. My laptop is assigned 169.254.1.2. I am using XAMPP under WIN7 as a web server from my laptop. Since I'm not a linux guy, I don't understand you comment about the PHP server and running the command sudo php -S 0.0.0.00. I did however, place the fw.upd file in the C:\xampp\apache/ directory, in the C:\xammp directory and also in the root C:\

I can run "http://169.254.1.1:1400/diag/cgi-bin/bin/echo yes we can" and indeed see the response "yes we can".

As I run each of the URL scripts you provided above, I don't see any browser feedback except for the very last url. The response is:
awk: cmd. line:1: Unexpected end of string
wget: server returned error 404: HTTP/1.1 404 Not Found
Read failure 0
WGET exited with 1
Upgrade failed: (11) upgrade file download failed
pull_upgrade failed

When I look at the apache logs, it shows:
169.254.1.1 - - [27/Apr/2016:17:38:09 -0500] "GET /fw.upd HTTP/1.1" 404 1056 "-" "Wget"

Should I be seeing some sort of echo response from the first 5 url scripts?

Any idea what my error means? Is my web server just not finding the file or is it a format issue?


Sounds like your ZP100 reach your machine, but the file is in the wrong folder. I don't recall how xampp is set up, but I think you should have a folder called "htdocs" somewhere under c:\xampp somewhere, you should put the file in there. Some web servers also need a defined mime-type configured for specific suffixes (IIS for example) in order to serve the file but I don't think apache cares about that.
Badge
@jishi - Thanks. placing the file under htdocs resolved the problem of finding the update file.

I am still having problems though. I still see no feedback until running the last url script above, but this time, instead of text in the browser window, I receive a download of the run.sh file. opening it as a text file I see:

awk: cmd. line:1: Unexpected end of string
upgrade
version 28.1-83040
compatible with Sonos Zone Player submodels 0-16 revisions 0-4294967294 (any region)
compatible with hardware feature set 1d
My hardware feature set is 0
Upgrade supports all my features
/-\|/-\|/- several lines of giberrish-\|/-\|/-\|/-\|/-\|/-\|
Upgrade file is good
Using new partition format mode
Destination section 0 generation 11
Operating in redundant partition mode (not changing partition table)
Executing upgrade script...failed
Upgrade failed: (35)
failure reading upgrade script file
pull_upgrade failed

No error logs from the apache server; looks lile it served the file ok.
Still investigating.
Badge
Got it! I have successfully run the upgrade via the url script! I did however have to modify one of the url scripts which I believe has an error.
@anapsix - maybe you (or one of the other folks) could verify....

In the steps anapsix outlines above as steps #1, #2 and #3 for a terminal based upgrade, step #2 is shown as:
$ ps | grep anac | awk '{print $1}' | xargs kill -9

Comparing that with the url script for that step which is:
http://169.254.1.1:1400/diag/cgi-bin/bin/echo ps "|" grep anac "|" awk "{print %5C$1}" "|" xargs kill -9 >> /tmp/run.sh

It appeared to me that the script is missing the single quotation mark before and after the {print $1}, so I simply added them and BAM!
Success!

So, to (hopefully) clarify, the (proper) url scripts that I used to take a ZP100 with 3.2-29243-diag firmware with a default IP of 169.254.1.1 should be:

http://169.254.1.1:1400/diag/cgi-bin/bin/echo touch /var/run/stopanacapa > /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/bin/echo ps "|" grep anac "|" awk "'{print %5C$1}'" "|" xargs kill -9 >> /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/bin/echo upgrade -fH http://169.254.1.2/fw.upd ">>" /jffs/log.txt >> /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/bin/echo rm /var/run/stopanacapa >> /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/bin/chmod 755 /tmp/run.sh
http://169.254.1.1:1400/diag/cgi-bin/tmp/run.sh

Thanks to the group for sharing your knowledge!
Badge
As a final note for the upgrade, I used a command way back in the thread to get the software download and it was 28.1-83040-1-1.upd. I later followed the instructions to use a path from my current working system and of course substituting the proper sonosid, etc. That downloaded a file named 31.8-24090-1-16.upd. I probably should have used that file for the upgrade but I stuck with the older 28-1 file.

After the successful upgrade, I tried to add the ZP100 to my current system and it really hosed everything up. Got to the update software part and gave an error 1101. I reset the system and tried again. As soon as I plugged in the ZP100, the rest of the sonos system went down again, I unplugged the rest of the system, set up the ZP100 as a new system, it updated to 39.1-26010 and worked well. I then did a factory reset and tried to add it back to my original system. Plugged everything back in, everything was acting fine with the ZP100 plugged in waiting to be added. I added it to the system and the controller updated the rest of my system to 39.1-2600. ZP100 is working great!

One negative side effect (I think) is that when the system updated to 39.1, it is now giving me some sort of error (warning): "Some Sonos players are using the wireless connection from your range extender device. You will be unable to play music in a group of rooms including such a player. To ensure playback in all grouped rooms, you will need a Sonos BOOST or a player permanently wired to your router."

I don't know if this is a result of the update to 31.9 or if it's because I powered down all the Sonos equipment and maybe they associated differently when they restarted. I'll check with Sonos Technical support later. Anyway, probably a subject for another thread but I thought that I would mention the issue here of updating the ZP100 to 28.1 and trying to add it to a currently updated system. It might be better to get the latest download first.
Userlevel 2
Badge
Good to know. FYI for anyone else doing this, I think it may be possible to keep serial access after the upgrade by running:
"mdputil -wfF 3" prior to upgrading to the upgrade. YMMV, I'm not responsible if you brick the device.
hello everyone! thank you for this thread! with your help, i managed to connect my sonos cpu board to my computer through uart and did some inspections.

the history in a nutshell was that my zp100 is a used unit, almost 9 or 10 years old. i've opened it multiple times to clean it or check the power supply... anyway i never had any problem with it.
back in the days it started to freeze randomly, independently of playing music or standby... there was no overheating...
so i decided to unplug and let it rest a few days but now when i wanted to boot it up the white led was blinking infinitely. cleaned it, checked the voltages temperatures no problem. PLUS the ethernet switch works indiviually...

with the uart i got into the bootloader and did some tests. RAM is good, NAND has only one bad block, but it won't boot the kernel...
This is the output for the boot linux from nand:
code:

Rincon boot loader version 0.16-11080(ZP) (32M SDRAM). Press 'h' for help.
h - help
m - SDRAM test
i - print NAND device ID
n - NAND device scan
x - NAND device destructive test
y - NAND device dump first page
p - Program NAND device
b - Boot the Linux kernel from NAND device
d - Boot diagnostics from NAND device
> NAND ID is EC:75
32M NAND flash (Samsung K9F5608U0C) detected
NAND flash block 970 is bad
Section 0 is provisionally good, kernel on partition 1, generation 15
nand_load: bad page magic, page 54688
nand_load: file appears to extend past end of partition
Section 1 is no good
Attempting to boot kernel from partition 1


and there it freezes. i've never seen the linux booting though uart ever...(yet).

can i do something? if the nand is fried or empty... how in the world could it get empty or fried if it was working fine for like 9-10 years...?

thank you very much!

EDIT: i'm using uart without plugging in to mains power... am i doing it right?

EDIT2: when i don't scan my nand before booting, it sees my nand good:
code:

SDRAM test complete
Attempting to autoboot from NAND device
NAND ID is EC:75
32M NAND flash (Samsung K9F5608U0C) detected
NAND flash block 970 is bad
Section 0 is provisionally good, kernel on partition 1, generation 15
Section 1 is provisionally good, kernel on partition 4, generation 14
Attempting to boot kernel from partition 1

but then freezes anyway.
I see something on my usb voltage meter. when it starts to boot, it uses the cpu and draws around 750-800mA. the bootloader freezes, and a few seconds later the current falls back to 500mA... it's like the cpu tried something then froze...
any idea? 😞
Userlevel 2
Badge
hThe linux console is disabled by default in production builds of the ZP100, so not seeing anything is normal. It sounds like the kernel starts to boot and crashes. This could be due to a bunch of different things, but it sounds like the NAND is going (not uncommon on units of this age). I've occasionally had luck doing a factory reset and trying to upgrade the unit if you can get it to boot at all post-reset (I think this allows it to go through and mark any newly bad blocks on the install of the new upgrade, but thats just a guess).

If you are really brave (and have access to the right hardware), you can try replacing the nand chip with an equivalent of the same model. Make sure you dump the data off this one using a flash programmer.

The ethernet switch has its own internal logic, so it will work even if the unit is hosed.
hThe linux console is disabled by default in production builds of the ZP100, so not seeing anything is normal. It sounds like the kernel starts to boot and crashes. This could be due to a bunch of different things, but it sounds like the NAND is going (not uncommon on units of this age). I've occasionally had luck doing a factory reset and trying to upgrade the unit if you can get it to boot at all post-reset (I think this allows it to go through and mark any newly bad blocks on the install of the new upgrade, but thats just a guess).

If you are really brave (and have access to the right hardware), you can try replacing the nand chip with an equivalent of the same model. Make sure you dump the data off this one using a flash programmer.

The ethernet switch has its own internal logic, so it will work even if the unit is hosed.


so you say i should keep trying until it might "accidently" boots up, do a factory reset and see what happens?
no chance for hardware problems (except nand)?
sadly i don't have jtag hardware at all.

but i don't understand... it was working fine for ages... i mean, my unit was given me from a hotel (because the hotel replaced all the audio with crestron) so this thing played music for long ages 24/7 and also it was working fine for 3 years at me. and it just suddenly freezes and dies without a sign before. weird.
Userlevel 2
Badge

so you say i should keep trying until it might "accidently" boots up, do a factory reset and see what happens?

Basically. When new software is installed it gets written around bad blocks on the NAND. Block 970 is part of the JFFS portion of the chip (RW user data), which should get cleared on a factory reset.


no chance for hardware problems (except nand)?

No idea, the things that I have had fail most commonly are NAND, RAM, and occasionally the DSP chip on the amplifier board. If the unit still won't boot after a factory reset, (and the SDRAM test passes), my guess would be that the issue is the amp board. If you have another opened unit somewhere you can try switching the computer board and see if it works then.

Another troubleshooting tool is to hook up a computer running wireshark directly to the unit when it boots and see if it's requesting an IP address. I've had units fail between when the kernel boots (and dhcpd is initialized) and when the webserver (anacapad) is initialized.


but i don't understand... it was working fine for ages... i mean, my unit was given me from a hotel (because the hotel replaced all the audio with crestron) so this thing played music for long ages 24/7 and also it was working fine for 3 years at me. and it just suddenly freezes and dies without a sign before. weird.


The thing is old. 10 years is well outside the expected life expectancy of consumer electronic components. So things eventually start to fail.

Basically. When new software is installed it gets written around bad blocks on the NAND. Block 970 is part of the JFFS portion of the chip (RW user data), which should get cleared on a factory reset.

No idea, the things that I have had fail most commonly are NAND, RAM, and occasionally the DSP chip on the amplifier board. If the unit still won't boot after a factory reset, (and the SDRAM test passes), my guess would be that the issue is the amp board. If you have another opened unit somewhere you can try switching the computer board and see if it works then.

Another troubleshooting tool is to hook up a computer running wireshark directly to the unit when it boots and see if it's requesting an IP address. I've had units fail between when the kernel boots (and dhcpd is initialized) and when the webserver (anacapad) is initialized.

The thing is old. 10 years is well outside the expected life expectancy of consumer electronic components. So things eventually start to fail.


thank you for your help! now i'm trying to boot it up through uart. still the same thing happens.
i could get it to amber-white flash to start the factory reset but it stuck in it. i waited for 10 minutes then interrupted...

the reason why i think that the linux part is defective is it won't even boot disconnected from the amp. or should it boot?...
i will check the amp board and try the wireshark and see what happens.
i don't really want to trash this thing out just one day to the next.