[rescue] A tale of two Sun A5000's
Mr Ian Primus
ian_primus at yahoo.com
Tue Mar 30 20:47:21 CDT 2010
Well, I've been tinkering with this Sun E4500 (running Linux), as well as a couple of A5000 disk arrays. The A5000's are both the 14-slot versions. I had one of the arrays working OK, but with occasional I/O errors I attributed to flakey old 9 gig disks. Well, I got some shiny new used disks in the mail, removed all the old 9's and inserted one new disk. It throws some errors, buffer I/O errors and command aborts. Low level format the disk with sg_format, partition, and go to mkfs the newly created partition - command aborts, the bus hangs up. Try again, same thing. Try another disk, same thing.
Enter the other A5000. This one was dirty and beat up, and had a faulty display module, one missing power supply, and three bad fans. Replace the display module with a used one from eBay and it comes back to life. Throw my disks in that, connect it up... works perfectly (amid the din of worn-out fan bearings). So, now I know it's not the disks.
So far, I've swapped both my I/O boards between the arrays, as well as the GBIC. Testing only with one disk in front slot 0, everything always works perfectly on the "beat up" array, but fails with command aborts and bus resets on the "nice" array - even when I keep the same disk, I/O boards and GBIC in the picture.
Paging through the configuration menus, both of them seem to be configured the same way, the only difference I can see between the two is the working array runs firmware version 1.05 while the faulty array runs version 1.07.
So, what is there left to swap? The only thing I can think of is a firmware bug or something - but where does the firmware live on this beast? How can I flash it? Or do I have to change the interconnect module in the middle of the thing?
More information about the rescue