iPXE discussion forum

Full Version: iSCSI boot shows error at boot time using OpenELEC
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
Hello there,

I posted this already in the OpenELEC forums but nobody seems to know anything about this issue.

I recently installed OpenELEC 3.1.1 testing on an iSCSI target (Synology Diskstation 411slim) which is connected via Gigabit ethernet to a TP-Link TL-WR1043ND. OpenELEC runs on an Asus E45M1-M PRO, also connected via Gigabit ethernet to the TP-Link. The TP-Link runs Gargoyle firmware v1.5.10. Its WLAN is bridged to an AVM FRITZ!Box 7330 which acts as gateway to the Internet.

Installation of OpenELEC was fine with the help from http://wiki.openelec.tv/index.php?title=...ot_-_iSCSI

Everything went fine until the database has been created (about ~500 MB). Since then, I get this error very often at boot time:

Error in mount_flash: iscsistart: Unable to auto connect
System halted.

I recently updated to OpenELEC 3.1.2, but nothing changed. In 3 of 4 tries, I get the error at boot time and I have to reset the system. With a little bit of luck, the system then boots normally and everything is fine.

I am using iPXE "The Hard and Techy Way": http://wiki.openelec.tv/index.php?title=...P.C2.A0.21

That's my start script to boot OpenELEC:

echo Booting Openelec from iSCSI for ${initiator-iqn}
set root-path ${base-iscsi}:DiskStation
sanboot ${root-path} || goto retry_sanboot

Every idea is welcome!

Thank you very much!
You config seems fine. What is the value of ${base-iscsi}? My guess is that the openelec initrd is having some issues mounting the iscsi target.

I'd suggest you perform a packet trace and see how the iscsi communication is happening. My guess is that either the client or the server is having some issue.

I do prefer to use NFS or HTTP for booting the kernel+initrd and use NFS for mounting /storage. It's been working without issues since 1.99. You seem to already be using my menu example, and the required kernel cmdline for NFS-based /storage is already documented there under the :openelec label.
Also, whoever wrote the OpenELEC wiki page http://wiki.openelec.tv/index.php?title=...P.C2.A0.21 should remove the reference to undionly.usb. It can't possibly work. You should use ipxe.usb instead. UNDI is a driver that specifically uses the existing PXE stack for driving the network card. If you boot using USB you don't have any existing card already running, so it won't work. See http://networkboot.org/fundamentals for more information about the BIOS boot process (there is a link to an UNDI wikipedia article there).
Missing values:

set base-iqn iqn.2000-01.com.synology
set base-iscsi iscsi:DiskStation::::${base-iqn}

During the next days I'll try to do the packet trace.

Yes, I got your example menu structure; these a really good tips to understand the whole thing. Thanks a lot for this one!

You mentioned the undionly.usb thing: I didn't use this; that's probably the "Lazy way" which advises to use USB sticks or similar to boot.
I notice you're just using a short DNS name (DiskStation) instead of an IP address or a FQDN. Have you tried with just the IP of the iSCSI server? Does it still fail? Depending on your DNS/DHCP setup, this _could_ be the issue.
Didn't get the chance to do the packet trace yet.

I changed every config from short DNS name to IP, I even recompiled iPXE with a new embedded script to use only IP adresses. Same result.

I tried a different NIC today: An Intel Pro 1000 GT PCI card. That's really weird now. With this NIC, I just get these messages after iPXE boot menu:

Registered SAN device 0x80
Booting from SAN device 0x80

Here the whole thing hangs. That's it. I really thought, the Intel NIC brings me further to the goal.

After switching back to the internal Realtek NIC, I recently connected the HTPC directly to the Synology DiskStation to bypass the TP-Link Router. Well, this made the whole thing a lot smoother. Doing this, booting is really more successful! Now, I get the boot errors just in 1 of 4 tries.

So I think it _has_ something to do with the constellation of NIC and the switch between NIC and iSCSI server. Unfortunately, I don't have any possibility to try any other NIC or switch.
You might want to verify that your NIC hardware is working properly. You can do a complete loopback test by going through the steps detailed at http://ipxe.org/dev/driver and report back if your cards work properly. If your cards check out properly you might have an issue with your iSCSI server and how it communicates. Not sure which software the Synology uses, but it might have some odd behavior iPXE gets confused by. Have you been able to iSCSI-boot anything else from that iSCSI server?
Tried to install Windows 7 x64 on the iSCSI target. I am getting the same digital signature error posted here:


The onboard NIC is the same: Realtek 8111E PCIe Gigabit LAN controller

I will do driver test during the next days.
Did the loop test and sometimes it gives me the following error for either one or both nics after the "Link:up" message:

[RXE: 1 x "Operation not supported (http://ipxe.org/3c1be003)"]

The packet counter displayed by lotest increases steadily and rapidly.
(2013-09-10 17:24)tom_major Wrote: [ -> ]Did the loop test and sometimes it gives me the following error for either one or both nics after the "Link:up" message:

[RXE: 1 x "Operation not supported (http://ipxe.org/3c1be003)"]

The packet counter displayed by lotest increases steadily and rapidly.

As long as the counter increases steadily in both directions (both receive and transmit) everything is supposed to be okay. To ensure you have no issues you should also test the high MTU and high bandwidth tests available from the same location, they can sometimes give more clues.
High MTU test stops after a very short time when testing the transmit datapath. Sometimes it gives the following message:

Received spurious packet type 8808

Testing the receive datapath doesn't stop; the packet counter displayed by lotest increases steadily and rapidly.

Do you mean the large file transfer speed when you mention the high bandwidth test? I did the test, too, and it results in download times for the 512 MB image between 21.9 and 29.5 seconds.
A 512MB image should take approx. 5 seconds on gigabit Ethernet (unless the HTTP server fetches from the network too, in which case timings are usually doubled), so you have an issue of some kind here. Did you do the tests with a direct cable between the two cards, or was there a switch or hub in-between? A switch could cause other packets to arrive on the port that shouldn't be there, confusing the lotest.

I think this issue is most likely better sent to the developer mailing-list. There _could be_ an error in the realtek driver, but I'm not knowledgeable enough to figure it out. I would suggest you create a table of all the tests, mark each as pass/fail, and include information about your test setup, what cards you're using and anything else you could think of that would be required to reproduce your failing tests.

Just forgot to ask, are you using ipxe.pxe or undionly.kpxe? If you're using undionly.kpxe some underlying PXE drivers do not implement what's needed for high MTU to work properly, and you might get failures. If so, try ipxe.pxe and see if it gives you the same problem. Also try to perform the tests from power-off to ensure warm reboot doesn't cause the NIC's EEPROM to be left in a funky state.
The lotests have been done with direct connection between two nics in the same machine.

The high bandwidth test has been done with a TP-Link TL-WR1043ND between the affected nic and the webserver. The 512 MB test file is fetched locally from webserver (Synology DiskStation).

For the tests i used ipxe.usb booted from an USB stick. When booting OpenElec I use undionly.kpxe booted over PXE chainloading your menu script and then booting via iSCSI.
If you try to download the 512MB image using e.g. wget from Linux, is it downloaded at close to wire speed (~ 5s on gigabit)? This is just to confirm that your NAS is capable of delivering full bandwidth on drivers that are expected to work properly. If it does not yield full speed, ipxe will obviously also have issues (maybe more so, as ipxe uses as little features as it can get away with, compared to full-featured drivers in normal operating systems).
Reference URL's