iPXE discussion forum

Full Version: Booting ipxe from usb causes megaraid_sas driver to fail
You're currently viewing a stripped down version of our content. View the full version with proper formatting.
We have an existing setup using the Broadcom Nics' pxe fw to chainload ipxe, then our image (devuan ascii/beowulf). It works as expected.

Our new setup has ipxe on usb, and it loads the same image via ipv6/dhcpv6 but during boot it gets stuck at: Probing EDD (edd=off to disable)
If we supply edd=off to kernel boot parameters it succesfully boots but the H800 (hwraid card) becomes unusable and it produces an error in dmesg (1).

We have tried: newer(4.19.0-6)/older kernel(4.9.0-11), recompiling ipxe/initramfs multiple times, boot uefi usb ipxe, upgrading bios/h800 fw, different hosts: dell (intel)r710/(amd)r815, multiple h800 cards

We have ran out of ideas, could somebody give us direction and pointers how can we solve this / where to continue debugging?


(1):

(it appears at lscpi)
root@devuan:~# lspci | grep -i raid
05:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)
06:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)

hung of udevd:

[ 243.007077] Call Trace:
[ 243.007962] [<ffffffffbb817609>] ? __schedule+0x239/0x6f0
[ 243.008821] [<ffffffffbb817af2>] ? schedule+0x32/0x80
[ 243.009642] [<ffffffffbb81ae8d>] ? schedule_timeout+0x1dd/0x380
[ 243.010426] [<ffffffffbb818531>] ? wait_for_completion+0xf1/0x130
[ 243.011209] [<ffffffffbb2a60e0>] ? wake_up_q+0x70/0x70
[ 243.011985] [<ffffffffbb2956fe>] ? flush_work+0x10e/0x1c0
[ 243.012744] [<ffffffffbb291cb0>] ? destroy_worker+0x80/0x80
[ 243.013490] [<ffffffffbb295821>] ? work_on_cpu+0x71/0x90
[ 243.014227] [<ffffffffbb2915e0>] ? __usermodehelper_disable+0x130/0x130
[ 243.014960] [<ffffffffbb584070>] ? pci_device_shutdown+0x60/0x60
[ 243.015679] [<ffffffffbb5854e3>] ? pci_device_probe+0xf3/0x160
[ 243.016387] [<ffffffffbb6853f2>] ? driver_probe_device+0xe2/0x420
[ 243.017088] [<ffffffffbb68580d>] ? __driver_attach+0xdd/0xe0
[ 243.017800] [<ffffffffbb685730>] ? driver_probe_device+0x420/0x420
[ 243.018527] [<ffffffffbb683069>] ? bus_for_each_dev+0x69/0xb0
[ 243.019266] [<ffffffffbb68476a>] ? bus_add_driver+0x16a/0x260
[ 243.020021] [<ffffffffc0588000>] ? 0xffffffffc0588000
[ 243.020770] [<ffffffffbb6860b7>] ? driver_register+0x57/0xc0
[ 243.021515] [<ffffffffc0588000>] ? 0xffffffffc0588000
[ 243.022258] [<ffffffffc05880bf>] ? megasas_init+0xbf/0x1000 [megaraid_sas]
[ 243.023016] [<ffffffffbb20218e>] ? do_one_initcall+0x4e/0x180
[ 243.023794] [<ffffffffbb3ccb96>] ? __vunmap+0x76/0xc0
[ 243.024576] [<ffffffffbb382153>] ? do_init_module+0x5b/0x1f6
[ 243.025372] [<ffffffffbb305466>] ? load_module+0x25b6/0x2ac0
[ 243.026180] [<ffffffffbb301b90>] ? __symbol_put+0x60/0x60
[ 243.027001] [<ffffffffbb305bb6>] ? SYSC_finit_module+0xc6/0xf0
[ 243.027845] [<ffffffffbb203b7d>] ? do_syscall_64+0x8d/0x100
[ 243.028688] [<ffffffffbb81c3ce>] ? entry_SYSCALL_64_after_swapgs+0x58/0xc6

...
[ 259.458677] megaraid_sas 0000:05:00.0: Failed to init firmware
...
[ 259.459989] Workqueue: events work_for_cpu_fn
[ 259.460013] 0000000000000000 ffffffffbb5353d4 ffffbcc64d703c30 0000000000000000
[ 259.460059] ffffffffbb27a83b 0000000000000032 ffffbcc64d703c88 ffff9efb8802c200
[ 259.460104] 0000000000000032 ffff9efb8802c2d4 0000000000000246 ffffffffbb27a8bf
[ 259.460150] Call Trace:
[ 259.460171] [<ffffffffbb5353d4>] ? dump_stack+0x5c/0x78
[ 259.460198] [<ffffffffbb27a83b>] ? __warn+0xcb/0xf0
[ 259.460223] [<ffffffffbb27a8bf>] ? warn_slowpath_fmt+0x5f/0x80
[ 259.460265] [<ffffffffc056c9c8>] ? megasas_free_cmds+0x48/0x70 [megaraid_sas]
[ 259.460301] [<ffffffffbb2d6fe2>] ? __free_irq+0xa2/0x280
[ 259.460327] [<ffffffffbb2d7247>] ? free_irq+0x37/0x90
[ 259.460357] [<ffffffffc0566285>] ? megasas_destroy_irqs+0x45/0x80 [megaraid_sas]
[ 259.460397] [<ffffffffc0570915>] ? megasas_probe_one+0xa05/0x1d30 [megaraid_sas]
[ 259.460434] [<ffffffffbb29a9e2>] ? __kthread_create_on_node+0x132/0x180
[ 259.460468] [<ffffffffbb2b0dd3>] ? check_preempt_wakeup+0x103/0x210
[ 259.460501] [<ffffffffbb5840b4>] ? local_pci_probe+0x44/0xa0
[ 259.460529] [<ffffffffbb2915f6>] ? work_for_cpu_fn+0x16/0x20
[ 259.460557] [<ffffffffbb29486a>] ? process_one_work+0x18a/0x430
[ 259.460587] [<ffffffffbb294cda>] ? worker_thread+0x1ca/0x490
[ 259.460615] [<ffffffffbb294b10>] ? process_one_work+0x430/0x430
[ 259.460645] [<ffffffffbb29abc9>] ? kthread+0xd9/0xf0
[ 259.460671] [<ffffffffbb29aaf0>] ? kthread_park+0x60/0x60
[ 259.460700] [<ffffffffbb81c564>] ? ret_from_fork+0x44/0x70
[ 259.460728] ---[ end trace 7cb7858013a2460a ]---

Linux devuan 4.9.0-11-amd64 #1 SMP Debian 4.9.189-3 (2019-09-02) x86_64 GNU/Linux
So this issue happens when you boot via iPXE, but not when you boot the exact same kernel directly from USB?

Or what I'm trying to ask is, why is this an "iPXE realated problem"?

You are always booting EFI in this case?

What are the instructions to reproduce the issue?
Reference URL's