2020-10-01, 20:31
Dear everybody,
I've been using iPXE for a couple months to PXE-boot various UEFI-only PC's in the LAN in our HW service workshop. iPXE is pretty cool for that job.
There's a quirk that I've been watching for some time: on most motherboards that managed to boot, the boot process would stall at "iPXE initializing devices" for about half a minute. Today I've found some time to take a dive under the hood, and owing to iPXE's excellent debugging capabilities, I've been able to narrow down the problem to the following place:
in file src/crypto/drbg.c, in function drbg_instantiate(), there's a call to get_entropy_input() - and it's this call that takes 25+ seconds to return.
This can be further traced to src/include/ipxe/entropy.h , which is a header file, defining the inline function get_entropy_input(). And I haven't inspected all the guts yet, but chances are, that this ultimately calls a function called get_entropy_input_tmp() , defined in src/crypto/entropy.c .
The point of this whole show appears to be: to seed a random number generator with enough initial entropy. And for some reason, it takes a hideous amount of time.
I'm wondering how many people who complain about the boot process freezing at "iPXE initializing devices" are just not patient enough to wait 30 seconds to see if the machine finally boots. And the last message on the screen, without debugging compiled in, appears to refer to "devices" = you would understand that this refers to "network" devices, you'd scorn the NIC on your motherboard etc.
That 30-second pause is a disgrace.
What's this crypto stuff useful for, anyway? UEFI? Or some networking? I don't need HTTPs. Does NFS need crypto? Would the problem be solved if I just skipped some options in the config, that bring crypto support in?
For the record, for people coming after me, I'd like to mention the debugging options that led me to this conclusion. As I also need to embed a tiny script in the bootable iPXE binary (to load the menu script from a separate file), I have created the following script in /tftpboot/ipxe/embed_script.sh :
And my embed.ipxe looks like this:
I've actually added a few more DBGC debug messages to crypto/drbg.c to see which call takes time to return.
The crypto/entropy.c does not seem to contain any DBGC messages, so I did not bother to investigate further... but I sure can try if desired. The DBGC macro declared in compiler.h is likely available even in entropy.c.
Admittedly, most of the machines that I've ever tried to boot via iPXE+UEFI were ATOM based. ATOM is just such a popular platform for industrial PC applications, and the latest generations are not nearly as sluggish as the 45nm/32nm predecessors. ATOM from Bay Trail above is my favourite CPU family, and apparently I'm not alone :-)
Thanks for providing iPXE free of charge. If there's something I can do to debug this problem, please let me know.
Frank
I've been using iPXE for a couple months to PXE-boot various UEFI-only PC's in the LAN in our HW service workshop. iPXE is pretty cool for that job.
There's a quirk that I've been watching for some time: on most motherboards that managed to boot, the boot process would stall at "iPXE initializing devices" for about half a minute. Today I've found some time to take a dive under the hood, and owing to iPXE's excellent debugging capabilities, I've been able to narrow down the problem to the following place:
in file src/crypto/drbg.c, in function drbg_instantiate(), there's a call to get_entropy_input() - and it's this call that takes 25+ seconds to return.
This can be further traced to src/include/ipxe/entropy.h , which is a header file, defining the inline function get_entropy_input(). And I haven't inspected all the guts yet, but chances are, that this ultimately calls a function called get_entropy_input_tmp() , defined in src/crypto/entropy.c .
The point of this whole show appears to be: to seed a random number generator with enough initial entropy. And for some reason, it takes a hideous amount of time.
I'm wondering how many people who complain about the boot process freezing at "iPXE initializing devices" are just not patient enough to wait 30 seconds to see if the machine finally boots. And the last message on the screen, without debugging compiled in, appears to refer to "devices" = you would understand that this refers to "network" devices, you'd scorn the NIC on your motherboard etc.
That 30-second pause is a disgrace.
What's this crypto stuff useful for, anyway? UEFI? Or some networking? I don't need HTTPs. Does NFS need crypto? Would the problem be solved if I just skipped some options in the config, that bring crypto support in?
For the record, for people coming after me, I'd like to mention the debugging options that led me to this conclusion. As I also need to embed a tiny script in the bootable iPXE binary (to load the menu script from a separate file), I have created the following script in /tftpboot/ipxe/embed_script.sh :
Code:
#!/bin/sh
CWDIR=`pwd`
cd /usr/src/ipxe/src
rm bin-x86_64-efi/ipxe.efi
make bin-x86_64-efi/ipxe.efi EMBED=$CWDIR/embed.ipxe DEBUG=init:3,rbg,drbg
cp bin-x86_64-efi/ipxe.efi $CWDIR/
cd $CWDIR
And my embed.ipxe looks like this:
Code:
#!ipxe
dhcp
chain tftp://192.168.100.200/ipxe/menu.ipxe
I've actually added a few more DBGC debug messages to crypto/drbg.c to see which call takes time to return.
The crypto/entropy.c does not seem to contain any DBGC messages, so I did not bother to investigate further... but I sure can try if desired. The DBGC macro declared in compiler.h is likely available even in entropy.c.
Admittedly, most of the machines that I've ever tried to boot via iPXE+UEFI were ATOM based. ATOM is just such a popular platform for industrial PC applications, and the latest generations are not nearly as sluggish as the 45nm/32nm predecessors. ATOM from Bay Trail above is my favourite CPU family, and apparently I'm not alone :-)
Thanks for providing iPXE free of charge. If there's something I can do to debug this problem, please let me know.
Frank