iPXE discussion forum
DHCP timeout on reboot - Printable Version

+- iPXE discussion forum (https://forum.ipxe.org)
+-- Forum: iPXE user forums (/forumdisplay.php?fid=1)
+--- Forum: General (/forumdisplay.php?fid=2)
+--- Thread: DHCP timeout on reboot (/showthread.php?tid=7965)



DHCP timeout on reboot - FreeMinded - 2016-02-18 17:10

Hi iPXE community

I'm using iPXE installed on the local hard drive and chainloaded by Grub. If I boot the computer from powered off state everything works as expected. Grub loads iPXE and DHCP request is successful (takes only about 2 seconds or less).

Now, if I reboot the computer Grub loads iPXE but then the DHCP request times out. If have tried to do 2 DHCP request consecutively but both time out the same. I have also tried with two different DHCP servers. I does not make a difference.

Has anyone an idea what the problem could be and how I could work around?

Regards,
FreeMinded


RE: DHCP timeout on reboot - NiKiZe - 2016-02-18 17:33

Are you using ipxe.lkrn or what kind of iPXE build are you using?

I'm suspecting you have a nic that does not get reset properly.

What output does The following commands give you ?
Code:
show busid
dhcp
ifstat
Please run them one after another from a clean start of ipxe and we will try to debug the issue.[/code]


RE: DHCP timeout on reboot - FreeMinded - 2016-02-18 17:59

Hi NiKiZe

Yes, im using a ipxe.lkrn from rom-o-matic.eu with a custom script which looks like this
Code:
#!ipxe
dhcp || goto boot_local
chain --timeout 5 http://boot.controlserver.tld/boot.ipxe || goto boot_local

:boot_local
echo Contacting Boot Control Server failed, booting from local drive...
sleep 2
sanboot --no-describe --drive 0x80 ||
echo Booting from local drive failed, rebooting machine now...
sleep 5
reboot

The commands render this (...shortend)
Code:
iPXE> show busid
net0/busid:hex = 01:80:86:15:59

iPXE> dhcp
Configuring (net0 c0:3f:d5:63:ac:5c)...... ok

iPXE> ifstat
net0: c0:3f:d5:63:ac:5c) using i218v on PCI00:19.0 (open)
  [Link:up, TX:13 TXE:0 RX:66 RXE:53]
  [RXE: 27 x "Operation not supported ... (..3c3f6303)"]
  [RXE: 2 x "Operation not supported ... (..3c086003)"]
  [RXE: 24 x "The socket is not connected ... (..380f6001)"]



RE: DHCP timeout on reboot - NiKiZe - 2016-02-18 18:07

Could you post output from dhcp and then ifstat when i don't work?


RE: DHCP timeout on reboot - FreeMinded - 2016-02-18 21:56

(2016-02-18 18:07)NiKiZe Wrote:  Could you post output from dhcp and then ifstat when i don't work?

Code:
iPXE> dhcp
Configuring (net0 c0:3f:d5:63:ac:5c).................... Error 0x040ee119 ...

iPXE> ifstat
net0: c0:3f:d5:63:ac:5c using i218v on PCI00:19.0 (closed)
  [Link:up, TX:4 TXE:0 RX:0 RXE:0]

Edit:
I also tried the command below with the same result (I guess it does exactly the same anyway)
Code:
iPXE> ifconf --configurator dhcp net0



RE: DHCP timeout on reboot - NiKiZe - 2016-02-18 22:26

Just tried to emulate this and my result has a bit more output to it from ifstat.

My guess is when you do a reboot the nic is put into powersave mode which is not reset by ipxe.

just to be sure running dhcp multiple times makes no difference right?

What is the last thing running when doing reboot? windows, linux or anything else?

You could test with patching intel.c and add INTEL_NO_PHY_RST so the line related to your nic becomes:
Code:
PCI_ROM ( 0x8086, 0x1559, "i218v", "I218-V", INTEL_NO_PHY_RST ),



RE: DHCP timeout on reboot - FreeMinded - 2016-02-18 23:24

(2016-02-18 22:26)NiKiZe Wrote:  My guess is when you do a reboot the nic is put into powersave mode which is not reset by ipxe.

just to be sure running dhcp multiple times makes no difference right?

Yes, it doesn't matter ho many times I run the dhcp command. The result is always the same.

(2016-02-18 22:26)NiKiZe Wrote:  What is the last thing running when doing reboot? windows, linux or anything else?

Windows or Linux. But it doesn't make a difference. Also reboot from iPXE shows the same behaviour.

(2016-02-18 22:26)NiKiZe Wrote:  You could test with patching intel.c and add INTEL_NO_PHY_RST so the line related to your nic becomes:
Code:
PCI_ROM ( 0x8086, 0x1559, "i218v", "I218-V", INTEL_NO_PHY_RST ),

I'll try that on the weekend and report back.

Thanks a lot for your help!


RE: DHCP timeout on reboot - FreeMinded - 2016-02-22 10:11

(2016-02-18 22:26)NiKiZe Wrote:  You could test with patching intel.c and add INTEL_NO_PHY_RST so the line related to your nic becomes:
Code:
PCI_ROM ( 0x8086, 0x1559, "i218v", "I218-V", INTEL_NO_PHY_RST ),

I tried it but with the patch dhcp times out also from cold boot!

But this was the first time I have compiled iPXE myself. So I might have missed something on the way...

What I did is:
1. Download source from git
2. Install the required tool on my machine (Kubuntu 15.10)
3. Applied the patch
4. run

Code:
make bin/ipxe.lkrn

It executed with no errors and filled the bin directory with 769 files plus a folder. Among them was ipxe.lkrn which I then copied to the target machine.


RE: DHCP timeout on reboot - NiKiZe - 2016-02-22 12:39

If you run a git diff, you only see that one line changed?

Lets try some debuging then..
Make bin/ipxe.lkrn DEBUG=intel

This enables debug prints in intel.c and will hopefully give something usefull.


RE: DHCP timeout on reboot - FreeMinded - 2016-02-22 14:27

Thanks NiKiZe for continued support!

(2016-02-22 12:39)NiKiZe Wrote:  If you run a git diff, you only see that one line changed?

yes, I get this...

Code:
@@ -1060,7 +1060,7 @@ static struct pci_device_id intel_nics[] = {
        PCI_ROM ( 0x8086, 0x1539, "i211", "I211", 0 ),
        PCI_ROM ( 0x8086, 0x153a, "i217lm", "I217-LM", INTEL_NO_PHY_RST ),
        PCI_ROM ( 0x8086, 0x153b, "i217v", "I217-V", 0 ),
-       PCI_ROM ( 0x8086, 0x1559, "i218v", "I218-V", 0),
+       PCI_ROM ( 0x8086, 0x1559, "i218v", "I218-V", INTEL_NO_PHY_RST ),
        PCI_ROM ( 0x8086, 0x155a, "i218lm", "I218-LM", 0),
        PCI_ROM ( 0x8086, 0x157b, "i210-2", "I210", 0 ),
        PCI_ROM ( 0x8086, 0x15a0, "i218lm-2", "I218-LM", INTEL_NO_PHY_RST ),

(2016-02-22 12:39)NiKiZe Wrote:  Lets try some debuging then..
Make bin/ipxe.lkrn DEBUG=intel

This enables debug prints in intel.c and will hopefully give something useful.

The result is this...
http://ibin.co/2XqZzGvg0KXP

Does this help?
What actually surprises me, is that with the self compiled ipxe.lkrn dhcp is never successful, also from cold boot. Might there be something wrong with my make config? I didn't change anything.


RE: DHCP timeout on reboot - NiKiZe - 2016-02-22 18:43

(2016-02-22 14:27)FreeMinded Wrote:  The result is this...
http://ibin.co/2XqZzGvg0KXP

Does this help?
What actually surprises me, is that with the self compiled ipxe.lkrn dhcp is never successful, also from cold boot. Might there be something wrong with my make config? I didn't change anything.

In that case It's probably due to the patching, so remove that and retry,
Would be good to see output of both coldboot and warmboot.

just run dhcp, followed by ifstat to keep as much debug data as possible on screen.


RE: DHCP timeout on reboot - FreeMinded - 2016-02-25 11:13

(2016-02-22 18:43)NiKiZe Wrote:  In that case It's probably due to the patching, so remove that and retry,
Would be good to see output of both coldboot and warmboot.

just run dhcp, followed by ifstat to keep as much debug data as possible on screen.

Ohh, I somehow missed your answer...

Here the outputs from cold boot...
http://ibin.co/2YAgRTysilkM

...and from reboot
http://ibin.co/2YAgkDQqr9hG

This time I used a build from ROM-o-matic with Debug intel enabled


RE: DHCP timeout on reboot - SebastianRoth - 2016-02-25 13:15

At first I thought the "MAC reset" after dhcp (on reboot) might be causing the issue. But as I understand the code (still kind of a newbie!) this comes from iPXE closing the devices as DHCP failed. So I am wondering if there is a difference in the DHCP sequence.

Maybe try adding debug for DHCP as well to see is there is any difference...
Code:
... DEBUG=intel,dhcp ...



RE: DHCP timeout on reboot - FreeMinded - 2016-02-25 21:56

(2016-02-25 13:15)SebastianRoth Wrote:  At first I thought the "MAC reset" after dhcp (on reboot) might be causing the issue. But as I understand the code (still kind of a newbie!) this comes from iPXE closing the devices as DHCP failed. So I am wondering if there is a difference in the DHCP sequence.

Maybe try adding debug for DHCP as well to see is there is any difference...
Code:
... DEBUG=intel,dhcp ...

Hi Sebastian

Thanks for joining the thread! This thing is starting to bend my mind but I think I finally found some interesting facts.

The DHCP timeout does NOT appear when rebooting by using the iPXE command reboot. But when rebooting from Windows or Ubuntu. So something must be different, when reboot is invoked by iPXE.

Debug output DHCP successful
http://ibin.co/2YDoPoFOkJOS

Debug output DHCP timing out
http://ibin.co/2YDosDmWFdXD

I hope this helps to find a solution, fix or workaround


RE: DHCP timeout on reboot - SebastianRoth - 2016-02-29 15:04

Actually looks like it's sending out DHCP request but not being able to properly receive the answers! Consistent to what we saw in this picture you posted: http://ibin.co/2YAgkDQqr9hG (notice ifstat RX:0). Do you actually see the DHCP discovery packets arriving at your DHCP server (wireshark) and being answered? I'd guess so but please make sure.


RE: DHCP timeout on reboot - FreeMinded - 2016-03-22 21:41

Hi Sebastian

I'm back from holiday and on this issue again...

(2016-02-29 15:04)SebastianRoth Wrote:  Actually looks like it's sending out DHCP request but not being able to properly receive the answers! Consistent to what we saw in this picture you posted: http://ibin.co/2YAgkDQqr9hG (notice ifstat RX:0). Do you actually see the DHCP discovery packets arriving at your DHCP server (wireshark) and being answered? I'd guess so but please make sure.

Here is what happens on cold boot (see http://pastebin.com/6NfVqbtX for detailed Wireshark output)
From iPXE
DHCP Discover
DHCP Offer
DHCP Request
DHCP ACK
And then from Windows
DHCP Request
DHCP ACK

On a reboot the following happens (see http://pastebin.com/Mnm9RkrL for detailed Wireshark output)
From iPXE
DHCP Discover
DHCP Offer
DHCP Discover
DHCP Offer
DHCP Discover
DHCP Offer
And then from Windows
DHCP Request
DHCP ACK

So to me it looks that iPXE is not acting on the DHCP offers on reboots.
Any idea what could be the cause of this?

Side note, the iPXE script is configured to boot from local disc if it's unable to contact our boot control server. Grub get's loaded again and boots into Windows. This is the reason that we have in both cases a successful DHCP request from Windows.


RE: DHCP timeout on reboot - FreeMinded - 2016-03-31 08:52

Hi all

Anyone an idea how I could continue on this issue? I'm really stuck here.

Thanks in advance!