Post Reply 
 
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Odd behavior running under Parallels
2013-07-29, 14:56 (This post was last modified: 2013-07-30 05:23 by Richard.)
Post: #1
Odd behavior running under Parallels
All,

I'm seeing some general weirdness trying to use iPXE between VMs for testing. I could use some troubleshooting tips, particularly around what's happening under the hood during file download.

What I'm seeing:
With ipxe.iso, the initrd image will start to load, then hang. The point isn't consistent, but ranges from 65k to 150k into the file.
With pxelinux.0, the full 370MB image will load and boot, but fail for other reasons.
With undionly.kpxe, the process never gets as far as starting the initrd step - it hangs while loading

What I'm wondering:
When the initrd command runs, does it download the whole file first, or does part of the downloaded image start executing partway through the download process? i.e., do I have a simple issue here with a download failing, or is it possible that code inside the image is causing the failure?


I'm running CentOS6.4 on MacOS 10.8.3 using Parallels 8.0.18494, and trying to replicate this boot scheme as a start:
http://blog.braastad.org/?p=332
The goal is to iPXE and run these commands:
dhcp
initrd http://example.com/initrd0.img
kernel http://example.com/vmlinuz0 initrd=/initrd0.img root=/centos64-pxe.iso rootfstype=auto rw liveimg toram size=4096
boot

What I've tried:
Booting from the ipxe.iso demo file, Ctrl-B, and manually entering the commands, trying both HTTP and TFTP as sources
Booting an ipxe.iso file from rom-o-matic with the above script embedded
Chain loading the above in undionly.kpxe form
Booting the image via pxelinux.0
Running tcpdump during the above, and analyzing with Wireshark


I'm suspecting some issue here between iPXE and the Parallels virtual NIC. Among the symptoms, when loading fails (e.g., TFTP timeout) the NIC goes dead - subsequent tries show no packets in tcpdump, and ifopen or dhcp won't work until the interface is ifclose'd first.

I've hit a wall trying to debug this. Any pointers would be appreciated.

Thanks,
Richard
Find all posts by this user
Quote this message in a reply
2013-07-29, 22:42 (This post was last modified: 2013-07-30 05:28 by Richard.)
Post: #2
RE: Odd behavior running under Parallels
More detail and some progress, for the record...

The base server running TFTP, HTTP, etc. is actually CentOS6.3, while the ISO is being built using CentOS6.4. The iPXE I'm using are all 1.0.0+ builds. I rebuilt the whole environment under VMware Fusion 5.0.3 and ran into some similarities and some improvements. (Edit: Re-tested with CentOS 6.4 on the server, and TCP behaves the same.)

On the similarities front, booting the demo ipxe.iso and manually issuing the commands had the same result when using HTTP as the source. However, upon switching back to TFTP, I was able to get a successful boot! Painfully slow loading time, but it worked!

Now, I haven't a clue why TFTP isn't playing nice under Parallels. But I do have some clues about HTTP in both environments - CentOS 6's TCP stack appears to be misbehaving.

If I can believe what tcpdump is telling me, during an HTTP transfer the CentOS6 server is sending super jumbo frames... like 15kB! Not only is this bigger than every jumbo frame implementation I've seen (9216 bytes is usually the cap), but the server is also ignoring the 1460-byte MSS requested by the iPXE client. iPXE seems to hold its own for a while, accepting these goliath frames up to a point before falling over. Specifically, iPXE sends back an ACK for part of a jumbo frame (not normally how it's done), then goes unresponsive.

I've been doing Sniffer traces for many years, so I'm pretty confident in what I see in Wireshark (after tcpdump'ing to file). But I'm not a heavy tcpdump user, so I don't know if I can completely trust what tcpdump is reporting, or if it could possibly be merging packets - that seems unlikely, because everything seems consistent in the packet layouts. Normally, these goliath packets wouldn't even be possible in the real world, but inside a virtual network in Parallels or VMware I could see it being possible. (I did try changing the NIC MTU, and while it reports as being changed this doesn't change the large frame behavior.)

So...
* With TFTP under Parallels 8.0.18494, there seems there's some sort of issue with iPXE; I'd suspect it's client-side interaction with the virtual NIC, since the same TFTP server works fine under Fusion.
* With HTTP and CentOS 6, the OS is sending huge packets when it shouldn't. This is eventually choking the iPXE client, and happens under both Parallels 8.0.18494 and VMware Fusion 5.0.3. While it's exposed a weakness in iPXE, this is a CentOS/RHEL defect in my eyes; it's probably just never been discovered in the real world because a physical network wouldn't forward these giant packets (and TCP Path MTU discovery would cause them to retry at a smaller packet size).

Does this sound like anything that's been reported before? I'm not getting any hits on searches.

Thanks,
Richard
Find all posts by this user
Quote this message in a reply
2013-07-30, 11:59 (This post was last modified: 2013-07-30 11:59 by robinsmidsrod.)
Post: #3
RE: Odd behavior running under Parallels
I've never heard anything like this reported on the various iPXE-related communities. Have you tried running through the checklist mentioned on http://ipxe.org/dev/driver to see what part exactly is giving errors? My guess is that high-speed or high MTU tests would fail.
Visit this user's website Find all posts by this user
Quote this message in a reply
2013-07-31, 09:52
Post: #4
RE: Odd behavior running under Parallels
Thanks. I'll explore this more as time allows, and report to the developer mailer if I find anything of interest.
Find all posts by this user
Quote this message in a reply
2013-09-30, 19:26
Post: #5
RE: Odd behavior running under Parallels
I am actually seeing the same issue with Parallels as well. One thing of interest, though, is that the same iPXE ISO used in attempt to build on Parallels works perfectly fine to build a guest on ESXi (they both are using same driver, which is 82545em.

The other strange thing is that when a real tftp server is answering the DHCP call generated by Parallels, it fully work without any problem and download completely. It only fails when using the ISO at random points in the download of linux kernel or when chaining a boot sequence to another ipxe loader.
Find all posts by this user
Quote this message in a reply
2013-09-30, 19:30
Post: #6
RE: Odd behavior running under Parallels
I am seeing the same issue on my Parallels as well. The same iPXE ISO works without any issue fora ESXi guest. I am using HTTP to download the boot kernel and the ISO is using the 82545em drivers.
Find all posts by this user
Quote this message in a reply
2013-10-05, 14:33
Post: #7
RE: Odd behavior running under Parallels
kpatel: This seems like a compatibility issue between Parallels and iPXE. You'll need to discuss it with the developers on either IRC or the mailing-list. My guess is they'll need more debug details to figure it out. AFAIK, none of the core developers have Mac computers at their disposal.
Visit this user's website Find all posts by this user
Quote this message in a reply
2014-04-02, 23:10
Post: #8
RE: Odd behavior running under Parallels
Sorry for long hibernation...

I am willing to make necessary data available to get it fixed. Let me know how I can go about making that data available.

Thanks.


The error message that is displayed on the screen is as follows (I am using dnasmasq as the TFTP server):

net0: XX:XX:XX:XX:XX:XX using 82545em on PCI00:05.0 (open)
[Link:up TX:0 TXE:0 RX:15 RXE:2]
[RXE: 2 x "Invalid Argument (http://ipxe.org/1c056002)"]
Find all posts by this user
Quote this message in a reply
Post Reply 




User(s) browsing this thread: 4 Guest(s)