I seem to have encountered an issue with Dell's new 13G servers (Think R630, R730 etc).
Environment:
wimboot-2.4.1
BIOS Mode booting via undionly (Not EFI)
Dell 13G Poweredge Server
Things I have tried without success:
- Disabling USB 3.0
- Enabling and disabling the Legacy video misc option
- Disabling MMIO
- Upgrading to the latest available BIOS
- Injecting the Matrox video driver into my PE Image
Things I have done that are successful:
- Turing my WIM image into a bootable ISO with oscdimg and booting it directly from the iDRAC virtual CD drve. Works flawlessly every time.
- Booting the same image on a 12G server via iPXE. That also works. It only seems to be these 13G servers that I have issues with.
Behavior I see:
After this last screen with the windows progress bar, it just goes black. I assume that the system is hung at this point because right after the PE environment loads, it should pull some files back from my razor server, and I never see that happen.
How best can I help troubleshoot this so I can provide the development community with meaningful information; since I appear to be getting to a fairly advanced point in the loading process?
(2015-07-24 04:37)theason Wrote: [ -> ]How best can I help troubleshoot this so I can provide the development community with meaningful information; since I appear to be getting to a fairly advanced point in the loading process?
How much RAM is in this server? From the addresses mentioned in the screenshots, it looks to be around 2G, which seems implausibly low for a server.
Could you obtain a copy of the system memory map? (You can obtain this from the "dmesg" output if you boot into Linux, or by building iPXE with DEBUG=memmap).
Thanks,
Michael
Sure thing. You've got the 2 part right, it's actually 256GB of RAM
Here's a dmesg from the razor microkernel that boots via iPXE quite nicely.
http://hastebin.com/ucihajesof.vhdl
Just a little bump. I have a work around, but booting from an ISO makes my automation completely lack coolness factor.
(2015-08-08 00:25)theason Wrote: [ -> ]Just a little bump. I have a work around, but booting from an ISO makes my automation completely lack coolness factor.
I've spent the past two days debugging what seems to be a similar issue, in which the 32-bit PAE code path in winload.exe seems to assume that all memory above the 2GB mark (0x80000000) is unused by earlier boot stages. However, looking at your screenshots, I don't think this fix will make any difference in your case, since the initrd is already placed below the 2GB point. There may be other fixed addresses used by winload.exe that also need to be avoided.
I have the same problem on my Dell servers. I can confirm that the new wimboot (casper 2.5.0) does not fix the problem on Dell servers.
Any other debug information that might help?
Having similar issues. What can we do to help get this fixed?
(2015-10-07 19:17)SoulA Wrote: [ -> ]Having similar issues. What can we do to help get this fixed?
In decreasing order of simplicity:
1. Provide me with local access to a server exhibiting the problem.
2. Find a way to reproduce this problem in a VM, so that I can reproduce the problem locally.
3. Hook up a kernel debugger, edit the BCD to enable both bootmgr and winload debugging, and figure out what is happening.
Michael
(2015-10-07 22:13)mcb30 Wrote: [ -> ] (2015-10-07 19:17)SoulA Wrote: [ -> ]Having similar issues. What can we do to help get this fixed?
In decreasing order of simplicity:
1. Provide me with local access to a server exhibiting the problem.
2. Find a way to reproduce this problem in a VM, so that I can reproduce the problem locally.
3. Hook up a kernel debugger, edit the BCD to enable both bootmgr and winload debugging, and figure out what is happening.
Michael
My company would be willing to provide you with DRAC access to an R730 that you can mess with. That should be enough to hopefully accomplish option #1? Let me know.
(2015-10-11 21:19)SoulA Wrote: [ -> ]My company would be willing to provide you with DRAC access to an R730 that you can mess with. That should be enough to hopefully accomplish option #1? Let me know.
Thank you for the offer! Unfortunately DRAC serial ports are (from experience) unusable for remote debugging, since they silently drop characters if you attempt to go above 9600 baud.
Michael
(2015-07-29 20:27)mcb30 Wrote: [ -> ] (2015-07-24 04:37)theason Wrote: [ -> ]How best can I help troubleshoot this so I can provide the development community with meaningful information; since I appear to be getting to a fairly advanced point in the loading process?
How much RAM is in this server? From the addresses mentioned in the screenshots, it looks to be around 2G, which seems implausibly low for a server.
Could you obtain a copy of the system memory map? (You can obtain this from the "dmesg" output if you boot into Linux, or by building iPXE with DEBUG=memmap).
Thanks,
Michael
I don't thing RAM can be a problem because I have got this same issue with 16GB RAM in my 13G server.
Hello
Has there been a resolution for this issue? I'm encountering the same exact behavior in VMware on Version 10 hardware.
Thanks
(2017-02-24 21:28)kristianreese Wrote: [ -> ]Has there been a resolution for this issue? I'm encountering the same exact behavior in VMware on Version 10 hardware.
There was one issue where >3GB ram caused issues on some hardware, a workaround was implemented in wimboot which should handle that, but just to test you can always run the vm with just 3GB ram or less.
I get same on some non-server hardware as well, even with latest wimboot
This issue was resolved for me some time ago after a Dell BIOS update.
I was having similar issues with Windows Server 2008 R2.
I found that I needed to enable the "Load Legacy Video Option ROM" setting in the system BIOS's
Miscellaneous Settings section.
See this screenshot:
http://imgur.com/a/DxFGC