**This is an old revision of the document!** ----
====== Stefan Hajnoczi: GDB Remote Debugging ====== ===== Week 4 ===== **Milestones:** * Get latest GDB stub work into mainline. * Modern bzImage prefix for gPXE. ==== Mon Jun 16 ==== **The ''gdbstub2'' branch is now ready for mainline review**. Diffs against gPXE ''master'' are [[http://etherboot.org/share/stefanha/gdbstub2.diff|here]]. Once it is merged I will update the documentation and encourage others to use GDB. **gPXE needs modern bzImage support so that GRUB, lilo, and SYSLINUX can load it**. This is my next piece of work after the GDB stub. There is already code in etherboot to make a bzImage. The old code doesn't work by default on today's popular bootloaders since the Linux bzImage header it supplies is outdated. I am investigating what needs to be done for GRUB, lilo, SYSLINUX, etherboot, and gPXE to load a gPXE bzImage. ==== Tue Jun 17 ==== Git commit: * [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=bfd885802fd6af9938f2b703f6c48a9259cd7657|[bzImage] Make gpxe.lkrn a zImage 2.07]] **I am trying out bootloaders on ''gpxe.lkrn'' images**. We were afraid that the outdated Linux zImage prefix no longer works with modern bootloaders. Here are results for unmodified gPXE (I have not yet attempted to implement bzImage): * **GRUB** boots ''gpxe.lkrn'' successfully. Here is a script to create a GRUB/gPXE boot floppy: <code> #!/bin/sh set -e dd if=/dev/zero of=grub.img bs=1024 count=1440 losetup /dev/loop0 grub.img mkfs /dev/loop0 mount /dev/loop0 /mnt mkdir -p /mnt/boot/grub cp /boot/grub/stage1 /boot/grub/stage2 /mnt/boot/grub/ cat >/mnt/boot/grub/menu.lst <<EOF title=gPXE root (fd0) kernel /boot/gpxe.lkrn EOF cp bin/gpxe.lkrn /mnt/boot/ umount /mnt grub --device-map=/dev/null <<EOF device (fd0) /dev/loop0 root (fd0) setup (fd0) quit EOF losetup -d /dev/loop0 </code> * **SYSLINUX** boots ''gpxe.lkrn'' successfully. Here is a script to create a boot floppy: <code> #!/bin/sh set -e dd if=/dev/zero of=syslinux.img bs=1024 count=1440 mkfs.msdos syslinux.img mount -o loop syslinux.img /mnt cp bin/gpxe.lkrn /mnt/gpxe.zi cat >/mnt/SYSLINUX.CFG <<EOF default gpxe.zi EOF umount /mnt syslinux syslinux.img </code> * **lilo** boots ''gpxe.lkrn'' unsuccessfully. QEMU stops with a triple-fault. I still need to look into this. Here is a script to create a boot floppy: <code> #!/bin/sh set -e dd if=/dev/zero of=lilo.img bs=1024 count=1440 losetup /dev/loop0 lilo.img mkfs /dev/loop0 mount /dev/loop0 /mnt mkdir /mnt/etc /mnt/boot cp bin/gpxe.lkrn /mnt/gpxe.zi cat >/mnt/etc/lilo.conf <<EOF boot =/dev/loop0 disk =/dev/loop0 bios =0x00 # 1.44MB disk geometry sectors =18 heads =2 cylinders =80 install =/mnt/boot/boot.b map =/mnt/boot/map backup =/dev/null image =/mnt/gpxe.zi EOF /tmp/lilo/sbin/lilo -C /mnt/etc/lilo.conf umount /mnt losetup -d /dev/loop0 </code> * **gPXE** boots ''gpxe.lkrn'' unsuccessfully since only the newer bzImage and not the old zImage format is supported. Testing was easy: <code> qemu -bootp gpxe.lkrn -tftp bin bin/gpxe.usb </code> * **Etherboot 5.4.3** boots ''gpxe.lkrn'' successfully. I used [[http://freshmeat.net/projects/wraplinux/|wraplinux]] to make an NBI file from ''gpxe.lkrn''. **Updated ''lkrnprefix.S'' to zImage 2.07**. The image is still only a zImage since the non-real code loads at 0x10000. A bzImage loads non-real code at 0x100000, i.e. right after the 1 MB low memory. Perhaps ''gpxe.lkrn'' can be a full bzImage, but I think that the A20 line will prevent us from accessing 0x100000. * **GRUB** boots successfully. * **Lilo** still fails. I need to investigate this, probably I'm not using it properly. * **SYSLINUX** boots successfully. * **gPXE** boots successfully with a small patch to ''bzimage.c''. Need to discuss this with mcb30. * **Etherboot** boots successfully. ==== Wed Jun 18 ==== **Lilo still triple-faults when loading ''gpxe.lkrn''**. I set up a virtual machine with [[http://damnsmalllinux.org/|Damn Small Linux]] to ensure a clean environment. The DSL kernel is boots successfully while ''gpxe.lkrn'' fails. Here is the triple fault information from QEMU: <code> qemu: fatal: triple fault EAX=60000000 EBX=0000fee8 ECX=00002900 EDX=00001d8a ESI=0001ffff EDI=0000ff51 EBP=0000f9c4 ESP=0000f96e EIP=0000074c EFL=00000002 [-------] CPL=0 II=0 A20=1 SMM=0 HLT=0 ES =0018 00000000 ffffffff 00cf9300 CS =0008 0000f600 0000ffff 00009b00 SS =0010 00090000 0000ffff 00009309 DS =0018 00000000 ffffffff 00cf9300 FS =0018 00000000 ffffffff 00cf9300 GS =0018 00000000 ffffffff 00cf9300 LDT=0000 00000000 0000ffff 00008000 TR =0000 00000000 00000000 00000000 GDT= 0009f99c 0000001f IDT= 00000000 000003ff CR0=60000011 CR2=00000000 CR3=00000000 CR4=00000000 CCS=00000000 CCD=0000f97e CCO=ADDB FCW=037f FSW=0000 [ST=0] FTW=00 MXCSR=00001f80 FPR0=0000000000000000 0000 FPR1=0000000000000000 0000 FPR2=0000000000000000 0000 FPR3=0000000000000000 0000 FPR4=0000000000000000 0000 FPR5=0000000000000000 0000 FPR6=0000000000000000 0000 FPR7=0000000000000000 0000 XMM00=00000000000000000000000000000000 XMM01=00000000000000000000000000000000 XMM02=00000000000000000000000000000000 XMM03=00000000000000000000000000000000 XMM04=00000000000000000000000000000000 XMM05=00000000000000000000000000000000 XMM06=00000000000000000000000000000000 XMM07=00000000000000000000000000000000 Aborted </code> I don't see an obvious clue in the crash dump, so I'll wait until after speaking with mcb30 about bzImage. If we decide to go in a different direction then I'd waste time debugging this. In the meantime I'll investigate real-mode GDB debugging. I already tried ''set architecture i8086'' for 16-bit disassembly. GDB still treats memory as a flat 32-bit space and will probably require some address translation inside the GDB stub. Next steps: * Update [[:dev:gdbstub|GDB stub page]] and screencast when UDP code is merged into mainline. See [[http://grub.enbug.org/DebuggingWithGDB|GRUB GDB wiki page]] for inspiration. * gPXE bzImage support. * Real-mode GDB stub.