**This is an old revision of the document!** ----
====== Stefan Hajnoczi: GDB Remote Debugging ====== ===== Week 6 ===== **Milestones:** * [b44] Tested and clean for mainline review. * [gpxelinux.0] Merge Award BIOS return-to-PXE workaround. ==== Tue Jul 8 ==== Git commits: * [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=9a4daaf7d1d3cbe8302aace77854ee99b2696b25|[e820] Full clipping of regions into fragments]] * [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=2e9dc72974b5ee0183e6268c3d1a747552c0c964|[e820] Clean up new e820 memory mangler]] **Progress on e820 memory map mangler**. I finally made the push for an e820 memory map mangler that can clip regions into fragments. The existing e820 memory map mangler works well when gPXE hides the beginning and/or end of a memory region. The new mangler supports hidden memory regions anywhere, and any number of them. In the worst case, this means splitting a memory region into two or more fragments. The existing e820 mangler has the nice property that it works on-the-fly. It does not need to take a snapshot of the entire e820 memory map. Instead, it does the necessary clipping at each point during a sequence of e820 calls. [[http://syslinux.zytor.com/memdisk.php|MEMDISK]] has a different e820 mangler. It takes a snapshot of the entire e820 memory map and performs clipping once. The real benefit I see is that the actual e820 handler code is very simple; it just reads the next memory region from the map. If we did something similar in gPXE, it would mean that all the clipping and hiding code would be written in C, with only a small e820 handler in 16-bit assembly. In the end, I didn't opt for the MEMDISK approach since you need to worry about storage for the e820 memory map snapshot. It would also involve rewriting more of our memory map code than simply extending what is already there. **The new algorithm works as follows**: <code> def int_e820(): for real_region in e820_memory_map: nfrags = 0 for i in [0..len(hidden_regions) - 2]: region = Region(real_region.start, real_region.end) clipped = False if hidden_regions[i].end_overlaps(region): region.start = hidden_regions[i].end clipped = True if hidden_regions[i + 1].start_overlaps(region): region.end = hidden_regions[i + 1].start clipped = True if hidden_regions[i].completely_overlaps(region): region.start = region.end clipped = True if clipped: nfrags += 1 if not region.is_empty(): yield region # If no fragments were clipped, return the original region if nfrags == 0: yield real_region </code> For every e820 region, the algorithm steps through each hidden region. Actually, it clips using the "current" hidden region and the "next" hidden region. The concept of ordered current and next hidden regions requires that hidden regions are sorted by start address. The e820 region is clipped to the end of the current region and the start of the next region. If there was an overlap and a fragment was clipped, then it is returned. If all hidden regions have been checked but no fragments were clipped, then the original region is unchanged and must be returned. The hidden regions list has a [0x0, 0x0) region at the beginning and a [0xffffffff, 0xffffffff) region at the end. These dummy values make clipping against the first and last hidden regions easy, otherwise we would need special cases. The pseudocode above is written with a continuous thread of control. However, the actual int 15h, e820 handler needs to be called for each fragment, so the full assembly code needs to manually manage iteration state and do continuation. ==== Wed Jul 9 ==== Git commit: * [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=b658f32e59ca482e03b7caf978b6926589708f88|[prefix] Return-to-PXE when int 18h is broken]] * [[http://git.etherboot.org/?p=people/stefanha/gpxe.git;a=commit;h=17fdb265ffb4685b496c7ca16d18da2620b9b8fb|[prefix] Stack overwrite check before return to PXE]] **Ported gpxelinux.0 changes** to my new ''gpxelinux'' branch. SYSLINUX uses ''undionly.kpxe'' (keep UNDI loaded, no PCI support) with PXELINUX as an embedded image. HPA has added a workaround for buggy BIOSes that do not support int 18h from PXE NBPs. We want to merge the workaround into mainline. On Monday, mcb30 and I discussed fingerprinting the buggy BIOS so gPXE can decide whether to exit via int 18h or by returning to PXE. A flag gets set when a buggy BIOS is detected. On exit, we check this flag and return via PXE if necessary. Overall, the steps to get ''gpxelinux.0'' cleanly merged are: - Merge return-to-PXE code from ''gpxelinux.0'', add buggy BIOS fingerprinting. - Detecting an overwritten stack. We cannot return to PXE if the stack has been corrupted (say, in the attempt to boot an image). - More control over ''shutdown()'' to distinguish between passing control to a successfully loaded image or asking for the next device to boot on failure. When passing control to an image that does not need gPXE services, we will unload everything. When asking for the next device to boot, we may keep the underlying PXE and UNDI. **Detecting an overwritten stack** is a bit wierd. We want to return to PXE because int 18h is broken. In order to return, we need to make sure the stack has not been overwritten. If we determine the stack is unusable, then we are stuck - int 18h is broken and return to PXE is impossible! The policy I have coded for now is to reboot the machine. Next steps: * [b44] Performance testing. * [b44] Cleanup & testing. * [bzImage] Expand the heap size to the full 64K segment when loading a bzImage kernel with version 2.02 or higher. * [GDB] Real-mode remote debugging.