====== Differences ====== This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
soc:2008:mdeck:journal:week7 [2008/07/10 11:22] mdeck |
soc:2008:mdeck:journal:week7 [2008/07/13 12:23] (current) mdeck |
||
---|---|---|---|
Line 128: | Line 128: | ||
Thus, I now have nailed down at least //one// bug, and now I can determine what's going wrong. | Thus, I now have nailed down at least //one// bug, and now I can determine what's going wrong. | ||
+ | |||
+ | * [[http://git.etherboot.org/?p=people/mdeck/gpxe.git;a=commit;h=9f561a19282078cc0346487d2a2b34060e1a3f62|[Drivers-eepro100] Bug fixes]] | ||
+ | |||
+ | The end of ''ifec_tx_wake()'' performs different operations depending if the state of the CU is active or suspended. After some consideration, it seems if the CU is active, a RESUME should still be issued - this will cause the CU to re-read the current TCB's S-bit. Thus, after clearing that bit, the CU will continue on and process this newly appended transmit command. | ||
+ | |||
+ | Otherwise, if the card was active before the tx, then it would suspend before processing the new TCB. This means the card is suspended at a TCB prior to the ''tcb_head''. This could happen multiple times, moving the actual TCB suspended closer to ''tcb_tail''. I think eventually tail would surpass the suspended TCB, and the head may write into the next TCB which is transmitted at the next ''ifec_net_transmit()''. This is speculation, as there may be some other way this corruption was occurring. | ||
+ | |||
+ | The bottom of ''ifec_tx_wake()'' was changed as such: | ||
+ | <file> | ||
+ | |||
+ | /* Resume if suspended. */ | ||
+ | switch ( ( inw ( ioaddr + SCBStatus ) >> 6 ) & 0x3 ) { | ||
+ | case 0: /* Idle - We should not reach this state. */ | ||
+ | DBG ( "\nifec_net_transmit: tx idle!\n" ); | ||
+ | ifec_scb_cmd ( netdev, virt_to_bus ( tcb ), CUStart ); | ||
+ | ifec_scb_cmd_wait ( netdev ); | ||
+ | return; | ||
+ | case 1: /* Suspended */ | ||
+ | DBG ( "s" ); | ||
+ | break; | ||
+ | default: /* Active */ | ||
+ | DBG ( "a" ); | ||
+ | } | ||
+ | ifec_scb_cmd_wait ( netdev ); | ||
+ | outl ( 0, ioaddr + SCBPointer ); | ||
+ | a->tcb_head->command &= ~CmdSuspend; | ||
+ | /* Immediately issue Resume command */ | ||
+ | outb ( CUResume, ioaddr + SCBCmd ); | ||
+ | ifec_scb_cmd_wait ( netdev ); | ||
+ | } | ||
+ | </file> | ||
+ | As you can see, the RESUME is issued even if the card is active. | ||
+ | Additionally, I removed a line from ''ifec_tx_process()'': | ||
+ | <file> | ||
+ | static void ifec_tx_process ( struct net_device *netdev ) { | ||
+ | struct ifec_private *priv = netdev->priv; | ||
+ | struct ifec_tcb *tcb = priv->active->tcb_tail; | ||
+ | s16 status; | ||
+ | |||
+ | /* Check status of transmitted packets */ | ||
+ | while ( ( status = tcb->status ) && tcb->iob ) { | ||
+ | if ( status & TCB_U ) { | ||
+ | DBG ( "ifec_tx_process : tx error!\n " ); | ||
+ | netdev_tx_complete_err ( netdev, tcb->iob, -ENOMEM ); | ||
+ | } else { | ||
+ | netdev_tx_complete ( netdev, tcb->iob ); | ||
+ | } | ||
+ | DBGIO ( "tx completion\n" ); | ||
+ | |||
+ | tcb->iob = NULL; | ||
+ | tcb->status = 0; | ||
+ | // tcb->command &= ~CmdSuspend; /* Allow controller to resume. */ | ||
+ | |||
+ | priv->active->tcb_tail = tcb->next; /* Next TCB */ | ||
+ | tcb = tcb->next; | ||
+ | } | ||
+ | } | ||
+ | </file> | ||
+ | This ensures the suspend bit isn't cleared except in the ''ifec_tx_wake()'' routine. This line was redundant. | ||
+ | |||
+ | === 13 July === | ||
+ | |||
+ | In lieu of having iSCSI packet captures to look at, I decided to try booting over AoE. This involves sufficient driver activity that I hope to locate a bug via it. | ||
+ | |||
+ | Booting a Windows image over AoE got stuck at the Windows splash screen. I then tried booting this image using Safe Mode. Every .sys driver loads until it gets to aoe32.sys. The system freezes at this line. I don't know enough about the AoE driver to determine what could be causing this. | ||
+ | |||
+ | I then compiled and attempted the same AoE boot using the legacy eepro100 driver. The boot sequence was exactly the same, with the machine freezing once loading aoe32.sys. I'll need to get a working AoE image to test this properly. |