====== Differences ====== This shows you the differences between two versions of the page.

Link to this comparison view

soc:2010:cooldavid:journal:week0 [2010/05/26 08:56]
cooldavid created
soc:2010:cooldavid:journal:week0 [2010/05/26 10:10] (current)
cooldavid
Line 1: Line 1:
-=== Before Apr. 22th === +=== Week 0: TCP performance test and tunning ​=== 
-== Debug boot crash problem ​== +=== May 13th to May 23th === 
-I had problem for boot gPXE with my testing computer. +  * Implement TCP receive queue 
-It keeps reboot(crash) right after gPXE relocats itself +  * Implement TCP SACK [RFC 2018] 
-to high memory location.+  * Implement TCP Window Scale [RFC 1323]
  
-I've traced partial codes of ''​src/​arch/​i386/​prefix/​*prefix.S''​ +The test results can be found at these notes: 
-And dumped the e820 memory maps as stefanha suggestes. +  * [[soc:​2010:​cooldavid:​notes:​tcpoorxtest|TCP out-of-order receive dstat test logs]] 
-The addresses all looks normal, no overlapps, and the gPXE loading +  ​[[soc:​2010:​cooldavid:​notes:​bkobench|Benchmarks for downloading image from boot.kernel.org]]
-address seems reasonable.+
  
-== Dumped message ​== +== Found TCP FIN ACK number issue == 
-<​code>​ +Upper layer protocols stop the TCP xfer by calling tcp_xfer_close().
-NVIDIA Boot Agent 215.0503 +
-Copyright ​(C2001-2005 NVIDIA Corporation +
-Copyright (C) 1997-2000 Intel Corporation+
  
-CLIENT MAC ADDR00 15 F2 3A BE 48  GUID: 00501D51-548A-0D10-A3D8-AB7EEAB94556 +If the call path is
-CLIENT IP: 192.168.201.208 ​ MASK: 255.255.255.0 ​ DHCP IP: 192.168.201.186 +     tcp_rx()->tcp_rx_data()->...Upper layer...->​tcp_xfer_close().
-GATEWAY IP: 192.168.201.186 +
-PXE->EB: !PXE at 99AB:0080, entry point at 99AB:0E57 +
-         UNDI code segment 99AB:5BD3, data segment 98B3:​0F80 ​(610-638kB) +
-         UNDI device is PCI 00:0A.0, type DIX+802.3 +
-         610kB free base memory after PXE unload +
-Fetching system memory map +
-FBMS base memory size 596 kB [0,95000) +
-INT 15,e801 extended memory size 15360+64*32511=2096064 kB [100000,​7fff0000) +
-INT 15,e820 region [0,9f800) type 1 +
-INT 15,e820 region [f0000,​100000) type 2 +
-INT 15,e820 region [fec00000,​100000000) type 2 +
-INT 15,e820 region [e0000000,​f0000000) type 2 +
-INT 15,e820 region [7fff3000,​80000000) type 3 +
-INT 15,e820 region [7fff0000,​7fff3000) type 4 +
-INT 15,e820 region [9f800,​a0000) type 2 +
-INT 15,e820 region [100000,​7fff0000) type 1 +
-Relocate: currently at [400000,​435c98) +
-...need 35ca7 bytes for 16-byte alignment +
-Considering [0,9f800) +
-...usable portion is [0,9f800) +
-Considering [100000,​7fff0000) +
-...end truncated to 7ff00000 ​(avoid ending in odd megabyte) +
-...usable portion is [100000,​7ff00000) +
-...new best block found. +
-Relocating from [400000,​435c98) to [7feca360,​7feffff8) +
-</​code>​+
  
-== Solved with cleaning pin connectors == +     Which is at lease happened in HTTP while received expected 
-After about a week's struggle, I removed all DIMMs and +     length of data.
-add-on cards, and cleaned all pin connectors. Fount that +
-it boots fine without any software patch.+
  
 +Sending FIN in tcp_xfer_close() in this case, will cause error ACK
 +number, since the ACK number have not updated by that time.
 +Later trying to send a correct one in tcp_rx() will fail too, due
 +to the TCP state already changed.
  
 +This patch let the TCP stack send FIN in normal tcp_rx() flow if the
 +packet contains the last data of the session.
 +
 +  * [[http://​git.etherboot.org/?​p=people/​cooldavid/​gpxe.git;​a=commitdiff;​h=d3661b67b1fa15c17b855b7e5cad47c1391835be|The git commit of my fix]]
 +  * [[https://​git.ipxe.org/​ipxe.git/​commitdiff/​9ff822969300a086ad13036c019aa981336017ed|mcb30'​s patch]]

QR Code
QR Code soc:2010:cooldavid:journal:week0 (generated for current page)