====== Differences ====== This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
soc:2009:oremanj:journal:week4 [2009/06/17 02:21] rwcr |
soc:2009:oremanj:journal:week4 [2009/06/21 21:00] (current) rwcr |
||
---|---|---|---|
Line 21: | Line 21: | ||
* [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=66f79b57a1ea0354b8339fc508c655f2d46d0ec9| | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=66f79b57a1ea0354b8339fc508c655f2d46d0ec9| | ||
[Makefile] Remove -Wformat-nonliteral command-line option]] | [Makefile] Remove -Wformat-nonliteral command-line option]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=4f1e431b1a73d51900e18cadd975a7a2bf6f93c4| | ||
+ | [802.11] Add status-printing callback]] | ||
* [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=e418cc6fab843f4831d8f016f75cf6813be4b166| | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=e418cc6fab843f4831d8f016f75cf6813be4b166| | ||
[802.11] Clean up channel and rate handling]] | [802.11] Clean up channel and rate handling]] | ||
Line 41: | Line 43: | ||
==== Tuesday, 16 June ==== | ==== Tuesday, 16 June ==== | ||
+ | * On branch **wireless**: | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=56c50206ab6c3feda25d11f9e15ce2546288da2e| | ||
+ | [802.11] Recognize retransmitted packets]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=502f419feac76bbb69574a28ef81983b8eb84004| | ||
+ | [drivers rtl8180] Provide retry information with TX completion]] | ||
* On branch **mainline-review**: | * On branch **mainline-review**: | ||
* [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=a9a0567225493046f70e6252e21ab5c6d8219e87| | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=a9a0567225493046f70e6252e21ab5c6d8219e87| | ||
Line 73: | Line 80: | ||
This could be a bug in gPXE's TCP stack (unlikely), an rtl8180 driver-level issue causing it to resubmit stale received packets, memory corruption somewhere, or something to do with 802.11's longer link-layer header. Tomorrow I try to figure out which one it is. Wheee! | This could be a bug in gPXE's TCP stack (unlikely), an rtl8180 driver-level issue causing it to resubmit stale received packets, memory corruption somewhere, or something to do with 802.11's longer link-layer header. Tomorrow I try to figure out which one it is. Wheee! | ||
+ | |||
+ | ==== Wednesday, 17 June ==== | ||
+ | It was duplicate ACKs: a silly bug (signed versus unsigned) in the 802.11 layer caused the duplicate RX elimination code to only work half the time, and gPXE's TCP stack did not elegantly handle the duplicate ACKs thus generated. (802.11 can generate duplicate packets when a packet is received but its link-layer ACK is not, causing a retransmission which is also received.) I've patched the issue in both the 802.11 layer and the TCP stack, since TCP is meant to be resilient against such things. RFC793 allows my fix: "If the ACK is a duplicate, it can be ignored" (p.72). | ||
+ | |||
+ | I also found an unrelated bug in rtl8180 that caused it to cycle through its whole TX ring whenever one packet was completed, reporting the spurious TX completions with iob set to NULL. I believe the Linux driver does this too. No symptoms, but it's best to fix such things. | ||
+ | |||
+ | Thus, commits: | ||
+ | * On branch **wireless**: | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=5e27fab092e4be7ee2bcfb466c36e90b9895d2bc| | ||
+ | [802.11] Fix packet duplication elimination state]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=0b6003f7167d2d80876053c8908e394cf5dd246e| | ||
+ | [drivers rtl8180] Only report TX status once per packet]] | ||
+ | * On branch **mainline-review**: | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=8041741323b40d9f5c482d3c6e1391bee7be759d| | ||
+ | [tcp] Ignore duplicate ACKs in TCP ESTABLISHED state]] [+16 bytes] | ||
+ | |||
+ | Remaining things for this week: rate control, answers to questions from yesterday's entry, and pushing 802.11 code to mainline-review after I get the first two sorted out. Figuring out the iSCSI issue took much longer than I anticipated, but hopefully I'll still be able to get everything done. | ||
+ | |||
+ | ==== Thursday, 18 June ==== | ||
+ | Lots of progress today! I didn't get to rate control, but every other outstanding issue that I know about has been fixed. | ||
+ | |||
+ | Commits: | ||
+ | * On branch **mainline-review**: | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=4602299f4b96f7692766553d5972066dfd567b4e| | ||
+ | [netdevice] Add netdev->link_rc field for errors encountered during link-up]] [+20 bytes for rtl8139, +124 bytes for everything] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=30df822acbf6a207201f111d886effa0e4fc97d3| | ||
+ | [ifmgmt] Move link-up status messages from autoboot() to iflinkwait()]] [+56 bytes] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=22c261e77bda0984f4cb052037008f487a4bcaa6| | ||
+ | [hci] Expose ifcommon_exec() in a local header so wireless commands can use it]] [free] | ||
+ | * On branch **wireless**: Everything new in mainline-review, plus | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=e180c836140e375a328719730a2cc5a395ed3ce5| | ||
+ | [802.11] Clean up, document, and add helper inline functions to ieee80211.h]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=136432c8d49bddd54123d542c61494a4a9f73dff| | ||
+ | [802.11] Modify 802.11 layer to use ieee80211_next_ie() to step through IEs]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=1d485938737ab16799915d219187ca607c509d2d| | ||
+ | [netdevice, Makefile, 802.11] Revert print_status callback changes]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=171dd62083bcc0b425960b1dd72d7cfdd515e2f6| | ||
+ | [802.11] Add all 802.11 status and reason codes]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=ed9c48f52bea567b69411f63dd66cde71e251761| | ||
+ | [802.11] Revamped probe, fully asynchronous association, better error handling, much cleanup]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=9d5d094312e3f8f00bbc69280a9f4922d1319800| | ||
+ | [iwmgmt] Add functions for common user-level wireless tasks]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=3524902cf7ab246c86cd1a42177558cd3991d73b| | ||
+ | [hci] Add command interface to user-level wireless tasks]] | ||
+ | |||
+ | Wheee! | ||
+ | |||
+ | I heard from Michael concerning my questions above. He suggested I try to make association fully asynchronous, implementing some kind of link-up error indicator in ''net_device'' to handle the problem of errors never showing up. His original suggestion was to //replace// the link-up bit with the link-up rc value, using ''-EINPROGRESS'' to indicate that link was ongoing; I chose not to do it that way because of a subtle downside to gPXE's error-reporting system: | ||
+ | |||
+ | If ''netdevice'' sets rc to ''-EINPROGRESS'', it's setting a value for that error that has been defined (by errno.h) to include a constant showing that the error came from ''netdevice.c''. If ''ifmgmt'' then wants to check whether the error is ''-EINPROGRESS'', its comparison will be against its //own// ''EINPROGRESS'', with a field showing it came from ''ifmgmt.c'', and it will thus never conclude that the net_device's error code is really ''EINPROGRESS''. The error-reporting system is optimized for the assumption that errors will either be handled within the file that originated them, dropped at some layer, or propagated all the way back to the user for display. Indeed, a quick grep of the gPXE source showed not a single error equality comparison for an error originating outside the file the comparison was in. This tradeoff is very well-suited to gPXE's use cases, but it means we have to be careful about how we use the error codes. :-) | ||
+ | |||
+ | Michael also explained a bit about a conceptual separation in gPXE between kernel-ish code and user-ish code; user-ish code uses ''printf()'' to report status and errors, while kernel-ish code uses ''DBG()'' (which is normally invisible) and reports errors via return codes. I realized that my earlier attempt at including wireless status violated that separation by putting a function using ''printf()'' in the 802.11 stack directly; so I scrapped it and worked out a mechanism that uses a wireless-specific command (iwstat, to which I added iwlist and iwassoc). | ||
+ | |||
+ | The ''netdev->link_rc'' addition produces a code size increase in ''gpxe.lkrn'' that is disproportionate to its real impact; it adds only a few bytes at a time, but they come in at every driver's use of ''netdev_link_down'' or ''netdev_link_up'' (to keep the ''link_rc'' field consistent). For the real size-critical case of a ROM with support for only one driver, the size impact will be negligible (~20 uncompressed bytes for rtl8139). | ||
+ | |||
+ | Tomorrow is rate control, figuring out a subset of 802.11 error codes to include human-readable definitions for, and cleaning all of this up to push it to mainline-review in time for Saturday's meeting. Hopefully I can manage it :-) | ||
+ | |||
+ | ==== Friday, 19 June ==== | ||
+ | * On branch **mainline-review**, something I did a while ago and forgot to port over: | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=6d63a4a5f928b46422e2eb79837a8aba103e5bb7| | ||
+ | [dhcp] Await link-up before starting DHCP]] | ||
+ | * On branch **wireless**, a bunch of small changes and one major new feature: | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=1a4ed46cb6a83a15828b250ebd0296775fd21ef4| | ||
+ | [802.11] Expose channel-changing functionality]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=48d2edc207edcee585b4a317e253e79af74fb290| | ||
+ | [iwmgmt] Make iwlist preserve existing associations, and be non-active]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=c110891fe4dd13f889792a2755e1637a5ecf7dfc| | ||
+ | [hci] Call iwlist without active argument, since it has been removed]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=fb129f1dcea8a81ec32ebd4c08be7c28716963b3| | ||
+ | [hci] Add wireless error lists]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=5ed61a78c6c8e412cf785587bbc7bd5eb46cbaa6| | ||
+ | [802.11] Add rate control support; fix two bugs; remove high rate bits]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=2f64f4fe523ce8d69c84077a0a29b3e1c6ef6b84| | ||
+ | [iwmgmt] Remove use of NET80211_RATE_VALUE]] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=302ccbb4f84b97c655a009aa9708d946d1fbe95e| | ||
+ | [drivers rtl8180] Remove NET80211_RATE_VALUE; pass RX rate to netdev_rx()]] | ||
+ | |||
+ | I got rate control working today, and squashed a few other outstanding bugs that I found. With the rate-control algorithm in place, I was able to transfer a 13MB file over TFTP in 15 seconds on an 802.11g network with the access point about 12 feet away. For comparison, the same transfer over wired gigabit Ethernet took 4 seconds. I think the level this RC algorithm achieves is probably more than adequate for gPXE's performance requirements; there are various constants that can be tuned, but I'm not going to worry about that side of things right now. | ||
+ | |||
+ | I designed gPXE's rate-control algorithm mostly from scratch, based on my thoughts about how it would work well; a few aspects (such as the overriding rate decrease if we get 3 failed TXes) are based on Linux implementations. We keep data for every rate that could be used (<16 for all practical purposes), for the TX and RX paths separately, updating the TX information for our current TX rate when we receive TX completion status on a packet, and the RX information for a received packet's RX rate whenever we receive a data packet (management packets are generally sent at 1Mbps and so would skew the results). The data for each (rate, direction) combination is kept in a simple 32-bit integer, with two bits per packet (3 = OK, 2 = retried once, 1 = retried multiple times, 0 = didn't get through); when new packets are received old data is automatically shifted off the end, so that we always keep information for at most 16 packets for each (rate, direction) combination. The average of TX and RX qualities for a given rate, weighted by number of packets of data available for each and weighting TX packets more heavily than RX packets (they're more reliable), is munged into a "goodness" value for that rate between 0 and 99. Whenever the current rate's "goodness" falls below 85, we switch to the fastest rate with "goodness" over 85, or the rate with best "goodness" if none is over 85. | ||
+ | |||
+ | I got lucky with this one: the algorithm worked well as I designed it with only minor modifications, despite the fact that most of the numeric parameters would best be classified as educated guesses. :-) And it's quite small: | ||
+ | oremanj@xenon /home/oremanj/dev/gpxe % size src/bin/rc80211.o linux-2.6.30/net/mac80211/rc80211_minstrel.o | ||
+ | text data bss dec hex filename | ||
+ | 602 0 0 602 25a src/bin/rc80211.o | ||
+ | 3472 96 0 3568 df0 linux-2.6.30/net/mac80211/rc80211_minstrel.o | ||
+ | |||
+ | With the addition of rate control, and a few other bugfixes that came up while I was testing today, I think the wireless code is just about ready for mainline review. I'm going to go through over the weekend and make sure nothing is missing documentation, remove whitespace that's creeped onto line ends, and so forth; I plan to have everything I've worked on thus far over the summer in my mainline-review branch by Monday. | ||
+ | |||
+ | A quick size sanity check on the wireless code: | ||
+ | 220 24 0 244 f4 bin/iwmgmt_cmd.o | ||
+ | 602 0 0 602 25a bin/rc80211.o | ||
+ | 1479 56 0 1535 5ff bin/iwmgmt.o | ||
+ | 7282 100 24 7406 1cee bin/net80211.o | ||
+ | |||
+ | And for the rtl8180/rtl8185 driver: (only one of the RF handlers is required for any given card, in addition to rtl8180.o) | ||
+ | 3096 200 0 3296 ce0 bin/rtl8180.o | ||
+ | 1264 24 0 1288 508 bin/rtl8180_grf5101.o | ||
+ | 609 24 0 633 279 bin/rtl8180_max2820.o | ||
+ | 8174 24 0 8198 2006 bin/rtl8180_rtl8225.o | ||
+ | 985 24 0 1009 3f1 bin/rtl8180_sa2400.o | ||
+ | |||
+ | It's big by gPXE standards, but not enormous, and I think the size is reasonable given the complexity of the 802.11 protocol. And with the mucurses stuff (login and config) taken out but everything else default including iSCSI linked in, it meets the real test: | ||
+ | 68 -rw-r--r-- 1 oremanj oremanj 65024 2009-06-20 01:59 bin/rtl8180--rtl8180_rtl8225.rom | ||
+ | Under 64k---yippee! (With a completely default config it's 66,560 bytes.) Of course, the real challenge will be squeezing encryption support in there... but I've got the rest of the summer to figure that out ;-) | ||
+ | |||
+ | ==== Saturday, 20 June ==== | ||
+ | Did a bunch of cleanup with no feature changes, and everything has been pushed to mainline-review. :-) | ||
+ | * On branch **wireless**, merged in mainline-review such that wireless and mainline-review now represent the same tree. | ||
+ | * On branch **mainline-review**: | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=4edf4718760dfb35c6a0c811fc2e019fb176e9fc| | ||
+ | [802.11] Add support for 802.11 devices with software MAC layer]] [+8,006 bytes] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=b23fba30c9847b8fabf651d50cf6e4e323753548| | ||
+ | [rtl818x] Add driver for Realtek 8180/8185 wireless cards]] [+3,298 + (2,930 for rtl8180; 8,198 for rtl8185) bytes] | ||
+ | * [[http://git.etherboot.org/?p=people/oremanj/gpxe.git;a=commit;h=dcd4ae5d0edbc9abd429bce50f0e58726cdfe00b| | ||
+ | [iwmgmt] Add user-level 802.11 management commands and common error tables]] [+1,779 bytes] | ||
+ | |||
+ | I took the opportunity to make the Realtek naming sane: | ||
+ | % make bin/rtl8180.lkrn # for an 8180 chipset with any of the 802.11b RF modules | ||
+ | % make bin/rtl8185.lkrn # for an 8185 chipset with its 8225 RF module | ||
+ | % make bin/rtl818x--rtl8180_sa2400.lkrn # 8180 chipset with a specific RF module, for the size-pressed | ||
+ | % make bin/10ec8185.rom # rtl8185 generic PCI card ROM (needs to be piggybacked on e.g. an r8169) | ||
+ | The main driver code is now called "rtl818x" to signify that it works for both 8180 and 8185. It was getting quite confusing having some things named rtl8180 and some named rtl818x. The original distinction between the two (from the Linux driver) was that the rtl818x structures also applied to the rtl8187 USB device; I doubt gPXE is ever going to support wireless USB devices, so we don't have to follow that lead. The rtl8180.c and rtl8185.c are each wrapper files (zero bytes compiled) that use ''REQUIRE_OBJECT()'' to pull in the necessary ''rtl818x.o'' main driver and whatever RF modules are required for the type of card they represent; each also contains a dummy list of ''PCI_ROM'' lines for that card, to enable the form of ''make'' shown in the last line above. The real NIC list in ''rtl818x.c'' is presented such that ''parserom.pl'' will not be confused by it. | ||
+ | |||
+ | I also moved the wireless code into net/80211/ (from the root of net/), in recognition that there will be several more files appearing there shortly to support encryption. | ||
+ | |||
+ | Commits ready for mainline review, in reverse order: | ||
+ | oremanj@xenon /home/oremanj/dev/gpxe/src % git log --pretty=oneline mainline-review | head -n 12 | ||
+ | dcd4ae5d0edbc9abd429bce50f0e58726cdfe00b [iwmgmt] Add user-level 802.11 management commands and common error tables | ||
+ | b23fba30c9847b8fabf651d50cf6e4e323753548 [rtl818x] Add driver for Realtek 8180/8185 wireless cards | ||
+ | 4edf4718760dfb35c6a0c811fc2e019fb176e9fc [802.11] Add support for 802.11 devices with software MAC layer | ||
+ | 6d63a4a5f928b46422e2eb79837a8aba103e5bb7 [dhcp] Await link-up before starting DHCP | ||
+ | 22c261e77bda0984f4cb052037008f487a4bcaa6 [hci] Expose ifcommon_exec() in a local header so wireless commands can use it | ||
+ | 30df822acbf6a207201f111d886effa0e4fc97d3 [ifmgmt] Move link-up status messages from autoboot() to iflinkwait() | ||
+ | 4602299f4b96f7692766553d5972066dfd567b4e [netdevice] Add netdev->link_rc field for errors encountered during link-up | ||
+ | 8041741323b40d9f5c482d3c6e1391bee7be759d [tcp] Ignore duplicate ACKs in TCP ESTABLISHED state | ||
+ | a9a0567225493046f70e6252e21ab5c6d8219e87 [image] Modify imgfree command to accept an argument | ||
+ | f77b486f42b2b12604ce94d65cbba33b55a589e5 [netdevice] Adjust maximum link-layer header length for 802.11 | ||
+ | d429b31ac28760004e753dc79178400d507975e2 [netdevice] Add netdev argument to link-layer push and pull handlers | ||
+ | 18e6470d06d8846d531d97d881be6f1278bd2f15 [nvs] Add init function for Atmel 93C66 EEPROM | ||
+ | |||
+ | Next up: encryption... |