Date: Mon, 06 May 2013 20:37:25 +1000 From: Lawrence Stewart <lstewart@freebsd.org> To: "Angelogiannopoulos, Aris" <Aris.Angelogiannopoulos@netapp.com> Cc: "freebsd-net@freebsd.org" <freebsd-net@freebsd.org> Subject: Re: Siftr inflight byte question Message-ID: <518787E5.1030500@freebsd.org> In-Reply-To: <B37E016C3006B044970756E725F25D72DC3922@SACEXCMBX06-PRD.hq.netapp.com> References: <B37E016C3006B044970756E725F25D72DC3909@SACEXCMBX06-PRD.hq.netapp.com> <51876A60.10209@swin.edu.au> <B37E016C3006B044970756E725F25D72DC3922@SACEXCMBX06-PRD.hq.netapp.com>
next in thread | previous in thread | raw e-mail | index | archive | help
[ccing freebsd-net@ so my problem description enters the collective subconscious in case I forget about this again] For everyone tuning in, Aris asked me the apt question of why siftr(4)'s "# inflight bytes" field doesn't take into account sacked bytes. On 05/06/13 18:45, Angelogiannopoulos, Aris wrote: > On 05/06/13 18:31, Lawrence Stewart wrote: >> On 05/06/13 18:18, Angelogiannopoulos, Aris wrote: >>> I noticed that the sent_inflight_bytes variable doesn’t take >>> into account the SACKd bytes in case SACK is used. >>> >>> Is there a specific reason for this? >> >> There is a very good reason - I was too lazy at the time ;) >> >> More specifically, calculating the number of bytes SACKed by the >> receiver is more difficult than pulling simple variables out of the >> TCP control block and I was new to the TCP stack when I wrote >> SIFTR. I've simply never bothered to go back and improve things in >> this regard. > > isn't it as simple as (tp->snd_nxt - tp->snd_fack) + > tp->sackhint.sack_bytes_rexmit ? This is the way the stack is doing > it in case of more than three duplicate acks. You would hope so given the comment accompanying that calculation in tcp_do_segment(), but I'm pretty sure that when I looked into this many moons ago, I found that calculation to be incorrect. I should obviously have gotten to the bottom of the problem back then, but from memory it was not crucial to what I was working on at the time and I subsequently forgot about it. It's good to come back to this now as it should be fixed. >>> Should we change it to better reflect the conditions on the >>> channel? >> >> Ideally, yes. Feel like having a go and submitting a patch? I can >> give you some hints on how to do it if you want a concrete starting >> point. > > Sure why not? My write up below is partially from memory and from a 2 min look at the code just now i.e. parts may be incorrect - please use it as a starting point only... So snd_fack effectively acts as a proxy for snd_una while SACK recovery is happening i.e. it tracks the highest sequence number we've been cumulatively SACKed for (left edge of the window). sack_bytes_rexmit counts the number of bytes we've retransmitted from the known holes during the current recovery episode. snd_max is the highest sequence number we've sent (right edge of the window). Another important bit of info is that the sender-side sack scoreboard stores holes (bytes missing at the receiver), whereas the sack blocks that come in on the wire refer to received byte ranges (bytes successfully received at the receiver). Therefore from the senders perspective, I believe the "pseudo" calculation for bytes in flight during a SACK recovery episode should look something like: (snd_max - snd_fack) - total_hole_bytes + sack_bytes_rexmit 3 of those variables are already around and (hopefully) usable, but I don't believe we currently track a variable equivalent to total_hole_bytes. Hopefully someone will chime in and correct me if any of the above is misguided. Otherwise, your challenge, should you choose to accept it, is to patch SIFTR to perform the above calculation, and to verify that the reported result is correct - perhaps by running some controlled lab experiments where you can review the value reported by SIFTR against the known SACK blocks and sequence numbers on the wire during the recovery episode. You could choose to make total_hole_bytes a sackhint like sack_bytes_rexmit, but I suggest you create a first version of the patch which manually walks the sack scoreboard each time through siftr_siftdata() to sum the # bytes in each hole to ensure you don't inadvertently introduce accounting bugs while developing the patch. Patch v2 can include total_hole_bytes as a sackhint. Cheers, Lawrence
Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?518787E5.1030500>