From owner-freebsd-bugs Sat Feb 10 16:33:21 1996 Return-Path: owner-bugs Received: (from root@localhost) by freefall.freebsd.org (8.7.3/8.7.3) id QAA24614 for bugs-outgoing; Sat, 10 Feb 1996 16:33:21 -0800 (PST) Received: from tfs.com (tfs.com [140.145.250.1]) by freefall.freebsd.org (8.7.3/8.7.3) with SMTP id QAA24607 for ; Sat, 10 Feb 1996 16:33:17 -0800 (PST) Received: by tfs.com (smail3.1.28.1) Message-Id: Date: Sat, 10 Feb 96 16:33 PST From: julian@TFS.COM (Julian Elischer) To: bugs@freebsd.org Subject: krlogin/krlogind usage of OOB data is broken (fwd) Newsgroups: comp.protocols.kerberos,comp.bugs.4bsd In-Reply-To: <4fiobt$590@jik.datasrv.co.il> Organization: TRW Financial Systems, Oakland, CA Sender: owner-bugs@freebsd.org Precedence: bulk just in case this is relevant ------- start of forwarded message ------- Path: tfs.com!agate!howland.reston.ans.net!swrinde!newsfeed.internetmci.com!bloom-beacon.mit.edu!jik.datasrv.co.il!jik.datasrv.co.il!jik From: jik@annex-1-slip-jik.cam.ov.com (Jonathan Kamens) Newsgroups: comp.protocols.kerberos,comp.bugs.4bsd Subject: krlogin/krlogind usage of OOB data is broken Date: 10 Feb 1996 18:32:09 GMT Organization: jik's Linux box Lines: 99 Sender: jik@jik.datasrv.co.il (Jonathan Kamens) Message-ID: <4fiobt$590@jik.datasrv.co.il> NNTP-Posting-Host: jik.datasrv.co.il Xref: tfs.com comp.protocols.kerberos:5007 comp.bugs.4bsd:143 (Comp.bugs.4bsd folks: This posting relates to a problem in how the Kerberos rlogin and rlogind programs use out-of-band data. However, I believe that the problem I'm describing here is shared by the stock BSD rlogin and rlogind programs, which is why I'm cross-posting to comp.bugs.4bsd.) I understand that Sam Hartman has done a considerable amount of work on rlogin and rlogind, trying to get their handling of out-of-band data to work properly. I've done similar work independently of Sam, and I suspect that my changes are quite different from the ones he's implemented; nevertheless, I've come to the conclusion that the way the protocl uses OOB data is broken in at least on way that simply cannot be fixed without changing the protocol. The basic problem I'm encountering is this: What happens if krlogind sends an OOB message to krlogin, and then it sends a *second* OOB message before krlogin has processed the first one? This *can* and *does* happen. For example, when I krlogin from my Linux box at home to an AIX box at work over a SLIP link, the AIX box sends three different OOB messages as part of the initial initialization of the connection, and network congestion can easily cause all of them to get to my Linux box in consecutive packets, too quickly for it to deal with each of them before the next one arrives. Unfortunately, the way OOB data is implemented in the Linux kernel (and I believe in many other UNIX kernels as well) is that only one OOB message is allowed at a time. If a second message is received while the first one is still pending, the first one becomes part of the normal data stream, and the OOB mark is moved to the second one. This does appear to be legal, according to the BSD documentation about OOB data. Consider what occurs if this happens with krlogin/krlogind -- if krlogind sends multiple OOB messages consecutively, then krlogin will process one of them, but the rest will simply be part of the data stream, thus causing one or more garbage characters to appear on the user's screen. If the connection is being encrypted, the results are much worse -- the OOB messages that enter the normal data stream corrupt it, which usually causes krlogin to complain and close the connection. I came up with three hacks to reduce the likelihood of this problem, but they're all real hacks, and even all of them together don't work 100% of the time. First of all, I modified the protocol() function in krlogind so that any single run through the protocol() loop only causes a single OOB byte to be sent, with all the commands that need to be sent OR'd together in it. This appears to be OK since (a) krlogin treats the OOB byte as a mask, and checks it to see which bits are set, and (b) the various commands sent as OOB bytes are bit-wise exlusive of each other. I confess that of the three hacks I came up with, this is the one I'm least sure about, so if anyone can confirm or deny that this is a reasonable thing to do, I'd love to hear it. For the Linux -> AIX case I mentioned above, this reduces the number of OOB bytes sent by the AIX box from three to two. Second, I modified krlogind so that it never sends two OOB messages less than five seconds apart. In *most* cases, this gives the client time to process the first OOB message before the second one is sent. But of course, it introduces delays when initiating some connections. Sometimes network congestion or whatever makes the five-second pause by krlogind meaningless, and besides, sometimes krlogin will have to talk to a krlogind which hasn't been modified in this way. So I put a third hack hack in des_read() in krlogin. When des_read() reads the length of the next encrypted data block off the net, and that length is absurd, it checks to see if the first byte of the length contains a valid OOB message. If it does, it processes it as an OOB message, shifts the three remaining bytes of the length up one, and then reads a new byte to replace the one that was treated as OOB. In the case of the Linux -> AIX connection I mentioned above, it ends up doing this twice, since the Linux box gets three OOB messages in quick succession and only ends up dealing with one of them as OOB data. I figured that this doesn't really pose a thread to the encrypted data stream, since if there's really a problem with it the problem will turn up later anyway. However, that second hack in krlogin will only work when an encrypted session is being used. Non-encrypted sessions will still end up with some OOB messages not getting processed and ending up as garbage in the data stream. Furthermore, even with these hacks, I've still seen instances where des_read() gets unexpected values when it tries to read the length off the net, or where the encrypted data is not available for some reason when it tries to read it. As far as I can tell, the only way to make this work reliably is to require hand-shaking -- when krlogind sends OOB data to krlogin, krlogin needs to send OOB data back to krlogind to tell it when it has processed the data, and krlogind needs to wait for that ACK before sending any more OOB data. This is, I believe, how telnet/telnetd handle their OOB data. Unfortunately, this would require changing the krlogin/krlogind protocol (and I realize that "protocol" is a strong word) in a way that would make the new krlogin incompatible with the old krlogind and vice versa. The closest thing that I can come up with to modifying the protocol in a backward-compatible way is to have krlogind set a bit in the first OOB byte it sends, to tell krlogin, "I know how to deal with OOB ACK messages, so you should ACK every OOB message you receive." Unfortunately, I can't figure out a protocol-compatible way for krlogin to tell krlogind that it knows how to deal with this bit, so after sending this bit to krlogin, krlogind has no way of knowing whether it should wait for the ACK from krlogin. I would appreciate any input that people might have into this problem. Am I right that there's a problem? Has it always been there? Is there any way to solve it, short of either (a) modifying the protocol in a way that isn't backward-compatible, or (b) ditching krlogin/krlogind altogether and using ktelnet/ktelnetd instead (yes, I'd love to do that, but first of all, some of our customers demand krlogin/krlogind, and second, I've heard rumors that the security negotiation in ktelnet/ktelnetd is vulnerable). Thanks. ------- end of forwarded message -------