Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 18 Jan 2008 21:16:38 +0200
From:      John Hay <jhay@meraka.org.za>
To:        "M. Warner Losh" <imp@bsdimp.com>
Cc:        freebsd-arm@freebsd.org, des@freebsd.org
Subject:   Re: sshd broken on arm?
Message-ID:  <20080118191638.GA30155@zibbi.meraka.csir.co.za>
In-Reply-To: <20080118.120152.-345488389.imp@bsdimp.com>
References:  <4790D750.4060702@errno.com> <20080118.101747.-579326832.imp@bsdimp.com> <20080118185634.GA28843@zibbi.meraka.csir.co.za> <20080118.120152.-345488389.imp@bsdimp.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On Fri, Jan 18, 2008 at 12:01:52PM -0700, M. Warner Losh wrote:
> In message: <20080118185634.GA28843@zibbi.meraka.csir.co.za>
>             John Hay <jhay@meraka.org.za> writes:
> : On Fri, Jan 18, 2008 at 10:17:47AM -0700, M. Warner Losh wrote:
> : > In message: <4790D750.4060702@errno.com>
> : >             Sam Leffler <sam@errno.com> writes:
> : > : John Hay wrote:
> : > : > On Thu, Jan 17, 2008 at 12:58:54PM +0200, John Hay wrote:
> : > : >   
> : > : >> Hi Guys,
> : > : >>
> : > : >> I just did a new build using RELENG_7 for the arm (Avila boards) and then
> : > : >> found that I cannot ssh into them. The sshd crash with a bus error just
> : > : >> after you entered your username and password. My build of mid November
> : > : >> did not do it. Anybody got ideas?
> : > : >>
> : > : >> The last part of "sshd -Dddd" on the arm board looks like this:
> : > : >>
> : > : >> debug1: server_input_channel_req: channel 0 request pty-req reply 0
> : > : >> debug1: session_by_channel: session 0 channel 0
> : > : >> debug1: session_input_channel_req: session 0 req pty-req
> : > : >> debug1: Allocating pty.
> : > : >> debug3: mm_request_send entering: type 25
> : > : >> debug3: monitor_read: checking request 25
> : > : >> debug3: mm_answer_pty entering
> : > : >> debug1: session_new: init
> : > : >> debug1: session_new: session 0
> : > : >> debug3: mm_pty_allocate: waiting for MONITOR_ANS_PTY
> : > : >> debug3: mm_request_receive_expect entering: type 26
> : > : >> debug3: mm_request_receive entering
> : > : >> debug3: mm_request_send entering: type 26
> : > : >> ssh_mm_receive_fd: recvmsg: expected received 1 got 0
> : > : >> debug1: do_cleanup
> : > : >> debug1: PAM: cleanup
> : > : >> Bus error (core dumped)
> : > : >> debug3: PAM: sshpam_thread_cleanup entering
> : > : >>     
> : > : >
> : > : > Ok, I found the problem. It looks like something changed and now the
> : > : > alignment for the char tmp[...] array in monitor_fdpass.c:mm_send_fd
> : > : > and monitor_fdpass.c:mm_receive_fd is different and the arm processors
> : > : > do not like it. Attached is my quick fix.
> : > : >
> : > : > One question that I have is if we should just fix all of these "problems"
> : > : > or should something be changed so that these things are aligned again? In
> : > : > the last month or two I have come across quite a few of these things that
> : > : > used to work on the arm and now do not anymore because of alignment
> : > : > changes.
> : > : >
> : > : > (I have cc'ed des@ because his name pitch up a lot in the openssh cvs logs.
> : > : > :-)
> : > : >   
> : > : 
> : > : This used to work fine so the problem is elsewhere.  Sounds like a 
> : > : toolchain or header change is the root cause.
> : > 
> : > Or some subtle change in the kernel that isn't using the macro (or is
> : > now and didn't used to be).
> : 
> : Hmmm Just to make sure that I'm on the right page. On FreeBSD ARM one
> : is not supposed to be able to access unaligned memory? Ie. an int that
> : does not start on an address that is a multiple of 4.
> : 
> : In a C function if you have something like "char tmp[4]", can you assume
> : that the compiler will align it on a 4 byte boundary or can it do it on
> : a byte boundary?
> : 
> : If one cannot access unaligned ints and char arrays are not int aligned,
> : then we were just lucky that the code worked at some stage.
> 
> You are correct.  The fact that it seemed to work meant that we were
> either getting lucky before, or there was some critical code on the
> kernel side that has accidentally been removed...

I don't think the kernel will make a difference. The bus error happened
on line 64 of openssh/monitor_fdpass.c. That is before the kernel was
called (sendmsg())... Except if the kernel aligned the stack differently.

> : > John, I don't suppose you'd have time for a binary search?
> : 
> : I'll see what I can do, but it will be slow going.
> 
> Bad sshd with kernels going back in time should be sufficient...

That should go faster. :-) I'll look at it tomorrow between power failures.
South Africa has a power shortage at the moment and they are load shedding
us in 3-6 hour sessions. :-(((

John
-- 
John Hay -- John.Hay@meraka.csir.co.za / jhay@FreeBSD.org



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20080118191638.GA30155>