Skip site navigation (1)Skip section navigation (2)
Date:      Thu, 30 Nov 2017 15:39:51 +0100
From:      "Hartmann, O." <ohartmann@walstatt.org>
To:        Hans Petter Selasky <hselasky@FreeBSD.org>
Cc:        src-committers@freebsd.org, svn-src-all@freebsd.org, svn-src-head@freebsd.org
Subject:   Re: svn commit: r326376 - head/sys/kern
Message-ID:  <20171130153946.0288b55f@hermann>
In-Reply-To: <201711292328.vATNSeOM046518@repo.freebsd.org>
References:  <201711292328.vATNSeOM046518@repo.freebsd.org>

next in thread | previous in thread | raw e-mail | index | archive | help
On Wed, 29 Nov 2017 23:28:40 +0000 (UTC)
Hans Petter Selasky <hselasky@FreeBSD.org> wrote:

> Author: hselasky
> Date: Wed Nov 29 23:28:40 2017
> New Revision: 326376
> URL: https://svnweb.freebsd.org/changeset/base/326376
>=20
> Log:
>   The sched_add() function is not only used when the thread is
> initially started, but also by the turnstiles to mark a thread as
> runnable for all locks, for instance sleepqueues do:
>   setrunnable()->sched_wakeup()->sched_add()
>  =20
>   In r326218 code was added to allow booting from non-zero CPU numbers
>   by setting the ts_cpu field inside the ULE scheduler's sched_add()
>   function. This had an undesired side-effect that prior sched_pin()
> and sched_bind() calls got disregarded. This patch fixes the
>   initialization of the ts_cpu field for the ULE scheduler to only
>   happen once when the initial thread is constructed during system
>   init. Forking will then later on ensure that a valid ts_cpu value
> gets copied to all children.
>  =20
>   Reviewed by:	jhb, kib
>   Discussed with:	nwhitehorn
>   MFC after:	1 month
>   Differential revision:	https://reviews.freebsd.org/D13298
>   Sponsored by:	Mellanox Technologies
>=20
> Modified:
>   head/sys/kern/sched_ule.c
>=20
> Modified: head/sys/kern/sched_ule.c
> =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=
=3D=3D=3D=3D
> --- head/sys/kern/sched_ule.c	Wed Nov 29 21:16:14 2017
> (r326375) +++ head/sys/kern/sched_ule.c	Wed Nov 29 23:28:40
> 2017	(r326376) @@ -1405,7 +1405,6 @@ sched_setup(void *dummy)
> =20
>  	/* Add thread0's load since it's running. */
>  	TDQ_LOCK(tdq);
> -	td_get_sched(&thread0)->ts_cpu =3D curcpu; /* Something valid
> to start */ thread0.td_lock =3D TDQ_LOCKPTR(TDQ_SELF());
>  	tdq_load_add(tdq, &thread0);
>  	tdq->tdq_lowpri =3D thread0.td_priority;
> @@ -1642,6 +1641,7 @@ schedinit(void)
>  	ts0->ts_ltick =3D ticks;
>  	ts0->ts_ftick =3D ticks;
>  	ts0->ts_slice =3D 0;
> +	ts0->ts_cpu =3D curcpu;	/* set valid CPU number */
>  }
> =20
>  /*
> @@ -2453,7 +2453,6 @@ sched_add(struct thread *td, int flags)
>  	 * Pick the destination cpu and if it isn't ours transfer to
> the
>  	 * target cpu.
>  	 */
> -	td_get_sched(td)->ts_cpu =3D curcpu; /* Pick something valid
> to start */ cpu =3D sched_pickcpu(td, flags);
>  	tdq =3D sched_setcpu(td, cpu, flags);
>  	tdq_add(tdq, td, flags);
> _______________________________________________
> svn-src-head@freebsd.org mailing list
> https://lists.freebsd.org/mailman/listinfo/svn-src-head
> To unsubscribe, send any mail to
> "svn-src-head-unsubscribe@freebsd.org"

With that patch commited, I was able to boot at least a single user
kernel (r326394). The kernel immedaitely crashes when trying to start
X11 (the laptop is a Intel haswell i5 4200M type CPU, Lenovo E540, the
Optimus nVidia GPU is not used, i915kms.ko loaded).

The binary world is now r326394, the kernel successfuly running is
r325893. Anything > r326218 is failing!

And it is even worse for me. On the three crashed servers I'm unable to
boot any(!) kernel. All three system boot off Samsung SSD, two 850 Pro
(256GB), on server is with an older Samsung 830 Pro, 128 GB. Any kernel
on those boxes  booting off after they have been corrupted,=20
r326394 GENERIC and the custom kernels adjusted to each machine, is
booting and then getting stuck at a certain point after initializing
USB. Forever. They seem active a kind of, since pluggin/unpluggin
devices is reported.

A weird thing is, I have installed different kernels from a packages
built on the remaining machine, droped at /boot/kernel.SOMENAME.

The loader doesn't give me those additional kernels: when exting the
loader menu with option 3 and typing:

load /boot/kernel.GENERIC/kernel

at the loader prompt (OK ), I get something like no filename provided
or similar. Huh? What the ... is this meaning? When investigating the
filesystem with this minimalistic Current from 21.11.2017 (USB flash
image), the folger boot/kernel.GENERIC/ is well populated and the
GENERIC kernel is also there and seem well.

What happened here?

=46rom another post I got the idea that the "patch" N. Whitehorn
corrupted or destroyed SSDs - so how can this be fixed or: how can I
check whether the SSD system has been destroyed?

For the record: after a after-the-book update of r32635X (single user
booted new kernel), a kernel crash occured while installing world.
After that, I was presented a halted BTX loader screen on all those SSD
driven systems.

On two of the systems I have on ZFS volumes a complete /usr/obj. With
the CURRENT images provided I can not even installworld/installkernel
this world nor rebuild it with the proper fixes. Please advice how to
perform a installworld/installkernel. Somewhere FreeBSD must have some
desaster recovery image not crippled and evacuated of the compiler
suite.

Kind regards and thanks in advance,

oh=20



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?20171130153946.0288b55f>