Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Jan 2018 06:44:06 +0000
From:      Grzegorz Junka <list1@gjunka.com>
To:        Warner Losh <imp@bsdimp.com>
Cc:        freebsd-questions@freebsd.org, freebsd-drivers@freebsd.org
Subject:   Re: Server doesn't boot when 3 PCIe slots are populated
Message-ID:  <0e582bdb-e1f9-438c-3da2-2bcdc950aab5@gjunka.com>
In-Reply-To: <CANCZdfqZ-dogHXBdoyMPLOPs_R-vD%2BwLM-r6sm6ypesd0Nvp4A@mail.gmail.com>
References:  <ecce3fa6-3909-0947-685c-8a412684e99c@gjunka.com> <CAOgwaMsf9zByJYhL3KqpUMW5qKAzQEHpDWcwejY-uK=9swWbUQ@mail.gmail.com> <3d0ad00c-5214-71b0-017b-c2d5ba608e37@gjunka.com> <CAOgwaMsOKrGfGNmRt-C9Skjssj8JPAtFpk8bwG9v55LmaWdoVw@mail.gmail.com> <8df1e967-01e0-d3c2-e14c-64c7fc8c66b0@gjunka.com> <CANCZdfqZ-dogHXBdoyMPLOPs_R-vD%2BwLM-r6sm6ypesd0Nvp4A@mail.gmail.com>

next in thread | previous in thread | raw e-mail | index | archive | help

On 15/01/2018 06:18, Warner Losh wrote:
>
>
> On Jan 14, 2018 11:05 PM, "Grzegorz Junka" <list1@gjunka.com 
> <mailto:list1@gjunka.com>> wrote:
>
>
>     On 14/01/2018 16:18, Mehmet Erol Sanliturk wrote:
>
>
>
>         On Sun, Jan 14, 2018 at 5:46 PM, Grzegorz Junka
>         <list1@gjunka.com <mailto:list1@gjunka.com>
>         <mailto:list1@gjunka.com <mailto:list1@gjunka.com>>> wrote:
>
>
>             On 13/01/2018 17:56, Mehmet Erol Sanliturk wrote:
>
>
>
>                 On Sat, Jan 13, 2018 at 7:21 PM, Grzegorz Junka
>                 <list1@gjunka.com <mailto:list1@gjunka.com>
>         <mailto:list1@gjunka.com <mailto:list1@gjunka.com>>
>                 <mailto:list1@gjunka.com <mailto:list1@gjunka.com>
>         <mailto:list1@gjunka.com <mailto:list1@gjunka.com>>>> wrote:
>
>                     Hello,
>
>                     I am installing a FreeBSD server based on
>         Supermicro H8SML-iF.
>                     There are three PCIe slots to which I installed 2 NVMe
>                 drives and
>                     one network card Intel I350-T4 (with 4 Ethernet
>         slots).
>
>                     I am observing a strange behavior where the system
>         doesn't
>                 boot if
>                     all three PCIe slots are populated. It shows this
>         message:
>
>                     nvme0: <Generic NVMe Device> mem
>         0xfd8fc000-0xfd8fffff irq
>                 24 at
>                     device 0.0 on pci1
>                     nvme0: controller ready did not become 1 within
>         30000 ms
>                     nvme0: did not complete shutdown within 5 seconds of
>                 notification
>
>                     The I see a kernel panic/dump and the system
>         reboots after
>                 15 seconds.
>
>                     If I remove one card, either one of the NVMe
>         drives or the
>                 network
>                     card, the system boots fine. Also, if in BIOS I
>         set PnP OS
>                 to YES
>                     then sometimes it boots (but not always). If I set
>         PnP OS
>                 to NO,
>                     and all three cards are installed, the system
>         never boots.
>
>                     When the system boots OK I can see that the
>         network card is
>                     reported as 4 separate devices on one of the PCIe
>         slots. I
>                 tried
>                     different NVMe drives as well as changing which
>         device is
>                     installed to which slot but the result seems to be the
>                 same in any
>                     case.
>
>                     What may be the issue? Amount of power drawn by the
>                 hardware? Too
>                     many devices not supported by the motherboard? Too
>         many
>                 interrupts
>                     for the FreeBSD kernel to handle?
>
>                     Any help would be greatly appreciated.
>
>                     GregJ
>
>                     _______________________________________________
>
>
>
>
>
>                 From my experience from other trade marked main boards
>         , an
>                 action may be to check manual of your server board to see
>                 whether there are rules about use of these slots :
>         Sometimes
>                 differently shaped slots are supplied with same ports
>         : If one
>                 slot is occupied , the other slot should be left open , or
>                 rules about not to insert such a kind of device into a
>         slot ,
>                 for example , graphic cards .
>
>
>                 Mehmet Erol Sanliturk
>
>
>             I checked the manual but couldn't find any restrictions
>         regarding
>             PCIe ports. It only says how many lanes are available in each
>             slot. Would there be any obvious BIOS setting that could cause
>             this issue? I tried after resetting BIOS to default
>         settings but
>             maybe something is set incorrectly by default?
>
>             GregJ
>             _______________________________________________
>
>
>
>
>
>         http://www.supermicro.com/Aplus/motherboard/Opteron3000/SR56x0/H8SML-iF.cfm
>         <http://www.supermicro.com/Aplus/motherboard/Opteron3000/SR56x0/H8SML-iF.cfm>;
>         H8SML-iF
>
>
>         On the above page , click "OS Compatibility"
>
>
>         On the following page , click "SR5650"
>
>         http://www.supermicro.com/Aplus/support/resources/OS/OS_Comp_SR5650.cfm
>         <http://www.supermicro.com/Aplus/support/resources/OS/OS_Comp_SR5650.cfm>;
>         OS Compatibility Chart
>
>
>         On the column ( third )
>
>         H8SML-7F
>         H8SML-7
>         H8SML-iF
>         H8SML-i
>
>
>         there listed only *
>         *
>         **
>         *
>         *
>         *
>         *
>
>         FreeBSD 8.0
>         FreeBSD 9.1
>
>         From this list , it may be said that , this mother board date
>         is old , means , it seems that the new OS versions are not
>         tested after currently tested OS versions .
>
>
>         To check interaction between operating system and your
>         Supermicro H8SML-iF , select one of the suitable operating
>         system ( Unix class OSes are more suitable ) for you and
>         tested on this card , and try to install it as you like your
>         installed components . If it boots successfully , it means
>         that there is an incompatibility between your FreeBSD and the
>         main board . If no one of them boots , then you may conclude
>         that , there is a problem in your settings .
>
>
>         BIOS settings are important , because , OS communicates with
>         the main board through these settings .
>
>
>         In manual ( downloaded from the above page :
>         Manual Revision 1.0c
>         Release Date: March 12, 2014 ) , page 4-9  , "PCI/PnP
>         Configuration" is defined .
>         If PnP is selected YES. OS adjusts some device settings  . If
>         NO is selected , BIOS adjusts some device settings . When BIOS
>         adjusted device settings are not conforming to OS parameters ,
>         the result will be "FAIL" .
>
>         Therefore , more suitable selection is YES .
>
>
>         Another point is that , there are many more BIOS selectable
>         parameters and jumpers about PCI slots and others  .
>         There are some BIOS settings for PCI slots :
>
>         PCI X4 Slot 6 ( page 4-9 )
>         PCI x8 Slot 7 ( page 4-10 )
>
>
>
>         Please review these BIOS settings in your manual and set them
>         with respect to your requirements .
>
>
>     Thanks Mehmet for looking into this. It's an old motherboard but
>     my point is that it boots fine when either: one NVMe and the
>     network card, or both NVMe are installed, but not when all three
>     are installed. How would that be related to FreeBSD compatibility?
>     The chipset and all devices that I am trying to install are
>     supported by FreeBSD 11.x.
>
>     I just tried booting into a Debian live system and it also didn't
>     enumerate NVMe drives properly. This means that it's not FreeBSD
>     related and is no longer relevant for this list. I will try to
>     play with BIOS settings to see if I can make it work that way.
>     Thanks for all the help.
>
>
>
> Nvme drives are weird about power. I distrust the power estimate of 
> 5-9w earlier in the thread... given the oddity with debian, it's not 
> too crazy to think that. How far does FreeBSD boot though?
>

I tried with a different power supply but the outcome was exactly the 
same. Sometimes FreeBSD boots fine but one of the NVMe drives is not 
visible (i.e. dmesg grep shows only one NVMe). When it doesn't work it 
boots up to the point of enumerating drives (SATA, USB, NVMe). Then it 
stops at the first NVMe and reboots.

The funny thing is that very often it's enough to pull out one of the 
cards and put it back in. Then the system boots fine with all three 
cards. I had that a few times. Once it's booted it works, I can restart 
the system and it boots every time. As soon as I power off, unplug from 
the power main, wait a few minutes and power it on again, the issue 
comes back - can't boot as NVMe can't be enumerated.

I though it might be caused by the hardware being too cold. I left the 
server once overnight but it didn't boot up, it was trying and 
restarting the whole night.

GregJ





Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?0e582bdb-e1f9-438c-3da2-2bcdc950aab5>