Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 15 Jan 2018 19:59:35 +0300
From:      Mehmet Erol Sanliturk <m.e.sanliturk@gmail.com>
To:        galtsev@kicp.uchicago.edu
Cc:        Grzegorz Junka <list1@gjunka.com>,  FreeBSD Questions Mailing List <freebsd-questions@freebsd.org>, Warner Losh <imp@bsdimp.com>, freebsd-drivers@freebsd.org
Subject:   Re: Server doesn't boot when 3 PCIe slots are populated
Message-ID:  <CAOgwaMuah3D46qu9efp_nNA7EDoFRyO-7KS9%2BxwJ5xkGBHxi%2Bg@mail.gmail.com>
In-Reply-To: <57715.108.68.169.115.1516033864.squirrel@cosmo.uchicago.edu>
References:  <ecce3fa6-3909-0947-685c-8a412684e99c@gjunka.com> <CAOgwaMsf9zByJYhL3KqpUMW5qKAzQEHpDWcwejY-uK=9swWbUQ@mail.gmail.com> <3d0ad00c-5214-71b0-017b-c2d5ba608e37@gjunka.com> <CAOgwaMsOKrGfGNmRt-C9Skjssj8JPAtFpk8bwG9v55LmaWdoVw@mail.gmail.com> <8df1e967-01e0-d3c2-e14c-64c7fc8c66b0@gjunka.com> <CANCZdfqZ-dogHXBdoyMPLOPs_R-vD%2BwLM-r6sm6ypesd0Nvp4A@mail.gmail.com> <0e582bdb-e1f9-438c-3da2-2bcdc950aab5@gjunka.com> <CAOgwaMvusKzt%2BYvmKeuyox0c=wgqEv9UP475Eacm2B0OkF7OrQ@mail.gmail.com> <57715.108.68.169.115.1516033864.squirrel@cosmo.uchicago.edu>

next in thread | previous in thread | raw e-mail | index | archive | help
On Mon, Jan 15, 2018 at 7:31 PM, Valeri Galtsev <galtsev@kicp.uchicago.edu>
wrote:

>
> On Mon, January 15, 2018 3:44 am, Mehmet Erol Sanliturk wrote:
> > On Mon, Jan 15, 2018 at 9:44 AM, Grzegorz Junka <list1@gjunka.com>
> wrote:
> >
> >>
> >> On 15/01/2018 06:18, Warner Losh wrote:
> >>
> >>>
> >>>
> >>> On Jan 14, 2018 11:05 PM, "Grzegorz Junka" <list1@gjunka.com <mailto:
> >>> list1@gjunka.com>> wrote:
> >>>
> >>>
> >>>     On 14/01/2018 16:18, Mehmet Erol Sanliturk wrote:
> >>>
> >>>
> >>>
> >>>         On Sun, Jan 14, 2018 at 5:46 PM, Grzegorz Junka
> >>>         <list1@gjunka.com <mailto:list1@gjunka.com>
> >>>         <mailto:list1@gjunka.com <mailto:list1@gjunka.com>>> wrote:
> >>>
> >>>
> >>>             On 13/01/2018 17:56, Mehmet Erol Sanliturk wrote:
> >>>
> >>>
> >>>
> >>>                 On Sat, Jan 13, 2018 at 7:21 PM, Grzegorz Junka
> >>>                 <list1@gjunka.com <mailto:list1@gjunka.com>
> >>>         <mailto:list1@gjunka.com <mailto:list1@gjunka.com>>
> >>>                 <mailto:list1@gjunka.com <mailto:list1@gjunka.com>
> >>>         <mailto:list1@gjunka.com <mailto:list1@gjunka.com>>>> wrote:
> >>>
> >>>                     Hello,
> >>>
> >>>                     I am installing a FreeBSD server based on
> >>>         Supermicro H8SML-iF.
> >>>                     There are three PCIe slots to which I installed 2
> >>> NVMe
> >>>                 drives and
> >>>                     one network card Intel I350-T4 (with 4 Ethernet
> >>>         slots).
> >>>
> >>>                     I am observing a strange behavior where the system
> >>>         doesn't
> >>>                 boot if
> >>>                     all three PCIe slots are populated. It shows this
> >>>         message:
> >>>
> >>>                     nvme0: <Generic NVMe Device> mem
> >>>         0xfd8fc000-0xfd8fffff irq
> >>>                 24 at
> >>>                     device 0.0 on pci1
> >>>                     nvme0: controller ready did not become 1 within
> >>>         30000 ms
> >>>                     nvme0: did not complete shutdown within 5 seconds
> >>> of
> >>>                 notification
> >>>
> >>>                     The I see a kernel panic/dump and the system
> >>>         reboots after
> >>>                 15 seconds.
> >>>
> >>>                     If I remove one card, either one of the NVMe
> >>>         drives or the
> >>>                 network
> >>>                     card, the system boots fine. Also, if in BIOS I
> >>>         set PnP OS
> >>>                 to YES
> >>>                     then sometimes it boots (but not always). If I set
> >>>         PnP OS
> >>>                 to NO,
> >>>                     and all three cards are installed, the system
> >>>         never boots.
> >>>
> >>>                     When the system boots OK I can see that the
> >>>         network card is
> >>>                     reported as 4 separate devices on one of the PCIe
> >>>         slots. I
> >>>                 tried
> >>>                     different NVMe drives as well as changing which
> >>>         device is
> >>>                     installed to which slot but the result seems to be
> >>> the
> >>>                 same in any
> >>>                     case.
> >>>
> >>>                     What may be the issue? Amount of power drawn by the
> >>>                 hardware? Too
> >>>                     many devices not supported by the motherboard? Too
> >>>         many
> >>>                 interrupts
> >>>                     for the FreeBSD kernel to handle?
> >>>
> >>>                     Any help would be greatly appreciated.
> >>>
> >>>                     GregJ
> >>>
> >>>                     _______________________________________________
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>                 From my experience from other trade marked main boards
> >>>         , an
> >>>                 action may be to check manual of your server board to
> >>> see
> >>>                 whether there are rules about use of these slots :
> >>>         Sometimes
> >>>                 differently shaped slots are supplied with same ports
> >>>         : If one
> >>>                 slot is occupied , the other slot should be left open ,
> >>> or
> >>>                 rules about not to insert such a kind of device into a
> >>>         slot ,
> >>>                 for example , graphic cards .
> >>>
> >>>
> >>>                 Mehmet Erol Sanliturk
> >>>
> >>>
> >>>             I checked the manual but couldn't find any restrictions
> >>>         regarding
> >>>             PCIe ports. It only says how many lanes are available in
> >>> each
> >>>             slot. Would there be any obvious BIOS setting that could
> >>> cause
> >>>             this issue? I tried after resetting BIOS to default
> >>>         settings but
> >>>             maybe something is set incorrectly by default?
> >>>
> >>>             GregJ
> >>>             _______________________________________________
> >>>
> >>>
> >>>
> >>>
> >>>
> >>>         http://www.supermicro.com/Aplus/motherboard/Opteron3000/SR56
> >>> x0/H8SML-iF.cfm
> >>>         <http://www.supermicro.com/Aplus/motherboard/Opteron3000/SR5
> >>> 6x0/H8SML-iF.cfm>
> >>>         H8SML-iF
> >>>
> >>>
> >>>         On the above page , click "OS Compatibility"
> >>>
> >>>
> >>>         On the following page , click "SR5650"
> >>>
> >>>         http://www.supermicro.com/Aplus/support/resources/OS/OS_Comp
> >>> _SR5650.cfm
> >>>         <http://www.supermicro.com/Aplus/support/resources/OS/OS_Com
> >>> p_SR5650.cfm>
> >>>         OS Compatibility Chart
> >>>
> >>>
> >>>         On the column ( third )
> >>>
> >>>         H8SML-7F
> >>>         H8SML-7
> >>>         H8SML-iF
> >>>         H8SML-i
> >>>
> >>>
> >>>         there listed only *
> >>>         *
> >>>         **
> >>>         *
> >>>         *
> >>>         *
> >>>         *
> >>>
> >>>         FreeBSD 8.0
> >>>         FreeBSD 9.1
> >>>
> >>>         From this list , it may be said that , this mother board date
> >>>         is old , means , it seems that the new OS versions are not
> >>>         tested after currently tested OS versions .
> >>>
> >>>
> >>>         To check interaction between operating system and your
> >>>         Supermicro H8SML-iF , select one of the suitable operating
> >>>         system ( Unix class OSes are more suitable ) for you and
> >>>         tested on this card , and try to install it as you like your
> >>>         installed components . If it boots successfully , it means
> >>>         that there is an incompatibility between your FreeBSD and the
> >>>         main board . If no one of them boots , then you may conclude
> >>>         that , there is a problem in your settings .
> >>>
> >>>
> >>>         BIOS settings are important , because , OS communicates with
> >>>         the main board through these settings .
> >>>
> >>>
> >>>         In manual ( downloaded from the above page :
> >>>         Manual Revision 1.0c
> >>>         Release Date: March 12, 2014 ) , page 4-9  , "PCI/PnP
> >>>         Configuration" is defined .
> >>>         If PnP is selected YES. OS adjusts some device settings  . If
> >>>         NO is selected , BIOS adjusts some device settings . When BIOS
> >>>         adjusted device settings are not conforming to OS parameters ,
> >>>         the result will be "FAIL" .
> >>>
> >>>         Therefore , more suitable selection is YES .
> >>>
> >>>
> >>>         Another point is that , there are many more BIOS selectable
> >>>         parameters and jumpers about PCI slots and others  .
> >>>         There are some BIOS settings for PCI slots :
> >>>
> >>>         PCI X4 Slot 6 ( page 4-9 )
> >>>         PCI x8 Slot 7 ( page 4-10 )
> >>>
> >>>
> >>>
> >>>         Please review these BIOS settings in your manual and set them
> >>>         with respect to your requirements .
> >>>
> >>>
> >>>     Thanks Mehmet for looking into this. It's an old motherboard but
> >>>     my point is that it boots fine when either: one NVMe and the
> >>>     network card, or both NVMe are installed, but not when all three
> >>>     are installed. How would that be related to FreeBSD compatibility?
> >>>     The chipset and all devices that I am trying to install are
> >>>     supported by FreeBSD 11.x.
> >>>
> >>>     I just tried booting into a Debian live system and it also didn't
> >>>     enumerate NVMe drives properly. This means that it's not FreeBSD
> >>>     related and is no longer relevant for this list. I will try to
> >>>     play with BIOS settings to see if I can make it work that way.
> >>>     Thanks for all the help.
> >>>
> >>>
> >>>
> >>> Nvme drives are weird about power. I distrust the power estimate of
> >>> 5-9w
> >>> earlier in the thread... given the oddity with debian, it's not too
> >>> crazy
> >>> to think that. How far does FreeBSD boot though?
> >>>
> >>>
> >> I tried with a different power supply but the outcome was exactly the
> >> same. Sometimes FreeBSD boots fine but one of the NVMe drives is not
> >> visible (i.e. dmesg grep shows only one NVMe). When it doesn't work it
> >> boots up to the point of enumerating drives (SATA, USB, NVMe). Then it
> >> stops at the first NVMe and reboots.
> >>
> >> The funny thing is that very often it's enough to pull out one of the
> >> cards and put it back in. Then the system boots fine with all three
> >> cards.
> >> I had that a few times. Once it's booted it works, I can restart the
> >> system
> >> and it boots every time. As soon as I power off, unplug from the power
> >> main, wait a few minutes and power it on again, the issue comes back -
> >> can't boot as NVMe can't be enumerated.
> >>
> >> I though it might be caused by the hardware being too cold. I left the
> >> server once overnight but it didn't boot up, it was trying and
> >> restarting
> >> the whole night.
> >>
> >> GregJ
> >>
> >>
> >> _______________________________________________
> >>
> >>
> >
> >
> >
> > The above explanation brings mind to the "impedance mismatch  in
> > electronics" problem .
>
> Hm, I wouldn't say so. First of all, I will seriously doubt that sane
> cards are out of specs as far as impedance is concerned.
>
> But before going further, let's make sure we talk about the same thing. I
> assume impedance mismatch is what is related to impedance of the load
> attached to transmission line to be different from impedance of
> transmission line itself. In such case part of transmitted signal is
> reflected from the load back into transmission line. This can make mess as
> transmitted signal is mixed with this reflected at different positions of
> the loads along the same transmission line. One has to have really large
> mismatch (over 20% at least) to make that matter. Many of us remember this
> in at least two computer related cases: 1. we used terminators at the end
> of SCSI cables (or attached "self-terminating SCSI device to the end of
> line). 2. In some system boards in which memory buses had no terminators
> the manual would say to populate slots beginning from the fartherst away
> from CPU (to defeat reflection from open end of memory bus lines).
>
> I have never heard of anything like that on PCI express bus. If I am
> wrong, could you give some pointer so I can read about it.
>
> Thanks in advance for pointers! (I know: you learn something every day -
> which I bet I am about to ;-)
>
> Valeri
>
> >
> > ( Please search
> >
> >
> > impedance mismatch  in electronics
> > impedance matching  in electronics
> >
> >
> > in Internet if you want explanations about them . )
> >
> >
> > When all of these cards are inserted into slots simultaneously , their
> > accumulated electronic effect may distort behaviour of your mother board
> > circuits or attached card circuit(s) .
> >
> >
> > Therefore , if you can find another NVMe and/or network card , please
> test
> > their effect .
> > Such tests may be inconclusive because mother board circuits may be
> > affected negatively from "properly" operating add on cards when they are
> > inserted together .
> >
> >
> > If it is feasible for you , you may use USB attached network card(s) to
> > eliminate network card attachment .
> > Or you may use a more capable one NVMe card instead of two smaller NVMe
> > cards , or you may use only one of them , or/and select an SATA SSD .
> > Such a choice would save your investment and produces a working server
> > with
> > a "little" loss when compared to "all" .
> >
> >
> >
> >
> > Mehmet Erol Sanliturk
> > _______________________________________________
> > freebsd-questions@freebsd.org mailing list
> > https://lists.freebsd.org/mailman/listinfo/freebsd-questions
> > To unsubscribe, send any mail to
> > "freebsd-questions-unsubscribe@freebsd.org"
> >
>
>
> ++++++++++++++++++++++++++++++++++++++++
> Valeri Galtsev
> Sr System Administrator
> Department of Astronomy and Astrophysics
> Kavli Institute for Cosmological Physics
> University of Chicago
> Phone: 773-702-4247
> ++++++++++++++++++++++++++++++++++++++++
>



The problem of "impedance matching" occurs between any two interacting
circuits : When a circuit gives its "output" to another circuit as "input"
there exists this problem irrespective of subjects and kinds of circuits .
Obviously , behaviours are not exactly the same .



If you search the following phrase in Internet , you will find a large
amount of links :


impedance matching circuit design




If we think a computer main board slots , the following may occur :

Assume a slot has a voltage level for triggering input into an add on card
, i.e. , add on card is affected when it senses a voltage level equal or
greater than that level . The lower level values will not trigger the add
on card .

Assume an add on card is working .
Assume a new add on card is also working alone .

When both of these add on cards are inserted into slots , the power drawn
will lower the voltage level of the surrounding circuit more than a single
card .
If this lowered voltage level is less than threshold level of the added
cards ( one of them , or both of them ) it ( they ) will not sense the
signals from the surrounding circuits . Therefore , it (they) will not
respond to the action requesting signals .


In one of the previous messages ,


https://lists.freebsd.org/pipermail/freebsd-questions/2018-January/280455.html


it is said that


"

I am observing a strange behavior where the system doesn't boot if all
three PCIe slots are populated. It shows this message:

nvme0: <Generic NVMe Device> mem 0xfd8fc000-0xfd8fffff irq 24 at device
0.0 on pci1
nvme0: controller ready did not become 1 within 30000 ms
nvme0: did not complete shutdown within 5 seconds of notification

The I see a kernel panic/dump and the system reboots after 15 seconds.

If I remove one card, either one of the NVMe drives or the network card,
the system boots fine.


"


A good example may be the above message .



Mehmet Erol Sanliturk



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAOgwaMuah3D46qu9efp_nNA7EDoFRyO-7KS9%2BxwJ5xkGBHxi%2Bg>