Skip site navigation (1)Skip section navigation (2)
Date:      Tue, 12 Aug 1997 18:10:50 -0700 (PDT)
From:      Julian Elischer <julian@FreeBSD.ORG>
To:        hackers@FreeBSD.ORG
Subject:   2.2.2+ crash.. more info
Message-ID:  <199708130110.SAA21060@freefall.freebsd.org>

next in thread | raw e-mail | index | archive | help

We have several hundred Bsd machines here.. we see this one enough for
me to recognise it..

the plot thickens..
I have discovered the following:
1/ the code that crashes:
  scanning the queues in swithc:

looking a the queue array:
  }, {
    ph_link = 0xf01be350, 
    ph_rlink = 0xf01be350
  }, {
    ph_link = 0x0,  <----- a few instructions before, this ALSO was 0xf07e1000
    ph_rlink = 0xf07e1000
  }, {
    ph_link = 0xf01be360, 
    ph_rlink = 0xf01be360
  }, {
    ph_link = 0xf01be368, 
    ph_rlink = 0xf01be368
  },

one entry is bogus from the registers we see it was 0xf07e1000
shortly before.

looking at the registers we can see what proc struct is being looked at..

$11 = {
  p_procq = {
    tqe_next = 0x0, 
    tqe_prev = 0xf01be7e8
  }, 
  p_list = {
    le_next = 0xf07e1200, 
    le_prev = 0xf07d2808
  }, 
...
this is where the NULL came from 

but wait!
this looks like an entry in a sleep queue..
sure enough!

in the array of sleep queues...
  }, {
    tqh_first = 0x0, 
    tqh_last = 0xf01be7d8
  }, {
    tqh_first = 0x0, 
    tqh_last = 0xf01be7e0
  }, {
    tqh_first = 0xf07e1000,  <--------- !!!!!
    tqh_last = 0xf07e1000
  }, {
    tqh_first = 0x0, 
    tqh_last = 0xf01be7f0
  }, {
    tqh_first = 0x0, 

so why was this sleeping?
looking in the proc struct again..
  p_wchan = 0xf272f698, 
  p_wmesg = 0xf015bead "swread", 

Since the processes proc structure looks liek a sleeping process,
it was probably put onto the sleep queue last, when it was already on
the runnable queue.
how can this happen?

some ideas:
it was half way through being woken up when the scheduling occured?
and still looked like a sleeping process? unlikely..

it was put onto the sleep queue accidentally by interrupt code
that just had it's proc address by accident?
unlikely.

somehow a wakeup occured during the tsleep call? 
sounds unlikely..

code examinations will follow with more info..
if this strikes anyone as familiar, do chime in!

julian




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?199708130110.SAA21060>