Skip site navigation (1)Skip section navigation (2)
Date:      Fri, 19 Jun 2015 17:10:11 -0700
From:      Dieter BSD <dieterbsd@gmail.com>
To:        freebsd-hackers@freebsd.org
Subject:   Re: Realtime process CPU starvation
Message-ID:  <CAA3ZYrDo6duhBZaODVzNQJuiV375z9vmcFR2yirn3SPqoTVJoQ@mail.gmail.com>

next in thread | raw e-mail | index | archive | help
Chris typed:
> I have a process running at realtime priority 0
> under FreeBSD 10.0. The main thread needs to run every 10 ms,

#include <standard_useless_response.h>
FreeBSD is not a true real time OS.  Please submit your patch
to make it one.

> How can a thread with the highest realtime priority not run for such a
> lengthy period?

Some device drivers do things like

for(ever)
  {
  DELAY(MAXINT);
  }

which does wonders for true-real-time jobs.  :-(
(and sucks even for non-real-time jobs)

I have also found that kernel printf(9) can spoil true-real-time jobs.
Hint: log(9) is your friend.

I run a lot of true-real-time jobs (needs something like < 4ms)
on a uniprocessor, and most of the time they run fine.  Even running
several of these jobs at once.  Carefully crafted userland program
running at rtprio 5.  It took a *lot* of work finding the problems and
a lot of kernel hacking (mostly device drivers) to get this far.  This
is with 8.2.  Tried 10.1 but it is severly broken (in non-real-time-ways)
and is unusable.  :-(  But even now I still get a true-real-time failure
occasionally (and data lost forever :-( ) and no clue why.  :-(

> ULE scheduler

Has problems.  I frequently run jobs that should be either cpu-bound
or disk-bound, but observe a significant percentage of cpu idle
time *and* the disk i/o rate is far lower than it should be.  No clue
what is going on there.  Fortunately these jobs are not true-real-time.
I guess I'll have to call these jobs "scheduler-bound".

> I was suspicious of the ZFS filesystem

How much grief would it be to try a different filesystem (e.g. FFS)?
(I run FFS with softdep.)

> During this 500 ms, two of the four CPUs were idle.

Hmmm, could the scheduler be trying to always run your true-real-time
job on the same cpu and it is waiting for that cpu to become free?
Is there a way to ask the scheduler to run your rt job on cpu0 and
zfs on cpu1?  That might help rule zfs in or out as a problem.
(Clearly I am not a scheduler wizard.)

Maybe see if there is a way to instrument the scheduler without having the
instrumentation creating its own rt or scheduling problem?



Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?CAA3ZYrDo6duhBZaODVzNQJuiV375z9vmcFR2yirn3SPqoTVJoQ>