Skip site navigation (1)Skip section navigation (2)
Date:      Mon, 24 Apr 2000 13:25:39 +0200 (CEST)
From:      Alexander Leidinger <Alexander@Leidinger.net>
To:        eischen@vigrid.com
Cc:        current@freebsd.org
Subject:   Re: pthread_cond_broadcast() not delivered
Message-ID:  <200004241125.NAA01320@Magelan.Leidinger.net>
In-Reply-To: <200004231809.OAA29230@pcnet1.pcnet.com>

next in thread | previous in thread | raw e-mail | index | archive | help
On 23 Apr, Daniel Eischen wrote:

>> (14) netchild@ttyp2% uname -a
>> FreeBSD Magelan.Leidinger.net 5.0-CURRENT FreeBSD 5.0-CURRENT #14:
>> Fri Apr 21 17:28:37 CEST 2000     root@:/big/usr/src/sys/compile/WORK
>>  i386
>> 
>> I've an application which uses pthread_cond_{wait,broadcast}() and
>> the debug output gives me the impression that the broadcast did not
>> get delivered anymore.
>> 
>> I run this program only occasionally, but with 4-current (last year)
>> it worked, and I haven't changed anything mutex-/cond-related in it
>> since then.
>> 
>> I've attached a short test-prog (1.7k) which shows the same behavior,
>> compile it with "cc -D_THREAD_SAFE -pthread test.c" and run
>> "./a.out".
> 
> If you want it to work correctly, you have to make the second thread
> release the mutex.  Look at it more closely:
> 
>     void *
>     second_thread(void *arg)
>     {
>       /* syncronize */
>       fprintf(stderr, "Second: lock.\n");
>       pthread_mutex_lock(main_mutex);
> 
>       fprintf(stderr, "Second: broadcast.\n");
>       pthread_cond_broadcast(main_cond);
> 
>       fprintf(stderr, "Second: unlock.\n");
>       pthread_mutex_lock(main_mutex);
>       ^^^^^^^^^^^^^^^^^^
[...]

Yes, sorry, a flaw in my test-prog. And yes, the test-prog works now,
but my app didn't. I've verified every lock/unlock with the
corresponding fprintf(), it's consistent:
---snip---
[prefill buffer 0-14 and start Output-thread]
Decode: (1) lock buffer.
Decode: (2) lock buffer 15.
Decode: before cond_wait.
Output: (1) lock buffer.
Output: before broadcast.
Output: after broadcast.
Output: (2) lock buffer 0.
Output: (3) unlock buffer.
Output: write buffer 0.
Output: (5) unlock buffer 0
Output: (6) lock buffer.
Output: (2) lock buffer 1.
Output: (3) unlock buffer.
Output: write buffer 1.
Output: (5) unlock buffer 1
Output: (6) lock buffer.
Output: (2) lock buffer 2.
Output: (3) unlock buffer.
Output: write buffer 2.
Output: (5) unlock buffer 2
[... buffer 3-13]
Output: (6) lock buffer.
Output: (2) lock buffer 14.
Output: (3) unlock buffer.
Output: write buffer 14.
Output: (5) unlock buffer 14
Output: (6) lock buffer.
Output: (2) lock buffer 15.
[deadlock]
---snip---
(after buf 15 it has to start with buf 0 again).

The corresponding code (Decode-thread):
---snip---
#if 1
fprintf(stderr, "Decode: (1) lock buffer.\n");
#endif
  pthread_mutex_lock(output->mutex);

/* [create output thread] */

#if 1
fprintf(stderr, "Decode: (2) lock buffer %d.\n", which_buffer);
#endif
  pthread_mutex_lock(output->buffer[which_buffer].mutex);
#if 1
fprintf(stderr, "Decode: before cond_wait.\n");
#endif
  pthread_cond_wait(&output->output_startet, output->mutex);
#if 1
fprintf(stderr, "Decode: (3) unlock buffer.\n");
#endif
  pthread_mutex_unlock(output->mutex);
---snip---

and (Output-thread):
---snip---
#if 1
fprintf(stderr, "Output: (1) lock buffer.\n");
#endif
  pthread_mutex_lock(output->mutex);
  /* we are in sync, awake it */
#if 1
fprintf(stderr, "Output: before broadcast.\n");
#endif
  ret = pthread_cond_broadcast(&output->output_startet);
#if 1
fprintf(stderr, "Output: after broadcast.\n");
#endif

  while((output->num_bytes == 0) || (output->num_bytes > bytes_written))
  {
#if 1
fprintf(stderr, "Output: (2) lock buffer %d.\n", which_buffer);
#endif
    pthread_mutex_lock(output->buffer[which_buffer].mutex);
#if 1
fprintf(stderr, "Output: (3) unlock buffer.\n");
#endif
    pthread_mutex_unlock(output->mutex);
---snip---

Everything looks fine here. And it worked a while ago. The only
code-change is in the "Output: write buffer %d"-part. I'm now under the
impression that the output part locks/unlocks output->mutex very fast
and the Decode-thread isn't able to get the lock on it (after a little
bit of restarting the app: sometimes it works, so it seems to be
timing related). I replace the "Output: (2) lock buffer %d"-part with a
trylock, usleep() a little bit if it returns EBUSY and have a look how
it works.

Sorry to have bothered the list with it,
Alexander.

-- 
            It is easier to fix Unix than to live with NT.

http://www.Leidinger.net                  Alexander+Home @ Leidinger.net
  GPG fingerprint = 7423 F3E6 3A7E B334 A9CC  B10A 1F5F 130A A638 6E7E



To Unsubscribe: send mail to majordomo@FreeBSD.org
with "unsubscribe freebsd-current" in the body of the message




Want to link to this message? Use this URL: <https://mail-archive.FreeBSD.org/cgi/mid.cgi?200004241125.NAA01320>