An.H.Nguyen <AnNguyen251@...> wrote:
> Peer sync'ing will work for a period of time, then start producing "peer
> queue overflow" errors. This happens on different servers at different
> times. What we've noticed is that the error will affect a particular
> server, but the other servers will continue to sync with each other with
> no problem; so the problem appears to be localized to the server
> initiating the synchronization. (And eventually, all the servers will
> end up with queue overflow errors.)
milter-greylsit puts records to be sent on a queue, and a syncer thread
is responsible for emptying the queue. If the syncer thread gets stuck,
the queue will grow to the limit, and you'll get the error message.
There may be a race condition hidden somewhere that cause the syncer
thread to get hung. In order to debug that, you'll have to add
mg_log(LOG_DEBUG, "%s() %s:%d", __func__, __FILE__, __LINE__);
lines everywhere in sync.c:sync_sender(). When the thread stops
operating, check the last debug message so that we can get an idea of
where it got stuck.
An other idea: there is a loop on all the peers:
LIST_FOREACH(peer, &peer_head, p_list) {
You can add a log here:
mg_log(LOG_DEBUG, "%s sync with %s", peer->p_name);
So that we check that the peer list does not get corrupted.
Sorry, I can't help more, as I never saw that problem occuring at mine.
--
Emmanuel Dreyfus
http://hcpnet.free.fr/pubz
manu@...Message
Re: [milter-greylist] RE: Cause of "peer queue overflow" errors?
2007-03-22 by manu@netbsd.org
Attachments
- No local attachments were found for this message.