Yahoo Groups archive

Milter-greylist

Index last updated: 2026-04-28 23:32 UTC

Thread

high load / high perf. / low memory

high load / high perf. / low memory

2004-09-10 by l_facq

i have think to this : 

 using a hash function like md5 on ip/from/to info 
 could be a good mean to
 - speed up (to check) the matching process
 - limit memory fragmentation (compared to futur string allocation in
replacement of the fixed ADDRLEN siwe)
 - limit memory consumation
 - limit .db size and speed up dump/reload process

 it is a bit like the SYN Cookie mechanism :

 when a new attempt arrive, compute something like
 hash = md5(apply_subnetmatch(ip),from,to)
 and just check if this 'hash' value exist in the whitlist table ...
 or create it

 in this hash table, the real value (ip,from,to) are optionnal
attributes. you can keep them for readability / debugging purpose
 or forget about them for performance.

 you can also dump only the hashes instead of full ip/from/rcpt in the
greylist.db and reload them from this file too. ip,from,to are optionnal
 
 off course, having the ip/from/to info is esayer to read !!
 but on a BIG mail server whith millions of user... memory consumation
can be a problem.

 the hash could also be compute only on ip or ip+from, depending of
the lazy/strict policy we want.


 L.

 --
Laurent FACQ - facq@... (05.40.00.65.34) - Reseau REAUMUR /
Bordeaux

Re: [milter-greylist] high load / high perf. / low memory

2004-09-10 by manu@netbsd.org

l_facq <facq@...> wrote:

>  i have think to this : 
> 
>  using a hash function like md5 on ip/from/to info 
>  could be a good mean to
>  - speed up (to check) the matching process
>  - limit memory fragmentation (compared to futur string allocation in
> replacement of the fixed ADDRLEN siwe)
>  - limit memory consumation
>  - limit .db size and speed up dump/reload process

milter-greylist supports regexs matching, you can't do that if you store
hashes.

If the performance was a real problem, we could store both the strings,
and the hash, but nobody ever reported that the process of walking the
grey list had a performance issue yet.

>  but on a BIG mail server whith millions of user... memory consumation
> can be a problem.

You mean if hotmail or yahoo want to use milter-greylist? I guess they
have the ressources to propose a patch :)

The first performance problem you'll encounter on a big setup is not the
greylist walk, it's the text dump. I guess it could be solved by
partially dumping to multiples files, or by using a real database
backend.

-- 
Emmanuel Dreyfus
Il y a 10 sortes de personnes dans le monde: ceux qui comprennent 
le binaire et ceux qui ne le comprennent pas.
manu@...

Re: high load / high perf. / low memory

2004-09-10 by l_facq

--- In milter-greylist@yahoogroups.com, manu@n... wrote:
> l_facq <facq@u...> wrote:
> 
> >  i have think to this : 
> > 
> >  using a hash function like md5 on ip/from/to info 
> >  could be a good mean to
> >  - speed up (to check) the matching process
> >  - limit memory fragmentation (compared to futur string allocation in
> > replacement of the fixed ADDRLEN siwe)
> >  - limit memory consumation
> >  - limit .db size and speed up dump/reload process
> 
> milter-greylist supports regexs matching, you can't do that if you store
> hashes.

 as far as i undestrand/imagine the process, i think that this regex
matching could be done *before* hashing

 1 regex match on from/to => exit if ok
 2 hash
 3 search this hash in db
 4 if found => mail ok
 5 not found => create an entry

 i didnt look all the process in detail, so may be i'm wrong

[...]
> greylist walk, it's the text dump. I guess it could be solved by
> partially dumping to multiples files, or by using a real database
> backend.

 on the fly compression (zlib) could be an easy way reduce disk access
(if this is the bottle neck).

 LF.

--
Laurent FACQ - Réseau REAUMUR / Université Bordeaux I

Re: high load / high perf. / low memory

2004-09-11 by l_facq

--- In milter-greylist@yahoogroups.com, "l_facq" <facq@u...> wrote:
> --- In milter-greylist@yahoogroups.com, manu@n... wrote:
> > l_facq <facq@u...> wrote:
> > 
> > >  i have think to this : 
> > > 
> > >  using a hash function like md5 on ip/from/to info 
> > >  could be a good mean to
> > >  - speed up (to check) the matching process
> > >  - limit memory fragmentation (compared to futur string
allocation in
> > > replacement of the fixed ADDRLEN siwe)
> > >  - limit memory consumation
> > >  - limit .db size and speed up dump/reload process
> > 
> > milter-greylist supports regexs matching, you can't do that if you
store
> > hashes.
> 
>  as far as i undestrand/imagine the process, i think that this regex
> matching could be done *before* hashing
> 
>  1 regex match on from/to => exit if ok
>  2 hash
>  3 search this hash in db
>  4 if found => mail ok
>  5 not found => create an entry
> 
>  i didnt look all the process in detail, so may be i'm wrong

 well, all regex functions call are in except_ fuctions
 and except functions are only call from milter-greylist.c : 

milter-greylist.c:      if ((priv->priv_whitelist =
except_sender_filter(SA(&priv->priv_addr),
milter-greylist.c:      if ((priv->priv_whitelist =
except_rcpt_filter(rcpt,
milter-greylist.c:      except_init();

 this checks are done when all info are available, but after
 only the hash is needed - in the db in particular

 so, hashing the infos is not a probleme in regards with regex matching.
Show quoted textHide quoted text
> 
> [...]
> > greylist walk, it's the text dump. I guess it could be solved by
> > partially dumping to multiples files, or by using a real database
> > backend.
> 
>  on the fly compression (zlib) could be an easy way reduce disk access
> (if this is the bottle neck).
> 
>  LF.
> 
> --
> Laurent FACQ - Réseau REAUMUR / Université Bordeaux I

Re: [milter-greylist] Re: high load / high perf. / low memory

2004-09-11 by Cyril Guibourg

"l_facq" <facq@...> writes:

>  as far as i undestrand/imagine the process, i think that this regex
> matching could be done *before* hashing
>
>  1 regex match on from/to => exit if ok
>  2 hash
>  3 search this hash in db
>  4 if found => mail ok
>  5 not found => create an entry
>
>  i didnt look all the process in detail, so may be i'm wrong

This has been already discussed a while ago.
See: http://groups.yahoo.com/group/milter-greylist/message/108 for other
inputs.

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.