On Tue, Jun 01, 2004 at 06:32:49PM +0200, Enrico Scholz wrote:
> > This impacts the memory usage.
> * Is this really a problem with nowadays hardware (1GB+ RAM)?
My setup has 256 MB, so yes, memory usage can be a problem for me :)
> > 2) ensure regex are matched on the address before it is truncated
> Sounds good, and is probably the best choice. Doing this would allow to
> store hashes in the database: the comparisions with settings in the
> configuration file (rcpt/from) happens on the real addresses (pointers
> given by the milter-interface), and the lookup in the database is done
> with hashes.
>
> This would make milter-greylist very efficiently:
I have no idea how costly is cryptographic hash computing. Are you sure
it won't consume more CPU if we go that way?
> * generate *one* hash over the entire triple, e.g.
> | sha1("%s#%s#%s", relay, from, rcpt) --> 40 bytes ASCII resp. 20 bytes binary
You can't do that: you loose the ability to perform subnet matching. At most,
you can hash from and rcpt addresses, but not the IP. Or you store the IP
next to the hash, and each time the config file is changed, you renegerate
the hashes. That's more complicated (I've done it for Bekeley DB support,
it was a pain to debug).
> - I would recommend sha1 over md5 since it is more collision resistent
I was wondering what was the best hash to use here. I suspect we don't care
about the security usage of hashes (ie: it is difficult to create two identical
hashes with different data on purpose), but we need efficiency, and collision
resistance.
> * do a binary search over these lists (insert is more expensive than
> the current linked-list implementation, but there should be far more
> 'lookup' than 'insert' operations)
I thought about this too, but before going that path, we need to measure
how long a lookup is. It's useless to optimize something that is low
compared to other problems. I suspect the database dump to be the biggest
CPU hog. Would you like to run some tests?
> Advantages:
> * reduced memory consumption (20 bytes + timestamp per entry)
> * faster (O(log n) for lookup)
> * enhanced privacy
I disagree with the last advantage: the log file is only readen by the
administrator, and the ability to easily debug things is probably more
useful than the ability to hide what was greylisted to the sysadmin.
If you follow the privacy path too far, you configure syslog to send
mail.* to /dev/null...
--
Emmanuel Dreyfus
manu@...Message
Re: [milter-greylist] Re: is this a DoS?
2004-06-01 by Emmanuel Dreyfus
Attachments
- No local attachments were found for this message.