Yahoo Groups archive

Milter-greylist

Index last updated: 2026-04-28 23:32 UTC

Message

Re: [milter-greylist] Re: Limiting resident memory usage

2006-11-02 by eclark

Matt, that depends on entirely how greedy the regex in question is, and how efficient one stack is vrs another. I would bet money that the regex evaluation in the milter is vastly insuperior to the dns resolution stack that the milter is compiled against. Moreover, network latency is a nonissue when you locally mirror RBLs inside your own network, or populate local sendmail databases on a regular basis with content pulled in an automatic fashion. The bigger point is, the broader the stroke you can cut, the more you can knock out at once, which ultimately means less resource consumption, even if you are using remote network connectivity to do some of your work (as remote connectivity is limited to what got past the initial greylist in the first place). I can speak firsthand about the ailments that are caused by rbls; we have seen mailservers brought to their knees running sendmail RBLs as if it were nothing, with no additional filtering at all. Drop nameservice to a mailserver
 , and your RBLs will wait for an extended period of time until process death, and back up the runqueue in no time. No, RBLs are not an end-all solution, and were not suggested as one. They were pointed out as a replacement for an unknown number of expensive regexs. You are making the fatal error of assuming the only expression being evaluated were the two pasted; I seriously doubt this is the case, and figure there are probably many more in his conf as well. At what point would you conceed that use of alternate, broader methods of checking would be superior to a list of expressions? 10 greedy regexs? 15? 5? If replacing an entire greylisting mechanism made of 15 or more greedy expressions with one locally based hostname lookup in a mirrored in your immediate network is considered less effective, then you truely have me stumped as to what might be considered more efficient.
 

On Thu, 02 Nov 2006 16:14:21 -0500, Matt Kettler <mkettler@...> wrote:
> manu@... wrote:
>> Matt Kettler <mkettler@...> wrote:
>>
>>> Wait.. you think the *regex* is too resource intensive, but advocate
> using
>>> RBLs instead?
>>>
>>> Are you completely out of your MIND???!!!
>>>
>>>
>>> An RBL is a NETWORK TEST. You have to create a UDP socket, send a
> request,
>>> wait for a reply, parse the reply..
>>
>> That's it: you wait. That means the thread is sleeping and the CPU works
>> somewhere else. Perhaps in another milter-greylist thread, perhaps in
>> another process.
> 
> Yes, I know that no CPU is being used during this time. I addressed both
> time
> AND real CPU clock cycles.
> 
>>
>> Indeed a DNS lookup increases lattency, but it does not load the CPU as
>> regexp computation does.
>>
> 
> I'd argue it does load the CPU more than a regex, although to a lesser
> degree
> than it increases latency. (ie: yes, it's obvious there's a massive
> increase in
> latency, but I say there's also a smaller increase in CPU load).
> 
> Yes, you do wait while waiting for the response to come back, and while
> doing
> that you are consuming no CPU cycles.
> 
> However, the number of actual CPU cycles burned building the query,
> sending it,
> receiving the reply and parsing that is by far higher than the regex is.
> 
> Remember, you don't just count the code that milter-greylist is running.
> Consider all the code in the resolver library, OS IP stack, and NIC
> driver. That
> all has to run too. And that all takes CPU time as well.
> 
> Sure, regexes are expensive compared to a binary compare, but they're not
> *THAT*
> expensive.
> 
> Parsers are expensive too, and that's exactly what the resolver is going
> to have
> to do with the DNS reply it gets.
> 
> Think about it. Picture in your head all the things your computer does to
> create
> a DNS querry in a UDP packet, send it, receive a response, and process the
> response.
> 
> No, really. think about all of it.
> 
> Sending the query:
> buffer allocation, DNS query formatting, context switch to kernel, IP/UDP
> header
> addition, ethernet header addition, NIC programming.
> 
> <sleep for free>
> 
> Receiving the response:
> 
> interrupt handler, kernel thread wake, (possible memcpy depending on NIC
> and
> kernel behaviors), ethernet header parsing, IP header parsing, UDP header
> parsing, match against existing socket handles, wake user thread , context
> switch to user space, buffer allocation, context switch to kernel, memcpy
> data
> to user app buffer, context switch to user space, DNS response format
> parsing,
> buffer deallocation.
> 
> There's a LOT of work going on under the covers here. That's not cheaper
> than a
> pair of short regexes. If you think it is, you're ignoring a large number
> of
> these steps which are all wrapped up in a library for you.
> 
> All the context switches alone are likely on-par in clock-cycles burned
> with the
> regex evaluation. Those are not at all inexpensive because the entire CPU
> state
> has to be saved off into a task descriptor. There's at least 3 context
> switches
> involved here, and that's before you actually do any real work.
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
> Yahoo! Groups Links
> 
> 
>

Attachments

Move to quarantaine

This moves the raw source file on disk only. The archive index is not changed automatically, so you still need to run a manual refresh afterward.