Duped Messages (was Re: Please split this list!)

Joel Newkirk freerunner at newkirk.us
Mon Sep 1 15:23:21 CEST 2008


Rui Miguel Silva Seabra wrote:
> Since we're talking about the mailing lists, I still receive (randomly),
> three repeated mails, two repeated mails, etc...
>
> This mail from Vasco... I received it three times already! :)
>
> Rui
>
> On Mon, Sep 01, 2008 at 12:24:03PM +0100, vasco.nevoa at sapo.pt wrote:
>   

If you examine the email headers you can see that the 'received' header 
documenting where the mail server handling the list (sita.openmoko.org 
running exim 4.63) received the message from the sender's mailserver 
(IE, their ISP's mailserver) differs from one copy to the next - this 
means that either the sending server is failing to recognize that the 
message has been delivered and resends, possibly the receiving server is 
failing to send the acknowledgement of receipt at the end of the SMTP 
transaction (at least in a timely fashion), so the sending server 
automatically retries.

Received: from relay1.ptmail.sapo.pt ([212.55.154.21] helo=sapo.pt)
	by sita.openmoko.org with smtp (Exim 4.63)
	(envelope-from <vasco.nevoa at sapo.pt>) id 1Ka7yh-0002Ow-Ha
	for community at lists.openmoko.org; Mon, 01 Sep 2008 13:55:37 +0200

Received: from relay1.ptmail.sapo.pt ([212.55.154.21] helo=sapo.pt)
	by sita.openmoko.org with smtp (Exim 4.63)
	(envelope-from <vasco.nevoa at sapo.pt>) id 1Ka7fK-0005dL-Pf
	for community at lists.openmoko.org; Mon, 01 Sep 2008 13:55:37 +0200

Received: from relay1.ptmail.sapo.pt ([212.55.154.21] helo=sapo.pt)
	by sita.openmoko.org with smtp (Exim 4.63)
	(envelope-from <vasco.nevoa at sapo.pt>) id 1Ka7ZO-0003bU-8Q
	for community at lists.openmoko.org; Mon, 01 Sep 2008 13:55:21 +0200


Notice that the message ID differs between the three copies.
This tells us that this is the point (when sapo.pt mailserver delivers 
to sita.openmoko.org mailserver) where the failure occurs.  If it were 
the mailinglist server sending dupes, this header would be identical 
among all copies.

Also, it doesn't depend on sending mailserver software, I've noted it 
happening with qmail as sender, exim, gmail.com, and others.  (even 
mail.openmoko.org sometimes, such as Andy Green's reply to the '3G 
modem' thread)  Based on past experience as admin of a cluster of 
mailservers that sometimes exceeded 1 million incoming SMTP connections 
per day, I suspect that either spam filtering or some testing for ML 
(IE, 'is this sender permitted to post?') takes place AFTER receiving 
the message but BEFORE telling the sending server that message was 
received, and is periodically taking longer than the sending server's 
SMTP timeout, so the sender gives up on the connection and tries again - 
meanwhile the exim server handling the ML eventually accepts the 
message.  My suspicion is that the load on the (virtual?) server hosting 
the ML is getting to a level where processing messages sometimes takes 
longer than some sending servers are willing to wait.  Properly, in such 
a situation, the sending server is supposed to resend, and the receiving 
server is supposed to discard the message it failed to fully receive 
before the connection was broken.

Unfortunately my mailserver experience is with surgemail, qmail, 
sendmail, and some exchange (ick), but I've never worked with Exim, so I 
can't suggest anything specific to check in the server config.  (When 
I've seen this caused by receiving mailserver it was most often qmail, 
and was caused by improperly configured/limited spawning that exceeded 
available RAM instead of deferring excess inbound connections - once 
dipping into swap, all bets are off regarding timely responses)

j






More information about the community mailing list