[cairo] Server RAID failure and various mailing list problems

Carl Worth cworth at cworth.org
Mon Aug 15 13:40:31 PDT 2005


The machine that hosts cairographics.org recently suffered some disk
failure[1].

So, far I've verified that the cairo CVS repository remains intact[2].

But the mailing lists had several problems. Here are the problems I've
identified so far and the progress we've been able to make on
repairing them:

1) Delivery of new messages was broken for about a day.

	If you're receiving this message, then delivery is now working
	again. If you had sent a message and received a rejection,
	please try again.

2) New subscriptions were broken for about a day.

	This should be fixed now, (and I'll double check it). If you
	had this problem then I don't know why I'm even mentioning it,
	since you won't be receiving this answer.

3) The existing subscription list was corrupted and the system
   automatically reverted to an old copy (from some hard-to-identify
   time).The old copy was missing on the order of 200 subscriptions,
   and would certainly also include addresses that people had since
   removed.

	For the missing subscriptions, I grepped[3] through the mail
	logs from Saturday and added back all the addresses I could
	identify as having received some list messages that went
	out. If I missed some, unfortunately I don't have any way to
	notify the affected individuals.

	For the previously-removed subscriptions, I also don't have
	any information on which addresses those are. So, I apologize
	if you have received this message in error. Please feel free
	to unsubscribe again [4] and we'll try to avoid this problem
	occurring again in the future.

4) The mailing list archives were definitely corrupted. As of this
   morning, several of the index pages from the last few months were
   filled with garbage.

	I ran the command to rebuild the archives from the giant mbox
	file that mailman maintains. After the rebuild, the archive
	indexes look much better. They're not quite perfect, as can be
	seen by the few "No subject" messages with partial contents at
	the end of this page:

	http://lists.freedesktop.org/archives/cairo/2005-August/thread.html

        I personally don't plan to put much more effort into repairing
	the archives further. But if anyone wants to look at the .mbox
	files and find the corrupted messages, we could replace them
	with known-good messages from private copies. I can provide
	known-good messages from my private copy, and I can also
	supply the giant .mbox file that mailman uses if the per-month
	mbox files it provides on the web page are insufficient for
	tracking down the problems.

-Carl

[1] http://lists.freedesktop.org/archives/sitewranglers/2005-August/001025.html

[2] I had a pre-failure, rsync-based copy of the cairo CVS repository
which I compared against a post-failure version. The differences that
I saw all seem to correlate with recent commits.

[3] For the benefit of anyone who needs to fix similarly affected
lists, here's what I did. I took the Message-Id from the header of a
message that previously went out with the correct subscription
list. Then I found the log file that corresponds to that date and
un-gzip-ed it. Finally I ran the following commands:

	export MSGID=1124029513.30753.7.camel at localhost.localdomain
	export LOG=mail.log.1
	for id in $(grep "message-id=<$MSGID>" $LOG | sed -e 's/^.*: \([0-9A-F]*\): message-id.*/\1/'); do grep "$id: to=" $LOG | sed -e "s/^.*$id: to=<\([^>]*\)>.*/\1/"; done | sort | uniq

[4] http://cairographics.org/cgi-bin/mailman/listinfo/cairo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20050815/a2872d71/attachment.pgp


More information about the cairo mailing list