[cairo] Server RAID failure and various mailing list problems
Carl Worth
cworth at cworth.org
Mon Aug 15 13:40:31 PDT 2005
The machine that hosts cairographics.org recently suffered some disk
failure[1].
So, far I've verified that the cairo CVS repository remains intact[2].
But the mailing lists had several problems. Here are the problems I've
identified so far and the progress we've been able to make on
repairing them:
1) Delivery of new messages was broken for about a day.
If you're receiving this message, then delivery is now working
again. If you had sent a message and received a rejection,
please try again.
2) New subscriptions were broken for about a day.
This should be fixed now, (and I'll double check it). If you
had this problem then I don't know why I'm even mentioning it,
since you won't be receiving this answer.
3) The existing subscription list was corrupted and the system
automatically reverted to an old copy (from some hard-to-identify
time).The old copy was missing on the order of 200 subscriptions,
and would certainly also include addresses that people had since
removed.
For the missing subscriptions, I grepped[3] through the mail
logs from Saturday and added back all the addresses I could
identify as having received some list messages that went
out. If I missed some, unfortunately I don't have any way to
notify the affected individuals.
For the previously-removed subscriptions, I also don't have
any information on which addresses those are. So, I apologize
if you have received this message in error. Please feel free
to unsubscribe again [4] and we'll try to avoid this problem
occurring again in the future.
4) The mailing list archives were definitely corrupted. As of this
morning, several of the index pages from the last few months were
filled with garbage.
I ran the command to rebuild the archives from the giant mbox
file that mailman maintains. After the rebuild, the archive
indexes look much better. They're not quite perfect, as can be
seen by the few "No subject" messages with partial contents at
the end of this page:
http://lists.freedesktop.org/archives/cairo/2005-August/thread.html
I personally don't plan to put much more effort into repairing
the archives further. But if anyone wants to look at the .mbox
files and find the corrupted messages, we could replace them
with known-good messages from private copies. I can provide
known-good messages from my private copy, and I can also
supply the giant .mbox file that mailman uses if the per-month
mbox files it provides on the web page are insufficient for
tracking down the problems.
-Carl
[1] http://lists.freedesktop.org/archives/sitewranglers/2005-August/001025.html
[2] I had a pre-failure, rsync-based copy of the cairo CVS repository
which I compared against a post-failure version. The differences that
I saw all seem to correlate with recent commits.
[3] For the benefit of anyone who needs to fix similarly affected
lists, here's what I did. I took the Message-Id from the header of a
message that previously went out with the correct subscription
list. Then I found the log file that corresponds to that date and
un-gzip-ed it. Finally I ran the following commands:
export MSGID=1124029513.30753.7.camel at localhost.localdomain
export LOG=mail.log.1
for id in $(grep "message-id=<$MSGID>" $LOG | sed -e 's/^.*: \([0-9A-F]*\): message-id.*/\1/'); do grep "$id: to=" $LOG | sed -e "s/^.*$id: to=<\([^>]*\)>.*/\1/"; done | sort | uniq
[4] http://cairographics.org/cgi-bin/mailman/listinfo/cairo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : http://lists.freedesktop.org/archives/cairo/attachments/20050815/a2872d71/attachment.pgp
More information about the cairo
mailing list