I won’t say “Python Sucks” – that would be a terribly irresponsible thing to say. I will say that it’s “challenging”. It’s certainly frustrating to deal with as a Perl programmer.
On the one hand, I was very pleasantly surprised at just how little code I had to write in order to retrieve all of my Wordpress comment notifications. Saving them to mbox format was also simple enough, although mutt’s idea of what constitutes mbox format set me back for a while. There is, of course, no actual defined standard for mbox, but that’s another rant…
Oh, and this site was defaced this morning. Which was nice. Something to do a file called .wp-rocn.php. Google shows 0 results – maybe a zero-day vuln? I’ve tightened things up a little here, and made backups. We’ll see what happens…
Anyway, I’m currently dumping all the relevant Gmail messages into a great big mbox file, which I’ll then interrogate with Perl, and (probably) generate a SQL script from which I can feed into the Wordpress database and restore all old comments.
Here’s my code, as promised, for real Python developers to point and laugh at:
#!/usr/bin/python
import libgmail
ga = libgmail.GmailAccount("my_email@gmail.com", "my_password")
ga.login()
query = ga.getMessagesByQuery('"[Barry Price] Comment"', True)
f = open ("wp.mbox", "w")
for thread in query:
for msg in thread:
lines = msg.source.split('\n')
i = 0
for line in lines:
i += 1 # why in Cthulhu's name does i++ not work?!
if i < 2: # couldn't remember if comparison was '=' or '==' :)
# skip the first line, cos it's always just whitespace.
# the library author says he may fix this in the future.
# thanks, library author.
# Let's write a "From" line instead to start the mbox record.
# The date here is irrelevant, but Mutt insists on having one in this format
f.write("From wordpress@example.com Mon May 18 12:00:00 2009\n")
continue
else:
f.write(line) # save the line to the mbox file
f.write("\n") # final CR to end the mbox record
f.close() # and close the mbox file. All done.
See, not much actual code at all. I’m left with the file “wp.mbox” which can be interrogated in a language I’m far more comfortable with for Stage Two.
It could probably be written a lot more efficiently, but it was a job to make it run at all. The version of the libgmail library in MacPorts has a bug whereby it makes rather a mess of dealing with any email that isn’t pure ASCII. In fact, it crashes completely.
Enquiring about this in #python on freenode met with some very huffy developers who insisted that Python’s fetish for crashing at the sight of an “é” is completely rational behaviour. I disagreed, and spent an hour or so hacking at the library, trying to persuade it that the email source should be treated as binary data, rather than cast as an ASCII string. This did not go particularly well.
In a flash of desperation, I copied my script across to my Debian box, installed the Debian versions of python and libgmail, and found that it ran perfectly. Thanks, Debian guys. Whether they patched the upstream version of libgmail to behave in a sane manner, or did the same to Python itself, I don’t know. But it works.










w00t.
270 comments restored so far – more than half. A few duplicates, but it mostly seems to have worked.
I’m off down the pub now – I’ll finish the job off tomorrow!
And, we’re done:
197 Posts
5 Pages
12 Categories
360 Tags
425 Comments
425 Approved
0 Pending
0 Spam