Blogger pages parsing tools



Audience/Usefulness

This code is only useful if you use blogger and you have configured blogger to post blogs to your own web server via ftp or sftp

I am providing two different tools, blogsnippet lets you give a link to a blog entry you wrote, without giving access to your entire blog, whereas integrate_blog lets you build a special interest page made from some of your blog entries
In both cases, scripts use the output of blogger and generate new pages from it

Also, this code is generally only useful if you know what you're doing. Minor hand tweaking may be required for your setup.

blogsnippet

Ok, an example is worth a thousand words, so this is the result.
The page that you see was autogenerated from the perma date, by extracting it from my blog archives. The end result provides the post I wrote while stripping all information that might let you get back to the rest of my blog

You can look at the ancient version I wrote in perl mostly for chuckle value, as an example of how not to use perl and regular expressions (but it was fast to write).
But for real use, you probably want the version I rewrote in python, which I used as an excuse to learn python (my first real script), so you'll be able to make fun of my weak python skills :) although you really want to grab the tarball so as to get the required extract_blog_entry.py module too

Make sure to look for CHANGEME in the source, and edit appropriately

integrate_blog

Integrate_blog serves a different purpose, it lets you update a page by picking portions of your blog that you don't mind sharing. For a good example, you can look at my flying page
What this code does is parse your blog archive files, finds the perma you gave (unix time, or the filename used for new blog archives), and it appends that blog entry after sanitizing it, to the target file you gave. For the insert/merge to work properly, you need a template file with proper markers (save to disk or view source)
Here's how you would use it:
integrate_blog.py flying-weekend-mammoth-flight index.html

The code is in integrate_blog.py, which also requires the extract_blog_entry.py module, so you probably want the tarball

Make sure to look for CHANGEME in the source, and edit appropriately


[ms free site] Email
Link to Home Page