vsbabu.org:Joy of Python: Blogrolling

« Using XPath and XPointer
» Joy of Python: Blogrolling 1.2

Joy of Python: Blogrolling

Simple Python script to get blogrolling rolling.

This is a small script, aimed mainly at blog owners. From Fozbaca’s site, I came to know about a site Blogrolling.com where you can create a web account and then maintain the blogs you want to watch. Promptly, I registered there and by the time I typed in my 3^rd link, I was tired - especially thinking I had to put some code on my site too.

Why bother? Python has really good http library (urllib) that can be used to create pretty much the same functionality.


"""
This is a  little script to help you do what www.blogrolling.com
can do. Define the blogs you are interested in, put the script
as a scheduled job or as a cgi and include the output in the appropriate
section of your blog template. (I wouldn’t recommend making this run
for every page load, since that will slow down your page quite a bit.)

You can see this in action on the front page of my blog at
http://vsbabu.org/mt/

S Babu:
joys of python: is a series of small and often silly scripts using
python that does something while explaining some nice features of
python. Absolutely newbie material.

http://vsbabu.org/mt/archives/categories/python/

All small values of joy are marked by comments starting with #joy:
"""
#joy: no more separate comment block. Triple quoted starting comment
#     is python’s documentation string. You can access it as __doc__


import urllib
import time

class Blog:
    """ Blog: a simple class that stores the attributes of a typical blog
    """
    #joy:documentation strings for a class or a function or the whole script
    #    can be added within triple quotes. And these are available within the
    #    predefined variable __doc__

    def __init__(self, url, title=None, description=None, rdf=None, priority=50):
        """ constructor to create a new instance"""
        self.url = url
        self.title = title
        self.description = description
        #this could potentially be enhanced by using Mark Pilgrim’s rss
        #autodiscovery tool

        self.rdf = rdf
        self.priority = priority
        #I’m using rdf to check for modified date since some servers
        #utilizing SSI or PHP does not return modified date

        self.modified_date = self.update_modified_date(self.rdf)
        
    def update_modified_date(self, url=None):
        """gets the modified time of the blog. One may optionally specify
           another url to check too"""
        #see time module’s documentation. I’m storing modified_date
        #as seconds since epoch

        if url is None:
            url = self.url
        try:
            s = urllib.urlopen(url)
            message = s.info()
            try:
                dt = message.getdate('Last-modified')
                if dt is None:
                    dt = message.getdate('Last-Modified')
                self.modified_date = time.mktime(dt)
            except:
                return None
        except:
            #this is an indication of a broken link.
            #Exercise: handle this nicely!
            return None
        return self.modified_date

    def show_html_string(self, modified_since=3600, current_time=None):
        """returns a string that can be used as an HTML link.
        
        If the blog has been modified within the specified
        number of seconds, print a * next to it too."""
        if current_time is None:
            current_time = time.mktime(time.localtime())
        s = """<a href="%s" title="%s">%s</a>""" % (self.url, self.description, self.title)
        if current_time-self.modified_date < modified_since:
            s = s + "*"
        return s
        
#joy:I can specify what is the main part of the script by
#    by simply equating the predefined variable __name__ to
#    '__main__'. Exercise. Try printing __name__ within a function
#    Other scripts can reuse my class by just doing an
#    from thisfile import Blog
#
#    This feature can be used to define test routines for your
#    module.

if __name__ == '__main__':

    #define a list of blogs that I visit often

    blogroll = []

    #joy:python’s lists are really too good. you can append()
    #    to a list. Or insert(). Or remove(). Check the documentation
    #    for more functions available for lists.

    blogroll.append(Blog("http://www.fozbaca.org/",
                         "Fozbaca",
                         "This is fozbaca’s blog",
                         "http://fozbaca.org/rss.xml"))
    blogroll.append(Blog("http://pythonowns.blogspot.com/",
                         "Jarno Virtanen",
                         "Python Owns Us",
                         "http://db.cs.helsinki.fi/%7Ejajvirta/pythonowns.rss"))
    blogroll.append(Blog("http://www.brunningonline.net/simon/blog/",
                         "Simon Brunning",
                         "Small Values of Cool",
                         "http://www.brunningonline.net/simon/blog/index.xml"))
    blogroll.append(Blog("http://www.diveintomark.org/",
                         "Mark Pilgrim",
                         "Dive into everything!",
                         "http://www.diveintomark.org/xml/rss.xml"))
    blogroll.append(Blog("http://www.windley.com/",
                         "Phil Windley",
                         "CIO of Utah",
                         "http://www.windley.com/rss.xml"))
    blogroll.append(Blog("http://weblog.xlle.com/",
                         "Nairomia",
                         "Sachin Nair",
                         "http://weblog.xlle.com/index.rdf"))
    blogroll.append(Blog("http://weblog.infoworld.com/udell/",
                         "Jon Udell",
                         "Radio",
                         "http://weblog.infoworld.com/udell/rss.xml"))
    blogroll.append(Blog("http://radio.weblogs.com/0106123/",
                         "Jeffrey Shell",
                         "Industrie Toulouse",
                         "http://radio.weblogs.com/0106123/rss.xml"))

    #joy: sort a list very easily by providing our own comparison function
    # lambda is a means to define inline functions. cmp() compares two
    # strings. I’m sorting my list by title

    blogroll.sort(lambda x,y: cmp(x.title, y.title))

    current_time = time.mktime(time.localtime())

    for blog in blogroll:
        print blog.show_html_string(3600, current_time), "<br/>"

Posted: August 26, 2002 05:40 PM
python

vsbabu.org

Related Entries

Joy of Python: Blogrolling