jump to content

Satheesh Babu
2001/02/19

How to use Zope, some Python, rsync and a shell script to make and maintain a website. Think of it as a $0 solution, against costly web content management software.

You should really use the scripts I have in the next article. This one was written when I was very busy and had to put some thing up quickly.

Philosophy

Face it, most sites are static as far as a visitor is concerned. The consultants you hired might tell you the benefits of dynamically generated pages over static ones. The point is you don't need to dynamically generate pages for each visitor. Generate it once, and store it somewhere! It makes the site lot faster. And it is independent of the database. Burn the site into CDs. Whatever!

Implementation plan

So, you keep your data in a database (Zope's - ZODB). You make your pages from the database using some scripting language (Zope's - DTML). You save the pages into a staging area (zsoft) in your file system (slurp.py - a Python script). Then you synchronize (using rsync) the staging area with your space (www) in your ISP's server (vsbabu.csoft.net).

I've followed a similar plan to make a publishing tool for my client. There the database is an Oracle 8i one. It could run very well from ZODB - however, I did not want to tie client's data to ZODB and Zope. Because it is in Oracle, it is quite easy to reimplement it in ASP or PHP, what Zope DTML does.

Questions

1. If your DTML Methods have changed, how do I force the download of the files that use this method?
Most of the time I don't change the methods. If at all I end up changing headers and footers, I download all the files once.
2. How about visually editing documents?
You can try IEMethod product which is a DTML Document modified with an IE DHTML Control editor. It is quite easy to set up and customize. Well, it works only if you use IE 4.0 or above in Windows to access Zope.
3. If I delete a document, does the script delete it from the staging area?
No. It is to manage a personal site. It is a simple system. (Add more excuses here, till I get time to add a simple mechanism to do this!) Right now, my site has aroung 360 objects and I rarely delete stuff. When I do, I manually delete it from staging area too.

Assumptions

  1. My Zope installation is in a local machine called ctx on port 8080. So, I call this http://ctx:8080/. You need to change this in the scripts to your own installation location.
  2. I run this in a Linux machine. I also use SSH to connect to my ISP.
  3. My staging area is a directory called zsoft. I keep the scripts in the parent directory of zsoft.

Components

  1. Content Management using Zope
  2. Setting a Zope catalog
  3. A Python script to download content to a staging area
  4. RSYNC to upload content to your hosting provider
  5. A shell script to glue it all!

Content Management using Zope

First, go to www.zope.org and download Zope for your platform. Install it. Read the tutorial. For the impatient, here's the menu.

I can possibly write here what all I do to my site. However, playing with Zope is a better solution, because you've hell a lot more freedom.

Setting a Zope catalog

At the root level of your site, add a ZCatalog object. I've called mine as downLoadables. A catalog makes it very easy to search your documents. However, we are now looking into making a static site, so we are going to use it for figuring out how many documents changed in last X days.

Inside the downLoadables catalog, add a DTML Method called updateCatalog. Add the following code to it.

<html>
<head>
<title>Clear and add contents to catalog</title>
</head>
<body>
<dtml-with "PARENTS[0]">
<dtml-call "manage_catalogClear(REQUEST,RESPONSE,BASE0)">
<dtml-call "manage_catalogFoundItems(REQUEST,RESPONSE,BASE0,BASE0,
['DTML Document','Image','File','Knowledge Document'])">
<dtml-comment>
    Ideally it should be URL1 or 2. However, this will need
  the catalog manage screens to be opened up. So, just
  give some url - I've set it to the root of my Zope; and 
  ignore the error.
    In my situation I need to index only the DTML Docs,
  images, files and Knowledge Documents (my own document
  ZClass). I don't care for DTML Methods, because any
  method I use is embedded in the DTML Docs I want to
  publish. Here's an example.
   <dtml-call "manage_catalogClear(REQUEST,RESPONSE,URL1)">
   <dtml-call "manage_catalogFoundItems(REQUEST,RESPONSE,URL2,
   URL1, ['DTML Document','Image','File','Knowledge Document'])">
</dtml-comment>
</dtml-with>
</body>
</html>

See the comments for some notes. Specifically, Knowledge Documents are my own type of ZClasses.

Click on the Proxy tab and choose Manager role. This will give any TDH the power to execute this DTML Method, without logging in to Zope by going tohttp://ctx:8080/downLoadables/updateCatalog which in turn clears and rebuilds the catalog. Giving proxy roles is a *bad* thing, but we started this as a *local* content management system!

A Python script to download content to a staging area

Please download the python script first. I'll assume that you saved it as slurp.py, in the parent directory of the staging area. I've verified that it works under Windows and Linux, in Python 1.5.2 and ActivePython 2.0 and even the stock Python that comes with Zope.

RSYNC to upload content to your hosting provider

Now, go read my other techbit about using RSYNC to synchronize staging area with my provider's server. As far as I know, RSYNC is available for many UNIX/Linux systems. I'm not aware of an RSYNC for Windows. You can get equivalent tools like Repliweb and SureSync. Or, if you are not worried about security, you can use FTP mirroring. See, my python script for that.

A shell script to glue it all!

Finally, make a shell script - I call mine dnld.sh - with the code given below. If have only Windows machines, you can write an equivalent batch file.

# rebuild catalog
# Get the URLs that have changed. If standard headers have changed,
# you'll need to run this manually for since_when=100 or so.
cd zsoft
# update the catalog
python ../slurp.py http://ctx:8080/downLoadables/updateCatalog
rm -f downLoadables/updateCatalog
rmdir downLoadables
#get the list of files that were updated in the last day
python ../slurp.py http://ctx:8080/URLFILE.txt?since_when=1
# I'll keep the file to see which files where uploaded
# Use that file list to do further downloading
python ../slurp.py -i URLFILE.txt
#upload using rsync
#This is all in one line
rsync --verbose --progress --stats --compress --rsh=/usr/bin/ssh 
--recursive --times --perms --links --delete
--exclude "counter.txt"
--exclude "*.log"
--exclude "URLFILE.txt" * vsbabu.csoft.net:www/
######
#if you are adventuristic, run htmltidy! (before rsync)
#find . -name '*.html' -exec tidy -config ~/tidy.cfg {} \;
#find . -name '*.php3' -exec tidy -config ~/tidy.cfg {} \;

References

  1. www.zope.org
  2. www.python.org
  3. rsync.samba.org
  4. How-To: Mirroring Zope with WGET
  5. How-To: static DTML documents from database records