Main Page Content
Headlines In Rss And Perl
Eyeballs are wonderfully important commodities to a website. Once
they're at your site, you can treat them with the reverence reserved forroyalty (what an awesome amount of alliteration, eh?). One way ofgetting eyeballs to your site is by allowing your headlines to be placedon other sites or portals. The easiest, and most popular, way to do thisis through RSS, a syndication format originally implemented by Netscape,and then collectively improved upon by Dave Winer and Netscape.With Perl and some easily downloadable modules from CPAN, you can
quickly implement your own headlines in RSS.What You'll Need
- Perl 5 or above.
- XML::RSS (available from cpan.org)
- The ability to run CGI scripts.
Creating Your Own RSS File
Below, we'll go through the simple code needed to create your
own headlines in RSS. I'm going to assume that you already have anhtml document which contains your headlines for viewing througha browser. Using the fun text manipulation available in perl, we'llparse through that html file, and place the headlines into a RSS data structure. Finally, we'll save the file.#!/usr/bin/perl -wT# Here, we're calling two modules or additions to Perl. The first one we
# installed on the server specifically for this occasion, but 'use# strict' is a helpful little addition if you want to write good code.# It catches a lot of the human idiocy that plagues horrible programs.use XML::RSS;use strict;# create some variables near the top of the script so we don't
# have to search through the script later to change them.my $headlines_file = "/path/to/headlines.html";my $rss_output = "/path/to/headlines.rss";# we create a new rss object.
my $rss = new XML::RSS (version => '0.9');# these are just little descriptive elements of the RSS channel we're
# about to make. we add them to the $rss object we made above.$rss->channel(title => "Your Site Name", link => "http://www.yoursitename.com/", description => "Headlines from Your Site Name!", );# we open the file with our headlines in it.
open(FILE, "<$headlines_file") or die "Couldn't open $headlines_file: $!";# a traditional while loop. for each line of our $headlines_file
# we start looking for markers within the code that tell us a# headline is coming. in this example, we're assuming that headlines# follow html like <h2><a href="link">headline</a></h2>.while (<FILE>) {# look in the current line to see if our headline is in there.
if ($_ =~ /<h2><a href=\".*\">.*<\/a><\/h2>/) {# if it is, then we grab the link and headline.
my ($link, $headline) = /<h2><a href=\"(.*)\">(.*)<\/a><\/h2>/i;# if both $link and $headline were created, then we
# add them to the $rss object using the add_item feature # of XML::RSS. if ($link and $headline) { $rss->add_item(title=>$headline, link=>$link ); }# move on to our next line.
next; }}# if we're this far, we've finished our file.
close(FILE);# finally, save the rss file.
$rss->save($rss_output);# and quit.
exit;
The Output
Running this script on this document (using some dummy links to
fake some pages) created the following RSS file:<?xml version="1.0"?><rdf:RDF
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"xmlns="http://my.netscape.com/rdf/simple/0.9/"><channel>
<title>Your Site Name</title><link>http://www.yoursitename.com/</link><description>Headlines from Your Site Name!</description></channel><item>
<title>What You'll Need</title><link>what_you'll_need.html</link></item><item>
<title>Creating Your Own RSS File</title><link>creating_your_own_rss.html</link></item><item>
<title>The Output</title><link>the_output.html</link></item><item>
<title>Conclusion</title><link>conclusion.html</link></item></rdf:RDF>
Conclusion
That's all there is to it, really. After customizing the headline / link
searching code above to your own needs, it's a simple matter of running thisscript whenever your headlines are updated (some people like setting up anhourly cron).The above explanation is really a more verbose version of what's alreadycontained within the XML::RSS documentation.If you're looking to output a different version of RSS or read in otherpeople's headlines, I heartily suggest examination of the documentation