Skip to page content or Skip to Accesskey List.

Work

Main Page Content

Headlines In Rss And Perl

Rated 4.04 (Ratings: 6)

Want more?

  • More articles in Code
 

Morbus Iff

Member info

User since: 01 Oct 2000

Articles written: 2

Eyeballs are wonderfully important commodities to a website. Once

they're at your site, you can treat them with the reverence reserved for

royalty (what an awesome amount of alliteration, eh?). One way of

getting eyeballs to your site is by allowing your headlines to be placed

on other sites or portals. The easiest, and most popular, way to do this

is through RSS, a syndication format originally implemented by Netscape,

and then collectively improved upon by Dave Winer and Netscape.

With Perl and some easily downloadable modules from CPAN, you can

quickly implement your own headlines in RSS.

What You'll Need

  • Perl 5 or above.
  • XML::RSS (available from cpan.org)
  • The ability to run CGI scripts.

Creating Your Own RSS File

Below, we'll go through the simple code needed to create your

own headlines in RSS. I'm going to assume that you already have an

html document which contains your headlines for viewing through

a browser. Using the fun text manipulation available in perl, we'll

parse through that html file, and place the headlines into a RSS

data structure. Finally, we'll save the file.

#!/usr/bin/perl -wT

# Here, we're calling two modules or additions to Perl. The first one we

# installed on the server specifically for this occasion, but 'use

# strict' is a helpful little addition if you want to write good code.

# It catches a lot of the human idiocy that plagues horrible programs.

use XML::RSS;

use strict;

# create some variables near the top of the script so we don't

# have to search through the script later to change them.

my $headlines_file = "/path/to/headlines.html";

my $rss_output = "/path/to/headlines.rss";

# we create a new rss object.

my $rss = new XML::RSS (version => '0.9');

# these are just little descriptive elements of the RSS channel we're

# about to make. we add them to the $rss object we made above.

$rss->channel(title => "Your Site Name",

link => "http://www.yoursitename.com/",

description => "Headlines from Your Site Name!",

);

# we open the file with our headlines in it.

open(FILE, "<$headlines_file") or die "Couldn't open $headlines_file: $!";

# a traditional while loop. for each line of our $headlines_file

# we start looking for markers within the code that tell us a

# headline is coming. in this example, we're assuming that headlines

# follow html like <h2><a href="link">headline</a></h2>.

while (<FILE>) {

# look in the current line to see if our headline is in there.

if ($_ =~ /<h2><a href=\".*\">.*<\/a><\/h2>/) {

# if it is, then we grab the link and headline.

my ($link, $headline) = /<h2><a href=\"(.*)\">(.*)<\/a><\/h2>/i;

# if both $link and $headline were created, then we

# add them to the $rss object using the add_item feature

# of XML::RSS.

if ($link and $headline) {

$rss->add_item(title=>$headline, link=>$link );

}

# move on to our next line.

next;

}

}

# if we're this far, we've finished our file.

close(FILE);

# finally, save the rss file.

$rss->save($rss_output);

# and quit.

exit;

The Output

Running this script on this document (using some dummy links to

fake some pages) created the following RSS file:

<?xml version="1.0"?>

<rdf:RDF

xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"

xmlns="http://my.netscape.com/rdf/simple/0.9/">

<channel>

<title>Your Site Name</title>

<link>http://www.yoursitename.com/</link>

<description>Headlines from Your Site Name!</description>

</channel>

<item>

<title>What You'll Need</title>

<link>what_you'll_need.html</link>

</item>

<item>

<title>Creating Your Own RSS File</title>

<link>creating_your_own_rss.html</link>

</item>

<item>

<title>The Output</title>

<link>the_output.html</link>

</item>

<item>

<title>Conclusion</title>

<link>conclusion.html</link>

</item>

</rdf:RDF>

Conclusion

That's all there is to it, really. After customizing the headline / link

searching code above to your own needs, it's a simple matter of running this

script whenever your headlines are updated (some people like setting up an

hourly cron).

The above explanation is really a more verbose version of what's already

contained within the XML::RSS documentation.

If you're looking to output a different version of RSS or read in other

people's headlines, I heartily suggest examination of the documentation

The access keys for this page are: ALT (Control on a Mac) plus:

evolt.org Evolt.org is an all-volunteer resource for web developers made up of a discussion list, a browser archive, and member-submitted articles. This article is the property of its author, please do not redistribute or use elsewhere without checking with the author.