Since the first suit told the first web designer 'Update our site faster,' content management has been vital for professional sites. Why does it matter? What do you need to be doing about it?
It started small, like so many things. I was developing a wee site for a company, and suggested a press releases page. "Sure," they said, "but how do we get press releases up there." I really didn't want to be a typist for them, adding each release as it came along. So I put togther a simple CGI which took in the content of a form - with fields for headline, summary and body - and wrote a new file for the release, and added the headline and summary to the main releases page with the date. And because you don't want any Tom, Dick or Harry adding releases, I added the most basic security there is: a password field which had to match the password hard coded into the script.
This scored me a fair lump of cash (to my very, very small freelance bank account), and made me a guru to the client. Not bad for half a day's work. Even better, I now had a working system I could apply to any CGI based site with a minimum of work, but for the same fee. So even at this basic level, a CMS can make your life less boring and better paid.
But there were significant problems with this system which will become evident:
- there was no system for approval of new content, just a one stop shop
- It only applied to a small section of the site (I never did get the call "Can you make the whole site like this", but worrying about it kept me awake nights for a long, long time)
- security was erm, less than ideal
- While writing static files is good for performance, filling your filesystem with auto-generated filenames is never going to make life easy for you
- It didn't do images. At all.
Spin forward a couple of years
I was working for a large corporate, with an extensive intranet. We'd been a bit sensible about this. Instead of having a central team dedicated to updating every department's site, we'd devolved the content production to each department. It's their content; they understand it better than us, right?
Being a corporate, we'd had FrontPage imposed on us. In this environment, it almost made sense - the FP extensions on the server and clear publishing path from development to production made it harder for the business units to screw up. But not impossible. We'd provided full FP training for anyone involved in the process, and given very, very clear instructions on design guidelines (see Palyne's excellent article about why this matters). However, there were still some users who insisted on using magenta text, adding useless hit counters and so on. Worse, there was nothing to prevent any of the following gotchas:
- Content was rarely reviewed by many departments, and just sat there even when it was no longer true
- Anyone in the department with rights to publish material could do so unsupervised - there was at least one incident of defamatory content hidden away
- If Alice went to publish her content, Bob's content could be 1/2 finished, but still went live (at least we separated publishing by department, but it was still very difficult)
- There was no way to prepare content in advance and schedule its launch automatically, which made it harder to schedule work to an even flow; work had to be done the day before its launch, no matter what the other workloads.
- Each department still had staff who wouldn't dirty their hands with publishing material, instead relying on the admin staff who didn't necessarily understand (or even care) enough about it to get it right.
- As with all static HTML sites, content is hardwired to presentation., particularly for navigation. The weekend I worked 22 hours to merge two sections because I had to hand-edit the navigation in every single damn page is unlikely to ever leave my memory.
So it clearly wasn't good enough. We went out into the market to see if there was any off-the-shelf product which would resolve these issues. But there wasn't (or if there was, it didn't work sufficiently robustly on our NT/ASP infrastructure).
Spin forward again
When the same client's senior management finally woke up to the fact that their site - while up to date in content - looked a bit tired (it had been designed 2 years before), we took the opportunity to save the client large sums of cash. Rather than every business area of the client (maybe 30 of them) faxing or emailing changes to a central web team to put in the publish queue, then have it sent back for checking, each business manager would be able to make their own changes, and pass them through their own internal signoffs automatically. Here were some of the key requirements:
- Content must be able to be edited without knowing any HTML, using a standard browser interface
- Meta-content such as launch and expiry dates, content owner etc must be captured
- Changes must go through an automatic signoff procedure - the system should notify each person in the workflow by email. This is not only real content but meta-content also.
- Changes must only go live at launch date, and content owners must be notified of upcoming expiries
- Changes to site structure must automatically update the navigation structure
- Images must be uploadable
- Each change must be independent of all other changes.
We quickly realised that none of this was going to happen without a database & template based site, otherwise the content would be too hard-wired to the presentation. As the site really did need solid reliability, we weren't going to mess about with toy databases; we chose Oracle. We also needed a product to do all of this - writing one from scratch was a non-starter as we wanted it to be good, affordable and delivered quickly. The choices were narrowed down to Allaire Spectra and Interwoven TeamSite. Both would do the job, but TeamSite was thought to be a better strategic choice as our Tech Strategy people wanted greater flexibility of application server (Spectra needs ColdFusion). See the review section below for more on these and other choices.
Defining content and workflows
Once you've developed your business requirements and chosen your product, you'll need to start thinking about what content you have, and how it gets to the site. Remember that ideally, every single type of content will need defining. This will include
- all your images, both standard and individual (like banner ads), plus their alt texts
- your navigation
- any legal text - particularly if it relates to specific information on the site
- calculators or other reusable pieces of scripting - where do calculators derive their data from?
- and don't forget the main content which users are there for
and so on. Next, you'll have to work out how these all get to the live site. You may will a workflow for every single one of them, although in practise, you'll produce a few which get used for multiple content objects. If you have a site contributed to by several disparate areas, each may require its own workflow.
Essentially, a workflow has three components:
- Someone initiates it. They decide which workflow to use, and allocate the tasks to each person on it, and note what changes are to be made (eg please change 'foo' on this page to 'bar')
- Someone actually makes the change - writes the text and puts it into the system, or produces and uploads the image
- Someone (or several layers of people) approve the change.
What you need to do in defining a workflow is to specify for each stage's task:
- What gets done here
- Who can perform it (can be 'any one of these people', or 'two of the following in parallel')
- What is required to complete the task
- What happens when it's completed (default is proceeds to next)
- What happens if it fails.
So a typical workflow might be:
- Any one of Alice, Bob or Caroline initiates and adds instructions
- David checks the proposed change for legal compliance. If fail, return to stage 1. If pass, proceed.
- Any one of Emily, Frank or Georgia can make the change in text & launch date
- Alice or Bob checks the change in text for accuracy. If it fails, return to stage 3. If pass, proceed
- Caroline checks the launch date. If fails, return to stage 3. If pass, proceed
- Content goes live on launch date
Remember that any workflows you define before launch are only going to last until you're using them and find the things which make you mad with them. It's a good idea to only define a few to start with as initial starting points. From the day you go live, you'll be evaluating them, so you can develop variations for the situation where "this particular bit of content doesn't fit workflow N". It's an evolutionary thing.
So, content management. It takes a hell of a lot of thought early on in the process, but it will pay you back in large lumps of time to go diving/ drinking/ driving/ whatever (though separately, eh?) once you go live. And that's what good development is all about.
Content Management Tools:
Horrifically expensive to buy (don't expect change out of £1m), and develop for (developers get £100k a year each), this is the big daddy of them all. Designed to give massive amounts of customisation and scalability, and talk to the big iron mainframes, this is what many ecommerce companies are spending their Venture Capital cash on. Forrester Research have been complaining lately that they don't support enough in the way of standard interfaces, particularly Enterprise Java Beans. They do have a Corba layer however.
- Vignette StoryServer
Not as expensive as Broadvision, but you're still looking at £70k per server. I did hear of a user which needed 6 Solaris E450s (£150k per box or so) to handle loads as small as 30k users a day, which can't be good. It uses TCL as its development platform, and if you're a TCL developer, it puts you in a very small, very well paid group. Users include Shell Chemicals and Mercedes, plus a whole bunch of news sites. Phil Greenspun has suggested that if you're a news site, you'll be drawn into its way of working.
- Allaire Spectra
Pretty affordable (£10k per server, but you need ColdFusion Enterprise too) and designed to handle workflows, personalisation and syndication. The development platform is CF, which has a sizable third party developer community. Will talk to almost anything sensible (ie ODBC, COM, DCOM, CORBA, EJB, WDDX etc). Right from the announcement, it's been making waves - have a look what WebMonkey said about Spectra. Customers include fitforall.com, PriceWaterHouseCoopers and about 50 others.
- Interwoven TeamSite
Not at all cheap for large user groups (it's based on number of licenses), but will talk to a wide range of application servers (even deploy with Server Side Includes, and handle static files). You'll also really need to buy their excellent deployment product, OpenDeploy. Strong on workflows (it uses Perl and XML as development tools), but weak on content entry. Until they update the entry side, you'll need to hook up something like eWebEditPro to it, which isn't going to be fun. Customers include Yahoo and eBookers.com. There are almost no third party developers, so you'll need to pay Interwoven to supply consultants. Could be a lengthy wait, particularly if you're in Europe.
That's this site. If you've ever submitted an article, you'll know that there's a delay before it hits the front page. Or it might not get there at all - you may get a note back from an admin member asking you to revise it. That's because there's a signoff step in the workflow. The workflow for comments is much simpler - there's no signoff at all. It's simple but effective for a small homogenous site like ours. It cost us no cash at all as it's built by evolt members, developed from a very good system produced by our own Walker, and runs on donated servers (ColdFusion & Oracle on Linux - it used to run on CF & Access on NT). Want to help develop it further? Sign up for thesite mailing list.
Martin is currently (March 2000) working with the UK's second biggest bank, developing a content management system to manage their Internet site.