Main Page Content
Using Custom Tags To Ease Content Management
Some typical questions from customers who want a website developed are...
This is a typical way the image tag would be found. In a page-length article on a website, typically images come within the text, so it makes sense to allow the user to place the image where they want and to place as many images as they want.So what I did was reduce the image tag to:
I have used a tag called
I used a global setting for things like "hspacing" and "vspacing" for images, even though this could have been incorporated into the form. I could have calculated things like height and width of the image when the user uploaded it… but let me leave it as another exercise.
- I want to manage the content myself. Will you have a system for it?
- I don't know HTML or programming. Can I still manage my content?
- I can't understand these HTML tags? Is there an easier way out?
- A layuser can manage content/layout features very easily with very little knowledge of HTML.
- Managing the content becomes much easier.
- The same concept can be reused and replicated in other website projects.
Recreating the <img> Tag
Let's take up the simple example of the image tag:<img src="xyz001.jpg" alt="xyz image" align="right">
This is a typical way the image tag would be found. In a page-length article on a website, typically images come within the text, so it makes sense to allow the user to place the image where they want and to place as many images as they want.So what I did was reduce the image tag to:
<g n="index">l</g>
The tag, <g>
, is not an HTML tag, but is our very own tag that our server-side script recognises.Let's examine the various parts of our tag for this example:<g> </g>
- This just indicates to our parsing program the beginning and ending of our tag.
n="index"
- This attribute indicates to the parser which image the user has requested, but the index need not be the image name. We will come to that later.
<g n="index">l</g>
- The
'l'
between the<g></g>
is an alignment indicator which asks the image to align left ('l'
) or right ('r'
).
<img>
tag.Just a small clarification here, that i decided to add after reading some of
the comments below :I have used a tag called
<g>
in this example, some people might consider that too cryptic, I could have very well used something like <image name="x">left</image>
, The parser described below will handle something like that by just changing the input parametersApplying our Custom Tag
So the content entry users have a form (I don't know what most people use… probably VB forms, Access forms or HTML forms like evolt.org) where they enter the content. In the case where I used this technique, the data store was a Domino database, so I used Domino forms (pretty rare…huh…!). The same technique can be easily duplicated on other systems. Some typical content entered by the user would be:X company reached IPO on Feb 14th , but the CEO was disbarred from attending the conference by the police as he had 6 armsThe parsing program goes through this and transforms the<g n="CEO">l</g>
. But federal investigator Mr. Mulder came up with an alternate theory…. Blah… blah… Blah… blah… Blah… blah… Blah… blah… Blah… blah…which conclusively<g n="Mulder">r</g>
proved that the CEO was an alien.
<g>
tags to meaningful HTML <img>
tags with the correct image name, which is then rendered in the browser, so the converted output would be:X company reached IPO on Feb 14th, but the CEO was disbarred from attending the conference by the police as he had 6 armsThe two main components that accomplish this are:<img src="path/ceo.180x100.jpg" align="left" alt="CEOs at Lunch">
. But federal investigator Mr. Mulder came up with an alternate theory…. Blah… blah… Blah… blah… Blah… blah… Blah… blah… Blah… blah… which conclusively<img src="path/mulder.jpg" align="right" alt="Agent Mulder">
proved that the CEO was an alien.
- A centralized image storage system.
- An image parser.
A Centralized Image Store
I have a database table and form system (let's call it an image store…) which is managed by the content entry users where they upload the image in a user-friendly form and enter the image attributes there, which gets stored in the backend database. The parser uses the n="index" attribute to locate the requested image from this table.The image entry form
A typical procedure for entering the image would be like this:- Add the image…
The image can be put in a database or in a file system, though in the case of a file system, we would also store the path information. (In my case, it was in a database for the sake of ease of management and the database had to be replicated as well - 2 birds with 1 stone). Typically the image has an unfriendly name, like img001_180x100.jpg. Users find these names very difficult to remember or to search for. - Specify an index name for the image…
This is a simple human understandable name which will be used by the content entry in our custom tag to identify the image:<g n='indexname'>
. - Specify an "alt" text for the image…
This eases things for the user as they do not have to manually enter "alt=" text every time they put an image tag in the content. The parser automatically picks it up. There is an additional advantage, since alt texts are stored centrally. In the case of images which are used in many places on a website, changing alt texts globally for an image would just imply a change in one place. - Enter a short description about the image…
I found this, along with alt text, very useful, since it allows building of an intuitive image search on a website.
The image store table
My table structure for the image manager would look like this:Img_index | Image | Image_AltText | Image_desc |
---|---|---|---|
Fishcatch | 8.jpg | Fish catch at hemingway | Fishing for sailfish at hemingways in the bay of biscay |
Ceo | Ceo180x100.jpg | Mr. Ceo Laughing | CEO of X corporation, Mr CEO |
The Image Parser
Regarding the Parser, I try to make it flexible for my needs. In reality, I use it not only to handle<g>
tags, but also a variety of other custom tags. I have one for handling links and one for embedded tables. So it is a generic parser in Java which has generic routines from which other specialised parsers were extended.How the parser works together with the Image store
- Get the whole content to be displayed in a string.
- The string has all the text content with the embedded custom tags.
- The parser then scans through the string, and when it locates a
<g>
tag, it extracts the tag attributes. - The parser then uses the index specified in
n=""
to locate the image name in the image store. - From the image store, it extracts all the associated info for the image, such as the "alt" attribute text.
- The information between the
<g>
and</g>
(the "l" or "r") is used to set thealign="right"
oralign="left"
attributes for the<img>
tag. - With all this information, the
<img>
tag is finally built. - Lastly, the parser replaces the
<g></g>
tags in the content string with the<img>
tag. - The content string is then further scanned for
<g></g>
tags and the same process happens again. - Finally the transformed content string is outputted to the browser
Parser structure
- There is a generic parser called the baseParser which has generic routines for scanning for a specified tag in a content string.
- There is a task specific parser, in this case an image parser called imgParser, which uses the iterative routines of the baseParser to extract its tags from a string. The task specific parser also has a function to output a custom tag in a particular way (in this case it's as an
<img>
tag).
Source Code Listings
Source Code for baseParser
This is a stripped out version of the baseParser. I removed a lot of application specific stuff to reduce the size of the parser as much as possible. There are descriptive comments along with source code.public class baseParser{private int nPreParseLength =0 ;private String strBeginp;private String strEndp;private String strRefp;private String str;private int lnkBeginLength;private int lnkRefLength;private int lnkEndLength;//these 2 variables are used to keep track of tag position in the content string//during an iterative scan through the content stringprivate int m_nLastIndex=0;private int m_nPrevIndex=0;public baseParser(){strBeginp="";strEndp="";strRefp="";str="";lnkBeginLength=0;lnkRefLength=0;lnkEndLength=0;}//String toBeParsed - string with content+custom tags which requires parsing//beginP - beginning tag e.g. <g//endP - ending tag e.g. </g>//refP - ref.attribute tag n="public baseParser(String toBeParsed, String beginP, String endP, String refP){str = toBeParsed;strBeginp = beginP;strEndp = endP;strRefp = refP;lnkBeginLength = strBeginp.length();lnkRefLength = strRefp.length();lnkEndLength = strEndp.length();} //iterator function which scans for tags sequentiallypublic String parseUnit(int nBeginIndex){//look for beginningint nPrevIndex = str.indexOf(strBeginp,nBeginIndex);//beginining tag not found..so end itif (nPrevIndex == -1)return "";//look for endingm_nPrevIndex = nPrevIndex;int nLastIndex = str.indexOf(strEndp,nPrevIndex);m_nLastIndex = nLastIndex+lnkEndLength;return str.substring(nPrevIndex, nLastIndex+lnkEndLength);} //helper function for iterator public String parseUnit(){return parseUnit(lastUnit());} //parses out the reference attribute of the tag// in <tag ref="refattrib">prop</tag>//this parses out refattrib...public String parseRef(String strCur){int nRefBegin=0;int nRefEnd = 0;nRefBegin=strCur.indexOf(strRefp);nRefEnd = strCur.indexOf('"', nRefBegin+lnkRefLength);return strCur.substring(nRefBegin+lnkRefLength,nRefEnd);} //parses out the property of the tag // in <tag ref="refattrib">prop</tag>//this parses out prop...public String parseProp(String strCur){ int nLnkEnd = 0;int n = 0;nLnkEnd = strCur.lastIndexOf(strEndp);for (n=nLnkEnd; strCur.charAt(n) != '>' && n >= 0; n--);if (strCur.charAt(n) != '>')return "";String strLnk = strCur.substring(n, nLnkEnd);nLnkEnd = strLnk.indexOf(">");strLnk = strLnk.substring(nLnkEnd+1); return strLnk;} //helper function for iterator public int prevUnit(){return m_nPrevIndex; }//helper function for iteratorpublic int lastUnit(){if (str.indexOf(strBeginp, m_nLastIndex) == -1)return -1;elsereturn m_nLastIndex ;}};
The image parser
The base parser gets used by the imgParser, as shown below. Again, I removed lot of application specific error checking and caching code to reduce size.import java.util.Vector;public class imgParser{private baseParser m_spa; private String m_strMain="";private String m_strKey="";public imgParser(){m_strMain = "";m_strKey = "";}//in reality i was also passing the db connection from the page
//script to the parser...//this allowed me to reuse the connection i made to the db for //displaying the page//instead of creating new connections for each instance of the parserpublic imgParser(String toBeParsed){m_strMain = toBeParsed;//seed the base parser with our required tagsm_spa = new baseParser(toBeParsed,"<g","</g>","n=\"");}private Vector lookupImageDb(String sImageNo )
{ // this was a lotus domino routine which looked up the // index name in the database and returned a row of //information as a java Vector //for illustrative purposes...i am just returning a dummy vector //but your routine to query the db would come over here... //this returned a vector with 2 columns // 1st col - image filename // 2nd col - image alt text Vector v = new Vector(2); v.addElement(new String("image.jpg")); v.addElement(new String("just a dummy image")); return v; } private String getPath() { //i didnt hard code any paths... //this function simply calculated the path to the image inside the //database...since i was storing the image in the database //you could use it to calculate paths... //for illustrative purposes, i just return a dummy path... return new String("/images") ; } private int IMG_FILE_NAME_COL=0; private int IMG_ALT_NAME_COL=1; private String ParseImageTag(String sImageNo, String sAlign) { String sTemp = ""; try{ //lookup image info from image store Vector v = lookupImageDb(sImageNo); //string used to store alt tag String strAltTag=new String(""); //string use to store image file name or handle in database String strFileName = new String(""); //get the alt tage and file name strAltTag = (String)v.elementAt(IMG_ALT_NAME_COL); strFileName = (String)v.elementAt(IMG_FILE_NAME_COL); //build the image tag sTemp+="<img src=\""+getPath()+"/"+strFileName +"\""; //set alignment depending on l or r properties if (sAlign.equals("l")) sTemp+=" align=\"left\""; else sTemp+=" align=\"right\""; //other tags //apply alt tag only if it exists if (strAltTag.length() != 0) sTemp += " alt=\""+strAltTag+"\""; sTemp+=" hspace=\"8\" vspace=\"8\" "; sTemp+="border=0 />"; } catch(Exception e) { e.printStackTrace(); } finally{ return sTemp; } } //the only function called externally for this parser public String ParsedString() { String strTemp=""; int nBegin=0; int nPrev = 0; int nLast = 0; //call the base parser routine to iteratively scan for <g> tags String sRet = m_spa.parseUnit(nBegin); if (sRet.equals("")) //no image tags....so return string as is return m_strMain; //<g> tag found parse out attributes and properties String sRef = m_spa.parseRef(sRet); String sTxt = m_spa.parseProp(sRet); nPrev = m_spa.prevUnit(); if (nPrev != -1) { //skip string to position at the end of <g></g> strTemp = m_strMain.substring(0,nPrev); //now build the <img> tag from the <g> tag strTemp+=ParseImageTag(sRef, sTxt); } nLast = m_spa.lastUnit(); //check if end while (nLast!=-1) { //parse the next <g> tag set sRet = m_spa.parseUnit(); //<g> tag found parse out attributes and properties sRef = m_spa.parseRef(sRet); sTxt = m_spa.parseProp(sRet); nPrev = m_spa.prevUnit(); if (nPrev != -1) { //skip string to position at the end of current <g></g> strTemp+= m_strMain.substring(nLast,nPrev); //parse out next image tag... strTemp+=ParseImageTag(sRef, sTxt); } nLast = m_spa.lastUnit(); } strTemp+=m_strMain.substring(nPrev+sRet.length()); return strTemp; } };
Using the parser in script
I use the imgParser in my code in this manner://..other code on pageString strContent;String strParsedContent;strContent = get_Content_As_String_From_Content_Table_Field();//in reality i pass my database connection handle of the page script
//to the parser, so i can reuse it//create our image parser object and pass the content string to itimgParser ipObj = new imgParser(strContent);//strParsedContent contains the transformed string now...
strParsedContent = ipObj.ParsedString();//...output parsed content to page
printwriterObj.print(strParsedContent);
Finally…
Just a few thoughts before I finish this…I use quite a few parsers built around the base parser. I use a link parser, which basically uses a custom tag for links, that appends a particular class name and a target=_blank for links external to the site. I use another tag to embed a small table dynamically within the text with a specific alignment.There might be performance implications in heavily trafficked sites, since the parser runs every time someone hits the page. What I do for such cases is pre-parse the content periodically on the server and cache it in the database, so when the user accesses the page, instead of parsing every time, they see the cached version.The way the parser handles only single attributes (the "n=" in<g n="">
) was by design. I wanted to make the tags as simple as possible for the user entering content. In almost all the cases where I used the parser, I never had to use more than a single attribute. Also I never check for incorrect tags. This was because I was checking it at the client side at the point of data entry of the content.Using this technique certainly saved time and heartburn for me. Other than dealing with clients, I also had to deal with a copy writer who liked getting acquainted with Gilbey's dry gin more than with "complicated" HTML tags. But, he actually enjoyed entering these custom tags along with his copy, since for the first time he had complete control over where to position images in the copy.