OmniMark CGI programs are no more complex than many of the sample programs seen in the previous chapters in this book - they employ exactly the same principles and syntax.
However, be warned that there are some rather complex setting up issues with CGI programs. You have to have access to a host machine on the web, you have to have OmniMark installed on that machine, and you have to have permission from the web server administrator to install and execute your programs on the host machine. Debugging CGI programs is also a little more tedious than with standalone programs.
I will cover all these complexities in later topics. The topics first cover the theory of the Common Gateway Interface in some detail, then the setting up issues and finally the programming principles and syntax itself.
Although you can write and run many of the programs in this chapter without using a real CGI environment, you won't actually see the results as they are intended - this is because the input and output for CGI is assumed to be managed by a web server and the user interface is assumed to be a web browser (like Netscape or MS Internet Explorer) which communicates with the web server.
CGI is an acronym for 'Common Gateway Interface' which is a set of basic rules for how web browsers can talk to executable programs on a host machine via web server software.
To understand this, first think about what happens when a person is browsing on the web and wants to link to another web page. The picture below shows an open web page. The person looking at it (called a 'client') has pointed their mouse onto a link in the page but has not yet clicked on it.
Notice the status bar along the bottom of the window frame. It contains the URL of the web page which will be loaded when the link is clicked. In this case, the URL is simply the name of an HTML file called 'index.html' which is located in the directory 'omnimark' on a host machine called 'clio.mit.csu.edu.au'.
When the client actually clicks the link, their browser will send a message to the web server on the host machine requesting that the indicated page be sent to the browser. This basic interaction between browser and web server is shown in the following diagram.
Following this request, the web server reads the file and transmits it back to the client's machine and the client's browser displays it. The web server on the host machine is just a computer program whose job it is to serve pages to clients.
The Common Gateway Interface allows clients to send similar messages to web servers, but instead of requesting a web page (an HTML file), the client asks the web server to execute a program on the host machine. Programs which can be executed this way must conform to the rules of the Common Gateway Interface and are usually just called 'CGI programs'.
The picture below shows a client's browser just before they click on a link to a CGI program - note the URL in the status bar.
We can see the location of the CGI program in the status bar. It is a program called 'formtest' which is located in a directory called 'cgi-bin' on the host machine 'clio.mit.csu.edu.au'.
When the client actually clicks the link, their browser will send a message to the web server on the host machine requesting that the indicated program be executed. This interaction between browser and web server is slightly more complicated than the page serving interaction because the browser may also send data to the server (such as the data typed into a form on the original page). When the server executes the CGI program, it passes this data to it. This is shown in the following diagram.
The CGI program may do any number of processing tasks and ultimately it will send data back to the web server which will pass this on to the client's browser for display. In typical CGI programs, the output is just HTML markup so when this arrives at the client's browser it appears like any standard web page. The following diagram shows the flow of output data from the CGI program through the web server and back to the client's browser.
An important point to know is that unlike software which you typically run on your own local machine, CGI programs live no longer that the time it takes for them to accept input, process it and deliver output. Once a CGI program has terminated, it loses all knowledge of the interaction. This means that a CGI program does not retain any information about previous calls to it - its does not maintain any state between executions. There are several techniques for helping CGI programs to maintain state between calls - such as writing and reading temporary files, setting cookies etc, but these are not dealt with in this book.
The above discussion indicates that a CGI program gets incoming data from the web server. Any data which is output by the CGI program is sent to the web server. As authors of CGI programs, we need to know what kind of data our program will get, how to interpret it, and how to output the correct data in response.
The Common Gateway Interface defines that all incoming data that a program receives comes in through its standard input stream. As you will see later in this chapter, OmniMark can process this input stream very easily. In addition to the standard input stream, the web server also sets several environmemt variables which the CGI program can access - these environment variables will be discussed in a later topic. OmniMark can easily deal with these environment values by using functions in its CGI library.
The most common data sent to a CGI program is that which has been captured by a web form on the client's machine. When the form is submitted (usually via a SUBMIT button), the browser collects the data in the form fields and encodes it in a special way (called 'URL encoding') before submitting it.
When the CGI program gets this data it must unencode it. CGI programs written with OmniMark can do this easily by, once again, using functions from the CGI library.
To send output data back to a web server, a CGI program simply writes the data onto its standard output stream. OmniMark does this with the simple 'output' actions seen often in the previous five chapters of this book.
Since the output from the program is destined eventually for display by a client's browser, the format of the output is usually just HTML. The program first outputs a special header indicating that the output is HTML, then just outputs the HTML markup.
The following discussion assumes that the OmniMark CGI programs will be run on a host machine running the UNIX operating system. However, the principles apply equally to other systems such as MS Windows - albeit with some slight variation of directory naming etc.
CGI programming is treated as a more sophisticated operation that just publishing web pages in HTML. Before you can install your CGI programs on a host machine, you must gain permission to do so from the web server administrator and/or the system administrator of the host. When you contact the administrator to seek permission, explain that you want to be able to save your programs in one of the special directories which are configured for CGI executables. On a typical UNIX machine, running a typical web server (such as 'Apache'), the most common directory is one called 'cgi-bin' which usually lives under the installation directory of the web server itself. A location of
/local/apache/cgi-bin
is quite common.
The administrator must give you permission to save your programs in the 'cgi-bin' directory and also give you execute permission in that directory (for debugging).
The OmniMark C/VM must be installed on the host machine along with its associated function libraries and include files - this is the standard installation. Once installed, you should make sure you know the absolute pathname to the OmniMark executable 'omnimark'. In a typical installation on a typical UNIX host, this might be:
/local/bin/omnimark
You must also know the absolute pathnames of the external function libraries and the include file directories. On my machine the function libraries are in
/local/omnimark53/lib
and the include files are in
/local/omnimark53/xin
This topic presents the two files you need to create to build a first (and very simple) OmniMark CGI program. You can use these as a general guide for all the OmniMark CGI programs provided in in later topics.
Using any text editor, create a text file in the 'cgi-bin' directory of your web host machine. Name this file 'cgiTest'. This file will not contain the OmniMark program itself, it will contain the command-line arguments which set up the locations of the OmniMark libraries and specify the name of the OmniMark program. The name of this arguments file (ie 'cgiTest') will be used by clients when they call your program.
This file is actually an OmniMark 'arguments file' as discussed in Chapter 5, Topic 4. Using the example installation paths discussed above, this argument file will contain:
001 #!/local/bin/omnimark -f 002 -sb cgiTest.xom 003 -x /local/omnimark53/lib/=L.so 004 -i /local/omnimark53/xin/
Line 1 in the above file is crucial. It must start at the beginning of the first line, as shown, with the symbols '#!'. These symbols are called 'hash-bang', or sometimes just 'shebang', and when they are read by the host's operating system they are interpreted as 'call the following command'.
The command which is called is '/local/bin/omnimark -f' which calls OmniMark and instructs it to interpret the subsequent lines as command-line arguments. Note that '/local/bin/omnimark' is the absolute pathname of the executable OmniMark C/VM on the host machine.
Line 2 contains the call to our OmniMark CGI program called 'cgiTest.xom'. Note the typical '-sb' options which specify that 'cgiTest.xom' is our source file and it is to be run in 'brief' mode.
Line 3 contains the '-x' option (which specifies the location of OmniMark libraries) followed by the absolute path of the libraries. The notation '=L.so' appended to this path indicates that on this system (UNIX), the function libraries have a suffix of 'so'. If you are writing CGI programs on a Windows system, the notation will have to be '=L.dll' since 'dll' is the Windows file suffix for dynamic linked libraries.
Finally, on line 4, the option '-i' specifies the location of the OmniMark include files.
This arguments file 'cgiTest' must be set as 'executable' by the operating system. On UNIX, executable mode can be specified for the file with the command
chmod 755 cgiTest
and can be confirmed with the UNIX command
ls -l cgiTest
which, on my local machine displays
-rwxr-xr-x 1 echoppin academic 99 Nov 7 15:45 cgiTest
The 'x' at the end of the first part of this listing shows that the file 'cgiTest' is executable by everyone.
Now to write the OmniMark program - at last!
Using any text editor, create the source file 'cgiTest.xom' in the same directory as the arguments file above. Our source will be a minimal CGI program which ignores any incoming data and simply outputs a header and the smallest possible HTML markup for display on the client's browser. Here is the program:
[Code Sample: C06T05a.xom]
001 ; a minimal OmniMark CGI program 002 003 process 004 output "Content-type: text/html%n%n" 005 output "<HTML>%n" 006 output "<HEAD>%n" 007 output "<TITLE>OmniMark CGI says Hi</TITLE>%n" 008 output "</HEAD>%n" 009 output "<BODY>%n" 010 output "<H2>Hi there from my OmniMark CGI program</H2>%n" 011 output "</BODY>%n" 012 output "</HTML>%n"
Line 4 of the program outputs (to standard output, naturally) the special header which tells the browser what kind of data is to follow. In this case the header is
Content-type: text/html
followed by two newlines. The content type 'text/html' specified is a mime-type which informs the browser that the data to follow is plain ASCII text and is marked up as HTML. The two newlines are essential because the blank line they create after the header tell the recipient that the header is finished and to get ready for the real data to follow.
The eight lines from line 5 to line 12 output simple HTML markup which the browser will display.
Before calling the program from a browser, it is wise to test it from the command line. This can be done (in UNIX), just by entering the name of the executable arguments file as a command...
cgiTest
When this is done, the operating system sees the hash-bang and so invokes the OmniMark C/VM on the file. OmniMark sees the option '-f' and realises that this is an arguments file, then sees the '-sb cgiTest.xom' and calls our program. The program's output appears on our console screen (our standard output device) and should appear as:
Content-type: text/html <HTML> <HEAD> <TITLE>OmniMark CGI says Hi</TITLE> </HEAD> <BODY> <H2>Hi there from my OmniMark CGI program</H2> </BODY> </HTML>
Of course, if there are OmniMark compile-time or run-time errors generated, they should be fixed and the program tested again from the command line. If it appears to work correctly it's time to test it live on the web.
The following picture shows my browser window after a call to 'cgiTest' has been made from within the browser. Note the location URL - it shows where I placed my program. Note also the title on the browser titlebar - it shows that the program's output has been interpreted correctly.
Although this sample program works correctly, it is a good idea in general OmniMark CGI programming to include a couple of extra settings. With these added, our sample program can be used as a template for all our following and more complex programs.
To make our programs more efficient in capturing incoming data we can specify that our standard input stream is not buffered. This means that the data is not accumulated in memory before being fed to our program. Unbuffering standard input can be specified with the instruction
declare #main-input has unbuffered
at the top of our '.xom' source code file.
To ensure that our output is sent without modification and as efficiently as possible we can specify that our standard output stream is sent in binary rather that ASCII mode. This setting is specified by writing
declare #main-output has binary-mode
at the top of our source code.
The external function libraries supplied with our OmniMark installation can be accessed simply by including them near the top of our source code. The following three lines bring the utility library, the CGI library and the date library respectively into our source code.
include "omutil.xin" include "omcgi.xin" include "omdate.xin"
Note that the libraries are named with the '.xin' suffix because they are OmniMark include files. OmniMark knows where to find these on our system by following the '-i' path specified in the arguments file. We do not have to import the external function library files themselves - these are accessed by OmniMark automatically within the '.xin' files. We do have to tell OmniMark where on our system these libraries are and that was done with the '-x' option in the arguments file.
Be careful to always include 'omutil.xin' before including 'omcgi.xin' - this is because some of the funtions in the CGI library require the use of some of those in the utility library.
With the include files available, we can easily and directly call any of support functions we need.
I present here, for completeness, another sample CGI program which includes the IO settings, and the libraries. It outputs HTML markup showing the exact date and time the program was called.
The executable arguments file (called 'showtime') which is called by a client is as follows
#!/local/bin/omnimark -f -sb showtime.xom -x /local/omnimark53/lib/=L.so -i /local/omnimark53/xin/
The full listing of the source code (in a file called 'showtime.xom') is:
[Code Sample: C06T06a.xom]
001 ; An OmniMark CGI program which outputs the current 002 ; date and time within HTML markup 003 004 ; set up the IO streams 005 declare #main-input has unbuffered 006 declare #main-output has binary-mode 007 008 ; include three libraries 009 include "omutil.xin" 010 include "omcgi.xin" 011 include "omdate.xin" 012 013 ; declare a variable to hold the time value 014 global stream theTime 015 016 process 017 set theTime to now-as-ymdhms 018 output "Content-type: text/html%n%n" 019 output "<HTML>%n" 020 output "<HEAD>%n" 021 output "<TITLE>OmniMark showtime CGI</TITLE>%n" 022 output "</HEAD>%n" 023 output "<BODY>%n" 024 output "<CENTER>%n" 025 output "<P>This program was executed at</P>%n" 026 output "<PRE>%n" 027 output "%g(theTime)" 028 output "</PRE>%n" 029 output "</CENTER>%n" 030 output "</BODY>%n" 031 output "</HTML>%n"
The current date and time is captured into the stream variable 'theTime' on line 17 and output on line 27. The function 'now-as-ymdhms' is available to the program from the 'omdate.xin' library which was included on line 11.
When I call the program from my browser, the display I get is shown in the following picture. You can obviously see when I captured this picture - year 2000, month 11, day 08, hour 10, minute 34, second 22. The +1100 on the end of the time value indicates that here in Bathurst, Australia we are 10 hours ahead of GMT and we currently have 1 hour of daylight saving time operating.
Perhaps the most useful purpose of CGI programs is capturing and processing data which comes from web forms.
The picture below shows a simple web form into which our friend Hugo has entered some data:
Here is the HTML used to create it.
001 <HTML> 002 <HEAD> 003 <TITLE>A simple web form</TITLE> 004 </HEAD> 005 <BODY> 006 <FORM METHOD=POST 007 ACTION="http://clio.mit.csu.edu.au:88/cgi-bin/ombook/form1"> 008 009 What is your name? <INPUT NAME="name" TYPE=TEXT SIZE=30><BR> 010 What is your age? <INPUT NAME="age" TYPE=TEXT SIZE=5><BR> 011 What is your email address? <INPUT NAME="email" TYPE=TEXT SIZE=20><BR> 012 Submit this form: <INPUT TYPE=SUBMIT VALUE="Hit Me"> 013 014 </FORM> 015 </BODY> 016 </HTML>
The FORM element starts on line 6 where the transmission method is defined as 'POST'. On line 7 the ACTION attribute contains the URL of the CGI program which will accept the form data when it is submitted. Note that this form contains only three fields into which data can be entered, one called 'name' on line 9, one called 'age' on line 10, and one called 'email' on line 11.
When the form is submitted, the browser encodes the names of the fields and their contents like this, and this data is sent to the CGI program.
name=Hugo+First&age=31&email=hugo%40myplace.com
The encoding pattern is a sequence of name/value pairs separated by an ampersand symbol '&'. Each pair contains a field name, an equal symbol and a field value. You might notice that spaces have been encoded as '+' symbols and some non-alphanumeric characters (like the '@' symbol) have been converted into the hexidecimal equivalent of their ASCII code, preceded by a percent symbol.
We could write an OmniMark pattern matching program (See Chapter 3) to decode this raw data but why bother? OmniMark has already provided a function which does it for us. The function is called 'cgiGetQuery' (within the 'omcgi.xin' library), it neatly decodes the data into an OmniMark shelf. All our form processing CGI programs will make use of this function, as shown in the following example.
The following OmniMark CGI program responds when the form above is submitted. The output from the program is HTML markup which includes the form data so the client can verify the data they submitted. Although most form processing programs are often longer and more complex, this example demonstrates the main principles.
The executable arguments file (called 'form1') is
#!/local/bin/omnimark -f -sb form1.xom -x /local/omnimark53/lib/=L.so -i /local/omnimark53/xin/
The CGI program (called 'form1.xom') is
[Code Sample: C06T07a.xom]
001 ; An OmniMark CGI program which processes a form 002 ; The program simply sends the form data back to the 003 ; client. 004 005 ; set up the IO stream settings 006 declare #main-input has unbuffered 007 declare #main-output has binary-mode 008 009 ; include three libraries 010 include "omutil.xin" 011 include "omcgi.xin" 012 include "omdate.xin" 013 014 ; a shelf to hold the field names and values 015 global stream formData variable initial-size 0 016 017 ; a variable to hold the number of fields 018 global counter numFields 019 020 process 021 cgiGetQuery into formData ;; decode and capture incoming data 022 023 ; output to client 024 output "Content-type: text/html%n%n" 025 output "<HTML>%n" 026 output "<HEAD>%n" 027 output "<TITLE>Form 1 CGI</TITLE>%n" 028 output "</HEAD>%n" 029 output "<BODY>%n" 030 output "<P>Thanks for submitting the form.</P>%n" 031 032 set numFields to number of formData 033 do when numFields = 0 034 output "<P>No form fields were received.</P>%n" 035 else 036 output "<P>%d(numFields) form fields were received:</P>%n" 037 repeat over formData 038 output "The field '" || key of formData || "' " 039 output "contains <KBD>" || formData || "</KBD><BR>%n" 040 again 041 done 042 043 output "</BODY>%n" 044 output "</HTML>%n"
Line 15 contains a declaration of an OmniMark stream shelf to hold the data.
Line 21:
cgiGetQuery into formData
calls the 'cgiGetQuery' function, it decodes all the raw data and copies it into the 'formData' shelf. The names of the fields will be the keys of the shelf items and and the field values will be the values of the shelf items.
The ten lines from 32 through 41, do all the processing of the form data. We first check if any fields have been received (lines 32 and 33), and if so we loop over all the elements of our shelf (lines 37 to 40) and output the shelf keys (line 38) and values (line 39).
When I submit the form to the above program via my browser, the resulting display is:
The above program processes any web form, with any number of fields and without knowing what the names of the fields are. In many cases we want to process form data in a more specific way. The following example involves processing a registration type web form. Clients fill in and submit the form to register themselves for a conference. The CGI program captures the data, writes it onto the end of a log file and then outputs HTML markup to thank the client for their interest.
The web form, as seen by a client, is shown in the following picture. The fields have been filled in with sample data:
The HTML which produces the form is:
001 <HTML> 002 <HEAD> 003 <TITLE>Registration Form</TITLE> 004 </HEAD> 005 <BODY> 006 <H2>Registration Form</H2> 007 <P>Please complete and submit this form to register 008 for the conference.</P> 009 010 <FORM METHOD=POST 011 ACTION="http://clio.mit.csu.edu.au:88/cgi-bin/ombook/regoform"> 012 013 Your Name: <INPUT NAME="name" TYPE=TEXT SIZE=30><BR> 014 Your Email: <INPUT NAME="email" TYPE=TEXT SIZE=30><P> 015 016 Please choose one of the following:<BR> 017 <OL type="a"> 018 <LI><INPUT NAME="regotype" TYPE=RADIO VALUE="full" CHECKED>Full Registration 019 <LI><INPUT NAME="regotype" TYPE=RADIO VALUE="early">EarlyBird Registration 020 <LI><INPUT NAME="regotype" TYPE=RADIO VALUE="student">Student Registration 021 </OL> 022 023 Submit this form: <INPUT TYPE=SUBMIT VALUE="Register"> 024 025 </FORM> 026 </BODY> 027 </HTML>
To process this data correctly, a CGI program must know the names of the expected fields. We can see there are two ordinary text fields called 'name' on line 13, and 'email' on line 14. The last field, defined on lines 18, 19 and 20, needs special attention. It is a radio button type field. Note that there are three radio buttons each defined with the same field name: 'regotype' so there is only one field called 'regotype'. Having multiple buttons with the same field name makes them mutually exclusive - the client can choose only one of them.
It is important that when mutually exclusive buttons are used, that one of them is automatically selected as the default - this is done with the 'CHECKED' attribute value on line 18. If the form is submitted with no radio button selected, the CGI program will not receive the field name 'regotype, nor a value for it.
The full listing of the CGI program to process the registration form data is given below, followed by an explanation of its features, process by process.
[Code Sample: C06T08a.xom]
001 ; An OmniMark CGI program which processes a form
002 ; The program writes the data to a log file and
003 ; responds to the client.
004
005 declare #main-input has unbuffered
006 declare #main-output has binary-mode
007
008 include "omutil.xin"
009 include "omcgi.xin"
010 include "omdate.xin"
011
012 global stream formData variable initial-size 0
013 global stream logFileName initial {"regolog.xml"}
014
015 ;; Error Message Function
016 define function showError as
017 output "<HTML>%n"
018 output "<HEAD>%n"
019 output "<TITLE>Error 1</TITLE>%n"
020 output "</HEAD>%n"
021 output "<BODY>%n"
022 output "<H2>Error in form processing</H2>%n"
023 output "Can't process registration - form data is invalid.%n"
024 output "</BODY>%n"
025 output "</HTML>%n"
026
027 process-start
028 cgiGetQuery into formData
029 output "Content-type: text/html%n%n"
030
031
032 ;; deal with bad forms
033 process
034 do unless number of formData = 3
035 showError
036 halt
037 done
038
039 do unless ( formData has key "name"
040 AND formData has key "email"
041 AND formData has key "regotype" )
042 showError
043 halt
044 done
045
046 ;; write data to a log file
047 process
048 local stream fileStream
049 reopen fileStream as file logFileName
050 put fileStream "<REGO>%n"
051 put fileStream "<NAME>" || formData key "name" || "</NAME>%n"
052 put fileStream "<EMAIL>" || formData key "email" || "</EMAIL>%n"
053 put fileStream "<TYPE>" || formData key "regotype" || "</TYPE>%n"
054 put fileStream "</REGO>%n%n"
055 close fileStream
056
057 ;; respond to client
058 process
059 output "<HTML>%n"
060 output "<HEAD>%n"
061 output "<TITLE>Registration Feedback</TITLE>%n"
062 output "</HEAD>%n"
063 output "<BODY>%n"
064 output "<H2>Registration Feedback</H2>%n"
065 output "<P>Thanks for registering, a full conference kit will"
066 output " be sent to you by email shortly.</P>%n"
067 output "</BODY>%n"
068 output "</HTML>%n"
Starting, obviously, at the 'process-start' rule on line 27, the program captures all the form data in the shelf 'formData' and replies to the client's browser with the content-type header.
The process rule at line 33, checks that the expected form data has arrived and all the fields are present. Line 34 tests that there are three fields and lines 39, 40 and 41 check that all the field names are correct. If either of these tests fire, we cannot process the form correctly and so we deliver a small error message (formatted in HTML) to the user. The function 'showError' defined on line 16 delivers the error message. Note that after we show the error message we halt the entire program with the 'halt' action on either line 36 or line 43.
The process starting on line 47 writes the form data onto a file called 'regolog.xml'. The log file is reopened (line 49) so each registration's information is appended to the file. A CGI program is only allowed to write to a file on the host machine if the file already exists and is writable by the web server. So, before the program is run for the first time, the file 'regolog.xml' must be created and have its write permission enabled for the user who the web server simulates.
In this program, the format of the data written to the log file is XML. The XML starttags and endtags are written so they surround the form data on lines 50 through 54.
The final process (starting on line 58) outputs a response to the client in simple HTML.
When the form is submitted, with Rhoda's information, the file 'regolog.xml' contains
<REGO> <NAME>Rhoda Dendron</NAME> <EMAIL>flower@someplace.org.au</EMAIL> <TYPE>early</TYPE> </REGO>
and the response the user sees on their browser is
Each time a web server calls a CGI program, it creates several environment variables and sets them with appropriate values. These variables and their values can easily be read by an OmniMark CGI program by using the function 'cgiGetEnv' which comes from the 'omcgi.xin' library.
The function places all the environment variable names and values into a shelf - each key in the shelf is the name of a variable and the shelf element holds the associated value. Capturing this data into a shelf is done in the same way as we captured form data in the previous topic.
Different web servers, running on different machines and acting on behalf of different clients may set more or fewer environment variables.
Below is a program which outputs all the names and values of all the environment variables which are set by my web server when it calls a CGI program.
[Code Sample: C06T09a.xom]
001 ; An OmniMark CGI program which outputs 002 ; the names and values of all environment variables 003 ; set by the web server which calls it. 004 005 declare #main-input has unbuffered 006 declare #main-output has binary-mode 007 008 include "omutil.xin" 009 include "omcgi.xin" 010 include "omdate.xin" 011 012 global stream envData variable initial-size 0 013 014 process-start 015 cgiGetEnv into envData 016 output "Content-type: text/html%n%n" 017 018 process 019 output "<HTML>%n" 020 output "<HEAD>%n" 021 output "<TITLE>Environment variables</TITLE>%n" 022 output "</HEAD>%n" 023 output "<BODY>%n" 024 output "<H2>CGI Environment variables and their values.</H2>%n" 025 output "<PRE>%n" 026 027 repeat over envData 028 output key of envData || " = " 029 output envData || "%n" 030 again 031 032 output "</PRE>%n" 033 output "</BODY>%n" 034 output "</HTML>%n"
The stream shelf 'envData' is declared on line 12 ready to hold the environment variables and values. On line 15, we call the function 'cgiGetEnv' and capture the result in the shelf. The lines 27 through 30 output the keys and values of the shelf - these appear in a 'PRE' (preformatted) HTML element so the client sees them in a monospaced font with one variable name and value per line.
The results which appear on my browser contain 19 variables and values which are:
DOCUMENT_ROOT = /local/WWW GATEWAY_INTERFACE = CGI/1.1 HTTP_ACCEPT = image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */* HTTP_ACCEPT_CHARSET = iso-8859-1,*,utf-8 HTTP_ACCEPT_LANGUAGE = en HTTP_CONNECTION = Keep-Alive HTTP_HOST = clio.mit.csu.edu.au:88 HTTP_USER_AGENT = Mozilla/4.61 [en] (WinNT; I) REMOTE_ADDR = 137.166.17.134 REMOTE_PORT = 1095 REQUEST_METHOD = GET REQUEST_URI = /cgi-bin/ombook/cgiEnv SCRIPT_FILENAME = /local/apache/cgi-bin/ombook/cgiEnv SCRIPT_NAME = /cgi-bin/ombook/cgiEnv SERVER_ADMIN = echopping@csu.edu.au SERVER_NAME = clio.mit.csu.edu.au SERVER_PORT = 88 SERVER_PROTOCOL = HTTP/1.0 SERVER_SOFTWARE = Apache/1.3.6 (Unix)
When you run the program on your host, with your browser, you will get similar but not identical results.
Be careful when accessing any individual environment variable. Remember that the keys in OmniMark shelfs are case sensitive and that you can't assume a particular variable will be set by a server.
Suppose you wanted to include the email address of the web server administrator in your response to a client. If you boldly write an action like this:
output "Contact the administrator at " || envData key "server_admin"
then an OmniMark error would occur - there is no item in the 'envData' shelf with a key of 'server_admin', the lowercase key name is not valid.
If you, again boldy, write
output "Contact the administrator at " || envData key "SERVER_ADMIN"
and this variable has not be set by the server, you will get the same error from OmniMark. The safest way to deal with this is to guard against accessing a non-existent key...
do when envData has key "SERVER_ADMIN"
output "Contact the administrator at " || envData key "SERVER_ADMIN"
else
output "Have a nice day."
done
Since OmniMark has strong built-in support for SGML and XML processing, it is easy to incorporate this into our CGI programs. I provided an example earlier (in topic 8) which wrote some form data in XML format onto a log file. The data represented registrations for a conference. Here I present a program which reads the conference registration log file and delivers an HTML version of it to the client. The first version of the program delivers the entire list of registrations, the second version allows the client to search for a particular person by specifying part of their name as a command-line argument when the program is called.
Below is a copy of the contents of a sample log file which contains the registration details of several (fictitious) people. It is this file which will be processed by the CGI program. The file's name is 'regolog.xml'
<!-- Registration Log File --> <REGO> <NAME>Rhoda Dendron</NAME> <EMAIL>flower@someplace.org.au</EMAIL> <TYPE>early</TYPE> </REGO> <REGO> <NAME>Sean Lamb</NAME> <EMAIL>sheepy@woolmark.com.au</EMAIL> <TYPE>full</TYPE> </REGO> <REGO> <NAME>Hugh Jass</NAME> <EMAIL>jassy@chairs.com</EMAIL> <TYPE>student</TYPE> </REGO> <REGO> <NAME>Lorrie Driver</NAME> <EMAIL>trucker@mobile.edu.au</EMAIL> <TYPE>early</TYPE> </REGO> <REGO> <NAME>Wayne Maker</NAME> <EMAIL>maker_w@weather.net.au</EMAIL> <TYPE>student</TYPE> </REGO>
A header file (called 'regolog.header') containing a document type declaration, a DTD and a root element is also made available in the same directory. It contains:
<!DOCTYPE ENTRIES[ <!ELEMENT ENTRIES - o (REGO)+> <!ELEMENT REGO - - (NAME,EMAIL,TYPE)> <!ELEMENT (NAME,EMAIL,TYPE) - - (#PCDATA)> ]> <ENTRIES>
The first version of the CGI program scans these files and uses element rules (see Chapter 4), to translate the information into HTML. It looks a little long but has a simple structure, discussed below. It is called 'showRego.xom'.
[Code Sample: C06T10a.xom]
001 ; An OmniMark CGI program to translate SGML data
002 ; to HTML
003
004 declare #main-input has unbuffered
005 declare #main-output has binary-mode
006
007 include "omutil.xin"
008 include "omcgi.xin"
009 include "omdate.xin"
010
011 global stream logFileHeader initial {"regolog.header"}
012 global stream logFileName initial {"regolog.xml"}
013 global counter numPeople initial {0}
014
015 process-start
016 output "Content-type: text/html%n%n"
017
018 process
019 output "<HTML>%n"
020 output "<HEAD>%n"
021 output "<TITLE>Registration Data</TITLE>%n"
022 output "</HEAD>%n"
023 output "<BODY>%n"
024 output "<H2>Registration Data</H2>%n"
025 output "<P>The following people have submitted registrations"
026 output " for the conference</P>%n"
027
028 do sgml-parse document
029 scan file logFileHeader || file logFileName
030 using group showPeople
031 output "%c"
032 done
033
034 output "</BODY>%n"
035 output "</HTML>%n"
036
037
038 group showPeople
039 element entries
040 output "<HR>%n"
041 output "<DL>%n"
042 output "%c"
043 output "</DL>%n"
044 output "<HR>%n"
045 output "There are %d(numPeople) registrations listed.%n"
046 output "<HR>%n"
047
048 element rego
049 increment numPeople
050 output "<DT>%d(numPeople): "
051 output "%c"
052
053 element name
054 output "<STRONG>%c</STRONG>%n"
055
056 element email
057 local stream address
058 set address to "%c"
059 output "<DD><EM>Email:</EM> "
060 output "<A HREF=%"mailto:%g(address)%">%g(address)</A>%n"
061
062 element type
063 local stream theContent
064 set theContent to "%c"
065 do when theContent matches "full"
066 output "<DD>Full Registration%n"
067 else when theContent matches "early"
068 output "<DD>EarlyBird Registration%n"
069 else
070 output "<DD>Student Registration%n"
071 done
072
073
074 element #implied
075 suppress
The code which initiates the processing of the registration data file is shown above in lines 28 through 32. To make a legal SGML document, the header file and the log file are concatenated and scanned on line 29.
Although there is only one set of element rules in the program I have chosen to place them in an element group and specify the group name explicitly, on line 30. This makes it easy to extend the program to include other groups for other types of processing when necessary.
The group of element rules starting on line 38, simply output the registration data wrapped in HTML tags for display by the client's browser. When I call the program from my browser I get the following display:
The following is a modification of the above program. It accepts a single command-line argument from the client. It then searches the registration log and displays the details of people who have a name containing the command line argument. The program's features are discussed after the code.
[Code Sample: C06T10b.xom]
001 ; An OmniMark CGI program which accepts a single
002 ; command line argument and searches for people
003 ; whose name contains it.
004
005 declare #main-input has unbuffered
006 declare #main-output has binary-mode
007
008 include "omutil.xin"
009 include "omcgi.xin"
010 include "omdate.xin"
011
012 ;; Error Message Function
013 define function showError as
014 output "<HTML>%n"
015 output "<HEAD>%n"
016 output "<TITLE>Error</TITLE>%n"
017 output "</HEAD>%n"
018 output "<BODY>%n"
019 output "<H2>Error in arguments</H2>%n"
020 output "<P>Can't search for people. This program requires"
021 output " a single command line argument.</P>%n"
022 output "</BODY>%n"
023 output "</HTML>%n"
024
025
026 global stream logFileHeader initial {"regolog.header"}
027 global stream logFileName initial {"regolog.xml"}
028 global counter numPeople initial {0}
029 global stream searchData
030 global switch foundPerson
031
032 process-start
033 output "Content-type: text/html%n%n"
034
035 ; deal with error in arguments
036 process
037 do unless number of #command-line-names = 1
038 showError
039 halt
040 else
041 set searchData to #command-line-names item 1
042 done
043
044
045 process
046 output "<HTML>%n"
047 output "<HEAD>%n"
048 output "<TITLE>Registration Search</TITLE>%n"
049 output "</HEAD>%n"
050 output "<BODY>%n"
051 output "<H2>Registration Search</H2>%n"
052
053 do sgml-parse document
054 scan file logFileHeader || file logFileName
055 using group findPeople
056 output "%c"
057 done
058
059 output "</BODY>%n"
060 output "</HTML>%n"
061
062 group findPeople
063 element entries
064 output "<HR>%n"
065 output "<DL>%n"
066 output "%c"
067 output "</DL>%n"
068 output "<HR>%n"
069 output "There are %d(numPeople) registrations found.%n"
070 output "<HR>%n"
071
072 element rego
073 output "%c"
074
075 element name
076 local stream theContent
077 set theContent to "%c"
078 deactivate foundPerson
079 repeat scan theContent
080 match ul"%g(searchData)"
081 activate foundPerson
082 increment numPeople
083 output "<DT><STRONG>%g(theContent)</STRONG>%n"
084 exit
085 match any
086 again
087
088 element email
089 local stream address
090 set address to "%c"
091 do when foundPerson
092 output "<DD><EM>Email:</EM> "
093 output "<A HREF=%"mailto:%g(address)%">%g(address)</A>%n"
094 done
095
096 element type
097 local stream theContent
098 set theContent to "%c"
099 do when foundPerson
100 do when theContent matches "full"
101 output "<DD>Full Registration%n"
102 else when theContent matches "early"
103 output "<DD>EarlyBird Registration%n"
104 else
105 output "<DD>Student Registration%n"
106 done
107 done
108
109 element #implied
110 suppress
I've called this program 'searchRego.xom'. In lines 36 through 42, we check the built-in shelf '#command-line-names' to see how many items are in it. If there is not exactly one item, we output an error page and halt. When there is exactly one command line argument, we store it into the global variable 'searchData' and use it later when scanning the content of the NAME element.
On lines 79 through 86, when in the NAME element rule, we scan all the characters in the element's content - that is we scan through every person's name. If the 'searchData' pattern is matched (line 80), we set a boolean variable 'foundPerson' to true, increment the number of people found and exit the scan.
In each of the EMAIL and TYPE element rules we only output the content if the boolean variable 'foundPerson' is true.
When running OmniMark as a normal console command, like this
omnimark -sb myProgram.xom one two three
any words typed in the command which do not have option symbol ('-'), are considered by OmniMark to be command-line arguments. This is the case with the words 'one', 'two' and 'three' above.
When calling a CGI program from a browser, we must append the first command-line argument to the location of the CGI program directly after a '?' (question mark) symbol. Any subsequent arguments are appended with plus signs ('+'). For example, a call the above CGI program with no arguments looks like this:
A syntactically legal call with two arguments, as shown below, results in the same error message:
Finally, a legal call, with a single argument of 'amb', searches for all registered people whose name contains the argument...
Even though CGI programs are designed to be executed by a web server, it is still possible to run them from the console where their output is just displayed on your screen. Unfortunately it is quite difficult to simulate the delivery of form data when doing this. Sometimes you have to actually call the programs from your browser, and sometimes they don't run correctly and you get an error message from the server. In these cases, any error messages output by OmniMark go onto the standard error stream where they are piped by the web server onto the end of the server's error log file.
At times, a CGI programmer must check the server's error log file to see what the error messages say. This is not difficult, but you may have to ask your web server administrator exactly where the server's error log file is on your system. On my system, the command:
tail -20 /local/apache/logs/error_log
writes the last 20 lines of the server's error log file onto my screen from which I can usually figure out why my CGI program went off into the weeds.
So, CGI programming can be pretty frustrating at times but when your programs work you start to use see some of the real power of 'programming the web'. Much of the impressive work done by modern web sites is made possible by CGI programs or technologies which are allied to it.
Because CGI programs are more demanding, harder to debug and require a larger infastructure, there are only two tasks below. However, they cover many of the principles discussed in this chapter.
This task is quite small but is followed up by the next task. You might like to read both tasks before starting this one.
Write an OmniMark CGI program called 'getForm' which accepts no input but delivers a web form as output. The idea is that when this program is called, the client gets a form to fill in on their browser. The program's structure can be similar to that shown in the example program topic 6.5.2. Dealing with the form itself is covered with in the next task.
The HTML data which specifies the web form should contain the actual address of another CGI program in the ACTION attribute. At this stage it is recommended that you 'hand-code' the address into your output - using the same directory as the 'getForm' program but specifying a program name of 'processForm' - which I ask you to write in the next task.
It does not really matter to me what data your form collects, but my sample solution will output a form which asks the client for their choice of Pizza, and allows them a choice of some optional extra toppings.
Write a program called 'processForm' which accepts the data posted by the client in response to the form given in the above task.
The output of the program should be HTML which simply provides a 'thank you' message for the client and confirms the type of Pizza they have ordered and any extra toppings they have specified. Note that this program will not get information about toppings which have not been checked on the form by the client; so, do not assume that your form data shelf will contain keys for toppings.
The program should contain tests on the form data to ensure that it has arrived and contains the correct field names.
The following program outputs a form which allows a client to order a Pizza. Note that the ACTION attribute's value points to another program called 'processForm' (on line 26). Note also that on line 26, I have used the escape symbol '%' to get literal quotation marks around the value.
[Code Sample: C06S01.xom]
001 ; An OmniMark CGI program which outputs 002 ; an HTML form. The form requests a choice 003 ; of Pizza. 004 005 declare #main-input has unbuffered 006 declare #main-output has binary-mode 007 008 include "omutil.xin" 009 include "omcgi.xin" 010 include "omdate.xin" 011 012 process-start 013 output "Content-type: text/html%n%n" 014 015 ; output form 016 process 017 output "<HTML>%n" 018 output "<HEAD>%n" 019 output "<TITLE>Pizza Time!</TITLE>%n" 020 output "</HEAD>%n" 021 output "<BODY>%n" 022 output "<H2>The OmniMark Pizza Gallery</H2>%n" 023 output "Choose your Pizza.<BR>%n" 024 025 output "<FORM METHOD=POST%n" 026 output "ACTION=%"/cgi-bin/ombook/processForm%">%n" 027 028 output "<SELECT NAME=pizzatype>%n" 029 output "<OPTION VALUE=aussie>Aussie Pizza (Bacon, Eggs and Kangaroo)%n" 030 output "<OPTION VALUE=supreme>Supreme (the lot)%n" 031 output "<OPTION VALUE=italian>Italian (Onion, Olives)%n" 032 output "<OPTION VALUE=irish>Irish (Potato, Guinness)%n" 033 output "</SELECT><P>%n" 034 035 output "You can select extra toppings if you wish.<BR>%n" 036 output "<INPUT NAME=anch TYPE=checkbox>Extra Anchovies<BR>%n" 037 output "<INPUT NAME=pina TYPE=checkbox>Extra Pinapple<BR>%n" 038 output "<INPUT NAME=garl TYPE=checkbox>Garlic Sauce<P>%n" 039 040 output "<INPUT TYPE=SUBMIT VALUE=%"Place Order%">%n" 041 042 output "</FORM>%n" 043 output "</BODY>%n" 044 output "</HTML>%n"
Here is a picture of my browser when I call the above program:
This program captures the data from the above form and outputs a confirmation message. The form data shelf is checked before processing is attempted and error messages are output if the form data is not as expected. Note how access to the form fields 'anch', 'pina' and 'garl' are protected with selections on lines 60 through 70.
[Code Sample: C06S02.xom]
001 ; An OmniMark CGI program which accepts 002 ; data from the Pizza form and outputs 003 ; a confirmation message. 004 005 declare #main-input has unbuffered 006 declare #main-output has binary-mode 007 008 include "omutil.xin" 009 include "omcgi.xin" 010 include "omdate.xin" 011 012 ;; Error Message Function 013 define function showError( value counter enum ) as 014 output "<HTML>%n" 015 output "<HEAD>%n" 016 output "<TITLE>Error %d(enum)</TITLE>%n" 017 output "</HEAD>%n" 018 output "<BODY>%n" 019 output "<H2>Error number %d(enum)</H2>%n" 020 do when enum = 1 021 output "Incorrect number of form fields%n" 022 else when enum = 2 023 output "Incorrect field names received%n" 024 else 025 output "Some wierd error%n" 026 done 027 output "</BODY>%n" 028 output "</HTML>%n" 029 030 global stream formData variable initial-size 0 031 032 process-start 033 cgiGetQuery into formData 034 output "Content-type: text/html%n%n" 035 036 ; deal with form errors 037 process 038 do unless number of formData >= 1 ; pizza type plus topppings 039 showError( 1 ) 040 halt 041 done 042 043 do unless formData has key "pizzatype" 044 showError( 2 ) 045 halt 046 done 047 048 ; output confirmation 049 process 050 output "<HTML>%n" 051 output "<HEAD>%n" 052 output "<TITLE>Order Confirmation!</TITLE>%n" 053 output "</HEAD>%n" 054 output "<BODY>%n" 055 output "<H2>Thanks for your order.</H2>%n" 056 output "<P>We are now cooking your " 057 output "<STRONG>" || formData key "pizzatype" || "</STRONG>%n" 058 output " Pizza</P>%n" 059 060 do when formData has key "anch" 061 output "Lots of anchovies are being added!<BR>%n" 062 done 063 064 do when formData has key "pina" 065 output "Ripe pinapple slices are going on.<BR>%n" 066 done 067 068 do when formData has key "garl" 069 output "We are smothering your pizza in garlic sauce.<BR>%n" 070 done 071 072 output "</BODY>%n" 073 output "</HTML>%n"