Citation preview
Steven E. i^renner
win Aoki
I ntro dixjLCti
M&J
on
2-0
Introduction to CGI/Perl
Introduction to CGI/Perl
Steven
E.
Brenner
Edwin Aol
Hello world with Perl
Howdy, world!
Hello morld luith Peri Home Link
Back
_
Load Images
Stop
ItM
Load Original
URL
Page complete
Image complete n>
Hello world with Perl
Howdy, world!
Figure
1
.4 Despite the fact that this
identical to the earlier static page.
page was constructed by a
script,
it
lool^
script before, the
get
you
Perl. itself,
started.
Because those
through
CGI
examples
We
make
Even
scripting.
in this
if
youve never
focus more
on
handy
library
written a Perl
chapter could very well be enough to
However, they are no substitute
we
the task of creating
also introduce cgi-lib.pl, a
for a
book solely about CGI than on CGI
Perl features useful for
more experienced with
the language
may wish
to breeze
this chapter.
19
J
Chapter 2
20
Perl Basics Perl
is
funny language,
a
rife
with apparent contradictions.
vaguely familiar to C programmers and shell scripters
same time
appear
will
will
It
but
alike,
seem at
the
people consider Perl to
relatively bizarre. Normally,
be an interpreted language because program execution basically starts the top and continues line-by-line. But when a Perl program is run, it actually
parsed and compiled, and only then
first
is
it
at is
executed.^ This
approach provides some of the efficiency of compilation while permitting
and
the convenience Perl's
power
flexibility
of interpreted languages.
derives from a combination of the best properties of
different languages. For example, as with
commands
languages, function.
Each
in a Perl
many
most interpreted scripting
program need not be included within run from top to bottom; the
line in a script is
first
line
a
of
C program, in which all commands must reside within functions, and program execution always begins with the main function. Like C. however, Perl is a free-form language. You can generally put as many statements as you like on a single line and put line breaks wherever you want. To tell where each statement ends and another begins, each the script ordinarily will be the
first
to
be executed. Contrast
this
with a
statement must be terminated with a semicolon." This flexibility carries over to Perl functions and variables as well. Perl
variables
come
in
many
sensitive, don't
need
default. That
unless
is,
be shared across
you don't need
all
different flavors, but
be declared you explicitly
to
in
them
in
of
them
are case
advance of use, and are global by
indicate otherwise, each variable will
the functions in a script.
to declare
all
advance
And speaking either.
of functions,
Furthermore,
many
functions in Perl do not require you to enclose parameters in
parentheses
—
a necessity for
many programming languages.^ Both
functions and variables are covered in greater detail in the text and sidebars that follow.
'
Nonetheless,
it
is
po.ssible to
dynamically create a segment of Perl code inside a program and then
direct Perl to execute that as well. ^
of
course, rules
there are •^
some
were meant
special instances
to
be broken, so
the.se guidelines aren't
—such as formats—where the
line
Sometimes, leaving off the parentheses can aid readability (as
function), but
it's
usually better to include them.
always
true.
For example,
breaks are significant.
is
typically the ca.se with the print
CGI Introductions with Perl Though easy
to learn, the Perl language
see. Before diving in to
look
at
any more
is
21
very capable, as we'll soon
Perl code,
however,
we
should
make a comment about comments. As in most scripting languages, Perl's comments are line-based, beginning with a hash sign"* (#) and continuing to the end of the line. There is no way of making a true multi-line comment other than by putting a # on each line. Now that we haved bragged a
bit
Perl to create
about
Web
Perl's features,
we
can take a look
at
how we
can use
content.
Easier Introductions:
Hello World with Functions You may
recall that
our
CGI
Perl
first
simply printed, line-by-line, a
static
script,
howdy. cgi,
(Listing 1.2)
document. Unfortunately, because the
program ended up being more hassle (and keystrokes) than the HTML text it replaced, it probably failed to convince you of the virtues of Perl. "Why would anyone write a program such as howdy. cgi? Nobody would. -*
Liowever, with the addition of functions, scripts can be useful even for
HTML
generating eliminate text
much
pages whose content does not change. Functions can
of the drudger>' of producing syntactically-correct
and automate much of the page-creation process, thus saving
improving consistency, and reducing the
shows the
text of hey. cgi.
hey.cgi bears
little
produces similar
Like so
most •'
many
common
which
possibility'
illustrates
resemblance to our
results,
shown
HTML typing,
of error. Listing 2.1
what we mean. Though
earlier script
howdy. cgi.
in Figure 2.1.
of these special characters, the - symbol
is
called a
number of
different
names. The
of these include "hash." "pound." "tic-tac-toe," "sharp," and, of course, "number."
Except, perhaps, the authors of a
it
book about CGI and
Perl.
Chapter 2
22
Listing simple
2.
1
HTML
The
script hey.cgi uses functions to aid in the creation of a
page.
#!/usr/local/bin/per1
requi re "cgi
-1
ib.pl";
MAIN: {
print SPrintHeader;
print SHtmlTopCHello world!"); print "
Hey there,
I'm functional !
\n";
print SHtmlBot;
Hello uiorld!
S)
SSSlEiS
Hello world!
i>
Hello world! Hey
there, I'm functional!
n-
^ Figure
m 2.1
Output of
hey.cgi. Despite the obvious differences
hey.cgi and howdy.cgi (Figure
1.4)
produce
in
the source code.
similar results for users.
CGI Introductions with Perl In order to see
how
hey.cgi comes about
the source code (Listing 2.1) line by interpreter runs through the script
executed
line.
from top
its
23
results, we'll step
through
As with howdy.cgi the ,
However, the
to bottom.
Perl first
line,
require "cgi -lib.pl";
does more than meets the eye. treat the
contents of the
file
cgi-lib.pl as
script at this point. This treatment
directive
works
for the
such as cgi-lib.pl,
we
essentially tells the Perl interpreter to
It
is
if
they were included in our
analogous to the
way
the #incl ude
C language preprocessor. By including
libraries
can concentrate on the specifics of what our
script
bogged down in the mundane details; we The complete source code of cgi-lib.pl is listed in Appendix C, and we'll discuss it in greater detail soon. For now, it's enough to know that it is one of the most popular libraries to assist in the writing of CGI scripts, and as we'll see, it defines a number of needs
to
do without
getting
delegate those to the library.
convenient functions.
Returning to hey.cgi, the next thing
commands,
we
notice
that instead of a
is
program statements are placed within a program block. Blocks are regions of code enclosed in curly braces, optionally identified by a label. In this case, we've chosen the nonfunctional label MAIN to indicate that the code here is the core of the simple
list
of
program. Unlike the mai
name MAIN;
the
n function in C, there is
as far as the interpreter
is
nothing special about the
concerned,
it
is
block and not necessarily the starting point for the block in
and
this
example has
exists primarily for
virtually
no
effect
convenience and
just like
any other
script. In fact,
the
on the program's execution
readability.
When we
introduce
subroutines, however, we'll see that blocks can be used to alter the flow
of execution and provide enclosures for local variables.
Now we come 1,
the
first
to the actual
program code
thing that a script must output
Content -type, followed by
a blank line.
printed this out directly:
print "Content-type: text/html \n\n";
itself. is
a
As
we saw
in
Chapter
header indicating the
Our howdy.cgi program
just
.
Chapter
24
seem
This doesn't
2
be onerous, but
to
practice,
in
remembering the
—
appropriate syntax of the Content -type header and typing it properly has proven famously problematic. Therefore hey.cgi generates
—
the header using the slightly
more mnemonic
statement:
print SPrintHeader;
This line accomplishes the same thing as manually printing the
Content -type header, but in a more convenient manner. The ampersand symbol that precedes its name indicates that PrintHeader is a subroutine function; the actual code of the function is in cgi-lib.pl When called, PrintHeader returns the appropriate header line as well as the (&)
following blank line but does not actually print them, despite
That task
STDOUT in Perl). This output manner described in Chapter 1
the standard output (called
by the
Web
Once
server in the
the header has
generating script
—the
standard tags
HTML
Top replaces
of the print statements to output these
all
majority of the
The
at
first
couple of lines
last
paragraph and
needed
and
title
call
the
work done by hotvdy.cgi. Html Top
in
level
our
the bottom of each
script
HTML
more
page.
sophisticated
we
in a library
and
that
all
of our pages
we wanted only once
new
to
it
to temiinate
would ha\e
change the design,
in the library,
information.
and
all
is
to output tlie standard tags
Though
the
HtmlBot function
could replace
HTML
address, or
calling
world!", which
simply print the "Hey there"
menu or command bar even show the current date and
example, display a
also
header element.
HtmlBot function
defined in cgi-lib.pl doesn't do much, function that creates
page. Standard
for a function. In hey.cgi. a
takes a single parameter, in our case the string "Hello
used as the page's
then received
is
and hardly ever change from document
document, so they're perfect candidates
single call to Html
name. text to
been produced, the next requirement of an HTML-
to output the text that begins the
is
tags like , ,
to
its
performed by the print statement, which sends the
is
it
\^ith
page endings.
It
another
could, for
of hyperlinks, give a contact time.
each page,
By
a consi.stent look.
we would need
of our pages
putting our function
we would
be guaranteed
Furthemiore, to
make
if
ever
the change
would be updated w
iih
the
CGI Introductions with Perl
25
Perl Variables Part hScalars
Most programming languages have various data types, and Perl exception Perl's
—but
like
almost everything else in Perl, there's a
simplest and most
of the
common
common
A
scalar
is
no
twist.
many
data type, the scalar, replaces
data types found in other languages.
is
simply
a single item: integer, floating point number, string, or Boolean value;
A
the precise type need not be specified in advance. scalars
is
that they automatically convert
between the
nifty feature of
different types as
needed: Snumber = 4;
# Snumber is 4, as you would expect
Sstring = "Hello";
#
a
nice,
j^
a
more exciting string
$bond = "007";
print $bond
-
Iscalar = "2"
# prints "5"
2; .
"1"; # .(dot)
$scalar -= 15;
These
last
friendly string
--
automagic string/number conversion
is string concatenate;
$scalar is "21"
# Sscalar is now 6
couple examples
may seem
odd; hearkening back to the
"What do you get when you put 2 and 2 together?" which the answer was "22." Perhaps the riddle was just preparation
childhood to for
riddle,
our eventually becoming Perl programmers. But
In case
sign ($).
you haven't
Though
this
noticed,
all
we
digress.
scalar variables begin with a dollar
may seem annoying
out to be phenomenally useful because
(and ugly) it
at first,
it
turns
prevents variable names
from being confused with Perl keywords. More
interestingly,
it
also
allows the variables to be directly substituted, or interpolated into strings:
print "The value of my scalar is Iscalar.";
yields The value of my scalar
is
6.
Even though words and numbers are represented using type of variable, there are
some
For example, the symbols ==, numerical relationships
(e.g.,
1
differences in
how
a single
they can be used.
(and others) are used to
!=,
+
1
==
2) while the
test
corresponding
operators eq, ne. It, and gt play the analogous role for strings ("1 + 1"
ne
"2").
Chapter
26
2
Listing 2.2 Compare the output of hey.cgi, shown here, to that of /lowdy.cgi,
shown
in Listing
text, the use of functions
1. 3.
Though they produce
makes hey.cgi
a
virtually identical
HTML
much neater and more compact
script.
Content-type: text/html
Hello world!
Hello world!
Hey there.
I'm functional !
How the
Magic Works
Making use of unseen that matter,
library functions in cgi-Ub.p1, or
may seem
any other
library for
a bit mystical Indeed, like magic (and like
many
aspects of Perl that we'll explore in this chapter), their use can range from
simple sleight of hand to complex routines with
many
subtleties.
Simple Subroutines Let's start
our discussion of subroutines with PrintHeader, which
about as simple as a subroutine can be while
still
being useful.
is
CGI Introductions with Perl Listing 2.3 The Pri
ntHeader subroutine from
way to make sure
useful
that the correct header
27
cgi-Ub.pl, though simple,
is
is
a
always generated.
sub PrintHeader {
return "Content-type: text/html \n\n"; }
A
by the keyword sub and a name. Functions can be placed almost anywhere in a Perl program, and the sub indicates that code should not be executed when the interpreter gets to it. Instead, it will be simply tucked away for use when needed. subroutine function
Unlike
is
just a
some languages
that
have both functions (which perform some
and procedures (which perform some action but do not return anything), Perl has only the former By action
and return a value
block, preceded
to their caller),
default, the value returned
subroutine. Suppose the
is
last
simply the result of the
executed
last
expression in the
line in a particular function were:
=2+2;
$four
would return 4, the value of $f our." If this is not the desired behavior, the return statement can be used to return a specified value. Often (as in PrintHeader), the return is not strictly necessary but is used simply to make the return value explicit.^ In this case, the function
Additionally, the return statement can
before reaching
°
we wanted
an expression the
in Perl is
this
(see the
like
operation
last
simply the result of the assignment.
we could leave out the return keyword. Since text/html \n\n" evaluates to the text string itself, if this were
to create needlessly obfuscated code,
"Content- type statement in PrintHeader,
last
makes
to cause a function to exit
its last line.
The value of an assignment If
be used
much
clearer,
:
it
would operate
so
we
think
it
is
in the same way. The return keyword, though, good form to use it. Perl can be obscure enough
sections in this chapter) without introducing extraneous complexity.
Chapter 2
28
Perl Variables Part
II:
Arrays
can group a number of scalars together
Perl
be referenced as
array can then
denoted by the resemblance to
array; the entire
In Perl, arrays are
character (@) and perhaps bear a stronger
"at"
lists
an
in
a single variable.
in LISP than to arrays in C.
any number of elements, which are simply
Each array can contain
scalars.
arrays can be assigned both to and from
For convenience,
(denoted by
lists
parentheses): ©array - ("1", "two". 3); (Ifirst, Ssecond. Jthird) = ©array;
and interpolated into strings. Note example above, an array need not contain scalars of
Like scalars, arrays can be printed that as in
the
same
one
our
first
type. This
an especially useful property
is
array into another, an operation
when
which simply
interpolating
inserts
each of the
elements of an array into another array: Snewarray - (0, ©array, 4);
y^
©newarray contains
(0,
"1",
"two", 3, 4)
Individual elements of an array can be accessed by their indices,
which as
in C,
normally
starts at
zero (although unlike C, the starting
index can be altered). Also like
C and many
other programming
languages, square brackets are used to specify the index:
Sfirst - $array[0];
A
//
Sfirst is the first item in the array: "1"
potentially confusing aspect of array elemejits
is
that since they
are themselves scalar, the character that precedes the variable
and
signifies
its
type
is
$,
not @. This anomaly sets
up
name
the rather
in which one can have a scalar variable $array which has no relationship to the value of $array[0], a scalar that
confusing situation
represents the
first
element of the array @a may.
The highest index array named ©array
one which specifies the last element) of anjj given by $#array, while the size of the array
(the is
CGI Introductions with Perl
(generally
one
larger)
is
29
the scalar value of the array. These also
backwards; assigning a number to the highest index changes $last = $array[$#array];
# $#array is 2;
$scalar = ©array;
# Sscalar is
$#array
=1;
$last is
its
work
size:
3
3 (number of elements in ©array)
# ©array is now ("1", "two")
enormous built-in support for arrays, making them very handy data types. We've only begun to scratch the surface of all of the ways in which Perl arrays can be used; for example, the language provides a number of special functions such as shift, unshift, push, pop, and spl ce to manipulate array contents conveniently and efficiently. More information about these can be found in the Perl reference manual (which comes with the language) or in books Perl provides
i
exclusively about Perl. .•"jyiirv^-^f-_Z^ J,
One
of Perl's interesting features
versatility
or
—
is
Look ma,
"
— and
a further
that return values aren't limited to
no hands!". Functions
may
testament to
its
being scalars such as 13
also return an array, such as:
return ("fee", "fi". "fo", "fum");
Some
functions take this a step further
and can return
either a scalar or
an
depending on the caller's need. The value of the (appropriately named) wantarray function can be used to determine which response to array,
give.
Parameter Passing do more with a subroutine than simply produce some fixed output; a function which returned the value 4 all the time would be of limited use. Typically, functions also take some input values in the form of parameters and use them to generate the desired results. The cgi-lib.pl routine Html Top, shown in Listing 2.4, demonstrates this Usually one wants to
approach.
Chapter 2
30
Listing 2.4 Functions Html Top and Html Bot from cgi-lib.pi sub Html Top {
(Stitle) - e_:
local
return [insert name here]>
;
.
How are you?"
Enter name:
Choose language:
English
French
Klingon
requesting
/trans html .
is
sent to
www.mycompany com. .
f
HTTP Server
Server receives request. Based on the .html extension, server retrieves the
preceded by the
file
and returns
Content -type
its
contents to the server,
header and status code.
HTT " w Client receives and displays the data.
Client
The and
browser should render
tags indicate that the
a
form with
the appropriate user interface elements.
User Interface
User
fills
out the form and presses the submit button to send the
data to the server.
User Interface The
client looks at the form's action attribute for the
destination address. GET, given
Client
in
the form's method,
tells
the client to append the form data to the resource address:
/trans. eg i ?who-Juan&l ang-spani sh¶ms=l
HTl•P
,r
HTTP Server
Server receives request. The it
The server
CC"
>,
Gateway Program
CGI
should run a
via
starts the script
environment
file
extension .cgi indicates
script.
and sends
>r
information
The script processes the form results and returns The header preceding the text contains the line
Content-type: text/html C). Therefore any explanatory capabilities of HTML.
The on and
may use
off states of a radio button are often also referred to as selected
parallelism with checkboxes, ^
text
The exceptions
' string.
"How
are you,"
and therefore
Chapter 3
66
After
ReadParse has been
called,
we
associative array, because that's exactly
lookup using an element's name
in
can
what
%in
treat it
We
is.
order to determine the users response
for that particular item. So, in order to retrieve tlie user's
named who
text field
$in{ 'lang'
Similarly,
31,
in Listing )
any other can perform a like
we
could simply look
name from at
the
$in{ 'who'
1.
contains the value associated with the chosen
a rather clever twist,
$in{'lang')
as the
key into
we
use the value returned by the %transl ateHi and %transl ateHow
language. In
associative arrays to obtain the proper translations
and transmit them
to
the server (Figure 33).
key
value
lang
Spanish
params
1
who
juan
%
key
value
english
Hello
french
Bonjour
kljngon
nuqneH
Spanish
Hola
n
>
Konnichiwa
japan
%translateHi
Figure 3.3 The trans.cgi
script uses the value
from one associative array
as the key
for another.
Occasionally, especially useful to see
all
when
creating or debugging a
new
script,
the input that the script receives from the form.
it
is
The
Pri ntVari abl es function from cgi-lib.pl provides this information, iterating
through the %in associative array and displaying each key and
checkbox controls whether these \ariables get displayed. The value of the checkbox is reported to the script by the existence of the params element, so the script determines whether to value. In trans.html, a
display the variables using the
line:
Form and Function
67
if ($in{'params'})
checkbox were checked, then $in| 'params' would have the value 1. This would be evaluated as true, so the program would call Pn'ntVariables to display the form parameters.^ If the checkbox were not checked, neither the element's name nor its value would have been If
the
)
sent to the script, and
no entry would have been created
associative array. Since a
lookup using
a non-existent
key
in the
false,
the
program would skip over the block containing the PrintVariables and simply go on to print the bottom-of-page tags.
call
is
fKSA Massk: Document View Help
Greetings of the world Your traiMlttion Hol«, Juan.
reads:
Que ul?
The CGI form lang
parameters..
Spanish
jmnms 1
trim
hum
^&|,
,
::oen-
|
Sg/eAs
.[
cmne| Me*) Ciose|
Figure 3.4 The output from trans.cgi shows the
results of the
interactive translator.
^ It
is
display,
important to note that since the parameter to PrintVariables
ReadParse must be called
first
to
fill
in %in.
is
the associative array to
Chapter 3
68
Once
the script has completed
back to the browser,
we
its
work and
its way shown in
the output winds
see something similar to the page
ReadParse function, we can write and use it to create this page without really having to know much about CGI at all. The only information we need is the name of each element that we choose to look up in the associative array %in. The program could also haxe made use of the CGI Figure 34. Thanks in large part to the a script to get at the form data
environment variables, as
we
These variables, stored ReadParse call.
A
demonstrated with worULcgi
in the
%ENV
in
Chapter
2.
by the
array, are unaffected
Letter
By now, the advantages of using the cgi-lib.pl library scripts are hopefully becoming obvious. The functions in almost
of the work, allowing script writers to
all
CGI the library do concentrate on what the to create
form needs to do, rather than on the "housekeeping" tasks of parsing and
CGI
interpreting the
adequate job First,
at translating,
could use a number of improvements.
it
the form could certainly be
validation: for
example, no error
with a blank input
field.
is
more user
produced
Additionally,
Query" button were more
it
— descriptive
if
friendly.
Because the given
in the
number of
HTML
act on i
is
does no input
nice
if
the "Submit
"Translate," for instance.
and
script
linked to the script by a filename explicitly
any time
we change
the
name
or
we must remember to update the form. Similarly, if we need to make sure we al.so copy
the form to another server,
the script to avoid orphaning the form. Finally, consider what
happen
More
opens up
potential errors.
attribute, trans.cgi.
location of the script,
we copy
form
It
the user submits the fomi
would be
importantly, the two-part nature of the separate form
the possibility of a
does an
input. But while the translator script
if
a u.ser
were
to
go
to the address of the .script directly:
http://www.niyconipany.com/trans.cgi
would
;
Form and Function A
executed
script
become
in this
way would
69
lack any form information and
would
hopelessly confused.
more tightly integrate the form and the script which processes it. We can do this by using a comboform. The premise behind a comboform is simple: we have a single script which is called first to display the form and then again in order to process the form data. Listing 3-3 shows a comboform that generates encoded form letters. The
solution to these problems
is
to
Listing 3.3 The Super Encoder, code.cgi,
is
a
good example of
a
comboform,
but a poor example of a useful one.
#!/usr/local /bin/perl
require "cgi
-1
ib.pl
"
MAIN: (
if
(&ReadParse(*input))
i
SProcessForm; 1
else
I
&ShowForm: 1
sub ShowForm {
print &PrintHeader; print &HtmlTop( "Super Encoder"); print t element, but they share little in common with each other or with the other elements we've explored in this chapter. "board, standard, the useful. ^^
The
file
such as images. Conceptually, the
name
attribute like
from which the
'"
A
pointer lo the
Online Resources.
lull
all
u.ser
element is quite simple. It accepts a other elements, and displays an interface
file
can choo.se a single
sjx'cification for file
file
uploads (RFC 1867)
to .send. Unfortunately, the
is
available online: see Appendi.x
1),
Form and Function
89
complexities involved in actually sending the data require an entirely
new
encoding scheme to handle files. Accordingly, in order to use this type of element, the element in which it is contained must be submitted using method=POST and enctype=mul ti part/form-data. This
latter attribute indicates that
the information
is
sent to the script as a
multipart Media Type, which is somewhat more complicated than the appl cation/x-www- form- url encoded type in widespread practice today. However, for cgi-lib.pl users, the change is less important, since ReadParse will take care of parsing the data stream and filling the %in associative array correctly, regardless of the encoding scheme in use. i
SYNTAX NOTE: element, type = file Creates an element that allows a
file
to
be attached
to a form.
Attributes:
type = file.
not specified, defaults to text
If
(see type = text). name =
identifies this
accept =
a
list
specified, a
file
element
when
it is
sent to the script.
of Media Types that are acceptable to upload. of any type
may be
If
not
selected.
Example:
Hidden Elements At
first
glance, a form element that doesn't display anything to the user
it turns out that there are some very good input) element which accepts no user input. Hidden elements circumvent the fact that HTTP is a stateless protocol, and they allow state information data that is remembered from a previous interaction to be tucked away as part of a form, unseen by the user.
might seem rather useless. But
reasons to have an
—