Embperl and Databases

by Reuven M. Lerner

Those who have read more than a few of my “At the Forge” columns know that I am a great fan of HTML/Perl templates, which allow us to mix the two in a single document. In October, I introduced Embperl, a templating system that can function as a stand-alone CGI program, but can also be integrated into the mod_perl module for Apache. This month we will take a closer look at Embperl, exploring ways in which it can allow us to edit records in a database.

There are a number of good reasons to use templates. First of all, by putting code and design in the same document, designers and programmers can each modify the elements for which they are responsible. No longer is the programmer the bottleneck when a site decides to change its design, as is the case when dynamic output is produced by CGI programs.

Even when you are unlikely to change the look of a dynamically generated HTML page, Embperl (and similar in-line templating mechanisms that allow you to mix code and HTML) enables you to stick it all together, making the logic easier to follow. I have written many CGI programs in which the dynamic output was dwarfed by the static output—but because even one portion of the resulting HTML page had to change over time, the entire thing had to be within the province of the program.

Since the time I wrote October's introduction to Embperl, the package has been improved significantly. Perhaps the most significant change is that recent releases of Apache 1.3.1 and mod_perl 1.15 free you from having to recompile everything when installing a new version of Embperl. Now, Embperl can be installed and upgraded separately from Apache and mod_perl, just as you install and upgrade other Perl packages from CPAN. Please see the “Resources” sidebar to learn where to obtain the latest information, including installation instructions, on Apache, mod_perl and Embperl.

Why Databases?

Databases are an increasingly important part of the Web. Using them, we can create customized and personalized sites, bringing people the specific information they want, rather than simply handing them all the information we have.

In addition, databases are designed to store and retrieve information easily. If text files and DBM files are too insecure or unstructured for your needs, consider using a relational database. Relational databases store their information in tables, where each table has columns (describing the various fields) and rows (with one record stored per row). Using multiple tables is where the “relational” part comes in, and it can be an extremely powerful tool. You could probably program this functionality on your own, but doing so would be quite complicated—and besides, someone has already done the work for you.

Relational databases are manipulated using SQL, the Structured Query Language developed by IBM in the 1970s. You don't write programs in SQL; instead, you write “queries” that manipulate one or more tables. Using SQL, you can create tables, modify their contents and request combinations of columns and rows containing particular types and pieces of data.

SQL is not a programming language, so it must be created and submitted to a database server by a programming language. In the past, each database product required its own version of Perl in order to allow access; this led to versions known as Oraperl, Sybperl, et al. Recently, the generic DBI (database interface) has produced a stable and portable database engine that allows access to any relational database with the same interface. The database-specific parts are kept in DBDs (database drivers) loaded dynamically by DBI. Assuming you stick to standard SQL rather than database vendors' proprietary extensions, you should be able to switch database brands by modifying a single Perl statement.

The relational database I use in these examples is MySQL, described by its author as a “mostly free” database. I have been using MySQL for quite some time now, and while it does not have all the optimization and locking features of its larger competitors, it performs admirably—and more features are on the way. For more information on MySQL, see the “Resources” sidebar.

Once you have installed Embperl, you need to tell Apache which documents should be interpreted with Embperl rather than as a straight HTML document. On my computer (running a modified version of Red Hat Linux 5.1), I put the following in the srm.conf configuration file:

Alias /embperl/ /usr/local/apache/share/embperl/

In addition, I put the following in the access.conf configuration file:

<Location /embperl>
SetHandler perl-script
PerlHandler HTML::Embperl
Options ExecCGI
</Location>
In other words, I told Apache that any URL beginning with /embperl refers to files actually in /usr/local/apache/share/embperl, and that any files in /embperl should be interpreted by the HTML::Embperl content handler. After restarting Apache, Embperl was up and ready to run.
Creating our Table

This month, we will create a database consisting of a single table, a list of clients for a consulting practice. One of the central tables in this system is the Clients table, which contains basic information about each client.

Here is the SQL necessary to create this table:

CREATE TABLE Clients (
        id MEDIUMINT UNSIGNED NOT NULL
           AUTO_INCREMENT PRIMARY KEY,
        name VARCHAR(40) NOT NULL,
        address1 VARCHAR(40) NOT NULL,
        address2 VARCHAR(40) NULL,
        city VARCHAR(40) NOT NULL,
        state VARCHAR(40) NULL,
        country VARCHAR(40) NOT NULL,
        zip VARCHAR(40) NULL,
        contact_name VARCHAR(40) NOT NULL,
        contact_phone1 VARCHAR(40) NOT NULL,
        contact_phone2 VARCHAR(40) NULL,
        contact_fax VARCHAR(40) NULL,
        initial_contact_date DATE NULL,
        dollars_per_hour TINYINT NOT NULL,
        UNIQUE (name)
 );

Again, we cannot enter this SQL directly into a relational database server; we must use a program that has been compiled with the correct client libraries. MySQL comes with a program (mysql) that allows interactive communication with the database; alternatively, we can use DBI to send the above SQL.

Each column in Clients is defined as a VARCHAR, that is, a variable-length text field. The length of the field is determined by the number in parentheses, which I set to 40 mostly to make other elements of the programming easier. (Over time, I expect to make most of these fields quite a bit shorter.)

The id field is special, not only because we define it as an unsigned integer (giving us the option of including up to 16 million different clients), but because it is set to be the “primary key”. As far as the database is concerned, each row can be identified uniquely with the primary key alone. We set id to AUTO_INCREMENT, meaning that MySQL will give the first client an ID of 1 in the archive file, the second an ID of 2 and so forth. Each client will receive an automatically generated, unique ID number.

We also declare the name column to be unique, since having more than one client with a given name could be confusing for the people involved. The database would accept several identically named columns, as long as the ID numbers were different. However, we will avoid the possibility of having two clients named “IBM” by checking for this in the database.

You may wonder why we didn't use name as the primary key, since it is guaranteed to be unique. We could have done so, and everything would work fine (perhaps a bit slower, since text strings are larger than integers). But consider what will happen if a client changes its name—we would have to update all of the references to that client, since old ones will no longer point to the right place. By making our primary key independent of any information the client changes, we can continue to keep track of the client regardless of what information changes.

Inserting Records into the Table

Now that we have defined our table, we will create an Embperl document that will let us insert new records. (Right now, our table is empty.) An Embperl document is largely the same as an HTML document, so you can use the <H1>, <P> and <Blink> tags as well as regular text, and it will work just fine.

However, you can insert Perl code within the Embperl document by putting it within special square brackets. Here are the four types of square brackets that Embperl understands:

  • [- CODE -]: Evaluate CODE.

  • [+ CODE +]: Evaluate CODE, inserting the final value into the HTML document.

  • [! CODE !]: Evaluate CODE as [- CODE -], but only once.

  • [$ Meta-code $]: Evaluate Embperl meta-commands.

Thus, we can include this statement:

[- $foo = 5; -]
and $foo will be set to 5—a value that persists over multiple invocations, since mod_perl and Embperl cache such values. If instead we include:
[+ $foo = 5; +]
then a “5” will appear in the document where the brackets were. If you are unfamiliar with the idea of a “final value from an expression”, you might want to end every Embperl block with the name of a variable. Variables return their values, so if you type:
[+ @reverse_list = reverse @list; $foo +]
then a “5” will be inserted into the HTML document at that point.

Listing 1, add-client.html, is a simple Embperl document that adds a client to the database. It does not check the data we hand it—since MySQL will do much of that for us—although it will show the user any database errors that might occur.

Creating the Form

If you are new to templates, it might take a while to understand the idea of a single file containing both an HTML form and the program necessary to process it. Consider that this is no different from a CGI program producing the form from which it can get input.

Listing 1 contains two parts: form processing and form creation. While Embperl looks at the former before the latter, we will look at creation first, since it is generally easier to handle, especially when working with templates for the first time.

We will have one HTML form element for every column in our table except for id, since MySQL generates the ID for us automatically. Later, we'll expand this program to handle editing and deleting of rows in our table, which means we will need to handle one form element for each column and row in our database, in addition to one for the “new” record we will be submitting.

My solution is to give each form element the name of the column to which it is attached, followed by a hyphen and the ID number. The “city” column for the row with id = 5 will be an element named “city-5”, and the name of the client with id = 30 will be an element named “name-30”. Since MySQL starts auto-incrementing ID with 1, we can use “name-0”, “address-0” and so forth for our new entry.

Early on in our program, we will define the @colnames array, which will contain the names of the columns in our database:

@colnames = (id name address1 address2 city
        state country zip
         contact_name contact_phone1 contact_phone2
        contact_fax
         initial_contact_date dollars_per_hour);

Now that we have defined @colnames, we can create the HTML form with Embperl's meta-commands. We want to create an entry for each element (except for id, since modifying that would create serious problems), so we will iterate through each element of @colnames, adding the necessary HTML and remembering to skip id. This part of my implementation looks like this:

[$ foreach $column @colnames $]
[$ if $column ne "id" $]
<TR> <TD>
        [+ $column +]
 </TD> <TD>
        <input type="text" name="[+ $column +]-0"
         size="40" maxlength="40" >
 </TD> </TR>
 [$ endif $]
 [$ endforeach $]
The above code looks a lot like Perl, with good reason. It uses a foreach loop, which iterates over the elements of an array (@colnames), putting each successive element of the array in a scalar ($column). We can then use that scalar value by putting it in square-plus brackets at the appropriate points in our HTML.

You are probably not used to seeing the endif and endforeach meta-commands in the square-dollar brackets. These tell Embperl where the if and foreach meta-commands end their scope, just as closing curly braces would do in a standard Perl program.

We set the maximum length of each field to “40”, just as the fields in our table are all defined to be VARCHAR(40). If we were to modify the table definition such that each column were set to a more reasonable size (e.g., name should probably be closer to 60, and contact_phone closer to 15), we would also want to modify the size of each field in the HTML form. Otherwise, users will blindly enter too many characters, and their input will be silently truncated by the database server. The MySQL DBD (DBD::mysql) has a length attribute that can be used for such purposes, if you wish.

Processing the Form

Now that we have created the form, let's think about how we can process it once we receive it. The Embperl document will receive the form's name-value pairs exactly as if they were being submitted to a CGI program, although we will have to extract them somewhat differently. The pairs are sent in the %fdat hash, in which the hash's keys are names of the submitted HTML form elements, and the hash's values are those values. We can grab the name of the new client with $fdat{"name-0"}, the main telephone number with $fdat{"contact_phone1-0"} and so forth.

Inserting a record into a table follows the pattern:

INSERT (column1, column2, column3) "
 VALUES ("value1", "value2", "value3")

We will want to do something like this:

INSERT (@columns)
 VALUES (%fdat)
Of course, life isn't quite that easy; we must first create a new array, @insert_colnames, with the names of the columns we wish to insert—in other words, everything except id:
[- @insert_colnames = grep !/^id$/, @colnames; -]
Then we turn that into a comma-separated list, which is what we will need for the first part of the INSERT:
[- $insert_colnames = join ', ', @insert_colnames; -]
With that accomplished, we will use Perl's built-in map function to turn @insert_colnames from an array of column names into an array of column values. We then convert the resulting array into a scalar, in which each value is separated by a comma and surrounded by double quotation marks:
[- $values = join '", "', map {$fdat{$_ .
"-0"}}
        @insert_colnames -]
If @insert_colnames were to consist of
(column1, column2, column3)
the above use of map would turn it into:
($fdat{"column1-0"}, $fdat{"column2-0"},
$fdat{"column3-0"})
which join would then turn into:
$fdat{"column1-0"}",
"$fdat{"column2-0"}",
"$fdat{"column3-0"})
There aren't any quotes at the beginning or the end, but we can add them when we finally construct the query:
[+ $sql = "INSERT INTO Clients ($insert_colnames)
        VALUES (\"$values\")"; +]
We use square-plus brackets here in order to see (and debug, if necessary) the query we send to the database. Don't forget that if we are using double quotes to take advantage of variable interpolation, we must escape the double quotes we wish to send in our query with backslashes.

We finally send that query with the statements:

[- $sth = $dbh->prepare($sql); -]
[- $sth->execute; -]

If there are any errors, print them for the user:

<P><B>[+ $sth->errstr +]</B></P>
Our new record is now inserted in the database.

This entire form-processing section is unnecessary if the user has not submitted any form elements. In Listing 1, you can see how we used the Embperl if meta-command to exclude evaluation of this entire block of code if the user has already done something.

The first time you run this, don't be surprised if everything seems to work and you get your original form back. As they say, that's not a bug—it's a feature! If Embperl finds fields in an HTML form that match the name-value pairs in %fdat, it fills them in automatically. You can turn this option off by modifying the EMBPERL_OPTIONS bitmask field, described in the Embperl documentation.

Creating an All-Purpose Editor

Now that we have seen how to enter new records using Embperl, let's expand the template such that it will allow us to modify and delete existing records, as well as add new ones. You can see the complete listing for such a template in Listing 2 in the archive file, client-editor.html.

The first task is to retrieve existing elements from the database and turn them into a list of form elements the user can grab. As we saw earlier, it will be easiest if we give each form element the name of the column with which it is associated, along with a number indicating its record ID number.

The first order of business is to retrieve rows from the current database. We do that with a SELECT statement, whose syntax looks like this:

SELECT column1, column2, column3 FROM Tablename;

We set up our query as follows:

[- $sql = "SELECT $colnames FROM Clients"; -]
Now we prepare and execute the query using the standard DBI syntax:
[- $sth = $dbh->prepare ($sql) -]
[- $sth->execute -]
The result from a SELECT is a table, which we can retrieve in a number of different ways. Perhaps the easiest method is to grab it as an array reference, then turn that array reference into an array containing the name-value pairs, continuing to fetch array references until we run out. If we use Embperl's while meta-command, we can do that fairly easily:
[$ while ($record = $sth->fetchrow_arrayref) $]
We then grab the id column:
[- $recordid = $record->[0]; -]
We can turn that array reference into an array, using Embperl's foreach meta-command to iterate over each element, printing each one except id in a table row. If we store the current record (row) number in $recordid and the current field number in $fieldcounter, we can create this by iterating over the following code:
<TR>
<TD>[+ $colnames[$fieldcounter] +]</TD>
<TD>
 <input type="text"
 name="[+ $colnames[$fieldcounter] .
        '-' . $recordid +]" size="50"
maxlength="100"
        value="[+ $field +]" >
</TD>
</TR>
We will also add a set of three radio buttons to indicate whether the user wishes to delete this record, modify it or do nothing. We will set “nothing” as the default, since we don't want users to inadvertently delete any elements. We create the radio buttons, using the modify- stem just as we would in normal HTML. However, we will add the current ID number to that stem:
<P><input type="radio" value="nothing"
 name="modify-[+$recordid +]" checked> Do nothing
 <input type="radio" value="modify"
 name="modify-[+$recordid +]"> Modify this client
 <input type="radio" value="delete"
 name="modify-[+$recordid +]"> Delete this client </P>
As you can see in Listing 2, we also added a check box to the initial “new client” form to indicate whether a user is interested in adding a new client. This check box can be hardcoded in HTML, since we are allowing users to add new elements from only that one form, with the pseudo-ID of 0:
<P><input type="checkbox"
name="modify-0">
 Add this new client <P>
Inserting, Updating and Deleting

Just as add-client.html (Listing 1) was divided into a processing section (the first part) and the form-generation section (the second part), so too is our full client-editor.html (Listing 2). The above section describes how we will use SELECT to create the HTML form, so all that remains is describing the processing section, which comes at the top of the template.

With add-client.html, we could assume that the user wanted to add a new client. There are now four possibilities: adding a new client, updating an existing client, deleting an existing client and doing nothing at all. While add can be true only for modify-0 (the new record), we have to check every set of HTML form elements that comes to us. The simplest case, of course, is when the modify- radio button is set to “nothing”.

If the user wants to add a new record, the element modify-0 will be checked. We can use an Embperl if meta-command to check for its existence:

[$ if $fdat{"modify-0"} ne "" $]

In other words, if the user checked modify-0, we will add a new record, just as we did in add-client.html.

Finding out if the user checked modify for one of the records is a bit trickier. We take the names of all submitted form elements (sort keys %fdat), and use grep to grab all of those with the modify- stem:

[$ foreach $clientid
 (grep {($_ =~ /^modify-\d+$/) && ($fdat{$_} eq
        "modify@bb:1.      )}
 (sort keys %fdat)) $]

If the above looks a bit intimidating, remember that $_ contains the value of the scalar currently being handled by grep. We tell grep to return only those array elements that match modify-\d+ (that is, modify- followed by one or more digits), and whose value is modify. We then take the array returned by grep and iterate over it using Embperl's foreach meta-command.

Once inside the foreach loop, how do we create the SQL query? We first have to grab the ID of the element in question, so that we will update only the appropriate record. We do that by giving:

$clientid =~ m/(\d+)$/;

This puts the ID value in the temporary variable $1. We then use a combination of grep, map and join to create the list of name-value pairs necessary to complete an UPDATE statement syntax with:

UPDATE Clients SET
name1="value1",name2="value2"
 WHERE id = $1
We use grep to grab all column names except for id (once again, we don't want to change that value). We then filter that result through map, turning the list of column names into a list of name="value" pairs. Finally, we join that list together with commas, resulting in the scalar $pairs:
$pairs = join ', ',
 map {"$_ = '" . $fdat{$_ . "-$1"} . "'"}
 grep (!/^id$/, @colnames);
We can then set up the SQL query as follows:
$sql = "UPDATE Clients SET $pairs WHERE id = $1";
Deleting elements is easier than updating, since we don't need the name-value pairs. We can use the statement:
$sql = "DELETE FROM Clients WHERE id = $1";
where $1 matched the number of the current element.
Conclusion

Believe it or not, we are done. This client editor obviously needs some help with its user interface, since it is still possible for someone to enter an illegal value (e.g., a bad DATE element for initial_contact_date, or a fraction for the TINYINT column dollars_per_hour). If you have more than three or four clients, this interface quickly becomes tedious. The lack of truly descriptive names for each column gives a hard-to-use look to a program that is far easier and less error-prone than entering straight SQL would be.

However, improving the interface is fairly straightforward once you understand how to perform the four basic database operations: INSERT, SELECT, UPDATE and DELETE. Indeed, we have seen that doing all of these in Embperl can be quite simple. Creating alternative interfaces should not be hard to do, given the examples we have already seen.

More importantly, this Embperl template is useful for much more than just the Clients table. By modifying the value of @columns and the name of the table, you could use this same template to modify nearly any record in any table.

I hope you have enjoyed this romp through the world of Embperl and templates. A number of templating systems are now available for doing similar things; even if you are unaccustomed to using such templates to communicate with databases, you should consider getting one of the available packages and trying it. Their power may convince you of their utility, too.

Resources

Reuven M. Lerner is an Internet and Web consultant living in Haifa, Israel, who has been using the Web since early 1993. In his spare time, he cooks, reads and volunteers with educational projects in his community. You can reach him at reuven@netvision.net.il.
Load Disqus comments