At the Forge - Column 100
Welcome to the 100th installment of At the Forge! Yes, that's right, this is the 100th column that I have written for Linux Journal and before it, SSC's Websmith, starting in the spring of 1996. For many years now, I have enjoyed having the monthly opportunity to explore Web- and server-side technologies.
This month, I want to look back at some of the history of server-side and Web/database programming, so we can gain some appreciation for where things currently stand. We then explore the Web as it stands today and consider where things will go in the coming years.
Today it's easy to take the Web and Internet for granted. I keep track of my bank accounts on the Web; I buy books from on-line bookstores; I read Weblogs using a Web-based RSS reader; I access newspapers more current than their printed counterparts; I chat with friends and relatives by using instant messenger programs, and I even receive payments by way of PayPal. It often has been said that residents of Manhattan never need to leave their homes, because everything can be delivered. For better or worse, the Internet is making that a reality for a growing number of people all over the world.
The Internet's maturation for business and pleasure has been a result of a dramatic transformation. Originally, Web servers were mechanisms for sharing stored plain-text and HTML-formatted text documents. But soon after it became popular to explore the relatively limited number of documents on the Web, someone realized that HTTP's inherent client-server nature made it possible to create documents dynamically in response to a request. An HTTP client requesting a document from a server had no way of knowing if the document had been sitting on the server's filesystem for several months or if it was created on the spot in response to this request. This insight transformed the Web forever, turning it into a platform for real-time document generation and application development, rather than a simple, shared repository for static documents.
The beginnings of this dynamic revolution were fairly primitive. The first dynamically generated content was little more than a wrapper around traditional UNIX command-line programs such as mail and finger. One of the first programs that my friends and I wrote, for example, was a simple program that made it possible to search through the content of our newspaper's on-line archives. Of course, my friends and I could have created specialized HTTP servers with this functionality. Luckily for us and for all Web developers, the designers of NCSA httpd, the forerunner of Apache, made it possible for any program on the server to communicate by using HTTP through its common gateway interface, otherwise known as CGI. CGI meant that any program on our server could be accessible on the Web, merely by wrapping it inside of a CGI program.
Things still were rough in those early years. We all assumed that the Web was inherently stateless and were pleasantly surprised when Netscape announced the creation of cookies, making it possible for servers to keep track of user-specific information. No programs yet existed to report on Web traffic, let alone libraries that took care of the low-level details associated with Web programming. Debugging consisted of watching the Web server's error log. And using anything more complicated than a simple text file was considered a sophisticated data-storage technique.
Today, of course, Web development is a far cry from what it was back then. Downloading and installing the latest version of Apache is a trivial act; within several minutes of visiting www.apache.org, you can have a state-of-the-art Web server running on your favorite computer. Relational databases are an unstated requirement for nearly any sophisticated Web application that you might want to create. But much of the time, you don't even have to create your own programs—the number of libraries, applications and frameworks now available for creating Web/database applications has become overwhelming. It used to be that you needed to search high and low for an open-source application that would suit your needs. Nowadays, it still takes time to find the right application, but that's because you need to sort through so many bad or inappropriate ones before finding the one that is right for you.
Moreover, the community of developers has matured tremendously over the past few years. There never was a lack of goodwill or help for newcomers to the server-side programming world, but there often was a lack of experience, because so little had been tried. In some ways, the early days of Web programming resembled a network of research labs, each of which would share its experiences with the rest of the community. Today, there is a great deal of experience, both in the Open Source community and behind corporate doors. A young programmer interested in creating new applications has an almost endless supply of books, magazines, Web sites and source code to look and learn from.
It's also true that the most popular programming languages used to create Web/database applications—Perl, Python, PHP and Java—have matured significantly over the past few years. But improvements to these languages and their libraries have impressed me less than the trend toward high-level languages in the computer industry.
Back when the Web was coming into its own, most people developed software in C and C++. People who programmed in high-level languages, such as Perl and Python, were seen as glorified tinkerers or people who were somehow less serious than their compiled-language counterparts. The Web has changed all of this; it now is possible to be seen as a serious application developer even if you're only working in PHP. Of course, compiled C code still executes faster than the equivalent high-level code. But, the corresponding difference in development and debugging time generally are so great that almost no one writes Web applications in C.
Increasingly, we see that mainstream companies are moving toward high-level languages in general and toward many open-source programs in particular. Many companies, from Amazon to eBay, have discovered that their programmers are more productive when using high-level languages. The fact that Java and C# are the lowest-level Web development languages in mainstream use says a lot about where the industry is going. Languages that make it possible for programmers to concentrate on high-level ideas rather than get their hands dirty with individual bits and bytes have become mainstream. Java largely has failed as a desktop application language, but C# seems to be gaining some speed as a result of Microsoft's .NET initiative—which means that within the next few years, most desktop applications might be running in languages that lack pointers and include garbage collection.
Obviously, there are many reasons, both technical and financial, why programmers are moving toward such languages. I have no doubt, though, that the Web has helped to push this issue to the forefront. High-level languages such as Perl are suited perfectly to the Web, with its ambiguous data types, its need for database connectivity and the need for easy-to-use, powerful text strings and string-manipulation libraries. The Web is nothing more than a bunch of text strings being hurled over the network, and no one can hurl text faster or farther than a high-level open-source language.
Dramatic growth also has occurred in the number of frameworks available for the creation of server-side applications. Even if you have an easy-to-use programming language, you still need to implement your own systems for managing users, groups, permissions, content and messages. By using an existing framework, you can avoid that work and take advantage of someone else's experience. Frameworks have moved in two different general directions—content management systems, which perform just-in-time assembly of newspapers and magazines, and application servers, which provide developers with a toolkit for the creation of applications.
On the surface, you might think that application frameworks such as HTML::Mason, Zope, OpenACS and Java servlets/JSPs have little in common. But anyone who works with more than one of these systems quickly discovers that although each framework has its own approach, they share many commonalities. Moving from one framework to another still can be difficult, but once you have enough experience with several application frameworks, trying others becomes relatively easy.
Yes, being a Web developer is 2005 is quite pleasant compared with what we had to endure ten years ago. The software is increasingly mature, the community is large and helpful, we are no longer re-inventing the wheel every other week and the number of organizations moving sites to the Web means that there is some demand for our work in the marketplace.
Given such a rosy description of the present day, where are we going in the future? What trends will pick up speed as we pass through 2005? To begin with, it is clear that the Web, by which I generally have meant the combination of HTTP, HTML and URLs, is slowly breaking apart into separate constituent parts. I always thought that the Web was unusually powerful because it combined three simple, powerful technologies—HTTP, HTML and URLs—that worked well together. But I now see that each is useful in its own right and is branching out into other uses.
Particularly interesting are Web services, which represent a new, rich and open communications protocol for programs other than Web browsers. When they were first revealed, I though that Web services were some simple ideas piggybacking on the Web's success and name recognition. Although this might be true regarding the poor name choice and although they might be simple in theory, Web services are quite powerful indeed. The idea that one application can connect to another without regard for operating system or programming language is nothing short of amazing. And although truly good uses for Web services remain relatively rare, Amazon, Google and Bloglines are demonstrating that it is possible to expose your internal API to customers and other outsiders without giving up the store.
A similar trend is the use of the Web browser as an integral component in desktop application development. Help systems now are built with HTML and miniature Web browsers, and there are some full-fledged applications, such as ActiveState's Komodo, that are based on the underlying Mozilla engine. I often have said that Mozilla is the new Emacs. Although Mozilla development significantly is harder than Emacs customization ever was, the fact that Mozilla provides a cross-platform, programmable environment for rich desktop applications is impressive and is likely to improve further.
One promising application is Sunbird, the Mozilla calendar program, which I have been using for several months on my own desktop. Sunbird still has a number of problems and bugs, but one of my favorite features is its use of the iCalendar standard to retrieve various calendars from the Internet using HTTP. Yes, that's right—I'm running a desktop application based on Mozilla that retrieves URLs by way of HTTP, but it's not a Web browser!
On the server side, collaboration is an increasingly important watchword. Although it might not meet the rigorous standards of a commercial encyclopedia, Wikipedia is where I first turn when I'm curious about a topic. And thanks to thousands of contributors, it is more than good enough for my day-to-day use. Managing that sort of collaboration is no mean feat, and the WikiMedia Foundation's MediaWiki software, based on PHP and MySQL, quietly is turning into a top-notch package for collective writing and editing.
Finally, there always is a need for better debugging and testing frameworks. The growing trend on this front is more testing and even test-based programming. Unit tests are never going to provide a complete measure of whether software works correctly—but wouldn't you rather know that all of your procedures are working correctly before you start trying to integrate them? Test-driven development has been identified as one of the key methodological changes of the last few years, and I believe that it will continue to grow in popularity as software becomes increasingly complex.
It has been my pleasure to write 100 installments of At the Forge so far. But as you can tell from my above enthusiasm, many new challenges await Web/database developers, which means it'll take at least 100 more columns to cover them all. Over the coming months, we are going to look at a number of the ideas mentioned in this column, including iCalendar, Wiki software, Web services and test-driven development.
It might be more than ten years old, but the Web continues to be a fun, exciting and intriguing medium in which to work. Drop me a line at reuven@lerner.co.il telling me where you think the Web is headed—and what projects, technologies and trends you would like to see me cover in the coming months and years.
Reuven M. Lerner, a longtime Web/database consultant and developer, now is a graduate student in the Learning Sciences program at Northwestern University. His Weblog is at altneuland.lerner.co.il, and you can reach him at reuven@lerner.co.il.