WillMaster Possibillites Logo EzineSeek Award
Learning Perl
by
William "Will" Bontrager

Permission is granted to reprint this article in its entirety, provided no reprints are sent in conjunction with unsolicited bulk email, provided no fee or other value is exchanged, provided no changes are made to the article, and provided the author's name, signature lines, and copyright line are printed with the article; except you may change the article's title.

A lot of webmasters want to learn Perl. I'm frequently asked how to go about it.

Perl is the language of choice for most CGI programs. CGI is a proven standard and available on essentially all internet servers. Thousands of CGI programs are available at script download sites, many free.

Learning a bit of Perl makes sense. It would allow you to tweak existing scripts with some confidence. And you may even want to write your own custom programs.

Besides, Perl is fun.

HTML Code Stripper

Consider this five-line script. It retrieves a web page from the internet, removes all HTML tags, and displays the rest in your browser as if it was a plain text file. Some web pages are really funny, with long lines and short, interspersed with blank space in odd patterns. Once the HTML outerware is removed, all you have are the bare essentials.

#!/usr/bin/perl
use LWP::Simple;
$Page = get 'http://anydomain.com/anypage.html';
$Page =~ s/\<.+?\>//sg;
print "Content-Type: text/plain\n\n$Page";

Change the URL on line 3 to any page you want to view, upload the script, and type the script's URL into your browser. You'll have fun with this one.

Here are line by line comments.

Line 1 specifies the location of Perl on your server. Verify its correctness.

Line 2 loads the Perl module LWP::Simple, which contains methods for retrieving pages from the Internet. (If you don't know whether or not your hosting company has LWP::Simple installed on your server, use the script at http://willmaster.com/master/pit/ to find out.)

Line 3 retrieves the URL and stores it in the variable named $Page

Line 4 strips all HTML codes from the page. This line uses what is known as "regular expressions" to search for and delete everything in angle brackets.

Line 5 prints the page to your browser. It prints the header specifying plain text, two line breaks ("\n\n"), and then the variable $Page

If you want to retain any HTML comments on the page, change line 4 to

$Page =~ s/\<[^!].+?\>//sg;

and the program will remove only HTML tags that do not have an exclamation mark following the left-angle bracket.

This article is not intended to teach you Perl. The above was intended to whet your appetite and possibly start the creative juices flowing. Let me present just a couple more fun things you can do (which might even be useful), then I'll recommend a book or two and helpful links to tutorials and such.

Snooper

If you want to make a snooper to see the source code at a URL (supposing it's a frameset, redirecting page, external JavaScript file, or other reason your browser can't get the source code), replace the last two lines of the above program with these five lines:

$Page =~ s/\</\</sg;
$Page =~ s/\>/\>/sg;
print "Content-Type: text/html\n\n";
print "<html><body><pre>$Page</pre></body></html>";

This code replaces the left- and right-angle brackets with < and > codes. It then prints a header telling the browser to expect an HTML formatted page. Last, it sends the page to the browser with $Page between PRE tags.

Personalized Web Pages

The last example, a bit more complicated, is a bare-bones method of delivering personalized web pages.

First, place some tags into a web page. (If you put the tags between angle brackets, the page can be displayed without personalizing, too.) Here is an example page to get you started.

<html>
<body>
<p>Hello <MyTag:name> — this is your personal 
page. If we don't have your correct email address 
<MyTag:email> then please send it to us. Thank 
You <MyTag:name> !!!</p>
</body>
</html>

With the tags placed as they are, the page can be displayed without personalization -- the tags just don't show up.

To display that page with personalization, link to this script instead of directly to the above page:

#!/usr/bin/perl
use LWP::Simple;
$Name = 'William';
$Addy = 'possibilities@willmaster.com';
$Page = get 'http://url/to/above/page.html';
$Page =~ s/\<MyTag\:name\>/$Name/sig;
$Page =~ s/\<MyTag\:email\>/$Addy/sig;
print "Content-Type: text/html\n\n$Page";

In an efficient system, the name and email address would be retrieved from a database or cookie. In this example, lines 3 and 4 store the name and email address into the variables named $Name and $Addy

Line 5 must contain the URL to your template page you created above.

Lines 6 and 7 replace the tags with your custom information. And line 8 sends the web page to the browser.

A Way of Thinking

Programming requires a certain way of thinking, just as gardening and writing a novel each require their own ways of thinking. The description may be inadequate, but writing a program line requires focusing on a tiny detail while imagining consequences several steps removed and visualizing the project as a whole. It helps to remember you're writing something for a machine that "thinks" only with zeros and ones, that every thing is either entirely true or entirely false.

The way to learn is to practice programming. The more you program, the more you become familiar with the thought process.

Getting Started

Obtain a "learning perl" book, to start, and never mind all the other resources until later. There's a lot of stuff out there, and it could be a bit confusing without a basic grounding in the language.

I recommend "Perl for Dummies (with CD ROM)" by Paul Hoffman. The CD contains a Perl interpreter to install on your computer. Mari learned Perl with this book. Her previous programming experience was a class with BASIC or FORTRAN in high school. (A similar title by the same author, "Perl 5 for Dummies," is also on bookshelves. I've never read that one, so can't make a recommendation.)

Do each exercise, from front to back, and you'll be well on your way to Perl guru-ness.

Some people learn well with "Learning Perl" by Randal L. Schwartz & Tom Christiansen. If you have a programming background, you may want to use this one instead.

If you need to pick up a copy of Perl for your computer, http://cpan.org/ports/ contains links where you can find Perl for dozens of operating systems, including Windows and Macintosh. Windows users have several choices; I recommend ActiveState (use the MSI link) at http://aspn.activestate.com/ASPN/Downloads/ActivePerl/

See "CGI Developer's Tools" linked from the WillMaster Possibilities article archives index at http://willmaster.com/possibilities/archives/ for tools you may wish to obtain for yourself.

Other Resources

After you become comfortable with the language, you'll want to become familiar with other Perl resources.

You'll find tutorials and helpful documentation at these three links:

  1. http://cgi.resourceindex.com/Documentation/
  2. http://www.hotscripts.com/Perl/Tips_and_Tutorials/
  3. http://www.perl.com/pub/q/documentation

A page of bookmarks to Perl related resources is at http://bookmarks.cpan.org/

http://learn.perl.org/ has Perl books listed for various programming skill levels, and they have some lists you might consider joining.

http://perl.com/ is a technical book publisher's site with focus on Perl. It has numerous resource links.

Mailing lists, more than you'll ever need can be found at http://lists.perl.org/

A technical FAQ can be found at http://www.cpan.org/misc/cpan-faq.html

http://www.perl.com/pub/q/Article_Archive contains links to many articles, including programming how-to.

What do I use? My favorite reference book is "Programming Perl" by Larry Wall, Tom Christiansen & Randal L. Schwartz. It's dog-eared and the covers are curling. I use it every day.

Copyright 2001 William Bontrager
Programmer/Publisher, "WillMaster Possibilities" ezine
http://willmaster.com/possibilities/
subscribe-possibilities@willmaster.com
Business Home Page: http://willmaster.com/