Friday
Dec311999
Perls of Wisdom (not)
Friday, December 31, 1999 at 5:01PM
I use to program for fun, but I don't have time (or enough concentration) to write anything substantial any more. It's still fun to hack out one-liners from time to time though.
The other day I decided to keep track of my Amazon.com sales rank on the front page of Leoville. To do this I'd have to write a program in perl that the web server could call using CGI, the common gateway interface. The program would return the rank which the web server would embed into my page.
The first iteration of the program was pretty simple, thanks to a perl library called LWP. The library provides built-in routines to access web pages. Using the LWP routine "get" I can fetch the contents of the Amazon.com page, then use Perl's built-in text search features to extract the ranking.
I wrote the program in a few minutes:
use LWP::Simple;
my $webpagetext;
# access Amazon web page
$webpagetext=get("http://www.amazon.com/exec/obidos/ASIN/0789726912/qid=1007181368/sr=1-6/ref=sr_1_74_6/104-8979567-7976756");
# find sales rank
$webpagetext =~ /(Sales Rank: )(d+)/;
# output sales rank
print "Content-type: text/htmlnn"; # this text is required for CGI output
print $2;
If you're not familiar with perl a few things might need explanation. All the real work is done in the line...
$webpagetext =~ /(Sales Rank: )(d+)/;
In English this would read something like: search the contents of $webpagetext for the text "Sales Rank: " followed by one or more digits.
The parentheses in the phrase (Sales Rank: ) (d+)
tell Perl to group the results. Perl assigns the value in the first group to the variable $1, the second group to $2, etc. I'll use $2 later to output the rank.
Finally I print the results to the console. CGI routes the output back to the web server which inserts it into the web page that called it.
I use Apache's server-side includes (SSI) to call the perl program and embed the results of the program. On my system that means putting the line:
<!--#include virtual="/cgi-bin/ranking.pl"-->
into the web page. When the web server sees it, it calls ranking.pl and sticks the result into the page at that point.
So far, so good. I could run the program locally and it worked fine, but it wouldn't work on my server. Turns out the LWP module was never installed. I wasn't sure how to get around that until I installed Movable Type. This blog software uses several modules that aren't part of my web host's perl installation. But I learned I could put the needed modules in a directory on the web host and tell your program to look for them there. Thus adding the line:
use lib "/cgi-bin/mt/extlib/";
at the beginning of the program and storing the LWP::Simple module in the extlib directory, fixed the problem, and version 1.0 of my program was up and running. Worked great, too, until my book fell below 999 in ranking. Amazon displays larger ranks with commas, and my program didn't consider that. I changed the search to include commas by substituting the regular expression [0-9,]
for d
:
$webpagetext =~ /(Sales Rank: )([0-9,]+)/;
and it was working again.
Incidentally, I work in perl on both Windows and Macintosh. On Windows, I use an excellent shareware editor from DZSoft. On Mac OS X I use BBEdit from Barebones Software. Both really speed up the development cycle by letting you run the program from within the editor, with built-in FTP uploading, and a perl reference.
No program is ever done, and neither was this one. Next time, how I extended it to keep track of the peak scores. (And maybe one of you perl experts can help me with a bug that's really been buggin me.)