Perl - Program Size Question

How to Join	Member's Area	Private Library	Search	Sanctuary		Login
Main Forums	Discussion	Tech Talk	Mature Content	Archives	A home without a wall, Our haven without ends, A circle we all call Our Family of Friends!

Beyond the Basics

Perl - Program Size Question

Christopher

Moderator

Member Rara Avis

since 1999-08-02

Posts 8296
Purgatorial Incarceration

0 posted 2002-12-15 09:08 PM

My query - if I write a program that has many options and becomes fairly large (in terms of overall lines) does that affect the processing speed, or is it only what is "working" that affects the speed. More specifically, if I have a dozen options in an if / elsif / elsif / elsif / ... is the speed dependant only on the actions that are acted on, or does the rest of the script also affect the processing speed?

Thanks

Ron

Administrator

Member Rara Avis

since 1999-05-19

Posts 8669
Michigan, US

1 posted 2002-12-21 01:16 PM

Guess I missed this post?

Size does matter (no matter what the girls all say). How much it matters depends on a ton of other things, with probably the biggest determinant being the RAM available in the server. When a script is executed, the web server has to read all of the code (including your require files) and then compile it into pseudo-binary code. Obviously, this overhead can't be ignored and the size of your code base is a significant factor. But it's not the only factor. If the server can, it will cache both the script source and pcode, and that's where the amount of RAM comes into play. If you have a lot of different scripts, as we do in the forums, it's unlikely they can all be cached, or if you have relatively few programs being executed because visitors only hit them once an hour, anything cached will be cleared before it's used a second time. In your case, you're sharing that RAM with everyone else on the server, too, so …

Still, the compiler in Perl is highly optimized and is usually a very, very small fraction of the time used by a CGI process. In the "old days," I would keep two versions of every program, my working copy and my executable copy. The latter was stripped down to the bare essentials, with all comments and white space removed. As CPU's got faster, however, this became less and less necessary and more and more of a pain the, uh, neck. I wouldn't even dream of doing this today. The cost of my time greatly exceeds the cost of a faster CPU.

Your nested if/then statements will also take no more than a heartbeat of execution time. When optimizing a program for speed, the first two places to look are loops and disk IO. Code embedded in a loop may run thousands of times and will obviously have a big impact. Optimize that code first and you may have to optimize nothing else. Disk IO is usually the bottleneck in any program. In the past ten years, we've gone from 486-66 CPU's up to giga-hertz Pentiums, several magnitudes of difference, but the speed of our disk drives has only marginally increased. In many applications, the CPU is twiddling its thumbs most of the time, waiting for the disk to do its jobs. (This, too, is conditional. With standard IDE drives, the CPU has to issue a disk command and wait. With SCSI drives, the CPU can issue multiple disk commands and then go do something else while the SCSI does its thing. For single-user machines, IDE is cheap and fast enough, because the CPU is rarely multitasking and can afford to wait. On multi-user machines, like web servers, SCSI drives allow the CPU to keep busy and avoid most waits.)

Corollary: Find any loop that is performing disk operations and optimize the heck out of it.

The new application I've been working on for the past two months comes in at about 6,000 lines of code, or about 200K. Everyone is different, but I prefer to work on one large program when in development, eliminating the need to remember which files hold which routines. It makes searches and search and replace operations easier, too. At some point, however, I'll start breaking that monolithic source file into multiple programs that Perl can more quickly compile. I'll have one required file that hold routines common to all functions, then one program for each of the major functions. When someone is, say, editing a graphical image, the web server shouldn't need to load the 2,000 lines of code that are used to manage directories.

	⇧ top of page ⇧
All times are ET (US). All dates are in Year-Month-Day format.