navwin » Tech Talk » Beyond the Basics » Database & Arrays?
Beyond the Basics
Post A Reply Post New Topic Database & Arrays? Go to Previous / Newer Topic Back to Topic List Go to Next / Older Topic
Christopher
Moderator
Member Rara Avis
since 1999-08-02
Posts 8296
Purgatorial Incarceration

0 posted 2002-10-22 04:54 AM


I'm back!

My site is in desperate need of updating. However, I really, really, really don't want to continue updating everything by hand as I've done in the past.

With that in mind, I'm working (slowly) on writing a script that will help me with it.

My immediate goals for this script follow along the lines of how these forums are set up. I want someone to enter their submissions, have that information saved to a database, have a "title list" created where I can click on the link, have that information pulled from the database and passed through a template page, presenting me with formatted output.

Uhm - I have the first part, the submission part covered. I have the template part covered, though I don't know how to pull it from a database. What I'd like to go over now, if we can, is how to arrange my arrays so that I can pull out the information.

ie:

@info = ();
$info[0] = $email;
$info[1] = $nameF;
$info[2] = $nameL;
$info[3] = $bio;
$info[4] = $category;


Would be my array ( ? ) for the information going to my database. Only - I have no idea how to "key" this... it seems all inclusive, disallowing me to have the person's email address as the defining key which links to the rest of their information.

Any help would be appreciated. (Because once I get this, I want to go back to writing files to disk)

[This message has been edited by Christopher (10-22-2002 04:54 AM).]

© Copyright 2002 C.G. Ward - All Rights Reserved
Ron
Administrator
Member Rara Avis
since 1999-05-19
Posts 8669
Michigan, US
1 posted 2002-10-22 07:05 AM


quote:
(Because once I get this, I want to go back to writing files to disk)

It's the other way around, Chris. Once you write it to disk, the arrays happen all by themselves.

$ID = "00000689"; # just happens to be your forum ID
open(DB,">$path/$ID");
# your data "might" have carriage returns so we strip each
# one and then write it to the file we just opened
chomp($email);
print DB "$email\n";  # we add a line feed
chomp($nameF);
print DB "$nameF\n");
.
.
.
close (DB);

Later, you'll read it back into an array, like this:

open (DB,"<$path/$ID");
@info = <DB>;
close (DB);

Unless you tell it differently, Perl's <> operator assumes you want an array based on the current record separator (the default line separator is the line feed). It's up to YOU to make sure the elements of the array always match up with the correct data after your program has read the disk file.

Make sense?

This is just one of many ways to store records in a flat-text file, and happens to be a poor way if you anticipate a LOT of records. For example, with over 5,000 Members in the forums, the directory that hold those over 5,000 data files has become a bottleneck in the system. No operating system, whether Unix or Windows, handles excessively large directories very well. When the program opens a file in that dir, the OS has to sequentially read through the directory list to find it. SLOW! But if you're going to have less than, oh say, 2000 records, it's a very easy way to program.

Christopher
Moderator
Member Rara Avis
since 1999-08-02
Posts 8296
Purgatorial Incarceration
2 posted 2002-10-26 02:08 AM


It does make sense (mostly, I'll get to that in a second).

I assuming the next step would be SQL? Would that be really difficult to changeover to in the future, if one starts with the default DB? Just curious. I think I'll stick with this method for the time being, as it should take quite some time before reaching those kinds of numbers.

Two things:

Would you know what [the email address] is when reading the DB because it'll be the (in this case) first entry? Is that how you would determine what you were reading / pulling out of an existing file?

Everything in what you wrote makes sense except part of the following -

open(DB,">$path/$ID");

opening DB, which is the name of the file, you want to add to the file ">", located in the "path/sub-directory". Right? Ok - where I'm not clear is on how to set up "$path", or if I need to if the subfolder leads from the same directory as the program.

Also - will this create a directory? If so, should I put it in my "www" portion instead of my CGI bin (I'm thinking so).

Thanks Ron, big help as always.

Christopher
Moderator
Member Rara Avis
since 1999-08-02
Posts 8296
Purgatorial Incarceration
3 posted 2002-10-26 02:10 AM


or - just thought of this - would you know what $email was, because it would be listed in DB as

$email = "christopher@countlesshorizons.com\n"

?

Christopher
Moderator
Member Rara Avis
since 1999-08-02
Posts 8296
Purgatorial Incarceration
4 posted 2002-10-26 02:10 AM


that doesn't seem right... I don't know, I should quit supposing until you get the time to get to this, lol.
Ron
Administrator
Member Rara Avis
since 1999-05-19
Posts 8669
Michigan, US
5 posted 2002-10-26 07:57 AM


The next step would indeed be SQL, but it would NOT be a natural progression. More like apples and oranges, and a complete rewrite. Probably a different language, too. (Perl does SQL, but PHP does it much more easily.) I have, so far, managed to avoid using a "real" database on the web, partly because I download most of the data to a local database application, but mostly because I don't like the way SQL is implemented on the Internet (through connections, rather than talking directly to the DB program). But, the technology hasn't improved in three years, and I doubt I'll be able to avoid it much longer.

Yes, you would "know" the email address was the first field simply because you wrote it as the first field. It's not uncommon to use variables (that don't really vary) to make it more intuitive. At the top of the program, for example, you may define $dbPoetEmail = 1; and then any time you reference the array, it would be as @info[$dbPoetEmail]. Perl hashes are another way to do the same thing, but a bit more advanced.

The $path variable is obviously set up before you open the disk file and can be anything you need it to be, either an absolute path or a relative path. Examples of an absolute path might be "/cgi-bin/database" (start at root / and then go to cgi-bin and then go to database) or "/www/database", while a relative path might be as simple as "database" (start where the program is running, in cgi-bin, and find the directory called database immediately under it).

You probably don't want to put data files in your document root (the www portion of your site), because then you have to take special precautions to protect them. Remember, I can point my browser to any directory in your www folder. In truth, those precautions have probably already been taken for you, but you never know about the next server you use. On the other hand, the cgi-bin is treated differently by the web server. You can't even view images from the cgi-bin on most systems, but can ONLY run programs. That usually prevents people from seeing the contents of data files.

You also have to have the directory for the database set up with the proper permissions. Remember, you're on a shared server and they usually don't want programs just writing files any old place. The CGI programs run as very low privileged users, usually, and can't write to the same folders you can access with FTP. That's a good thing. On Unix, I manage my own directories with telnet and the chmod command, but on NT you'll probably have to have the server admin do it for you. That's a bad thing.

Ron
Administrator
Member Rara Avis
since 1999-05-19
Posts 8669
Michigan, US
6 posted 2002-10-26 08:25 AM


Oops, missed something important.

When you said, "opening DB, which is the name of the file …" you left some room for confusion. DB is not the "name" of the file, but is the file handle. It's nothing more than a special variable we can use to refer to the opened file later. The name of the file is $ID, and that's what you would see in the directory if you looked at it outside of Perl.

Your idea of keying off the email address is a common one. Unfortunately, it doesn't work well because an email address can contain characters that will choke a file name. The key to your file in a flat-text system like this one is the file name, and I typically just use a sequential number. The email address becomes just another field in the file and, if you want to find a data record with a specific email address, you have to jump through some programming hoops to do it. If you tell me your Member number is 689, I can directly read your data record in a heartbeat. If you tell me your email address, though, I have to find you. The usual way to do that is with indexes, which are separately maintained disk files. For example, the forums use an index called members.idx that looks like this:

rcarnell@voyager.net|00000003
christopher@countlesshorizons.com|00000689
etc …

The program can read this file in one fell swoop, looking for an email address and then using the corresponding file name.

open (MEMBERS, "members.idx");
@index = <MEMBERS>;
close (MEMBERS);

foreach $line (@index) {
  ($email, $key) = split($line, /\|/);
  if ($email eq $WhatWeNeed) {
    last; #break the loop with $key set
  }
}

Not incidentally, I don't encourage the use of aliases in the forums, and I typically go through every few months deleting unused usernames, because of exactly this reason. Every username in the index adds work for the server, and the index files get used a LOT.

Christopher
Moderator
Member Rara Avis
since 1999-08-02
Posts 8296
Purgatorial Incarceration
7 posted 2002-11-19 07:08 AM


Having an issue here and am really lost trying to figure it out. I've walked it through and still can't see what I might be missing.

#Compare email addy for possible match
open (MEMBERS, "members.idx") || die "Can't open member file: $!";
@index = <MEMBERS>;
close (MEMBERS);

--No problem up to this point, just for reference.

  foreach $line (@index) {
  ($mEmail, $key) = split($line, /\|/);
       if ($mEmail eq $email) {
   last;
   }
}

--This is where I feel I'm having my problem, but am not sure why. As I understand it, if $mEmail eq $email then that would set the value of the variable $ID as the number corresponding to the found email. Is this correct?


if ($key > 0) {$ID = $key;} #If match, retreive ID#
       else {&GetLock;....

--Blah, blah, lol. The "else" part works fine. Problem is, it works that way everytime. I now have several ID's in the .idx file with the same email address. Which tells me that $key is not being returned somehow. And that, is what I don't understand. Can you help me here... just a nudge please?

Ron
Administrator
Member Rara Avis
since 1999-05-19
Posts 8669
Michigan, US
8 posted 2002-11-19 10:18 AM


Your biggest problem is my fault. A bit of a typo and a frequent one for me. I transposed the arguments in the split call. It should be split(/\|/, $line). I seem to do that often, perhaps because logically Perl's way makes less sense to me.

However, you also have two other problems you need to address.

When Perl reads an array from the diamond operator <MEMBERS> the array elements holds exactly what was in the file. That is going to include the new line character, too, so you need to strip that off.

Second, your test if ($key > 0) will ALWAYS succeed once you fix the split statement. Even if $mEmail never equals $email, $key will still be set to the last line in the array when the loop concluded.

Try this on for size:

$ID = 0;  # same as boolean False
foreach $line (@index) {
   ($mEmail, $key) = split(/\|/, $line);
   if ($mEmail eq $email) {
      chomp($key);  # remove new line
      $ID = $key;   # greater than zero is boolean True
      last;
   }
}

if (! $ID) {
   GetTheLock("id");
   etc …
}

Christopher
Moderator
Member Rara Avis
since 1999-08-02
Posts 8296
Purgatorial Incarceration
9 posted 2002-11-19 10:46 PM


You rock, Ron.

I have one question and a comment to the above:

What is the difference in the way it is listed? I'm trying to see how the transposition caused the problem, but am not following the logic?

($mEmail, $key) = split(/\|/, $line);

Or, as I think I may have just guessed, are you, in the split command, telling it to place what is to the right of the pipe symbol as the $key value? That might make sense, am I getting close?


Second:

if (! $ID) {
   GetTheLock("id");
   etc …
}

You're slick, man. Thanks for that little hint there. I can see how utilizing operators like "!" can reduce the amount of code you need to write. Slick!

Ron
Administrator
Member Rara Avis
since 1999-05-19
Posts 8669
Michigan, US
10 posted 2002-11-19 11:23 PM


The split() function is about the neatest thing since peanut butter. I like it so much that, since discovering it in Perl, I've duplicated the functionality in both Visual Basic and C++.

You pass split() two parameters; a database record and the field separater. In this instance, there's only two fields in our "record" and they're separated with a pipe symbol. The split() function returns a list of the fields, with "list" being another word for array. We could just as easily have said @fields = split(/\|/, $line) and had our variables assigned to $field[0] and $field[1]. And we can have a lot more fields than just two (and commonly do). For example, part of the database record for THIS thread might look like this:

000000|Christopher|10-22-2002|04:54 AM|I'm back!<br><br>My site is in desperate need...
000001|Ron|10-22-2002|07:05 AM|<BLOCKQUOTE>(Because once I get this,...
000002|Christopher|10-26-2002|02:08 AM|It does make sense...
000003|Christopher|10-26-2002|02:10 AM|or - just thought of this - would...

Each line in the file represents a post within the thread, and each delimited field is obviously part of the post. Perl makes it very easy to read in the file, then use split() to tear apart each record in the file.

foreach $line (@thread) {
   @post = split(/\|/, $line);
   … print the post
}

If that was the only thing split() did, it would be useful. But every time you turn around, you'll find another use for it. Look at the dates as they're formatted in our threads? Gee, that's a dash delimited record!

($day, $month, $year) = split(/-/, $date);

Or look at any sentence in this post, and it becomes a space deliminted record.

@words = split(/ /, $sentence);

Yep. Definitely the neatest thing since peanut butter.

Christopher
Moderator
Member Rara Avis
since 1999-08-02
Posts 8296
Purgatorial Incarceration
11 posted 2002-11-19 11:49 PM


ROFL! Now I'm wanting a sandwich!

COOL! And that provides the next hint for me... slick man, slick.

Post A Reply Post New Topic ⇧ top of page ⇧ Go to Previous / Newer Topic Back to Topic List Go to Next / Older Topic
All times are ET (US). All dates are in Year-Month-Day format.
navwin » Tech Talk » Beyond the Basics » Database & Arrays?

Passions in Poetry | pipTalk Home Page | Main Poetry Forums | 100 Best Poems

How to Join | Member's Area / Help | Private Library | Search | Contact Us | Login
Discussion | Tech Talk | Archives | Sanctuary