Dan Knight
- 2002.04.25
Last time we created our database with four fields: id, link, model,
and text. Now we need to fill the database.
Again, there are two ways to do things. The hard way is to use
Telnet and manually enter all of your data. The easy way, assuming you
have the data in a file, is to import your information.
Filling the Mac of the Day Database
Call me lazy, but I simply don't want to spend a lot of time
rekeying this information or doing the cut-and-paste thing. Three
fields time nearly 100 Macs would take a long time.
Besides, I already had everything in HTML format. Here's the line
for the Mac Plus as an example:
- <A HREF="/1986/mac-plus/">Mac Plus</A>
(1/86-10/90). First Mac with SCSI, memory expansion. Longest model life
- over 4 years.
I wanted to break this down to three fields: link, model, and text,
as follows:
- compact/plus.shtml
- model: Mac Plus
- text: (1/86-10/90). First Mac with SCSI, memory expansion.
Longest model life - over 4 years.
I'd add the id field manually, since it didn't exist in my HTML
file.
The first step was to take the raw HTML from my file and copy it
into
TextSoap, a powerful program for cleaning up text with manual
line breaks, extra spaces, and other problems. Running the scrub
command reconnected the HTML lines that Home Page had nicely broken
into pieces.
Then we copied that code to BBEdit Lite,
where our next step was to eliminate the extra space between lines of
code. Search for \r\r (two returns) and replace with
\r. Then save the file to the desktop.
As Brian <
http://brkn.net/> explained to me, before you go any further
you need to "escape" things like quote marks. So we did a global search
for the quote mark and replaced it with \" - the backslash is used to
indicate that the next character is literal instead of punctuation. We
did the same with parentheses.
Our Mac Plus description now looked like this:
- <A HREF="/1986/mac-plus/">Mac Plus</A>
\(1/86-10/90\). First Mac with SCSI, memory expansion. Longest model
life - over 4 years.
We also stripped out the unnecessary HTML code such as <A
HREF="/ and "> and </A> . Because MySQL
likes to work with comma delimited files, we replaced ">
with ", " and also replaced </A> with ",
". Now it looked like this:
- compact/plus.shtml", "Mac Plus", "\(1/86-10/90\). First Mac
with SCSI, memory expansion. Longest model life - over 4
years.
That had us most of the way there, but we still needed to add quote
marks at the start and end of each line, so we had BBEdit search for
\r and replace it with "\r". Three fields down; one
to go.
- "compact/plus.shtml", "Mac Plus", "\(1/86-10/90\). First Mac
with SCSI, memory expansion. Longest model life - over 4
years."
Next we manually numbered every line in our database. The Mac Plus
is the 27th in our list, so the entry now looked like this:
- "27", "compact/plus.shtml", "Mac Plus", "\(1/86-10/90\). First
Mac with SCSI, memory expansion. Longest model life - over 4
years."
Brian and I had created 89 records that were just about ready to be
imported into the MySQL database we'd created. Just a couple more
steps. We needed to preface each line with INSERT INTO mod
(id,link,model,text) VALUES( and append ); to the end of
the line. Again, we did a search for the return (\r in BBEdit)
and replaced it with );\rINSERT INTO mod (id,link,model,text)
VALUES( - resulting in the following
- INSERT INTO mod (id,link,model,text) VALUES("27",
"compact/plus.shtml", "Mac Plus", "\(1/86-10/90\). First Mac with SCSI,
memory expansion. Longest model life - over 4 years.");
It probably takes longer to explain the steps than it took to do
them. Then it was time to import the whole text file into MySQL using
PHPMyAdmin. It failed the first two or three times because I'd
accidentally given two lines the same number.
Once the import was successful, we were done. The database had been
created. Now all we needed to do was use the data to create Mac of the
Day entries.