The archive contains all webpages.
This is mainly for developers or people who like to play around with SQL. It's also useful for people who like to clean up the freedb entries.
Also have a look at the excellent jmbase.
FreeDBToMySQL is a perl script which imports files from the freedb archive to a mysql database. Freedb is a free alternative to the CDDB-service, a service which lets you download information about your audio-CDs (like artist and title) directly into your CD-Player on your computer and saves you the time to type in the information by yourself. You should support freedb if you can, because the originators of CDDB did a rather nasty thing: The information about the CDs is supplied by normal users. Whenever a new CD comes out, someone types in the information and sends it to the CDDB-server. After enough information was supplied to CDDB, it went commercial and started bugging developers to only include support for CDDB in their player-applications. This is why freedb was made. The information supplied to freedb is and stays free.
I wrote this because this is the first step for the development of an SQL-based cddb-server which could be used locally without the internet. You can already use cddbd from the freedb site if you want to run a server by yourself. However, since freedb stores the information in a one-file-per-album-basis, your file system will get a bit cluttered because you will have over 500 000 files on your partition. (Moreover, on Mac OS X you are probably using HFS+, which really does not perform that well with that many files.) The logical step is to import all data into a SQL-database and modify cddbd to fetch the entries from there. I found that I can save about 5/6 of the space needed by the files alone, not even speaking of the overhead the file system needs for the management of the files. Instead of 500 000+ files, all data will be contained in 24 files (not counting in the files needed for mysql). And I haven't even used myisampack yet, which will compress the tables even further (at the cost of the tables being read-only from there on).
The following table should give you an idea of the task. For every directory, a new image file was made in order to unclutter the HFS+ B*-Tree. All tables were empty before each import. All imports were made on an iBook/600 with a 20 GB drive, the freedb archives consisted of freedb-complete-20020327.tar.bz2, freedb-update-20020327-20020501.tar.bz2 and freedb-update-20020501-20020601.tar.bz2. Note that the FreeDBToMySQL.pl used did not import the EXT-fields.
Directory name |
# of files |
Space consumed in HFS+ in MB (1) |
Space consumed in mySQL in MB (2) |
Space consumed in mySQL on incremental import in MB (3) |
Time to unpack the files in seconds (4) |
Time to import in seconds (4) |
Time to import on incremental import in seconds (5) |
jazz |
41895 |
186,2 |
28,3 |
28,3 |
886 |
2430 |
2430 |
data |
3465 |
15,7 |
3,1 |
30,6 |
533 |
153 |
147 |
blues |
24898 |
110,5 |
19,7 |
49,6 |
720 |
1145 |
1695 |
reggae |
7212 |
32,3 |
5,9 |
55,0 |
488 |
335 |
513 |
soundtrack |
19091 |
84,8 |
18,0 |
72,4 |
665 |
1078 |
1574 |
folk |
33782 |
149,8 |
27,3 |
98,8 |
717 |
1820 |
2510 |
country |
16074 |
71,6 |
13,7 |
111,7 |
575 |
759 |
1077 |
misc |
182879 |
814,5 |
143,8 |
254,1 |
2802 |
14823 |
17675 |
classical |
55371 |
245,6 |
50,0 |
302,3 |
1020 |
2890 |
6596 |
rock |
~170000 |
|
|
|
|
|
|
newage |
~21000 |
|
|
|
|
|
|
Before you can use this, you need a few things. Also, you should be able to use the terminal fairly well. You will need to download about 130-140 MB of files, and the database needs (after everything is imported) about 500 MB of hard disk space. You will temporarily need about 1 GB of additional scratch space. Also, importing all the files can literally take days. Be patient!
After installation, you will have mysql and the CD database on your harddisk. At the moment, the only thing you can do with this is train your SQL skills and make queries from the command line. You can't use this for automatic CD lookup.
What follows are detailed installation instructions for Mac OS X. Users of other operating systems may skip steps as needed. Feel free to mail me installation instructions for other systems.
fink install mysql
fink install dbd-mysql-pm
fink install bzip2
shell> mysql -u root mysql
mysql> SET PASSWORD FOR root=PASSWORD('new_password');
mysql> FLUSH PRIVILEGES;
shell> mysql -u root mysql
mysql> create database freedb;
mysql> grant select,insert,update,delete,create,drop,alter on freedb.* to lars@localhost;
mysql> FLUSH PRIVILEGES;
chmod 755 FreeDBToMySQL.pl
sudo /sw/share/mysql/mysql.serverstart
Now we will create a disk image for temporarily storing the files from freedb. The files are distributed in a few subdirecories in the archive. We will unpack and import one subdirectory at a time. (If you have a lot of space, you can of course unpack everything in one go.) I'm assuming there's a subdirectory called "freedbstuff" in your home directory containing all relevant files.
mysql < ~/freedbstuff/inittdb.mysql
bzcat ~/freedbstuff/freedb-complete-20020327.tar.bz2 | tar xvf - data/\*
bzcat ~/freedbstuff/freedb-update-20020327-20020501.tar.bz2 | tar xvf - data/\*
~/freedbstuff/FreeDBToMySQL.pl
rm -rf data
That's it. You can try to do some selecting stuff. Go to the shell, start the client with
mysql freedb
, and do a
SELECT id,title FROM artist WHERE name = "Dire Straits";
All this should work now. I hope someone starts to modify cddbd so that it will query the sql-server instead. Volunteers?
Don't worry if SELECT
s are slow if you play around while importing. Only the indexes which are really needed during importing are updated. For example, for table song, only the primary index is updated, so SELECT
s on the artist_id in this table is really slow. Indexes are updated at the end of the import.
MySQL is slow for importing stuff (at least the way it's done with this script), so don't get frustrated about the time it takes to import everything. Lookup is very fast.
In a lot of ways, the script is incomplete:
Feel free to help. :)
Table album:
Field name |
Type |
Comments, size, etc. |
Example (partly fictional) |
id |
mediumint(8) unsigned |
0-16 777 215, primary key, auto-increment |
59021 |
artist_id |
mediumint(8) unsigned |
0-16 777 215, foreign key to artist |
9076 |
title |
blob |
length <= 2^16, case sensitive |
Brothers In Arms |
filedir_id |
tinyint(3) unsigned |
0-255 |
7 |
filename |
char(8) |
length 8, case insensitive |
640c8a09 |
discids |
varchar(255) |
length <= 255, case insensitive, comma separated |
640c8a09,8d0cf209 |
length |
smallint(5) unsigned |
0-65535, in seconds |
3212 |
trackoffsets |
text |
length <= 2^16, case insensitive, comma separated |
182,23052,59222,77422, 105872,143630,174417, 195240,210842 |
year |
smallint(5) unsigned |
0-65535 |
1995 |
genre_id |
mediumint(8) unsigned |
0-16 777 215, foreign key to genre |
26 |
revision |
smallint(5) unsigned |
0-65535 |
2 |
firstline_id |
mediumint(8) unsigned |
0-16 777 215, foreign key to firstline |
2 |
processed_id |
mediumint(8) unsigned |
0-16 777 215, foreign key to processedline |
4 |
submitted_id |
mediumint(8) unsigned |
0-16 777 215, foreign key to submittedline |
164 |
Table artist:
Field name |
Type |
Comments, size, etc. |
Example |
id |
mediumint(8), unsigned |
0-16 777 215, primary key, auto-increment |
9076 |
name |
blob |
length <= 2^16, case sensitive |
Dire Straits |
Table dirname:
Field name |
Type |
Comments, size, etc. |
Example |
id |
mediumint(8), unsigned |
0-16 777 215, primary key, auto-increment |
7 |
name |
char(10) |
length = 10 characters |
misc |
Table firstline:
Field name |
Type |
Comments, size, etc. |
Example |
id |
mediumint(8), unsigned |
0-16 777 215, primary key, auto-increment |
2 |
firstlinetext |
varchar(255) binary |
length <= 255, case sensitive |
# xmcd |
Table genre:
Field name |
Type |
Comments, size, etc. |
Example |
id |
mediumint(8), unsigned |
0-16 777 215, primary key, auto-increment |
26 |
genretext |
varchar(255) binary |
length <= 255, case sensitive |
Rock |
Table processedbyline:
Field name |
Type |
Comments, size, etc. |
Example |
id |
mediumint(8), unsigned |
0-16 777 215, primary key, auto-increment |
4 |
processedbylinetext |
varchar(255) binary |
length <= 255, case sensitive |
# Processed by: cddbd v1.4b41PL3 Copyright(c) Steve Scherf et al. |
Table song:
Field name |
Type |
Comments, size, etc. |
Example |
id |
int(10) unsigned |
0-4 294 967 295, PK, auto-inc. |
730507 |
artist_id |
mediumint(8), unsigned |
0-16 777 215, foreign key to artist |
9076 |
album_id |
mediumint(8), unsigned |
0-16 777 215, foreign key to album |
59021 |
title |
blob |
length <= 2^16, case sensitive |
Money For Nothing |
song_num |
tinyint(3) unsigned |
0-255, starting from 0 |
1 |
Table submittedvialine:
Field name |
Type |
Comments, size, etc. |
Example |
id |
mediumint(8), unsigned |
0-16 777 215, primary key, auto-increment |
164 |
submittedvialinetext |
varchar(255) binary |
length <= 255, case sensitive |
# Submitted via: AltoMP3 Maker 3.0 |
Album.title and Artist.name are BLOBs because some were longer than 255. BLOB instead of TEXT because of case sensitivity.
Some varchars are binary to get case sensitiveness. Same goes for blobs.
Album.year: Since mysql does all kinds of conversions for a year field, it was decided to leave the field as a smallint(5).
Table dirname: The index for the name is done on the first two characters. Consider this if ever another directory is added.
Last modified: 18.07.2002
© 2002 Lars Bohn