A long time ago on a system far, far away...
Trying to migrate a database from MySQL to PostgreSQL. All the documentation I have read covers, in great detail, how to migrate the structure. I have found very little documentation on migrating the data. The schema has 13 tables (which have been migrated successfully) and 9 GB of data.
MySQL version: 5.1.x
PostgreSQL version: 8.4.x
I want to use the R programming language to analyze the data using SQL select statements; PostgreSQL has PL/R, but MySQL has nothing (as far as I can tell).
A New Hope
Create the database location (/var has insufficient space; also dislike having the PostgreSQL version number everywhere -- upgrading would break scripts!):
sudo mkdir -p /home/postgres/mainsudo cp -Rp /var/lib/postgresql/8.4/main /home/postgressudo chown -R postgres.postgres /home/postgressudo chmod -R 700 /home/postgressudo usermod -d /home/postgres/ postgres
All good to here. Next, restart the server and configure the database using these installation instructions:
sudo apt-get install postgresql pgadmin3sudo /etc/init.d/postgresql-8.4 stopsudo vi /etc/postgresql/8.4/main/postgresql.conf- Change
data_directoryto/home/postgres/main sudo /etc/init.d/postgresql-8.4 startsudo -u postgres psql postgres\password postgressudo -u postgres createdb climatepgadmin3
Use pgadmin3 to configure the database and create a schema.
The episode continues in a remote shell known as bash, with both databases running, and the installation of a set of tools with a rather unusual logo: SQL Fairy.
perl Makefile.PLsudo make installsudo apt-get install perl-doc(strangely, it is not calledperldoc)perldoc SQL::Translator::Manual
Extract a PostgreSQL-friendly DDL and all the MySQL data:
sqlt -f DBI --dsn dbi:mysql:climate --db-user user --db-password password -t PostgreSQL > climate-pg-ddl.sql- Edit
climate-pg-ddl.sqland convert the identifiers to lowercase, and insert the schema reference (using VIM)::%s/"\([A-Z_]*\)"/\L\1/g:%s/ TABLE / TABLE climate./g:%s/ on / on climate./g
mysqldump --skip-add-locks --complete-insert --no-create-db --no-create-info --quick --result-file="climate-my.sql" --databases climate --skip-comments -u root -p
It might be worthwhile to simply rename the tables and columns in MySQL to lowercase:
select concat( 'RENAME TABLE climate.', TABLE_NAME, ' to climate.', lower(TABLE_NAME), ';' ) from INFORMATION_SCHEMA.TABLES where TABLE_SCHEMA='climate';- Execute the commands from the previous step.
- There is probably a way to do the same for columns; I changed them manually because it was faster than figuring out how to write the query.
The Database Strikes Back
Recreate the structure in PostgreSQL as follows:
pgadmin3(switch to it)- Click the Execute arbitrary SQL queries icon
- Open
climate-pg-ddl.sql - Search for
TABLE "replace withTABLE climate."(insert the schema nameclimate) - Search for
on "replace withon climate."(insert the schema nameclimate) - Press
F5to execute
This results in:
Query returned successfully with no result in 122 ms.
Replies of the Jedi
At this point I am stumped.
- Where do I go from here (what are the steps) to convert
climate-my.sqltoclimate-pg.sqlso that they can be executed against PostgreSQL? - How to I make sure the indexes are copied over correctly (to maintain referential integrity; I don't have constraints at the moment to ease the transition)?
- How do I ensure that adding new rows in PostgreSQL will start enumerating from the index of the last row inserted (and not conflict with an existing primary key from the sequence)?
- How do you ensure the schema name comes through when transforming the data from MySQL to PostgreSQL inserts?
Resources
A fair bit of information was needed to get this far:
- https://help.ubuntu.com/community/PostgreSQL
- http://articles.sitepoint.com/article/site-mysql-postgresql-1
- http://wiki.postgresql.org/wiki/Converting_from_other_Databases_to_PostgreSQL#MySQL
- http://pgfoundry.org/frs/shownotes.php?release_id=810
- http://sqlfairy.sourceforge.net/
Thank you!