8

I have a database in PostgreSQL 8.3.1 that I'd like to migrate to MS SQL Server 2005 (or maybe 2008), including both the table schema and the data. The database is about 50GB in size with about 400,000,000 rows, so I think simple INSERT statements are out of the question. Could anyone recommend the best tool for performing this migration? Obviously it needs to be reliable, so the data is exactly the same in the target DB as in the source one and it needs to be able to copy this volume of data within a reasonable time.

warren
  • 17,829
  • 23
  • 82
  • 134
EMP
  • 5,122
  • 10
  • 36
  • 32

4 Answers4

6

If you have the appropriate Postgres support drivers installed on your SQL 2005 box (or wish to use Postgres via ODBC, or wish to dump the data from Postgres to a file and import from that) you can use import/export wizard in SQL Server in order to copy the data. This will ask you a variety of questions and then execute the import as a SQL Server Integration Services (SSIS) package job, using appropriate batch insert operations.

However if that wizard is not an option, it's worth considering that although you have a large number of rows, the individual size of the rows is < 135 bytes on average, and given sufficient transaction log space to allow a 50 GB transaction to occur 'simple insert' statements are not themselves out of the question.

dezso
  • 149
  • 9
Steve Gray
  • 266
  • 1
  • 2
  • 1
    Look at using BCP (it is a utility that comes with SQL server), if you end up exporting the data and then importing into SQL Server. Using the SSIS is a good idea though if you want to just get it directly from the PG server, but may give you log trouble. – ColtonCat Sep 15 '09 at 07:44
  • The SSIS package sounded very promising and I tried it, but unfortunately it runs out of memory and fails. :( ERROR [HY000] Out of memory while reading tuples.; Error while executing the query (PSQLODBC35W.DLL) – EMP Sep 16 '09 at 06:04
6

I ended up not using any third-party tool for the data as none of the ones I've tried worked for the large tables. Even SSIS failed. I did use a commercial tool for the schema, though. So my conversion process was as follows:

  1. Full Convert Enterprise to copy the schema (no data).
  2. pg_dump to export the data from Postgres in "plain text" format, which is basically a tab-separated values (TSV) file.
  3. Python scripts to transform the exported files into a format bcp would understand.
  4. bcp to import the data into MSSQL.

The transformation step took care of some differences in the formats used by pg_dump and bcp, such as:

  • pg_dump puts some Postgres-specific stuff at the start of the file and ends the data with ".", while bcp expects the entire file to contain data
  • pg_dump stores NULL values as "\N", while bcp expects nothing in place of a NULL (ie. no data in-between column separators)
  • pg_dump encodes tabs as "\t" and newlines as "\n", while bcp treats those literally
  • pg_dump always uses tabs and newlines as separators, while bcp allows the user to specify separators. This becomes necessary if the data contains any tabs or newlines, since they're not encoded.

I also found that some unique constraints that were fine in Postgres were violated in MSSQL, so I had to drop them. This was because NULL=NULL in MSSQL (ie. NULL is treated as a unique value), but not in Postgres.

EMP
  • 5,122
  • 10
  • 36
  • 32
0

http://www.easyfrom.net/

There you go :) Unfortunately, it is a little expensive.

David Rickman
  • 3,290
  • 17
  • 16
0

Almost 10 years on, and this is still not a straight forward issue. I ended up with a hybrid solution, I rolled my own schema mapper by exporting the schema and table/column comments using the following command:

pg_dump --schema-only --no-owner --no-privileges your_db_name > schema_create_script.sql

I then wrote a PHP script that translated the schema to T-SQL. I subsequently, used the following 3rd party software to do the actual import of rows (no affiliation):

http://www.convert-in.com/pgs2mss.htm

It's was a little slow, but so far so good. Our database was a smaller than yours, only 15GB, but that tool seemed to handle it well. It was also the cheapest one I could find at about $50. So far it's been a worth while investment.

dearsina
  • 101
  • 2