Search Postgresql Archives

Need to replace SAN, best method with least downtime? (8.4.4)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]


I have a beefy server with 2 SANs, 1 "fast" (A) and 1 "slow" (B) and 1.3TB worth of 8.4.4 databases on A. A needs to be replaced/wiped completely with as little downtime as possible. It's flash-based and the modules need to be replaced, so no "swapping the SAN and keeping the disks". The databases are relatively busy, generating 8-50 16MB WAL segments per minute).

Several methods spring to mind:

a) pg_dumpall, wipe, restore (alternatively pg_dump global objects and all databases in parallel)

This will probably be 100% safe but take a long time (pg_dumpall takes ~440 minutes currently), so it's not useful unless the other methods are all too risky. Access to DB needs to be prevented during backup to avoid data loss.

b) set up a PITR slave (warm standby) on the same box, fail over to it, replace SAN A, then set up a PITR slave on A and fail over to it eventually

This would probably reduce my downtime to nearly nothing (except waiting for slave to read in archived WAL before restarting it as master, if there is some backlog). I cannot judge how risky it is in terms of data integrity. Also, it means running at reduced performance for a long time (1.3TB "hot backup" needs to be performed for fail over back to SAN A).

c) set up a tablespace on B and move as many tables/databases over to it as possible without severe service degradation. Then shut down Postgres, perform a filesystem-level backup of the remaining data on A, replace A, restore, then move things back to the default tablespace.

Moving big tables/databases will cause service degradation or interruption, but only few objects are really big and those aren't critical. I am hoping to end up with <=150GB of data to back up/restore, which should take 20-30 minutes (possibly less with rsync).

What would you do and why? I am considering c) at the moment because I am unsure about b): I cannot check the integrity of the slave's datadir quickly before I wipe the SAN (or can I?) and I don't know how well the slow SAN will hold up if all busy tables are moved to it, also it has to be done very carefully with no mistakes in recovery.conf etc. or I might trash my datadir or WAL archive dir.

Is there anything unsafe about c) that I am missing here? Looking at a few 100 tables and indices to classify and eventually move them is a lot of work, but it seems worth it to me.


Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:

[Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Programming PHP]     [Kernel Newbies]     [PHP Classes]     [Find Someone Nice]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

Add to Google Powered by Linux