Re: Backing up postgresql database

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Mar 3, 2009 at 4:49 PM, Jakov Sosic <jakov.sosic@xxxxxxx> wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Kevin Grittner wrote:
>
>> What you are copying is all the information required to restore the
>> database to the state it was in after the commit of any one of these
>> transactions.  Out of curiosity, how much space would you have thought
>> that would take?
>
> Well I surely didn't thought it would take 4.8 TB per day on a 150GB
> database! I tought it will take less than a DB itself. Wouldn't it be
> better to simply store SQL commands? :)

That might not be much smaller if you're doing a lot of updates /
inserts / deletes.  Also, SQL ain't the same as what was committed. If
you insert now() it won't be the same when you run it in two days.

The key here is it seems like you have a very high churn rate, and if
you're backup solution involves saving all those changes and applying
them later, it's gonna incur a rather large storage use.

I.e. PITR, as you have it implemented, might not be your best solution.

I do believe there's a warm standby method where you are constantly in
recovery on the target database but I'm not sure it would work right
here.

You might be better off with something like Slony, Burcado (sp?) or
another of other live replication setups that don't have the same
issues you're seeing here.

> Please, don't get angry because of my attitude, but I'm new to backup
> strategies and that's why I'm puzzled alot with the volume of
> information... I mean, we do have LTO4 tapes, but still :)

I don't think anyone's getting angry.  :)

>> It records each change made to every row in the database.  Do fewer
>> updates or keep the WAL files for less time?
>
> How do you mean, do fewer updates? DB transactions can't be influenced -
> cause is on the application level, not on the DB level.

I think you missed his point.  If you can't do fewer updates, then
your PITR implementation isn't going to work unless you've got 40TB to
dedicate to the job.

> And what do you mean by keeping WAL's for less time? Daily volume is
> still the same... I can define that I'll keep only two week backup, and

Reduce the time for keeping old log files to one or two days and do
nightly rsyncs?

> not 90day as is policy in my company, but as I've mentioned, that's
> still a lots lots of data :)

It's probably better to keep pg_dump type backups for those types of
things.  pitr is a poor challenger to the easily compressed output of
pg_dump for offsite backups.  Especially over a wire.

> I thought that WAL's will be a lot smaller
> than the whole DB cause only small part of DB information is inserted on

If I update a single row in a database 1,000,000,000 times, how man
WAL files will you need?
If you update every row in a 1,000,000,000 row database, how many WAL
files will you need?

> I've read about gziping WAL's, and I will do it offcourse, but that only
> makes problem a little smaller, doesn't solve it :)

I'm wondering what you're objectives are in your backup strategy, it
might be you're coming at it one way when we'd all approach it from
another way.  Or not.

-- 
Sent via pgsql-admin mailing list (pgsql-admin@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-admin


[Index of Archives]     [KVM ARM]     [KVM ia64]     [KVM ppc]     [Virtualization Tools]     [Spice Development]     [Libvirt]     [Libvirt Users]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite Questions]     [Linux Kernel]     [Linux SCSI]     [XFree86]

  Powered by Linux