Search Postgresql Archives

Streaming replication and WAL archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




TLDR: We want to be able to use streaming replication, WAL archiving, and have the ability to restore from a backup made before a failover using the WAL archive.

Setup:
We use postgresql 9.1 with streaming replication between a pair of nodes. The nodes are identical, and can fail over and fail back between master/slave pairs. We have separate reliable storage where we store periodic base backups and WAL archives. On each node, archive_command and restore_command write and read from this shared location.

Currently, WAL archives are written from the current master to a shared location. On failover, the new master will write WAL files with the same name as already exist in the archive. Currently we allow these to overwrite the old files. (See more below)

When a failed node comes back, if it is able to resume streaming from the new master, it does. Otherwise it does an rsync backup from the new master and eventually gets back into sync by replaying archived WAL files.

WAL archiving by new master:
When the new master comes up, it wants to archive some of the WAL files it has. These WAL files have the same name as files that have already been archived by the old master, but different contents. Newer WAL files in the new timeline cannot be archived until the old ones are archived. For this reason we allow the old files to be overwritten. This works consistently for *failover*, and is effectively the same thing as having each node write archives to a separate location, and have each only restore from the others' archives.

Note that some of the contents of the WAL files on the old slave were never archived, and do need to be archived. The slave has transactions that were part of the open WAL file on the master that the master had not archived, yet, but that the slave received via streaming replication. So the end of the old timeline can only come from the old slave/new master.

Restore:
We also want to be able to restore from the base backups. The problem we are encountering is that a server restored from a base backup fails to continue to restore once it hits a WAL file that was overwritten. I.E. that was written by the new master after a failover.

Related discussions:
http://www.postgresql.org/message-id/CADKfymHjBa9=edv1z8qh8G9o44iA2WNcRMNuowqj90djL+Y9Pg@xxxxxxxxxxxxxx
http://www.postgresql.org/message-id/CAHGQGwHVYqbX=A+zo+AvFbVHLGoypO9G_QDKbabeXgXBVGd05g@xxxxxxxxxxxxxx

Questions:
- Is there a single series of WAL files that will be the full timeline for the old timeline? I.E. if we had every WAL file from each node, could we put them together in a way that would allow us to replay them from a backup made before the failover?
- Is there a better way to handle any of this?
- Is there something we're missing?

Thanks,
-Alan Bryant


--
Sent via pgsql-general mailing list (pgsql-general@xxxxxxxxxxxxxx)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general




[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Books]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]
  Powered by Linux