Comments on: How PostgreSQL protects against partial page writes and data corruption http://www.xaprb.com/blog/2010/02/08/how-postgresql-protects-against-partial-page-writes-and-data-corruption/ Stay curious! Fri, 10 May 2013 18:25:19 +0000 hourly 1 http://wordpress.org/?v=3.5.1 By: Enzo http://www.xaprb.com/blog/2010/02/08/how-postgresql-protects-against-partial-page-writes-and-data-corruption/#comment-19577 Enzo Mon, 15 Aug 2011 20:48:04 +0000 http://www.xaprb.com/blog/?p=1616#comment-19577 Postgres doesn’t do that is the dumbest thing I’ve heard when it comes to corruption. Explain this error then:

SQLERRM: missing chunk number 0 for toast value 9785285
SQLCODE: XX000

If you Google this, you will see that it is very common due to hardware failures. In my case, dropping the toast index, rebuilding it and vacuuming the database have done absolutely nothing to correct the problem. I’m selecting * from the table to identify the ids and deleting them, but this is a 6 million row table and it is taking 5 min. to identify each bad record. I have no idea how many are bad. I could set the database parameter zero_damaged_pages to true, but then I won’t have any idea of how many rows are corrupted.

Postgres is pretty lame when it comes to identifying database corruption and repairing it. The only way to detected it is by dumping the database or select * from every table in the database.

]]>
By: Xaprb http://www.xaprb.com/blog/2010/02/08/how-postgresql-protects-against-partial-page-writes-and-data-corruption/#comment-17837 Xaprb Fri, 19 Feb 2010 02:25:26 +0000 http://www.xaprb.com/blog/?p=1616#comment-17837 I won’t be able to be at the BWPUG meeting this time, but maybe another time.

As far as I can understand, hint bits don’t belong on disk or in the WAL anyway. They say nothing about the data; they only reflect the current state of transactions in a way that’s not meaningful for recovery. But I appreciate why they’re in the pages and not elsewhere. Changing the page format looks like a good reason to delay page CRCs until someone’s ready with a way to help with in-place upgrades (read old version, write new version).

]]>
By: Greg Smith http://www.xaprb.com/blog/2010/02/08/how-postgresql-protects-against-partial-page-writes-and-data-corruption/#comment-17813 Greg Smith Sun, 14 Feb 2010 08:35:53 +0000 http://www.xaprb.com/blog/?p=1616#comment-17813 Let me summarize the giant pgsql-hackers thread…there’s a whole stack of reasons there’s not CRC checks on the data blocks in PostgreSQL yet, bu the major three are:

1) The database uses these things called hint bits: http://wiki.postgresql.org/wiki/Hint_Bits to help reduce the work involved in checking visibility information on the page. Changes to the hint bit info is not considered important–if they get screwed up, it can only be in the way that required doing more work; can’t have any corruption issues from it. Accordingly, hint bit changes aren’t logged in the WAL, and the database is pretty loose about letting people change them. Tightening that up, so that all hint bit changes go through the same WAL logging as other data on the page, is going to introduce a performance hit that everybody pays, whether or not they have CRCs turned on.

2) Torn pages, where only part of a page is written out, are handled pretty well by the current recovery design. This gets much more complicated with CRCs.

3) Adding this feature requires expanding the header of data pages, which is going to add a new type of task for an in-place upgrade that introduces this feature. This is difficult to put the work into resolving when the value of this feature seems so small. Most don’t care about it, and those who do already have options like ZFS (which avoids torn page issues at a lower level the database can’t really match).

]]>
By: Greg Smith http://www.xaprb.com/blog/2010/02/08/how-postgresql-protects-against-partial-page-writes-and-data-corruption/#comment-17812 Greg Smith Sun, 14 Feb 2010 07:55:59 +0000 http://www.xaprb.com/blog/?p=1616#comment-17812 BWPUG meetings are typically the 2nd Wednesday of the month, which makes next month’s expected on March 9. The pugs page is quite stale; the list archives at http://archives.postgresql.org/bwpug/ are more useful.

Given most people who attend the BWPUG and Baron will all be at PGEast, it may not be worth his trouble to hit the PUG when we’ll all see each other a few weeks later anyway.

]]>
By: Xaprb http://www.xaprb.com/blog/2010/02/08/how-postgresql-protects-against-partial-page-writes-and-data-corruption/#comment-17809 Xaprb Sat, 13 Feb 2010 00:38:15 +0000 http://www.xaprb.com/blog/?p=1616#comment-17809 (Lest I seem lazy — I did look at http://pugs.postgresql.org/bwpug, but it’s quite stale.)

]]>