PostgreSQL 9.0.11 commit log

Stamp 9.0.11.

commit   : 6fa51d4ba72570d0f1776f03eaddd8f73b3c6eda    
  
author   : Tom Lane <[email protected]>    
date     : Mon, 3 Dec 2012 15:22:30 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Mon, 3 Dec 2012 15:22:30 -0500    

Click here for diff

M configure
M configure.in
M doc/bug.template
M src/include/pg_config.h.win32
M src/interfaces/libpq/libpq.rc.in
M src/port/win32ver.rc

Update release notes for 9.2.2, 9.1.7, 9.0.11, 8.4.15, 8.3.22.

commit   : bf274683b70a0642d359da886d72f0685e01d752    
  
author   : Tom Lane <[email protected]>    
date     : Mon, 3 Dec 2012 15:10:17 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Mon, 3 Dec 2012 15:10:17 -0500    

Click here for diff

M doc/src/sgml/release-8.3.sgml
M doc/src/sgml/release-8.4.sgml
M doc/src/sgml/release-9.0.sgml

Revert "Add mode where contrib installcheck runs each module in a separately named database."

commit   : 655592022865f0e3b9d851ca00aa4011f9e445d0    
  
author   : Andrew Dunstan <[email protected]>    
date     : Mon, 3 Dec 2012 15:03:50 -0500    
  
committer: Andrew Dunstan <[email protected]>    
date     : Mon, 3 Dec 2012 15:03:50 -0500    

Click here for diff

This reverts commit c8f666abde2af3060af41afe4b03ced2f62d94a9.  

M contrib/dblink/Makefile
M src/Makefile.global.in
M src/makefiles/pgxs.mk

Avoid holding vmbuffer pin after VACUUM. During VACUUM if we pause to perform a cycle of index cleanup we drop the vmbuffer pin, so we should do the same thing when heap scan completes. This avoids holding vmbuffer pin across the main index cleanup in VACUUM, which could be minutes or hours longer than necessary for correctness.

commit   : c52e0e2afc0c8ca2de4e03fc8478d89b4d5e7fce    
  
author   : Simon Riggs <[email protected]>    
date     : Mon, 3 Dec 2012 18:56:41 +0000    
  
committer: Simon Riggs <[email protected]>    
date     : Mon, 3 Dec 2012 18:56:41 +0000    

Click here for diff

Bug report and suggested fix from Pavan Deolasee  

M src/backend/commands/vacuumlazy.c

Fix documentation of path(polygon) function.

commit   : 17164ccdf870318bd157dff0835781fe6b40735c    
  
author   : Tom Lane <[email protected]>    
date     : Mon, 3 Dec 2012 11:09:04 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Mon, 3 Dec 2012 11:09:04 -0500    

Click here for diff

Obviously, this returns type "path", but somebody made a copy-and-pasteo  
long ago.  
  
Dagfinn Ilmari Mannsåker  

M doc/src/sgml/func.sgml

Translation updates

commit   : f14bd22a525e0d636fdbba09d86d233b52994cac    
  
author   : Peter Eisentraut <[email protected]>    
date     : Mon, 3 Dec 2012 07:52:39 -0500    
  
committer: Peter Eisentraut <[email protected]>    
date     : Mon, 3 Dec 2012 07:52:39 -0500    

Click here for diff

M src/backend/po/de.po
M src/backend/po/fr.po
M src/backend/po/ru.po
M src/bin/pg_controldata/po/cs.po
M src/bin/pg_dump/po/cs.po
M src/bin/pg_dump/po/de.po
M src/bin/pg_dump/po/fr.po
M src/bin/pg_dump/po/ru.po
M src/bin/pg_resetxlog/po/cs.po
M src/bin/psql/po/cs.po
M src/bin/psql/po/ru.po
M src/bin/scripts/po/cs.po
M src/interfaces/ecpg/preproc/po/cs.po
M src/interfaces/libpq/po/cs.po
M src/interfaces/libpq/po/de.po
M src/interfaces/libpq/po/fr.po
M src/pl/plperl/po/cs.po
M src/pl/plpgsql/src/po/cs.po
M src/pl/plpython/po/cs.po

Add mode where contrib installcheck runs each module in a separately named database.

commit   : c8f666abde2af3060af41afe4b03ced2f62d94a9    
  
author   : Andrew Dunstan <[email protected]>    
date     : Sun, 2 Dec 2012 17:30:18 -0500    
  
committer: Andrew Dunstan <[email protected]>    
date     : Sun, 2 Dec 2012 17:30:18 -0500    

Click here for diff

Normally each module is tested in aq database named contrib_regression,  
which is dropped and recreated at the beginhning of each pg_regress run.  
This mode, enabled by adding USE_MODULE_DB=1 to the make command line,  
runs most modules in a database with the module name embedded in it.  
  
This will make testing pg_upgrade on clusters with the contrib modules  
a lot easier.  
  
Still to be done: adapt to the MSVC build system.  
  
Backpatch to 9.0, which is the earliest version it is reasonably  
possible to test upgrading from.  

M contrib/dblink/Makefile
M src/Makefile.global.in
M src/makefiles/pgxs.mk

Update time zone data files to tzdata release 2012j.

commit   : 194bb37ebac43c5b5782dd32bcf45c64434a172d    
  
author   : Tom Lane <[email protected]>    
date     : Sun, 2 Dec 2012 16:35:23 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Sun, 2 Dec 2012 16:35:23 -0500    

Click here for diff

DST law changes in Cuba, Israel, Jordan, Libya, Palestine, Western Samoa,  
and portions of Brazil.  

M src/timezone/data/africa
M src/timezone/data/asia
M src/timezone/data/australasia
M src/timezone/data/europe
M src/timezone/data/northamerica
M src/timezone/data/southamerica

Don't advance checkPoint.nextXid near the end of a checkpoint sequence.

commit   : 135f4f605517deb97fed60811ebca32ff94a8488    
  
author   : Tom Lane <[email protected]>    
date     : Sun, 2 Dec 2012 15:20:15 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Sun, 2 Dec 2012 15:20:15 -0500    

Click here for diff

This reverts commit c11130690d6dca64267201a169cfb38c1adec5ef in favor of  
actually fixing the problem: namely, that we should never have been  
modifying the checkpoint record's nextXid at this point to begin with.  
The nextXid should match the state as of the checkpoint's logical WAL  
position (ie the redo point), not the state as of its physical position.  
It's especially bogus to advance it in some wal_levels and not others.  
In any case there is no need for the checkpoint record to carry the  
same nextXid shown in the XLOG_RUNNING_XACTS record just emitted by  
LogStandbySnapshot, as any replay operation will already have adopted  
that value as current.  
  
This fixes bug #7710 from Tarvi Pillessaar, and probably also explains bug  
#6291 from Daniel Farina, in that if a checkpoint were in progress at the  
instant of XID wraparound, the epoch bump would be lost as reported.  
(And, of course, these days there's at least a 50-50 chance of a checkpoint  
being in progress at any given instant.)  
  
Diagnosed by me and independently by Andres Freund.  Back-patch to all  
branches supporting hot standby.  

M src/backend/access/transam/xlog.c
M src/backend/storage/ipc/standby.c
M src/include/storage/standby.h

XidEpoch++ if wraparound during checkpoint. If wal_level = hot_standby we update the checkpoint nextxid, though in the case where a wraparound occurred half-way through a checkpoint we would neglect updating the epoch also. Updating the nextxid is arguably the wrong thing to do, but changing that may introduce subtle bugs into hot standby startup, while updating the value doesn't cause any known bugs yet. Minimal fix now to HEAD and backbranches, wider fix later in HEAD.

commit   : 069aa395c0a2d4f73944f13d1bbc15d6106e2277    
  
author   : Simon Riggs <[email protected]>    
date     : Sun, 2 Dec 2012 15:02:28 +0000    
  
committer: Simon Riggs <[email protected]>    
date     : Sun, 2 Dec 2012 15:02:28 +0000    

Click here for diff

Bug reported in #6291 by Daniel Farina and slightly differently in  
  
Cause analysis and recommended fixes from Tom Lane and Andres Freund.  
  
Applied patch is minimal version of Andres Freund's work.  

M src/backend/access/transam/xlog.c

Fix psql crash while parsing SQL file whose encoding is different from client encoding and the client encoding is not *safe* one. Such an example is, file encoding is UTF-8 and client encoding SJIS. Patch contributed by Jiang Guiqing.

commit   : ceee108acdf4c1f962574b3f9f9a6999891771dd    
  
author   : Tatsuo Ishii <[email protected]>    
date     : Sun, 2 Dec 2012 21:11:15 +0900    
  
committer: Tatsuo Ishii <[email protected]>    
date     : Sun, 2 Dec 2012 21:11:15 +0900    

Click here for diff

M src/bin/psql/psqlscan.l

Prevent passing gmake's environment variables down through pg_regress.

commit   : dbf2e2639067d95f507b727989454a74981fbc31    
  
author   : Tom Lane <[email protected]>    
date     : Sat, 1 Dec 2012 17:24:05 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Sat, 1 Dec 2012 17:24:05 -0500    

Click here for diff

When we do "make install" to create a temp installation, we don't want  
that instance of make to try to communicate with any instance of make  
that might be calling us.  This is known to cause problems if the  
upper make has a -jN flag, and in principle could cause problems even  
without that.  Unset the relevant environment variables to prevent such  
issues.  
  
Andres Freund  

M src/test/regress/pg_regress.c

commit   : 4ef41598171b93be06e2f46d4301ae173da62f34    
  
author   : Peter Eisentraut <[email protected]>    
date     : Sat, 1 Dec 2012 01:52:23 -0500    
  
committer: Peter Eisentraut <[email protected]>    
date     : Sat, 1 Dec 2012 01:52:23 -0500    

Click here for diff

M doc/src/sgml/docguide.sgml

Take buffer lock while inspecting btree index pages in contrib/pageinspect.

commit   : f6a0170777a2c43307474e19785e3f11ae64b5a6    
  
author   : Tom Lane <[email protected]>    
date     : Fri, 30 Nov 2012 17:02:44 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Fri, 30 Nov 2012 17:02:44 -0500    

Click here for diff

It's not safe to examine a shared buffer without any lock.  

M contrib/pageinspect/btreefuncs.c

Add missing buffer lock acquisition in GetTupleForTrigger().

commit   : f369815080881cd766a2da36b08bc2c415c17335    
  
author   : Tom Lane <[email protected]>    
date     : Fri, 30 Nov 2012 13:56:11 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Fri, 30 Nov 2012 13:56:11 -0500    

Click here for diff

If we had not been holding buffer pin continuously since the tuple was  
initially fetched by the UPDATE or DELETE query, it would be possible for  
VACUUM or a page-prune operation to move the tuple while we're trying to  
copy it.  This would result in a garbage "old" tuple value being passed to  
an AFTER ROW UPDATE or AFTER ROW DELETE trigger.  The preconditions for  
this are somewhat improbable, and the timing constraints are very tight;  
so it's not so surprising that this hasn't been reported from the field,  
even though the bug has been there a long time.  
  
Problem found by Andres Freund.  Back-patch to all active branches.  

M src/backend/commands/trigger.c

Produce a more useful error message for over-length Unix socket paths.

commit   : 31c341ae13eb8adbc0007baa303efd84c2735fbe    
  
author   : Tom Lane <[email protected]>    
date     : Thu, 29 Nov 2012 19:57:24 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Thu, 29 Nov 2012 19:57:24 -0500    

Click here for diff

The length of a socket path name is constrained by the size of struct  
sockaddr_un, and there's not a lot we can do about it since that is a  
kernel API.  However, it would be a good thing if we produced an  
intelligible error message when the user specifies a socket path that's too  
long --- and getaddrinfo's standard API is too impoverished to do this in  
the natural way.  So insert explicit tests at the places where we construct  
a socket path name.  Now you'll get an error that makes sense and even  
tells you what the limit is, rather than something generic like  
"Non-recoverable failure in name resolution".  
  
Per trouble report from Jeremy Drake and a fix idea from Andrew Dunstan.  

M src/backend/libpq/pqcomm.c
M src/include/libpq/pqcomm.h
M src/interfaces/libpq/fe-connect.c

Correctly init/deinit recovery xact environment. Previously we performed VirtualXactLockTableInsert but didn't set MyProc->lxid for Startup process. pg_locks now correctly shows "1/1" for vxid of Startup process during Hot Standby. At end of Hot Standby the Virtual Transaction was not deleted, leading to problems after promoting to normal running for some commands, such as CREATE INDEX CONCURRENTLY.

commit   : f4a3e679306ebfbd150d8af3cdd481bea1619c52    
  
author   : Simon Riggs <[email protected]>    
date     : Thu, 29 Nov 2012 23:46:54 +0000    
  
committer: Simon Riggs <[email protected]>    
date     : Thu, 29 Nov 2012 23:46:54 +0000    

Click here for diff

M src/backend/storage/ipc/standby.c
M src/backend/storage/lmgr/lmgr.c
M src/include/storage/lmgr.h

Fix assorted bugs in CREATE INDEX CONCURRENTLY.

commit   : 1dbd02dc37efb74b770aa834b2cb65dae2446640    
  
author   : Tom Lane <[email protected]>    
date     : Thu, 29 Nov 2012 14:50:39 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Thu, 29 Nov 2012 14:50:39 -0500    

Click here for diff

This patch changes CREATE INDEX CONCURRENTLY so that the pg_index  
flag changes it makes without exclusive lock on the index are made via  
heap_inplace_update() rather than a normal transactional update.  The  
latter is not very safe because moving the pg_index tuple could result in  
concurrent SnapshotNow scans finding it twice or not at all, thus possibly  
resulting in index corruption.  
  
In addition, fix various places in the code that ought to check to make  
sure that the indexes they are manipulating are valid and/or ready as  
appropriate.  These represent bugs that have existed since 8.2, since  
a failed CREATE INDEX CONCURRENTLY could leave a corrupt or invalid  
index behind, and we ought not try to do anything that might fail with  
such an index.  
  
Also fix RelationReloadIndexInfo to ensure it copies all the pg_index  
columns that are allowed to change after initial creation.  Previously we  
could have been left with stale values of some fields in an index relcache  
entry.  It's not clear whether this actually had any user-visible  
consequences, but it's at least a bug waiting to happen.  
  
This is a subset of a patch already applied in 9.2 and HEAD.  Back-patch  
into all earlier supported branches.  
  
Tom Lane and Andres Freund  

M src/backend/access/heap/README.HOT
M src/backend/catalog/index.c
M src/backend/commands/cluster.c
M src/backend/commands/indexcmds.c
M src/backend/commands/tablecmds.c
M src/backend/commands/vacuum.c
M src/backend/executor/execUtils.c
M src/backend/optimizer/util/plancat.c
M src/backend/utils/cache/relcache.c
M src/include/catalog/index.h
M src/include/catalog/pg_index.h

When processing nested structure pointer variables ecpg always expected an array datatype which of course is wrong.

commit   : 3dfdf28152eb2df7cb4d4e43f461903fcc09e1d2    
  
author   : Michael Meskes <[email protected]>    
date     : Thu, 29 Nov 2012 17:12:00 +0100    
  
committer: Michael Meskes <[email protected]>    
date     : Thu, 29 Nov 2012 17:12:00 +0100    

Click here for diff

Applied patch by Muhammad Usama <[email protected]> to fix this.  

M src/interfaces/ecpg/preproc/variable.c

Fix pg_resetxlog to use correct path to postmaster.pid.

commit   : 614ba4844d3f771e56999dcdecc53b78d884d5d5    
  
author   : Tom Lane <[email protected]>    
date     : Thu, 22 Nov 2012 11:23:38 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Thu, 22 Nov 2012 11:23:38 -0500    

Click here for diff

Since we've already chdir'd into the data directory, the file should  
be referenced as just "postmaster.pid", without prefixing the directory  
path.  This is harmless in the normal case where an absolute PGDATA path  
is used, but quite dangerous if a relative path is specified, since the  
program might then fail to notice an active postmaster.  
  
Reported by Hari Babu.  This got broken in my commit  
eb5949d190e80360386113fde0f05854f0c9824d, so patch all active versions.  

M src/bin/pg_resetxlog/pg_resetxlog.c

Avoid bogus "out-of-sequence timeline ID" errors in standby-mode.

commit   : 875d3f3039f09ba74f442ceb95411e3a75f18048    
  
author   : Heikki Linnakangas <[email protected]>    
date     : Thu, 22 Nov 2012 11:23:46 +0200    
  
committer: Heikki Linnakangas <[email protected]>    
date     : Thu, 22 Nov 2012 11:23:46 +0200    

Click here for diff

When startup process opens a WAL segment after replaying part of it, it  
validates the first page on the WAL segment, even though the page it's  
really interested in later in the file. As part of the validation, it checks  
that the TLI on the page header is >= the TLI it saw on the last page it  
read. If the segment contains a timeline switch, and we have already  
replayed it, and then re-open the WAL segment (because of streaming  
replication got disconnected and reconnected, for example), the TLI check  
will fail when the first page is validated. Fix that by relaxing the TLI  
check when re-opening a WAL segment.  
  
Backpatch to 9.0. Earlier versions had the same code, but before standby  
mode was introduced in 9.0, recovery never tried to re-read a segment after  
partially replaying it.  
  
Reported by Amit Kapila, while testing a new feature.  

M src/backend/access/transam/xlog.c

Don't launch new child processes after we've been told to shut down.

commit   : 2a18b3ed36ffe5cd365121fb9b00b3d64748f527    
  
author   : Tom Lane <[email protected]>    
date     : Wed, 21 Nov 2012 15:18:52 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Wed, 21 Nov 2012 15:18:52 -0500    

Click here for diff

Once we've received a shutdown signal (SIGINT or SIGTERM), we should not  
launch any more child processes, even if we get signals requesting such.  
The normal code path for spawning backends has always understood that,  
but the postmaster's infrastructure for hot standby and autovacuum didn't  
get the memo.  As reported by Hari Babu in bug #7643, this could lead to  
failure to shut down at all in some cases, such as when SIGINT is received  
just before the startup process sends PMSIGNAL_RECOVERY_STARTED: we'd  
launch a bgwriter and checkpointer, and then those processes would have no  
idea that they ought to quit.  Similarly, launching a new autovacuum worker  
would result in waiting till it finished before shutting down.  
  
Also, switch the order of the code blocks in reaper() that detect startup  
process crash versus shutdown termination.  Once we've sent it a signal,  
we should not consider that exit(1) is surprising.  This is just a cosmetic  
fix since shutdown occurs correctly anyway, but better not to log a phony  
complaint about startup process crash.  
  
Back-patch to 9.0.  Some parts of this might be applicable before that,  
but given the lack of prior complaints I'm not going to worry too much  
about older branches.  

M src/backend/postmaster/postmaster.c

commit   : eb865dbb1b0dd2f78c5f516ce2b003d183207d2c    
  
author   : Tom Lane <[email protected]>    
date     : Mon, 19 Nov 2012 21:21:48 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Mon, 19 Nov 2012 21:21:48 -0500    

Click here for diff

Some platforms throw an exception for this division, rather than returning  
a necessarily-overflowed result.  Since we were testing for overflow after  
the fact, an exception isn't nice.  We can avoid the problem by treating  
division by -1 as negation.  
  
Add some regression tests so that we'll find out if any compilers try to  
optimize away the overflow check conditions.  
  
Back-patch of commit 1f7cb5c30983752ff8de833de30afcaee63536d0.  
  
Per discussion with Xi Wang, though this is different from the patch he  
submitted.  

M src/backend/utils/adt/int.c
M src/backend/utils/adt/int8.c
M src/test/regress/expected/int2.out
M src/test/regress/expected/int4.out
M src/test/regress/expected/int8-exp-three-digits.out
M src/test/regress/expected/int8.out
M src/test/regress/sql/int2.sql
M src/test/regress/sql/int4.sql
M src/test/regress/sql/int8.sql

Limit values of archive_timeout, post_auth_delay, auth_delay.milliseconds.

commit   : 15939a7a2903604596c05e633c893aba43a68ad0    
  
author   : Tom Lane <[email protected]>    
date     : Sun, 18 Nov 2012 17:15:22 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Sun, 18 Nov 2012 17:15:22 -0500    

Click here for diff

The previous definitions of these GUC variables allowed them to range  
up to INT_MAX, but in point of fact the underlying code would suffer  
overflows or other errors with large values.  Reduce the maximum values  
to something that won't misbehave.  There's no apparent value in working  
harder than this, since very large delays aren't sensible for any of  
these.  (Note: the risk with archive_timeout is that if we're late  
checking the state, the timestamp difference it's being compared to  
might overflow.  So we need some amount of slop; the choice of INT_MAX/2  
is arbitrary.)  
  
Per followup investigation of bug #7670.  Although this isn't a very  
significant fix, might as well back-patch.  

M src/backend/utils/misc/guc.c

Fix the int8 and int2 cases of (minimum possible integer) % (-1).

commit   : 2e06d52529ae52b7c56f18e451e896cb8680aac9    
  
author   : Tom Lane <[email protected]>    
date     : Wed, 14 Nov 2012 17:30:10 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Wed, 14 Nov 2012 17:30:10 -0500    

Click here for diff

The correct answer for this (or any other case with arg2 = -1) is zero,  
but some machines throw a floating-point exception instead of behaving  
sanely.  Commit f9ac414c35ea084ff70c564ab2c32adb06d5296f dealt with this  
in int4mod, but overlooked the fact that it also happens in int8mod  
(at least on my Linux x86_64 machine).  Protect int2mod as well; it's  
not clear whether any machines fail there (mine does not) but since the  
test is so cheap it seems better safe than sorry.  While at it, simplify  
the original guard in int4mod: we need only check for arg2 == -1, we  
don't need to check arg1 explicitly.  
  
Xi Wang, with some editing by me.  

M src/backend/utils/adt/int.c
M src/backend/utils/adt/int8.c

Fix memory leaks in record_out() and record_send().

commit   : 797ea5219ca1ddb54c9f9a5e908a1b2935a628f6    
  
author   : Tom Lane <[email protected]>    
date     : Tue, 13 Nov 2012 14:44:46 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Tue, 13 Nov 2012 14:44:46 -0500    

Click here for diff

record_out() leaks memory: it fails to free the strings returned by the  
per-column output functions, and also is careless about detoasted values.  
This results in a query-lifespan memory leakage when returning composite  
values to the client, because printtup() runs the output functions in the  
query-lifespan memory context.  Fix it to handle these issues the same way  
printtup() does.  Also fix a similar leakage in record_send().  
  
(At some point we might want to try to run output functions in  
shorter-lived memory contexts, so that we don't need a zero-leakage policy  
for them.  But that would be a significantly more invasive patch, which  
doesn't seem like material for back-patching.)  
  
In passing, use appendStringInfoCharMacro instead of appendStringInfoChar  
in the innermost data-copying loop of record_out, to try to shave a few  
cycles from this function's runtime.  
  
Per trouble report from Carlos Henrique Reimer.  Back-patch to all  
supported versions.  

M src/backend/utils/adt/rowtypes.c

Clarify docs on hot standby lock release

commit   : 759340c8f52416cfcd934a25f52894359a5ebed7    
  
author   : Simon Riggs <[email protected]>    
date     : Tue, 13 Nov 2012 15:58:35 -0300    
  
committer: Simon Riggs <[email protected]>    
date     : Tue, 13 Nov 2012 15:58:35 -0300    

Click here for diff

Andres Freund and Simon Riggs  

M src/backend/storage/ipc/procarray.c
M src/backend/storage/ipc/standby.c

Fix multiple problems in WAL replay.

commit   : bb745dc46250e14fe9f70a4d016c6af71fee4de4    
  
author   : Tom Lane <[email protected]>    
date     : Mon, 12 Nov 2012 22:05:27 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Mon, 12 Nov 2012 22:05:27 -0500    

Click here for diff

Most of the replay functions for WAL record types that modify more than  
one page failed to ensure that those pages were locked correctly to ensure  
that concurrent queries could not see inconsistent page states.  This is  
a hangover from coding decisions made long before Hot Standby was added,  
when it was hardly necessary to acquire buffer locks during WAL replay  
at all, let alone hold them for carefully-chosen periods.  
  
The key problem was that RestoreBkpBlocks was written to hold lock on each  
page restored from a full-page image for only as long as it took to update  
that page.  This was guaranteed to break any WAL replay function in which  
there was any update-ordering constraint between pages, because even if the  
nominal order of the pages is the right one, any mixture of full-page and  
non-full-page updates in the same record would result in out-of-order  
updates.  Moreover, it wouldn't work for situations where there's a  
requirement to maintain lock on one page while updating another.  Failure  
to honor an update ordering constraint in this way is thought to be the  
cause of bug #7648 from Daniel Farina: what seems to have happened there  
is that a btree page being split was rewritten from a full-page image  
before the new right sibling page was written, and because lock on the  
original page was not maintained it was possible for hot standby queries to  
try to traverse the page's right-link to the not-yet-existing sibling page.  
  
To fix, get rid of RestoreBkpBlocks as such, and instead create a new  
function RestoreBackupBlock that restores just one full-page image at a  
time.  This function can be invoked by WAL replay functions at the points  
where they would otherwise perform non-full-page updates; in this way, the  
physical order of page updates remains the same no matter which pages are  
replaced by full-page images.  We can then further adjust the logic in  
individual replay functions if it is necessary to hold buffer locks  
for overlapping periods.  A side benefit is that we can simplify the  
handling of concurrency conflict resolution by moving that code into the  
record-type-specfic functions; there's no more need to contort the code  
layout to keep conflict resolution in front of the RestoreBkpBlocks call.  
  
In connection with that, standardize on zero-based numbering rather than  
one-based numbering for referencing the full-page images.  In HEAD, I  
removed the macros XLR_BKP_BLOCK_1 through XLR_BKP_BLOCK_4.  They are  
still there in the header files in previous branches, but are no longer  
used by the code.  
  
In addition, fix some other bugs identified in the course of making these  
changes:  
  
spgRedoAddNode could fail to update the parent downlink at all, if the  
parent tuple is in the same page as either the old or new split tuple and  
we're not doing a full-page image: it would get fooled by the LSN having  
been advanced already.  This would result in permanent index corruption,  
not just transient failure of concurrent queries.  
  
Also, ginHeapTupleFastInsert's "merge lists" case failed to mark the old  
tail page as a candidate for a full-page image; in the worst case this  
could result in torn-page corruption.  
  
heap_xlog_freeze() was inconsistent about using a cleanup lock or plain  
exclusive lock: it did the former in the normal path but the latter for a  
full-page image.  A plain exclusive lock seems sufficient, so change to  
that.  
  
Also, remove gistRedoPageDeleteRecord(), which has been dead code since  
VACUUM FULL was rewritten.  
  
Back-patch to 9.0, where hot standby was introduced.  Note however that 9.0  
had a significantly different WAL-logging scheme for GIST index updates,  
and it doesn't appear possible to make that scheme safe for concurrent hot  
standby queries, because it can leave inconsistent states in the index even  
between WAL records.  Given the lack of complaints from the field, we won't  
work too hard on fixing that branch.  

M src/backend/access/gin/ginfast.c
M src/backend/access/gin/ginxlog.c
M src/backend/access/gist/gistxlog.c
M src/backend/access/heap/heapam.c
M src/backend/access/nbtree/nbtxlog.c
M src/backend/access/transam/README
M src/backend/access/transam/xlog.c
M src/include/access/xlog.h

Check for stack overflow in transformSetOperationTree().

commit   : fae09422fdb97ef2bfbfffcd2abc30060a9c1f51    
  
author   : Tom Lane <[email protected]>    
date     : Sun, 11 Nov 2012 19:56:27 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Sun, 11 Nov 2012 19:56:27 -0500    

Click here for diff

Since transformSetOperationTree() recurses, it can be driven to stack  
overflow with enough UNION/INTERSECT/EXCEPT clauses in a query.  Add a  
check to ensure it fails cleanly instead of crashing.  Per report from  
Matthew Gerber (though it's not clear whether this is the only thing  
going wrong for him).  
  
Historical note: I think the reasoning behind not putting a check here in  
the beginning was that the check in transformExpr() ought to be sufficient  
to guard the whole parser.  However, because transformSetOperationTree()  
recurses all the way to the bottom of the set-operation tree before doing  
any analysis of the statement's expressions, that check doesn't save it.  

M src/backend/parser/analyze.c

XSLT stylesheet: Add slash to directory name

commit   : dd1ca20dfee9f05d0aa9df5b1f4d5215c181162e    
  
author   : Peter Eisentraut <[email protected]>    
date     : Thu, 8 Nov 2012 23:55:36 -0500    
  
committer: Peter Eisentraut <[email protected]>    
date     : Thu, 8 Nov 2012 23:55:36 -0500    

Click here for diff

Some versions of the XSLT stylesheets don't handle the missing slash  
correctly (they concatenate directory and file name without the slash).  
This might never have worked correctly.  

M doc/src/sgml/stylesheet.xsl

Fix handling of inherited check constraints in ALTER COLUMN TYPE.

commit   : 8e344eacdccc3c069974cba61af42be60e50c764    
  
author   : Tom Lane <[email protected]>    
date     : Mon, 5 Nov 2012 13:36:31 -0500    
  
committer: Tom Lane <[email protected]>    
date     : Mon, 5 Nov 2012 13:36:31 -0500    

Click here for diff

This case got broken in 8.4 by the addition of an error check that  
complains if ALTER TABLE ONLY is used on a table that has children.  
We do use ONLY for this situation, but it's okay because the necessary  
recursion occurs at a higher level.  So we need to have a separate  
flag to suppress recursion without making the error check.  
  
Reported and patched by Pavan Deolasee, with some editorial adjustments by  
me.  Back-patch to 8.4, since this is a regression of functionality that  
worked in earlier branches.  

M src/backend/commands/tablecmds.c
M src/include/nodes/parsenodes.h
M src/test/regress/expected/alter_table.out
M src/test/regress/sql/alter_table.sql

Document that TCP keepalive settings read as 0 on Unix-socket connections.

commit   : d997bd2c50c15b8e33b02004e580f68e29a610a6    
  
author   : Tom Lane <[email protected]>    
date     : Wed, 31 Oct 2012 14:26:20 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Wed, 31 Oct 2012 14:26:20 -0400    

Click here for diff

Per bug #7631 from Rob Johnson.  The code is operating as designed, but the  
docs didn't explain it.  

M doc/src/sgml/config.sgml

Prefer actual constants to pseudo-constants in equivalence class machinery.

commit   : b1f7ee9218f91c755c97aefaf4494029dbf73714    
  
author   : Tom Lane <[email protected]>    
date     : Fri, 26 Oct 2012 14:19:55 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Fri, 26 Oct 2012 14:19:55 -0400    

Click here for diff

generate_base_implied_equalities_const() should prefer plain Consts over  
other em_is_const eclass members when choosing the "pivot" value that  
all the other members will be equated to.  This makes it more likely that  
the generated equalities will be useful in constraint-exclusion proofs.  
Per report from Rushabh Lathia.  

M src/backend/optimizer/path/equivclass.c

Prevent parser from believing that views have system columns.

commit   : 9619fdca106149d9e7bae5db3977435f8ce5f0c2    
  
author   : Tom Lane <[email protected]>    
date     : Wed, 24 Oct 2012 14:54:07 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Wed, 24 Oct 2012 14:54:07 -0400    

Click here for diff

Views should not have any pg_attribute entries for system columns.  
However, we forgot to remove such entries when converting a table to a  
view.  This could lead to crashes later on, if someone attempted to  
reference such a column, as reported by Kohei KaiGai.  
  
This problem is corrected properly in HEAD (by removing the pg_attribute  
entries during conversion), but in the back branches we need to defend  
against existing mis-converted views.  This fix costs us an extra syscache  
lookup per system column reference, which is annoying but probably not  
really measurable in the big scheme of things.  

M src/backend/parser/parse_relation.c
M src/test/regress/expected/rules.out
M src/test/regress/sql/rules.sql

Fix hash_search to avoid corruption of the hash table on out-of-memory.

commit   : 586250cc700ac2ee99ff719e1e593275feed4d8a    
  
author   : Tom Lane <[email protected]>    
date     : Fri, 19 Oct 2012 15:24:21 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Fri, 19 Oct 2012 15:24:21 -0400    

Click here for diff

An out-of-memory error during expand_table() on a palloc-based hash table  
would leave a partially-initialized entry in the table.  This would not be  
harmful for transient hash tables, since they'd get thrown away anyway at  
transaction abort.  But for long-lived hash tables, such as the relcache  
hash, this would effectively corrupt the table, leading to crash or other  
misbehavior later.  
  
To fix, rearrange the order of operations so that table enlargement is  
attempted before we insert a new entry, rather than after adding it  
to the hash table.  
  
Problem discovered by Hitoshi Harada, though this is a bit different  
from his proposed patch.  

M src/backend/utils/hash/dynahash.c

Fix ruleutils to print "INSERT INTO foo DEFAULT VALUES" correctly.

commit   : c47b40894d0a309a83a1f390ce0b6ae8959195db    
  
author   : Tom Lane <[email protected]>    
date     : Fri, 19 Oct 2012 13:40:14 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Fri, 19 Oct 2012 13:40:14 -0400    

Click here for diff

Per bug #7615 from Marko Tiikkaja.  Apparently nobody ever tried this  
case before ...  

M src/backend/utils/adt/ruleutils.c

Further tweaking of the readfile() function in pg_ctl.

commit   : 95ff5b3599c3e88a70a45f3bb1e36f6c8762d450    
  
author   : Heikki Linnakangas <[email protected]>    
date     : Thu, 18 Oct 2012 22:26:26 +0300    
  
committer: Heikki Linnakangas <[email protected]>    
date     : Thu, 18 Oct 2012 22:26:26 +0300    

Click here for diff

Don't leak a file descriptor if the file is empty or we can't read its size.  
  
Expect there to be a newline at the end of the last line, too. If there  
isn't, ignore anything after the last newline. This makes it a tiny bit  
more robust in case the file is appended to concurrently, so that we don't  
return the last line if it hasn't been fully written yet. And this makes  
the code a bit less obscure, anyway. Per Tom Lane's suggestion.  
  
Backpatch to all supported branches.  

M src/bin/pg_ctl/pg_ctl.c

Fix planning of non-strict equivalence clauses above outer joins.

commit   : afdc7515fdadf51ea459aab568b5b7687b552e75    
  
author   : Tom Lane <[email protected]>    
date     : Thu, 18 Oct 2012 12:29:06 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Thu, 18 Oct 2012 12:29:06 -0400    

Click here for diff

If a potential equivalence clause references a variable from the nullable  
side of an outer join, the planner needs to take care that derived clauses  
are not pushed to below the outer join; else they may use the wrong value  
for the variable.  (The problem arises only with non-strict clauses, since  
if an upper clause can be proven strict then the outer join will get  
simplified to a plain join.)  The planner attempted to prevent this type  
of error by checking that potential equivalence clauses aren't  
outerjoin-delayed as a whole, but actually we have to check each side  
separately, since the two sides of the clause will get moved around  
separately if it's treated as an equivalence.  Bugs of this type can be  
demonstrated as far back as 7.4, even though releases before 8.3 had only  
a very ad-hoc notion of equivalence clauses.  
  
In addition, we neglected to account for the possibility that such clauses  
might have nonempty nullable_relids even when not outerjoin-delayed; so the  
equivalence-class machinery lacked logic to compute correct nullable_relids  
values for clauses it constructs.  This oversight was harmless before 9.2  
because we were only using RestrictInfo.nullable_relids for OR clauses;  
but as of 9.2 it could result in pushing constructed equivalence clauses  
to incorrect places.  (This accounts for bug #7604 from Bill MacArthur.)  
  
Fix the first problem by adding a new test check_equivalence_delay() in  
distribute_qual_to_rels, and fix the second one by adding code in  
equivclass.c and called functions to set correct nullable_relids for  
generated clauses.  Although I believe the second part of this is not  
currently necessary before 9.2, I chose to back-patch it anyway, partly to  
keep the logic similar across branches and partly because it seems possible  
we might find other reasons why we need valid values of nullable_relids in  
the older branches.  
  
Add regression tests illustrating these problems.  In 9.0 and up, also  
add test cases checking that we can push constants through outer joins,  
since we've broken that optimization before and I nearly broke it again  
with an overly simplistic patch for this problem.  

M src/backend/nodes/outfuncs.c
M src/backend/optimizer/path/equivclass.c
M src/backend/optimizer/plan/initsplan.c
M src/include/nodes/relation.h
M src/include/optimizer/planmain.h
M src/test/regress/expected/join.out
M src/test/regress/sql/join.sql

Fix typo in previous commit

commit   : df35b7e33ef6354ef8530ba9eabcdb2010b0ca39    
  
author   : Simon Riggs <[email protected]>    
date     : Wed, 17 Oct 2012 09:21:29 +0100    
  
committer: Simon Riggs <[email protected]>    
date     : Wed, 17 Oct 2012 09:21:29 +0100    

Click here for diff

M doc/src/sgml/indices.sgml
M doc/src/sgml/ref/create_index.sgml

Clarify hash index caution and copy to CREATE INDEX docs

commit   : 98188441972151995d485b2e2c18a8cf36a3a227    
  
author   : Simon Riggs <[email protected]>    
date     : Wed, 17 Oct 2012 08:33:38 +0100    
  
committer: Simon Riggs <[email protected]>    
date     : Wed, 17 Oct 2012 08:33:38 +0100    

Click here for diff

M doc/src/sgml/indices.sgml
M doc/src/sgml/ref/create_index.sgml

Fix race condition in pg_ctl reading postmaster.pid.

commit   : 1c95b5eea6bec742d5d6ee097c55690712dd3be6    
  
author   : Heikki Linnakangas <[email protected]>    
date     : Sat, 13 Oct 2012 12:48:14 +0300    
  
committer: Heikki Linnakangas <[email protected]>    
date     : Sat, 13 Oct 2012 12:48:14 +0300    

Click here for diff

If postmaster changed postmaster.pid while pg_ctl was reading it, pg_ctl  
could overrun the buffer it allocated for the file. Fix by reading the  
whole file to memory with one read() call.  
  
initdb contains an identical copy of the readfile() function, but the files  
that initdb reads are static, not modified concurrently. Nevertheless, add  
a simple bounds-check there, if only to silence static analysis tools.  
  
Per report from Dave Vitek. Backpatch to all supported branches.  

M src/bin/initdb/initdb.c
M src/bin/pg_ctl/pg_ctl.c

Fix cross-type case in partial row matching for hashed subplans.

commit   : 5c07de4bdfadd2ea73c66dcad3b0e4d5ca4ceef7    
  
author   : Tom Lane <[email protected]>    
date     : Thu, 11 Oct 2012 12:21:18 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Thu, 11 Oct 2012 12:21:18 -0400    

Click here for diff

When hashing a subplan like "WHERE (a, b) NOT IN (SELECT x, y FROM ...)",  
findPartialMatch() attempted to match rows using the hashtable's internal  
equality operators, which of course are for x and y's datatypes.  What we  
need to use are the potentially cross-type operators for a=x, b=y, etc.  
Failure to do that leads to wrong answers or even crashes.  The scope for  
problems is limited to cases where we have different types with compatible  
hash functions (else we'd not be using a hashed subplan), but for example  
int4 vs int8 can cause the problem.  
  
Per bug #7597 from Bo Jensen.  This has been wrong since the hashed-subplan  
code was written, so patch all the way back.  

M src/backend/executor/nodeSubplan.c
M src/test/regress/expected/subselect.out
M src/test/regress/sql/subselect.sql

Fix PGXS support for building loadable modules on AIX.

commit   : 6975dcddb23219bf842da7f622ea38d42e5adebf    
  
author   : Tom Lane <[email protected]>    
date     : Tue, 9 Oct 2012 21:04:20 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Tue, 9 Oct 2012 21:04:20 -0400    

Click here for diff

Building a shlib on AIX requires use of the mkldexport.sh script, but we  
failed to install that, preventing its use from non-source-tree contexts.  
Also, Makefile.aix had the wrong idea about where to find the installed  
copy of the postgres.imp symbol file used by AIX.  
  
Per report from John Pierce.  Patch all the way back, since this has been  
broken since the beginning of PGXS.  

M src/backend/Makefile
M src/makefiles/Makefile.aix

Fix lo_import and lo_export to return useful error messages more often.

commit   : 875406af6ff961ba169db11c74828f02e96f5753    
  
author   : Tom Lane <[email protected]>    
date     : Mon, 8 Oct 2012 21:52:53 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Mon, 8 Oct 2012 21:52:53 -0400    

Click here for diff

I found that these functions tend to return -1 while leaving an empty error  
message string in the PGconn, if they suffer some kind of I/O error on the  
file.  The reason is that lo_close, which thinks it's executed a perfectly  
fine SQL command, clears the errorMessage.  The minimum-change workaround  
is to reorder operations here so that we don't fill the errorMessage until  
after lo_close.  

M src/interfaces/libpq/fe-lobj.c

Fix lo_export usage in example programs.

commit   : 1e9d79856dacf38570e9d9080081c5ba60841f54    
  
author   : Tom Lane <[email protected]>    
date     : Mon, 8 Oct 2012 21:19:01 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Mon, 8 Oct 2012 21:19:01 -0400    

Click here for diff

lo_export returns -1, not zero, on failure.  

M src/test/examples/testlo.c

Say ANALYZE, not VACUUM, in error message on analyze in hot standby.

commit   : d09affdc480786dd38deae5fcdbf472fbe324117    
  
author   : Heikki Linnakangas <[email protected]>    
date     : Mon, 8 Oct 2012 14:17:27 +0300    
  
committer: Heikki Linnakangas <[email protected]>    
date     : Mon, 8 Oct 2012 14:17:27 +0300    

Click here for diff

Tomonaru Katsumata  

M src/backend/tcop/utility.c

Removed sentence about not being able to retrieve more than one row at a time, because it is not correct.

commit   : bea34106a6423eb2121f74b892c9020aa184919b    
  
author   : Michael Meskes <[email protected]>    
date     : Fri, 5 Oct 2012 16:54:49 +0200    
  
committer: Michael Meskes <[email protected]>    
date     : Fri, 5 Oct 2012 16:54:49 +0200    

Click here for diff

M doc/src/sgml/ecpg.sgml

Fixed test for array boundary.

commit   : 6b0d71bf71f748c80dc84d5a52f4f60a098bb192    
  
author   : Michael Meskes <[email protected]>    
date     : Fri, 5 Oct 2012 16:37:45 +0200    
  
committer: Michael Meskes <[email protected]>    
date     : Fri, 5 Oct 2012 16:37:45 +0200    

Click here for diff

Instead of continuing if the next character is not an array boundary get_data()  
used to continue only on finding a boundary so it was not able to read any  
element after the first.  

M src/interfaces/ecpg/ecpglib/data.c

Fix permissions explanations in CREATE DATABASE and CREATE SCHEMA docs.

commit   : 0f6d0119f2b3e0957c67e6cb5ec84e669718bd27    
  
author   : Tom Lane <[email protected]>    
date     : Thu, 4 Oct 2012 13:41:12 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Thu, 4 Oct 2012 13:41:12 -0400    

Click here for diff

These reference pages still claimed that you have to be superuser to create  
a database or schema owned by a different role.  That was true before 8.1,  
but it was changed in commits aa1110624c08298393dfce996f7b21809d98d3fd and  
f91370cd2faf1fd35a1ac74d84652a85ed841919 to allow assignment of ownership  
to any role you are a member of.  However, at the time we were thinking of  
that primarily as a change to the ALTER OWNER rules, so the need to touch  
these two CREATE ref pages got missed.  

M doc/src/sgml/ref/create_database.sgml
M doc/src/sgml/ref/create_schema.sgml

REASSIGN OWNED: consider grants on tablespaces, too

commit   : bec6e6cdfaa8987e82c328bf72cdc30d517e09f1    
  
author   : Alvaro Herrera <[email protected]>    
date     : Wed, 3 Oct 2012 12:22:41 -0300    
  
committer: Alvaro Herrera <[email protected]>    
date     : Wed, 3 Oct 2012 12:22:41 -0300    

Click here for diff

Apparently this was considered in the original code (see commit  
cec3b0a9) but I failed to notice that such entries would always be  
skipped by the database check at the start of the loop.  
  
Per bugs #7578 by Nikolay, #6116 by [email protected].  

M src/backend/catalog/pg_shdepend.c

Fix access past end of string in date parsing.

commit   : 3e291cab0829f48215061a67cc07371cca3cf2e1    
  
author   : Heikki Linnakangas <[email protected]>    
date     : Tue, 2 Oct 2012 10:43:48 +0300    
  
committer: Heikki Linnakangas <[email protected]>    
date     : Tue, 2 Oct 2012 10:43:48 +0300    

Click here for diff

This affects date_in(), and a couple of other funcions that use DecodeDate().  
  
Hitoshi Harada  

M src/backend/utils/adt/datetime.c

Fix bugs in "restore.sql" script emitted in pg_dump tar output.

commit   : 54152d0faa2bf31930d6eede80bac12bafa21232    
  
author   : Tom Lane <[email protected]>    
date     : Sat, 29 Sep 2012 17:56:54 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Sat, 29 Sep 2012 17:56:54 -0400    

Click here for diff

The tar output module did some very ugly and ultimately incorrect hacking  
on COPY commands to try to get them to work in the context of restoring a  
deconstructed tar archive.  In particular, it would fail altogether for  
table names containing any upper-case characters, since it smashed the  
command string to lower-case before modifying it (and, just to add insult  
to injury, did that in a way that would fail in multibyte encodings).  
I don't see any particular value in being flexible about the case of the  
command keywords, since the string will just have been created by  
dumpTableData, so let's get rid of the whole case-folding thing.  
  
Also, it doesn't seem to meet the POLA for the script to restore data only  
in COPY mode, so add \i commands to make it have comparable behavior in  
--inserts mode.  
  
Noted while looking at the tar-output code in connection with Brian  
Weaver's patch.  

M src/bin/pg_dump/pg_backup_tar.c

Fix pg_restore to accept POSIX-conformant tar files.

commit   : 158cbedfba84790480a4474f7b49d4e6e9e8b03c    
  
author   : Tom Lane <[email protected]>    
date     : Fri, 28 Sep 2012 15:42:22 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Fri, 28 Sep 2012 15:42:22 -0400    

Click here for diff

Back-patch portions of commit 05b555d12bc2ad0d581f48a12b45174db41dc10d.  
We need to patch pg_restore to accept either version of the magic string,  
in hopes of avoiding compatibility problems when 9.3 comes out.  I also  
fixed pg_dump to write the correct 2-block EOF marker, since that won't  
create a compatibility problem with pg_restore and it could help with some  
versions of tar.  
  
Brian Weaver and Tom Lane  

M src/bin/pg_dump/pg_backup_tar.c

Fix examples of how to use "su" while starting the server.

commit   : cf78eb4a2d80a614d4ee1a161b71798d7250e57b    
  
author   : Tom Lane <[email protected]>    
date     : Tue, 25 Sep 2012 13:53:05 -0400    
  
committer: Tom Lane <[email protected]>    
date     : Tue, 25 Sep 2012 13:53:05 -0400    

Click here for diff

The syntax "su -c 'command' username" is not accepted by all versions of  
su, for example not OpenBSD's.  More portable is "su username -c  
'command'".  So change runtime.sgml to recommend that syntax.  Also,  
add a -D switch to the OpenBSD example script, for consistency with other  
examples.  Per Denis Lapshin and Gábor Hidvégi.  

M doc/src/sgml/runtime.sgml