commit : 757f567ec8d6d4767e74cf987a5cc63e63f1c9dc author : Tom Lane <firstname.lastname@example.org> date : Mon, 8 Aug 2016 16:31:43 -0400 committer: Tom Lane <email@example.com> date : Mon, 8 Aug 2016 16:31:43 -0400
Last-minute updates for release notes.
commit : 43957e873d22ce5d279a91d3acda6fc9e848fda7 author : Tom Lane <firstname.lastname@example.org> date : Mon, 8 Aug 2016 11:56:10 -0400 committer: Tom Lane <email@example.com> date : Mon, 8 Aug 2016 11:56:10 -0400
Security: CVE-2016-5423, CVE-2016-5424
Fix several one-byte buffer over-reads in to_number
commit : 43d7a0af5340b32161bbc89cdb886f4211fa1496 author : Peter Eisentraut <firstname.lastname@example.org> date : Mon, 8 Aug 2016 11:12:59 -0400 committer: Peter Eisentraut <email@example.com> date : Mon, 8 Aug 2016 11:12:59 -0400
Several places in NUM_numpart_from_char(), which is called from the SQL function to_number(text, text), could accidentally read one byte past the end of the input buffer (which comes from the input text datum and is not null-terminated). 1. One leading space character would be skipped, but there was no check that the input was at least one byte long. This does not happen in practice, but for defensiveness, add a check anyway. 2. Commit 4a3a1e2cf apparently accidentally doubled that code that skips one space character (so that two spaces might be skipped), but there was no overflow check before skipping the second byte. Fix by removing that duplicate code. 3. A logic error would allow a one-byte over-read when looking for a trailing sign (S) placeholder. In each case, the extra byte cannot be read out directly, but looking at it might cause a crash. The third item was discovered by Piotr Stefaniak, the first two were found and analyzed by Tom Lane and Peter Eisentraut.
commit : a35c2d902cc28967e424cdbf832ecec3f904843b author : Peter Eisentraut <firstname.lastname@example.org> date : Mon, 8 Aug 2016 10:53:45 -0400 committer: Peter Eisentraut <email@example.com> date : Mon, 8 Aug 2016 10:53:45 -0400
Source-Git-URL: git://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 27a0ac4549d67fb3e07f19d15cdad9f8695b7e7c
Fix two errors with nested CASE/WHEN constructs.
commit : 6c954a6a5f52cf04c6a99134331b42fa94cee606 author : Tom Lane <firstname.lastname@example.org> date : Mon, 8 Aug 2016 10:33:47 -0400 committer: Tom Lane <email@example.com> date : Mon, 8 Aug 2016 10:33:47 -0400
ExecEvalCase() tried to save a cycle or two by passing &econtext->caseValue_isNull as the isNull argument to its sub-evaluation of the CASE value expression. If that subexpression itself contained a CASE, then *isNull was an alias for econtext->caseValue_isNull within the recursive call of ExecEvalCase(), leading to confusion about whether the inner call's caseValue was null or not. In the worst case this could lead to a core dump due to dereferencing a null pointer. Fix by not assigning to the global variable until control comes back from the subexpression. Also, avoid using the passed-in isNull pointer transiently for evaluation of WHEN expressions. (Either one of these changes would have been sufficient to fix the known misbehavior, but it's clear now that each of these choices was in itself dangerous coding practice and best avoided. There do not seem to be any similar hazards elsewhere in execQual.c.) Also, it was possible for inlining of a SQL function that implements the equality operator used for a CASE comparison to result in one CASE expression's CaseTestExpr node being inserted inside another CASE expression. This would certainly result in wrong answers since the improperly nested CaseTestExpr would be caused to return the inner CASE's comparison value not the outer's. If the CASE values were of different data types, a crash might result; moreover such situations could be abused to allow disclosure of portions of server memory. To fix, teach inline_function to check for "bare" CaseTestExpr nodes in the arguments of a function to be inlined, and avoid inlining if there are any. Heikki Linnakangas, Michael Paquier, Tom Lane Report: https://github.com/greenplum-db/gpdb/pull/327 Report: <4DDCEEB8.firstname.lastname@example.org> Security: CVE-2016-5423
Obstruct shell, SQL, and conninfo injection via database and role names.
commit : 95a6855c5508fed7e327fe28e6d3ffb614a406bf author : Noah Misch <email@example.com> date : Mon, 8 Aug 2016 10:07:46 -0400 committer: Noah Misch <firstname.lastname@example.org> date : Mon, 8 Aug 2016 10:07:46 -0400
Due to simplistic quoting and confusion of database names with conninfo strings, roles with the CREATEDB or CREATEROLE option could escalate to superuser privileges when a superuser next ran certain maintenance commands. The new coding rule for PQconnectdbParams() calls, documented at conninfo_array_parse(), is to pass expand_dbname=true and wrap literal database names in a trivial connection string. Escape zero-length values in appendConnStrVal(). Back-patch to 9.1 (all supported versions). Nathan Bossart, Michael Paquier, and Noah Misch. Reviewed by Peter Eisentraut. Reported by Nathan Bossart. Security: CVE-2016-5424
Promote pg_dumpall shell/connstr quoting functions to src/fe_utils.
commit : c1b048f498fe026da997d0e15e4eeee38ee0c592 author : Noah Misch <email@example.com> date : Mon, 8 Aug 2016 10:07:46 -0400 committer: Noah Misch <firstname.lastname@example.org> date : Mon, 8 Aug 2016 10:07:46 -0400
Rename these newly-extern functions with terms more typical of their new neighbors. No functional changes; a subsequent commit will use them in more places. Back-patch to 9.1 (all supported versions). Back branches lack src/fe_utils, so instead rename the functions in place; the subsequent commit will copy them into the other programs using them. Security: CVE-2016-5424
Fix Windows shell argument quoting.
commit : 395d565ac76b6fe5a9a97fb5e87e0d0842ba9824 author : Noah Misch <email@example.com> date : Mon, 8 Aug 2016 10:07:46 -0400 committer: Noah Misch <firstname.lastname@example.org> date : Mon, 8 Aug 2016 10:07:46 -0400
The incorrect quoting may have permitted arbitrary command execution. At a minimum, it gave broader control over the command line to actors supposed to have control over a single argument. Back-patch to 9.1 (all supported versions). Security: CVE-2016-5424
Reject, in pg_dumpall, names containing CR or LF.
commit : 0f679d2c1cb0ef5fc43133ebebf489b82b929214 author : Noah Misch <email@example.com> date : Mon, 8 Aug 2016 10:07:46 -0400 committer: Noah Misch <firstname.lastname@example.org> date : Mon, 8 Aug 2016 10:07:46 -0400
These characters prematurely terminate Windows shell command processing, causing the shell to execute a prefix of the intended command. The chief alternative to rejecting these characters was to bypass the Windows shell with CreateProcess(), but the ability to use such names has little value. Back-patch to 9.1 (all supported versions). This change formally revokes support for these characters in database names and roles names. Don't document this; the error message is self-explanatory, and too few users would benefit. A future major release may forbid creation of databases and roles so named. For now, check only at known weak points in pg_dumpall. Future commits will, without notice, reject affected names from other frontend programs. Also extend the restriction to pg_dumpall --dbname=CONNSTR arguments and --file arguments. Unlike the effects on role name arguments and database names, this does not reflect a broad policy change. A migration to CreateProcess() could lift these two restrictions. Reviewed by Peter Eisentraut. Security: CVE-2016-5424
Field conninfo strings throughout src/bin/scripts.
commit : 05abd3bcfe4e8742208b5c766be60feef73bb0ef author : Noah Misch <email@example.com> date : Mon, 8 Aug 2016 10:07:46 -0400 committer: Noah Misch <firstname.lastname@example.org> date : Mon, 8 Aug 2016 10:07:46 -0400
These programs nominally accepted conninfo strings, but they would proceed to use the original dbname parameter as though it were an unadorned database name. This caused "reindexdb dbname=foo" to issue an SQL command that always failed, and other programs printed a conninfo string in error messages that purported to print a database name. Fix both problems by using PQdb() to retrieve actual database names. Continue to print the full conninfo string when reporting a connection failure. It is informative there, and if the database name is the sole problem, the server-side error message will include the name. Beyond those user-visible fixes, this allows a subsequent commit to synthesize and use conninfo strings without that implementation detail leaking into messages. As a side effect, the "vacuuming database" message now appears after, not before, the connection attempt. Back-patch to 9.1 (all supported versions). Reviewed by Michael Paquier and Peter Eisentraut. Security: CVE-2016-5424
Introduce a psql "\connect -reuse-previous=on|off" option.
commit : dfb2d8039eb714d6b582a8cb8a7993c98b88a224 author : Noah Misch <email@example.com> date : Mon, 8 Aug 2016 10:07:46 -0400 committer: Noah Misch <firstname.lastname@example.org> date : Mon, 8 Aug 2016 10:07:46 -0400
The decision to reuse values of parameters from a previous connection has been based on whether the new target is a conninfo string. Add this means of overriding that default. This feature arose as one component of a fix for security vulnerabilities in pg_dump, pg_dumpall, and pg_upgrade, so back-patch to 9.1 (all supported versions). In 9.3 and later, comment paragraphs that required update had already-incorrect claims about behavior when no connection is open; fix those problems. Security: CVE-2016-5424
Sort out paired double quotes in \connect, \password and \crosstabview.
commit : a44d713512222a519701b7f5ad2634dc3a8fc24b author : Noah Misch <email@example.com> date : Mon, 8 Aug 2016 10:07:46 -0400 committer: Noah Misch <firstname.lastname@example.org> date : Mon, 8 Aug 2016 10:07:46 -0400
In arguments, these meta-commands wrongly treated each pair as closing the double quoted string. Make the behavior match the documentation. This is a compatibility break, but I more expect to find software with untested reliance on the documented behavior than software reliant on today's behavior. Back-patch to 9.1 (all supported versions). Reviewed by Tom Lane and Peter Eisentraut. Security: CVE-2016-5424
Release notes for 9.5.4, 9.4.9, 9.3.14, 9.2.18, 9.1.23.
commit : 56e410c86d510eb64aa160f5c39e1a543be5c7f7 author : Tom Lane <email@example.com> date : Sun, 7 Aug 2016 21:31:02 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sun, 7 Aug 2016 21:31:02 -0400
Fix misestimation of n_distinct for a nearly-unique column with many nulls.
commit : 20a85950434f750e3790176500b89d38ff123129 author : Tom Lane <email@example.com> date : Sun, 7 Aug 2016 18:52:02 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sun, 7 Aug 2016 18:52:02 -0400
If ANALYZE found no repeated non-null entries in its sample, it set the column's stadistinct value to -1.0, intending to indicate that the entries are all distinct. But what this value actually means is that the number of distinct values is 100% of the table's rowcount, and thus it was overestimating the number of distinct values by however many nulls there are. This could lead to very poor selectivity estimates, as for example in a recent report from Andreas Joseph Krogh. We should discount the stadistinct value by whatever we've estimated the nulls fraction to be. (That is what will happen if we choose to use a negative stadistinct for a column that does have repeated entries, so this code path was just inconsistent.) In addition to fixing the stadistinct entries stored by several different ANALYZE code paths, adjust the logic where get_variable_numdistinct() forces an "all distinct" estimate on the basis of finding a relevant unique index. Unique indexes don't reject nulls, so there's no reason to assume that the null fraction doesn't apply. Back-patch to all supported branches. Back-patching is a bit of a judgment call, but this problem seems to affect only a few users (else we'd have identified it long ago), and it's bad enough when it does happen that destabilizing plan choices in a worse direction seems unlikely. Patch by me, with documentation wording suggested by Dean Rasheed Report: <VisenaEmail.26.df42f82acae38a58.156463942b8@tc7-visena> Discussion: <email@example.com>
Teach libpq to decode server version correctly from future servers.
commit : c3107f18a76fded735dbffe176a668d877e4f793 author : Tom Lane <firstname.lastname@example.org> date : Fri, 5 Aug 2016 18:58:12 -0400 committer: Tom Lane <email@example.com> date : Fri, 5 Aug 2016 18:58:12 -0400
Beginning with the next development cycle, PG servers will report two-part not three-part version numbers. Fix libpq so that it will compute the correct numeric representation of such server versions for reporting by PQserverVersion(). It's desirable to get this into the field and back-patched ASAP, so that older clients are more likely to understand the new server version numbering by the time any such servers are in the wild. (The results with an old client would probably not be catastrophic anyway for a released server; for example "10.1" would be interpreted as 100100 which would be wrong in detail but would not likely cause an old client to misbehave badly. But "10devel" or "10beta1" would result in sversion==0 which at best would result in disabling all use of modern features.) Extracted from a patch by Peter Eisentraut; comments added by me Patch: <firstname.lastname@example.org>
Update time zone data files to tzdata release 2016f.
commit : 5630bd2eca9caf2af8ca9bffe4577ce5ca460d75 author : Tom Lane <email@example.com> date : Fri, 5 Aug 2016 12:58:17 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Fri, 5 Aug 2016 12:58:17 -0400
DST law changes in Kemerovo and Novosibirsk. Historical corrections for Azerbaijan, Belarus, and Morocco. Asia/Novokuznetsk and Asia/Novosibirsk now use numeric time zone abbreviations instead of invented ones. Zones for Antarctic bases and other locations that have been uninhabited for portions of the time span known to the tzdata database now report "-00" rather than "zzz" as the zone abbreviation for those time spans. Also, I decided to remove some of the timezone/data/ files that we don't use. At one time that subdirectory was a complete copy of what IANA distributes in the tzdata tarballs, but that hasn't been true for a long time. There seems no good reason to keep shipping those specific files but not others; they're just bloating our tarballs.
doc: Remove documentation of nonexistent information schema columns
commit : eaf3ba7b4964c5cf5ace0dbca2eef1a77949b71e author : Peter Eisentraut <email@example.com> date : Wed, 3 Aug 2016 13:45:55 -0400 committer: Peter Eisentraut <firstname.lastname@example.org> date : Wed, 3 Aug 2016 13:45:55 -0400
These were probably copied in by accident. From: Clément Prévost <email@example.com>
doc: OS collation changes can break indexes
commit : a6839bf0c61ccc8465ef438034312aff0a1d8a95 author : Bruce Momjian <firstname.lastname@example.org> date : Tue, 2 Aug 2016 17:13:10 -0400 committer: Bruce Momjian <email@example.com> date : Tue, 2 Aug 2016 17:13:10 -0400
Discussion: 20160702155517.GD18610@momjian.us Reviewed-by: Christoph Berg Backpatch-through: 9.1
Fix pg_dump's handling of public schema with both -c and -C options.
commit : 6693c9d7bffbd8481b13ccad2d164d3f69398fa6 author : Tom Lane <firstname.lastname@example.org> date : Tue, 2 Aug 2016 12:48:51 -0400 committer: Tom Lane <email@example.com> date : Tue, 2 Aug 2016 12:48:51 -0400
Since -c plus -C requests dropping and recreating the target database as a whole, not dropping individual objects in it, we should assume that the public schema already exists and need not be created. The previous coding considered only the state of the -c option, so it would emit "CREATE SCHEMA public" anyway, leading to an unexpected error in restore. Back-patch to 9.2. Older versions did not accept -c with -C so the issue doesn't arise there. (The logic being patched here dates to 8.0, cf commit 2193121fa, so it's not really wrong that it didn't consider the case at the time.) Note that versions before 9.6 will still attempt to emit REVOKE/GRANT on the public schema; but that happens without -c/-C too, and doesn't seem to be the focus of this complaint. I considered extending this stanza to also skip the public schema's ACL, but that would be a misfeature, as it'd break cases where users intentionally changed that ACL. The real fix for this aspect is Stephen Frost's work to not dump built-in ACLs, and that's not going to get back-ported. Per bugs #13804 and #14271. Solution found by David Johnston and later rediscovered by me. Report: <firstname.lastname@example.org> Report: <email@example.com>
Fixed array checking code for "unsigned long long" datatypes in libecpg.
commit : 3ca359426c09d4a928f4603f531bed2f46648bcb author : Michael Meskes <firstname.lastname@example.org> date : Mon, 1 Aug 2016 06:36:27 +0200 committer: Michael Meskes <email@example.com> date : Mon, 1 Aug 2016 06:36:27 +0200
Fix pg_basebackup so that it accepts 0 as a valid compression level.
commit : 013f423729a8e7b75bbb80db5212fe320a146343 author : Fujii Masao <firstname.lastname@example.org> date : Mon, 1 Aug 2016 17:36:14 +0900 committer: Fujii Masao <email@example.com> date : Mon, 1 Aug 2016 17:36:14 +0900
The help message for pg_basebackup specifies that the numbers 0 through 9 are accepted as valid values of -Z option. But, previously -Z 0 was rejected as an invalid compression level. Per discussion, it's better to make pg_basebackup treat 0 as valid compression level meaning no compression, like pg_dump. Back-patch to all supported versions. Reported-By: Jeff Janes Reviewed-By: Amit Kapila Discussion: CAMkU=1x+GwjSayc57v6w87ij6iRGFWt=hVfM0B64b1_bPVKRqg@mail.gmail.com
Doc: remove claim that hash index creation depends on effective_cache_size.
commit : 9c6e5942ac1a23de201706c1cb12c5f35e94edea author : Tom Lane <firstname.lastname@example.org> date : Sun, 31 Jul 2016 18:32:34 -0400 committer: Tom Lane <email@example.com> date : Sun, 31 Jul 2016 18:32:34 -0400
This text was added by commit ff213239c, and not long thereafter obsoleted by commit 4adc2f72a (which made the test depend on NBuffers instead); but nobody noticed the need for an update. Commit 9563d5b5e adds some further dependency on maintenance_work_mem, but the existing verbiage seems to cover that with about as much precision as we really want here. Let's just take it all out rather than leaving ourselves open to more errors of omission in future. (That solution makes this change back-patchable, too.) Noted by Peter Geoghegan. Discussion: <CAM3SWZRVANbj9GA9j40fAwheQCZQtSwqTN1GBTVwRrRbmSf7cg@mail.gmail.com>
doc: apply hypen fix that was not backpatched
commit : 67defa882b9c97bee3e097d90057403821df2a84 author : Bruce Momjian <firstname.lastname@example.org> date : Sat, 30 Jul 2016 14:52:17 -0400 committer: Bruce Momjian <email@example.com> date : Sat, 30 Jul 2016 14:52:17 -0400
Head patch was 42ec6c2da699e8e0b1774988fa97297a2cdf716c. Reported-by: Alexander Law Discussion: 5785FBE7.firstname.lastname@example.org Backpatch-through: 9.1
Guard against empty buffer in gets_fromFile()'s check for a newline.
commit : ddec1269478199d70fb8102cf764a46ba142764f author : Tom Lane <email@example.com> date : Thu, 28 Jul 2016 18:57:24 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Thu, 28 Jul 2016 18:57:24 -0400
Per the fgets() specification, it cannot return without reading some data unless it reports EOF or error. So the code here assumed that the data buffer would necessarily be nonempty when we go to check for a newline having been read. However, Agostino Sarubbo noticed that this could fail to be true if the first byte of the data is a NUL (\0). The fgets() API doesn't really work for embedded NULs, which is something I don't feel any great need for us to worry about since we generally don't allow NULs in SQL strings anyway. But we should not access off the end of our own buffer if the case occurs. Normally this would just be a harmless read, but if you were unlucky the byte before the buffer would contain '\n' and we'd overwrite it with '\0', and if you were really unlucky that might be valuable data and psql would crash. Agostino reported this to pgsql-security, but after discussion we concluded that it isn't worth treating as a security bug; if you can control the input to psql you can do far more interesting things than just maybe-crash it. Nonetheless, it is a bug, so back-patch to all supported versions.
Fix assorted fallout from IS [NOT] NULL patch.
commit : 06971438762cf020ff9452adc11bf0f57d783209 author : Tom Lane <email@example.com> date : Thu, 28 Jul 2016 16:09:15 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Thu, 28 Jul 2016 16:09:15 -0400
Commits 4452000f3 et al established semantics for NullTest.argisrow that are a bit different from its initial conception: rather than being merely a cache of whether we've determined the input to have composite type, the flag now has the further meaning that we should apply field-by-field testing as per the standard's definition of IS [NOT] NULL. If argisrow is false and yet the input has composite type, the construct instead has the semantics of IS [NOT] DISTINCT FROM NULL. Update the comments in primnodes.h to clarify this, and fix ruleutils.c and deparse.c to print such cases correctly. In the case of ruleutils.c, this merely results in cosmetic changes in EXPLAIN output, since the case can't currently arise in stored rules. However, it represents a live bug for deparse.c, which would formerly have sent a remote query that had semantics different from the local behavior. (From the user's standpoint, this means that testing a remote nested-composite column for null-ness could have had unexpected recursive behavior much like that fixed in 4452000f3.) In a related but somewhat independent fix, make plancat.c set argisrow to false in all NullTest expressions constructed to represent "attnotnull" constructs. Since attnotnull is actually enforced as a simple null-value check, this is a more accurate representation of the semantics; we were previously overpromising what it meant for composite columns, which might possibly lead to incorrect planner optimizations. (It seems that what the SQL spec expects a NOT NULL constraint to mean is an IS NOT NULL test, so arguably we are violating the spec and should fix attnotnull to do the other thing. If we ever do, this part should get reverted.) Back-patch, same as the previous commit. Discussion: <email@example.com>
Improve documentation about CREATE TABLE ... LIKE.
commit : 52205629acbcff5bc25b46a5bec95f50f45d5cc4 author : Tom Lane <firstname.lastname@example.org> date : Thu, 28 Jul 2016 13:26:59 -0400 committer: Tom Lane <email@example.com> date : Thu, 28 Jul 2016 13:26:59 -0400
The docs failed to explain that LIKE INCLUDING INDEXES would not preserve the names of indexes and associated constraints. Also, it wasn't mentioned that EXCLUDE constraints would be copied by this option. The latter oversight seems enough of a documentation bug to justify back-patching. In passing, do some minor copy-editing in the same area, and add an entry for LIKE under "Compatibility", since it's not exactly a faithful implementation of the standard's feature. Discussion: <20160728151154.AABE64016B@smtp.hushmail.com>
Register atexit hook only once in pg_upgrade.
commit : 1be038795694565430126b9edcb64149d718775b author : Tom Lane <firstname.lastname@example.org> date : Thu, 28 Jul 2016 11:39:11 -0400 committer: Tom Lane <email@example.com> date : Thu, 28 Jul 2016 11:39:11 -0400
start_postmaster() registered stop_postmaster_atexit as an atexit(3) callback each time through, although the obvious intention was to do so only once per program run. The extra registrations were harmless, so long as we didn't exceed ATEXIT_MAX, but still it's a bug. Artur Zakirov, with bikeshedding by Kyotaro Horiguchi and me Discussion: <firstname.lastname@example.org>
Fix incorrect description of udt_privileges view in documentation.
commit : b54ba3bc5be5437b15c5cb657a097a2b3cb91d6b author : Fujii Masao <email@example.com> date : Thu, 28 Jul 2016 22:34:42 +0900 committer: Fujii Masao <firstname.lastname@example.org> date : Thu, 28 Jul 2016 22:34:42 +0900
The description of udt_privileges view contained an incorrect copy-pasted word. Back-patch to 9.2 where udt_privileges view was added. Author: Alexander Law
Fix constant-folding of ROW(...) IS [NOT] NULL with composite fields.
commit : c235d510ead48a33a4e9f4976d048424bfe33298 author : Tom Lane <email@example.com> date : Tue, 26 Jul 2016 15:25:02 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Tue, 26 Jul 2016 15:25:02 -0400
The SQL standard appears to specify that IS [NOT] NULL's tests of field nullness are non-recursive, ie, we shouldn't consider that a composite field with value ROW(NULL,NULL) is null for this purpose. ExecEvalNullTest got this right, but eval_const_expressions did not, leading to weird inconsistencies depending on whether the expression was such that the planner could apply constant folding. Also, adjust the docs to mention that IS [NOT] DISTINCT FROM NULL can be used as a substitute test if a simple null check is wanted for a rowtype argument. That motivated reordering things so that IS [NOT] DISTINCT FROM is described before IS [NOT] NULL. In HEAD, I went a bit further and added a table showing all the comparison-related predicates. Per bug #14235. Back-patch to all supported branches, since it's certainly undesirable that constant-folding should change the semantics. Report and patch by Andrew Gierth; assorted wordsmithing and revised regression test cases by me. Report: <email@example.com>
Make the AIX case of Makefile.shlib safe for parallel make.
commit : 98b7a3cf2b62f06230e0b056dc95c71b815f2814 author : Noah Misch <firstname.lastname@example.org> date : Sat, 23 Jul 2016 20:30:03 -0400 committer: Noah Misch <email@example.com> date : Sat, 23 Jul 2016 20:30:03 -0400
Use our typical approach, from src/backend/parser. Back-patch to 9.1 (all supported versions).
Make contrib regression tests safe for Danish locale.
commit : e15e7886e685c6284d69bbe7f3ce61f16ed62b2d author : Tom Lane <firstname.lastname@example.org> date : Thu, 21 Jul 2016 16:52:36 -0400 committer: Tom Lane <email@example.com> date : Thu, 21 Jul 2016 16:52:36 -0400
In btree_gin and citext, avoid some not-particularly-interesting dependencies on the sorting of 'aa'. In tsearch2, use COLLATE "C" to remove an uninteresting dependency on locale sort order (and thereby allow removal of a variant expected-file). Also, in citext, avoid assuming that lower('I') = 'i'. This isn't relevant to Danish but it does fail in Turkish.
Make pltcl regression tests safe for Danish locale.
commit : 0060638c877cf45803c16cc260cf72c9a94c82a2 author : Tom Lane <firstname.lastname@example.org> date : Thu, 21 Jul 2016 14:24:07 -0400 committer: Tom Lane <email@example.com> date : Thu, 21 Jul 2016 14:24:07 -0400
Another peculiarity of Danish locale is that it has an unusual idea of how to sort upper vs. lower case. One of the pltcl test cases has an issue with that. Now that COLLATE works in all supported branches, we can just change the test to be locale-independent, and get rid of the variant expected file that used to support non-C locales.
Remove very-obsolete estimates of shmem usage from postgresql.conf.sample.
commit : 5e7fc462904c7deccead7c8d6ba0147998218d9d author : Tom Lane <firstname.lastname@example.org> date : Tue, 19 Jul 2016 18:41:30 -0400 committer: Tom Lane <email@example.com> date : Tue, 19 Jul 2016 18:41:30 -0400
runtime.sgml used to contain a table of estimated shared memory consumption rates for max_connections and some other GUCs. Commit 390bfc643 removed that on the well-founded grounds that (a) we weren't maintaining the entries well and (b) it no longer mattered so much once we got out from under SysV shmem limits. But it missed that there were even-more-obsolete versions of some of those numbers in comments in postgresql.conf.sample. Remove those too. Back-patch to 9.3 where the aforesaid commit went in.
Fix MSVC build for changes in zic.
commit : f102bd8684605f53c41316d7f3b1e8a8f67fd640 author : Tom Lane <firstname.lastname@example.org> date : Tue, 19 Jul 2016 17:53:31 -0400 committer: Tom Lane <email@example.com> date : Tue, 19 Jul 2016 17:53:31 -0400
Ooops, I missed back-patching commit f5f15ea6a along with the other stuff.
Sync back-branch copies of the timezone code with IANA release tzcode2016c.
commit : 3928132eadd28f31aad6eb16607ccdcd0aa12c1b author : Tom Lane <firstname.lastname@example.org> date : Tue, 19 Jul 2016 15:59:36 -0400 committer: Tom Lane <email@example.com> date : Tue, 19 Jul 2016 15:59:36 -0400
Back-patch commit 1c1a7cbd6a1600c9, along with subsequent portability fixes, into all active branches. Also, back-patch commits 696027727 and 596857043 (addition of zic -P option) into 9.1 and 9.2, just to reduce differences between the branches. src/timezone/ is now largely identical in all active branches, except that in 9.1, pgtz.c retains the initial-timezone-selection code that was moved over to initdb in 9.2. Ordinarily we wouldn't risk this much code churn in back branches, but it seems necessary in this case, because among the changes are two feature additions in the "zic" zone data file compiler (a larger limit on the number of allowed DST transitions, and addition of a "%z" escape in zone abbreviations). IANA have not yet started to use those features in their tzdata files, but presumably they will before too long. If we don't update then we'll be unable to adopt new timezone data. Also, installations built with --with-system-tzdata (which includes most distro-supplied builds, I believe) might fail even if we don't update our copies of the data files. There are assorted bug fixes too, mostly affecting obscure timezones or post-2037 dates. Discussion: <firstname.lastname@example.org>
Use correct symbol for minimum int64 value
commit : 805f2bb53f954a859fa0e7f27e13ba87c9e03595 author : Peter Eisentraut <email@example.com> date : Sun, 17 Jul 2016 09:37:33 -0400 committer: Peter Eisentraut <firstname.lastname@example.org> date : Sun, 17 Jul 2016 09:37:33 -0400
The old code used SEQ_MINVALUE to get the smallest int64 value. This was done as a convenience to avoid having to deal with INT64_IS_BUSTED, but that is obsolete now. Also, it is incorrect because the smallest int64 value is actually SEQ_MINVALUE-1. Fix by writing out the constant the long way, as it is done elsewhere in the code.
Fix crash in close_ps() for NaN input coordinates.
commit : 16e28fcec2fe235abddd501c17536c3b15a4dcec author : Tom Lane <email@example.com> date : Sat, 16 Jul 2016 14:42:37 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sat, 16 Jul 2016 14:42:37 -0400
The Assert() here seems unreasonably optimistic. Andreas Seltenreich found that it could fail with NaNs in the input geometries, and it seems likely to me that it might fail in corner cases due to roundoff error, even for ordinary input values. As a band-aid, make the function return SQL NULL instead of crashing. Report: <email@example.com>
Fix torn-page, unlogged xid and further risks from heap_update().
commit : 2e51ae1f62c8273664ec489349b155f889658ec4 author : Andres Freund <firstname.lastname@example.org> date : Fri, 15 Jul 2016 17:49:48 -0700 committer: Andres Freund <email@example.com> date : Fri, 15 Jul 2016 17:49:48 -0700
When heap_update needs to look for a page for the new tuple version, because the current one doesn't have sufficient free space, or when columns have to be processed by the tuple toaster, it has to release the lock on the old page during that. Otherwise there'd be lock ordering and lock nesting issues. To avoid concurrent sessions from trying to update / delete / lock the tuple while the page's content lock is released, the tuple's xmax is set to the current session's xid. That unfortunately was done without any WAL logging, thereby violating the rule that no XIDs may appear on disk, without an according WAL record. If the database were to crash / fail over when the page level lock is released, and some activity lead to the page being written out to disk, the xid could end up being reused; potentially leading to the row becoming invisible. There might be additional risks by not having t_ctid point at the tuple itself, without having set the appropriate lock infomask fields. To fix, compute the appropriate xmax/infomask combination for locking the tuple, and perform WAL logging using the existing XLOG_HEAP_LOCK record. That allows the fix to be backpatched. This issue has existed for a long time. There appears to have been partial attempts at preventing dangers, but these never have fully been implemented, and were removed a long time ago, in 11919160 (cf. HEAP_XMAX_UNLOGGED). In master / 9.6, there's an additional issue, namely that the visibilitymap's freeze bit isn't reset at that point yet. Since that's a new issue, introduced only in a892234f830, that'll be fixed in a separate commit. Author: Masahiko Sawada and Andres Freund Reported-By: Different aspects by Thomas Munro, Noah Misch, and others Discussion: CAEepm=3fWAbWryVW9swHyLTY4sXVf0xbLvXqOwUoDiNCx9mBjQ@mail.gmail.com Backpatch: 9.1/all supported versions
Make HEAP_LOCK/HEAP2_LOCK_UPDATED replay reset HEAP_XMAX_INVALID.
commit : 46acbeb2f09c862bae5056d34473223764ff99b0 author : Andres Freund <firstname.lastname@example.org> date : Fri, 15 Jul 2016 14:37:06 -0700 committer: Andres Freund <email@example.com> date : Fri, 15 Jul 2016 14:37:06 -0700
0ac5ad5 started to compress infomask bits in WAL records. Unfortunately the replay routines for XLOG_HEAP_LOCK/XLOG_HEAP2_LOCK_UPDATED forgot to reset the HEAP_XMAX_INVALID (and some other) hint bits. Luckily that's not problematic in the majority of cases, because after a crash/on a standby row locks aren't meaningful. Unfortunately that does not hold true in the presence of prepared transactions. This means that after a crash, or after promotion, row level locks held by a prepared, but not yet committed, prepared transaction might not be enforced. Discussion: firstname.lastname@example.org Backpatch: 9.3, the oldest branch on which 0ac5ad5 is present.
Avoid serializability errors when locking a tuple with a committed update
commit : 6c243f90ab6904f27fa990f1f3261e1d09a11853 author : Alvaro Herrera <email@example.com> date : Fri, 15 Jul 2016 14:17:20 -0400 committer: Alvaro Herrera <firstname.lastname@example.org> date : Fri, 15 Jul 2016 14:17:20 -0400
When key-share locking a tuple that has been not-key-updated, and the update is a committed transaction, in some cases we raised serializability errors: ERROR: could not serialize access due to concurrent update Because the key-share doesn't conflict with the update, the error is unnecessary and inconsistent with the case that the update hasn't committed yet. This causes problems for some usage patterns, even if it can be claimed that it's sufficient to retry the aborted transaction: given a steady stream of updating transactions and a long locking transaction, the long transaction can be starved indefinitely despite multiple retries. To fix, we recognize that HeapTupleSatisfiesUpdate can return HeapTupleUpdated when an updating transaction has committed, and that we need to deal with that case exactly as if it were a non-committed update: verify whether the two operations conflict, and if not, carry on normally. If they do conflict, however, there is a difference: in the HeapTupleBeingUpdated case we can just sleep until the concurrent transaction is gone, while in the HeapTupleUpdated case this is not possible and we must raise an error instead. Per trouble report from Olivier Dony. In addition to a couple of test cases that verify the changed behavior, I added a test case to verify the behavior that remains unchanged, namely that errors are raised when a update that modifies the key is used. That must still generate serializability errors. One pre-existing test case changes behavior; per discussion, the new behavior is actually the desired one. Discussion: https://www.postgresql.org/message-id/560AA479.email@example.com https://firstname.lastname@example.org Backpatch to 9.3, where the problem appeared.
doc: Fix typos
commit : 1466ed3e5ece94d811dac6538e961117e99f4436 author : Peter Eisentraut <email@example.com> date : Thu, 14 Jul 2016 22:28:20 -0400 committer: Peter Eisentraut <firstname.lastname@example.org> date : Thu, 14 Jul 2016 22:28:20 -0400
From: Alexander Law <email@example.com>
Fix GiST index build for NaN values in geometric types.
commit : 57dba87a72278468bc4bf5d025090a713a96fb1a author : Tom Lane <firstname.lastname@example.org> date : Thu, 14 Jul 2016 18:46:00 -0400 committer: Tom Lane <email@example.com> date : Thu, 14 Jul 2016 18:46:00 -0400
GiST index build could go into an infinite loop when presented with boxes (or points, circles or polygons) containing NaN component values. This happened essentially because the code assumed that x == x is true for any "double" value x; but it's not true for NaNs. The looping behavior was not the only problem though: we also attempted to sort the items using simple double comparisons. Since NaNs violate the trichotomy law, qsort could (in principle at least) get arbitrarily confused and mess up the sorting of ordinary values as well as NaNs. And we based splitting choices on box size calculations that could produce NaNs, again resulting in undesirable behavior. To fix, replace all comparisons of doubles in this logic with float8_cmp_internal, which is NaN-aware and is careful to sort NaNs consistently, higher than any non-NaN. Also rearrange the box size calculation to not produce NaNs; instead it should produce an infinity for a box with NaN on one side and not-NaN on the other. I don't by any means claim that this solves all problems with NaNs in geometric values, but it should at least make GiST index insertion work reliably with such data. It's likely that the index search side of things still needs some work, and probably regular geometric operations too. But with this patch we're laying down a convention for how such cases ought to behave. Per bug #14238 from Guang-Dih Lei. Back-patch to 9.2; the code used before commit 7f3bd86843e5aad8 is quite different and doesn't lock up on my simple test case, nor on the submitter's dataset. Report: <firstname.lastname@example.org> Discussion: <email@example.com>
doc: Update URL for PL/PHP
commit : 7d70bf97b2b3d946c6f2eec7896a2f794921cb72 author : Peter Eisentraut <firstname.lastname@example.org> date : Mon, 11 Jul 2016 12:13:29 -0400 committer: Peter Eisentraut <email@example.com> date : Mon, 11 Jul 2016 12:13:29 -0400
Fix TAP tests and MSVC scripts for pathnames with spaces.
commit : 57e9ea2ddeb09e2e41a8b315e4d6693e8dd749a1 author : Tom Lane <firstname.lastname@example.org> date : Mon, 11 Jul 2016 11:24:04 -0400 committer: Tom Lane <email@example.com> date : Mon, 11 Jul 2016 11:24:04 -0400
Back-patch relevant parts of commit 30b2731bd into 9.1-9.3. Michael Paquier, Kyotaro Horiguchi Discussion: <firstname.lastname@example.org>
Add missing newline in error message
commit : 280a558ed6031a790bb9b6748965f4f6bce0930f author : Magnus Hagander <email@example.com> date : Mon, 11 Jul 2016 13:53:17 +0200 committer: Magnus Hagander <firstname.lastname@example.org> date : Mon, 11 Jul 2016 13:53:17 +0200
doc: mention dependency on collation libraries
commit : 539b3dcb2622b135e1791151f4676c49823b6ee7 author : Bruce Momjian <email@example.com> date : Sat, 2 Jul 2016 11:22:35 -0400 committer: Bruce Momjian <firstname.lastname@example.org> date : Sat, 2 Jul 2016 11:22:35 -0400
Document that index storage is dependent on the operating system's collation library ordering, and any change in that ordering can create invalid indexes. Discussion: 20160617154311.GB19359@momjian.us Backpatch-through: 9.1
Be more paranoid in ruleutils.c's get_variable().
commit : b0f20c2ea5f429a4a7bea01e2f721364bf0bd043 author : Tom Lane <email@example.com> date : Fri, 1 Jul 2016 11:40:22 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Fri, 1 Jul 2016 11:40:22 -0400
We were merely Assert'ing that the Var matched the RTE it's supposedly from. But if the user passes incorrect information to pg_get_expr(), the RTE might in fact not match; this led either to Assert failures or core dumps, as reported by Chris Hanks in bug #14220. To fix, just convert the Asserts to test-and-elog. Adjust an existing test-and-elog elsewhere in the same function to be consistent in wording. (If we really felt these were user-facing errors, we might promote them to ereport's; but I can't convince myself that they're worth translating.) Back-patch to 9.3; the problematic code doesn't exist before that, and a quick check says that 9.2 doesn't crash on such cases. Michael Paquier and Thomas Munro Report: <email@example.com>
Fix CREATE MATVIEW/CREATE TABLE AS ... WITH NO DATA to not plan the query.
commit : 17bfef80ee24f2204e66eb28277fddecdad64318 author : Tom Lane <firstname.lastname@example.org> date : Mon, 27 Jun 2016 15:57:21 -0400 committer: Tom Lane <email@example.com> date : Mon, 27 Jun 2016 15:57:21 -0400
Previously, these commands always planned the given query and went through executor startup before deciding not to actually run the query if WITH NO DATA is specified. This behavior is problematic for pg_dump because it may cause errors to be raised that we would rather not see before a REFRESH MATERIALIZED VIEW command is issued. See for example bug #13907 from Marian Krucina. This change is not sufficient to fix that particular bug, because we also need to tweak pg_dump to issue the REFRESH later, but it's a necessary step on the way. A user-visible side effect of doing things this way is that the returned command tag for WITH NO DATA cases will now be "CREATE MATERIALIZED VIEW" or "CREATE TABLE AS", not "SELECT 0". We could preserve the old behavior but it would take more code, and arguably that was just an implementation artifact not intended behavior anyhow. In 9.5 and HEAD, also get rid of the static variable CreateAsReladdr, which was trouble waiting to happen; there is not any prohibition on nested CREATE commands. Back-patch to 9.3 where CREATE MATERIALIZED VIEW was introduced. Michael Paquier and Tom Lane Report: <firstname.lastname@example.org>
Fix handling of multixacts predating pg_upgrade
commit : 28f294afdb522110d78f55905d031fbf1341911a author : Alvaro Herrera <email@example.com> date : Fri, 24 Jun 2016 18:29:28 -0400 committer: Alvaro Herrera <firstname.lastname@example.org> date : Fri, 24 Jun 2016 18:29:28 -0400
After pg_upgrade, it is possible that some tuples' Xmax have multixacts corresponding to the old installation; such multixacts cannot have running members anymore. In many code sites we already know not to read them and clobber them silently, but at least when VACUUM tries to freeze a multixact or determine whether one needs freezing, there's an attempt to resolve it to its member transactions by calling GetMultiXactIdMembers, and if the multixact value is "in the future" with regards to the current valid multixact range, an error like this is raised: ERROR: MultiXactId 123 has not been created yet -- apparent wraparound and vacuuming fails. Per discussion with Andrew Gierth, it is completely bogus to try to resolve multixacts coming from before a pg_upgrade, regardless of where they stand with regards to the current valid multixact range. It's possible to get from under this problem by doing SELECT FOR UPDATE of the problem tuples, but if tables are large, this is slow and tedious, so a more thorough solution is desirable. To fix, we realize that multixacts in xmax created in 9.2 and previous have a specific bit pattern that is never used in 9.3 and later (we already knew this, per comments and infomask tests sprinkled in various places, but we weren't leveraging this knowledge appropriately). Whenever the infomask of the tuple matches that bit pattern, we just ignore the multixact completely as if Xmax wasn't set; or, in the case of tuple freezing, we act as if an unwanted value is set and clobber it without decoding. This guarantees that no errors will be raised, and that the values will be progressively removed until all tables are clean. Most callers of GetMultiXactIdMembers are patched to recognize directly that the value is a removable "empty" multixact and avoid calling GetMultiXactIdMembers altogether. To avoid changing the signature of GetMultiXactIdMembers() in back branches, we keep the "allow_old" boolean flag but rename it to "from_pgupgrade"; if the flag is true, we always return an empty set instead of looking up the multixact. (I suppose we could remove the argument in the master branch, but I chose not to do so in this commit). This was broken all along, but the error-facing message appeared first because of commit 8e9a16ab8f7f and was partially fixed in a25c2b7c4db3. This fix, backpatched all the way back to 9.3, goes approximately in the same direction as a25c2b7c4db3 but should cover all cases. Bug analysis by Andrew Gierth and Álvaro Herrera. A number of public reports match this bug: https://www.postgresql.org/message-id/20140330040029.GY4582@tamriel.snowman.net https://www.postgresql.org/message-id/538F3D70.email@example.com https://www.postgresql.org/message-id/556439CF.firstname.lastname@example.org https://www.postgresql.org/message-id/SG2PR06MB0760098A111C88E31BD4D96FB3540@SG2PR06MB0760.apcprd06.prod.outlook.com https://email@example.com
Make "postgres -C guc" print "" not "(null)" for null-valued GUCs.
commit : dafdcbb6c116bc72a1915146af6c7d96868549b4 author : Tom Lane <firstname.lastname@example.org> date : Wed, 22 Jun 2016 11:55:18 -0400 committer: Tom Lane <email@example.com> date : Wed, 22 Jun 2016 11:55:18 -0400
Commit 0b0baf262 et al made this case print "(null)" on the grounds that that's what happened on platforms that didn't crash. But neither behavior was actually intentional. What we should print is just an empty string, for compatibility with the behavior of SHOW and other ways of examining string GUCs. Those code paths don't distinguish NULL from empty strings, so we should not here either. Per gripe from Alain Radix. Like the previous patch, back-patch to 9.2 where -C option was introduced. Discussion: <CA+YdpwxPUADrmxSD7+Td=uOshMB1KkDN7G7cf+FGmNjjxMhjbw@mail.gmail.com>
Document that dependency tracking doesn't consider function bodies.
commit : 8df3c7ba74e6800c01a0d2fcd09adf23c298e790 author : Tom Lane <firstname.lastname@example.org> date : Tue, 21 Jun 2016 20:07:58 -0400 committer: Tom Lane <email@example.com> date : Tue, 21 Jun 2016 20:07:58 -0400
If there's anyplace in our SGML docs that explains this behavior, I can't find it right at the moment. Add an explanation in "Dependency Tracking" which seems like the authoritative place for such a discussion. Per gripe from Michelle Schwan. While at it, update this section's example of a dependency-related error message: they last looked like that in 8.3. And remove the explanation of dependency updates from pre-7.3 installations, which is probably no longer worth anybody's brain cells to read. The bogus error message example seems like an actual documentation bug, so back-patch to all supported branches. Discussion: <firstname.lastname@example.org>
Docs: improve description of psql's %R prompt escape sequence.
commit : c7aea92c1e287e5900ce6001fc25bc5f46d4d55b author : Tom Lane <email@example.com> date : Sun, 19 Jun 2016 13:11:40 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sun, 19 Jun 2016 13:11:40 -0400
Dilian Palauzov pointed out in bug #14201 that the docs failed to mention the possibility of %R producing '(' due to an unmatched parenthesis. He proposed just adding that in the same style as the other options were listed; but it seemed to me that the sentence was already nearly unintelligible, so I rewrote it a bit more extensively. Report: <email@example.com>
Finish up XLOG_HINT renaming
commit : d300b8cf7bad3e57b41e7049406b89868d18f7d1 author : Alvaro Herrera <firstname.lastname@example.org> date : Fri, 17 Jun 2016 18:05:55 -0400 committer: Alvaro Herrera <email@example.com> date : Fri, 17 Jun 2016 18:05:55 -0400
Commit b8fd1a09f3 renamed XLOG_HINT to XLOG_FPI, but neglected two places. Backpatch to 9.3, like that commit.
Fix validation of overly-long IPv6 addresses.
commit : 519445ba26848b7c88572b2b614b104dccb3153a author : Tom Lane <firstname.lastname@example.org> date : Thu, 16 Jun 2016 17:16:32 -0400 committer: Tom Lane <email@example.com> date : Thu, 16 Jun 2016 17:16:32 -0400
The inet/cidr types sometimes failed to reject IPv6 inputs with too many colon-separated fields, instead translating them to '::/0'. This is the result of a thinko in the original ISC code that seems to be as yet unreported elsewhere. Per bug #14198 from Stefan Kaltenbrunner. Report: <firstname.lastname@example.org>
Avoid crash in "postgres -C guc" for a GUC with a null string value.
commit : 29987b2e1f8cfb9921ffd33c0d0e1a5f53a090b4 author : Tom Lane <email@example.com> date : Thu, 16 Jun 2016 12:17:03 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Thu, 16 Jun 2016 12:17:03 -0400
Emit "(null)" instead, which was the behavior all along on platforms that don't crash, eg OS X. Per report from Jehan-Guillaume de Rorthais. Back-patch to 9.2 where -C option was introduced. Michael Paquier Report: <20160615204036.2d35d86a@firost>
Widen buffer for headers in psql's \watch command.
commit : 832c3f9328949b31f9548a60c85a73f79fbe2d3f author : Tom Lane <email@example.com> date : Wed, 15 Jun 2016 19:35:39 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Wed, 15 Jun 2016 19:35:39 -0400
This is to make sure there's enough room for translated versions of the message. HEAD's already addressed this issue, but back-patch a simple increase in the array size. Discussion: <20160612145532.GA22965@postgresql.kr>
Fix multiple minor infelicities in aclchk.c error reports.
commit : f475fe3538858650aaf55fcc20c3e71a74e0fd56 author : Tom Lane <email@example.com> date : Mon, 13 Jun 2016 13:53:10 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Mon, 13 Jun 2016 13:53:10 -0400
pg_type_aclmask reported the wrong type's OID when complaining that it could not find a type's typelem. It also failed to provide a suitable errcode when the initially given OID doesn't exist (which is a user-facing error, since that OID can be user-specified). pg_foreign_data_wrapper_aclmask and pg_foreign_server_aclmask likewise lacked errcode specifications. Trivial cosmetic adjustments too. The wrong-type-OID problem was reported by Petru-Florin Mihancea in bug #14186; the other issues noted by me while reading the code. These errors all seem to be aboriginal in the respective routines, so back-patch as necessary. Report: <email@example.com>
Clarify documentation of ceil/ceiling/floor functions.
commit : bca0f60de4b3fd34b69e83c543ec7d9012e89792 author : Tom Lane <firstname.lastname@example.org> date : Thu, 9 Jun 2016 11:58:00 -0400 committer: Tom Lane <email@example.com> date : Thu, 9 Jun 2016 11:58:00 -0400
Document these as "nearest integer >= argument" and "nearest integer <= argument", which will hopefully be less confusing than the old formulation. New wording is from Matlab via Dean Rasheed. I changed the pg_description entries as well as the SGML docs. In the back branches, this will only affect installations initdb'd in the future, but it should be harmless otherwise. Discussion: <CAEZATCW3yzJo-NMSiQs5jXNFbTsCEftZS-Og8=FvFdiU+kYuSA@mail.gmail.com>
nls-global.mk: search build dir for source files, too
commit : b5dd25b77aa0acd9faa45abd0915c62a767316e7 author : Alvaro Herrera <firstname.lastname@example.org> date : Tue, 7 Jun 2016 18:55:18 -0400 committer: Alvaro Herrera <email@example.com> date : Tue, 7 Jun 2016 18:55:18 -0400
In VPATH builds, the build directory was not being searched for files in GETTEXT_FILES, leading to failure to construct the .pot files. This has bit me all along, but never hard enough to get it fixed; I suppose not a lot of people uses VPATH and NLS-enabled builds, and those that do, don't do "make update-po" often. This is a longstanding problem, so backpatch all the way back.
Don't reset changes_since_analyze after a selective-columns ANALYZE.
commit : 5f3e0e84b274dbbe605cd2d4c2717c14e2cfe787 author : Tom Lane <firstname.lastname@example.org> date : Mon, 6 Jun 2016 17:44:17 -0400 committer: Tom Lane <email@example.com> date : Mon, 6 Jun 2016 17:44:17 -0400
If we ANALYZE only selected columns of a table, we should not postpone auto-analyze because of that; other columns may well still need stats updates. As committed, the counter is left alone if a column list is given, whether or not it includes all analyzable columns of the table. Per complaint from Tomasz Ostrowski. It's been like this a long time, so back-patch to all supported branches. Report: <firstname.lastname@example.org>
Suppress -Wunused-result warnings about write(), again.
commit : 4a21c6fd78052d394ba6ba31753d241c3c12a920 author : Tom Lane <email@example.com> date : Fri, 3 Jun 2016 11:29:20 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Fri, 3 Jun 2016 11:29:20 -0400
Adopt the same solution as in commit aa90e148ca70a235, but this time let's put the ugliness inside the write_stderr() macro, instead of expecting each call site to deal with it. Back-port that decision into psql/common.c where I got the macro from in the first place. Per gripe from Peter Eisentraut.
Redesign handling of SIGTERM/control-C in parallel pg_dump/pg_restore.
commit : 5c9724305eb578a14667ef89412077309cd04125 author : Tom Lane <email@example.com> date : Thu, 2 Jun 2016 13:27:53 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Thu, 2 Jun 2016 13:27:53 -0400
Formerly, Unix builds of pg_dump/pg_restore would trap SIGINT and similar signals and set a flag that was tested in various data-transfer loops. This was prone to errors of omission (cf commit 3c8aa6654); and even if the client-side response was prompt, we did nothing that would cause long-running SQL commands (e.g. CREATE INDEX) to terminate early. Also, the master process would effectively do nothing at all upon receipt of SIGINT; the only reason it seemed to work was that in typical scenarios the signal would also be delivered to the child processes. We should support termination when a signal is delivered only to the master process, though. Windows builds had no console interrupt handler, so they would just fall over immediately at control-C, again leaving long-running SQL commands to finish unmolested. To fix, remove the flag-checking approach altogether. Instead, allow the Unix signal handler to send a cancel request directly and then exit(1). In the master process, also have it forward the signal to the children. On Windows, add a console interrupt handler that behaves approximately the same. The main difference is that a single execution of the Windows handler can send all the cancel requests since all the info is available in one process, whereas on Unix each process sends a cancel only for its own database connection. In passing, fix an old problem that DisconnectDatabase tends to send a cancel request before exiting a parallel worker, even if nothing went wrong. This is at least a waste of cycles, and could lead to unexpected log messages, or maybe even data loss if it happened in pg_restore (though in the current code the problem seems to affect only pg_dump). The cause was that after a COPY step, pg_dump was leaving libpq in PGASYNC_BUSY state, causing PQtransactionStatus() to report PQTRANS_ACTIVE. That's normally harmless because the next PQexec() will silently clear the PGASYNC_BUSY state; but in a parallel worker we might exit without any additional SQL commands after a COPY step. So add an extra PQgetResult() call after a COPY to allow libpq to return to PGASYNC_IDLE state. This is a bug fix, IMO, so back-patch to 9.3 where parallel dump/restore were introduced. Thanks to Kyotaro Horiguchi for Windows testing and code suggestions. Original-Patch: <email@example.com> Discussion: <firstname.lastname@example.org>
Clean up some minor inefficiencies in parallel dump/restore.
commit : 0a1485f1c0e50796b3fec97229756035a6cf91c1 author : Tom Lane <email@example.com> date : Wed, 1 Jun 2016 16:14:21 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Wed, 1 Jun 2016 16:14:21 -0400
Parallel dump did a totally pointless query to find out the name of each table to be dumped, which it already knows. Parallel restore runs issued lots of redundant SET commands because _doSetFixedOutputState() was invoked once per TOC item rather than just once at connection start. While the extra queries are insignificant if you're dumping or restoring large tables, it still seems worth getting rid of them. Also, give the responsibility for selecting the right client_encoding for a parallel dump worker to setup_connection() where it naturally belongs, instead of having ad-hoc code for that in CloneArchive(). And fix some minor bugs like use of strdup() where pg_strdup() would be safer. Back-patch to 9.3, mostly to keep the branches in sync in an area that we're still finding bugs in. Discussion: <email@example.com>
Avoid useless closely-spaced writes of statistics files.
commit : a84cad2247f648a588ce7f2c883b06095a49e397 author : Tom Lane <firstname.lastname@example.org> date : Tue, 31 May 2016 15:54:47 -0400 committer: Tom Lane <email@example.com> date : Tue, 31 May 2016 15:54:47 -0400
The original intent in the stats collector was that we should not write out stats data oftener than every PGSTAT_STAT_INTERVAL msec. Backends will not make requests at all if they see the existing data is newer than that, and the stats collector is supposed to disregard requests having a cutoff_time older than its most recently written data, so that close-together requests don't result in multiple writes. But the latter part of that got broken in commit 187492b6c2e8cafc, so that if two backends concurrently decide the existing stats are too old, the collector would write the data twice. (In principle the collector's logic would still merge requests as long as the second one arrives before we've actually written data ... but since the message collection loop would write data immediately after processing a single inquiry message, that never happened in practice, and in any case the window in which it might work would be much shorter than PGSTAT_STAT_INTERVAL.) To fix, improve pgstat_recv_inquiry so that it checks whether the cutoff time is too old, and doesn't add a request to the queue if so. This means that we do not need DBWriteRequest.request_time, because the decision is taken before making a queue entry. And that means that we don't really need the DBWriteRequest data structure at all; an OID list of database OIDs will serve and allow removal of some rather verbose and crufty code. In passing, improve the comments in this area, which have been rather neglected. Also change backend_read_statsfile so that it's not silently relying on MyDatabaseId to have some particular value in the autovacuum launcher process. It accidentally worked as desired because MyDatabaseId is zero in that process; but that does not seem like a dependency we want, especially with no documentation about it. Although this patch is mine, it turns out I'd rediscovered a known bug, for which Tomas Vondra had already submitted a patch that's functionally equivalent to the non-cosmetic aspects of this patch. Thanks to Tomas for reviewing this version. Back-patch to 9.3 where the bug was introduced. Prior-Discussion: <firstname.lastname@example.org> Patch: <email@example.com>
Fix missing abort checks in pg_backup_directory.c.
commit : 3033e7359fb850d71ffe9deb88d013cf83a98ad2 author : Tom Lane <firstname.lastname@example.org> date : Sun, 29 May 2016 13:18:49 -0400 committer: Tom Lane <email@example.com> date : Sun, 29 May 2016 13:18:49 -0400
Parallel restore from directory format failed to respond to control-C in a timely manner, because there were no checkAborting() calls in the code path that reads data from a file and sends it to the backend. If any worker was in the midst of restoring data for a large table, you'd just have to wait. This fix doesn't do anything for the problem of aborting a long-running server-side command, but at least it fixes things for data transfers. Back-patch to 9.3 where parallel restore was introduced.
Remove pg_dump/parallel.c's useless "aborting" flag.
commit : 99e3298181546bb31895b6f7123ece222427c8dc author : Tom Lane <firstname.lastname@example.org> date : Sun, 29 May 2016 13:00:09 -0400 committer: Tom Lane <email@example.com> date : Sun, 29 May 2016 13:00:09 -0400
This was effectively dead code, since the places that tested it could not be reached after we entered the on-exit-cleanup routine that would set it. It seems to have been a leftover from a design in which error abort would try to send fresh commands to the workers --- a design which could never have worked reliably, of course. Since the flag is not cross-platform, it complicates reasoning about the code's behavior, which we could do without. Although this is effectively just cosmetic, back-patch anyway, because there are some actual bugs in the vicinity of this behavior. Discussion: <firstname.lastname@example.org>
Lots of comment-fixing, and minor cosmetic cleanup, in pg_dump/parallel.c.
commit : 24c1f64a66d6f7a6448d0c9aedecf3e47a2a0c5e author : Tom Lane <email@example.com> date : Sat, 28 May 2016 14:02:11 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sat, 28 May 2016 14:02:11 -0400
The commentary in this file was in extremely sad shape. The author(s) had clearly never heard of the project convention that a function header comment should provide an API spec of some sort for that function. Much of it was flat out wrong, too --- maybe it was accurate when written, but if so it had not been updated to track subsequent code revisions. Rewrite and rearrange to try to bring it up to speed, and annotate some of the places where more work is needed. (I've refrained from actually fixing anything of substance ... yet.) Also, rename a couple of functions for more clarity as to what they do, do some very minor code rearrangement, remove some pointless Asserts, fix an incorrect Assert in readMessageFromPipe, and add a missing socket close in one error exit from pgpipe(). The last would be a bug if we tried to continue after pgpipe() failure, but since we don't, it's just cosmetic at present. Although this is only cosmetic, back-patch to 9.3 where parallel.c was added. It's sufficiently invasive that it'll pose a hazard for future back-patching if we don't. Discussion: <email@example.com>
Clean up thread management in parallel pg_dump for Windows.
commit : 8b97208ecb44f41d159237822f4fa7214075ea75 author : Tom Lane <firstname.lastname@example.org> date : Fri, 27 May 2016 12:02:09 -0400 committer: Tom Lane <email@example.com> date : Fri, 27 May 2016 12:02:09 -0400
Since we start the worker threads with _beginthreadex(), we should use _endthreadex() to terminate them. We got this right in the normal-exit code path, but not so much during an error exit from a worker. In addition, be sure to apply CloseHandle to the thread handle after each thread exits. It's not clear that these oversights cause any user-visible problems, since the pg_dump run is about to terminate anyway. Still, it's clearly better to follow Microsoft's API specifications than ignore them. Also a few cosmetic cleanups in WaitForTerminatingWorkers(), including being a bit less random about where to cast between uintptr_t and HANDLE, and being sure to clear the worker identity field for each dead worker (not that false matches should be possible later, but let's be careful). Original observation and patch by Armin Schöffmann, cosmetic improvements by Michael Paquier and me. (Armin's patch also included closing sockets in ShutdownWorkersHard(), but that's been dealt with already in commit df8d2d8c4.) Back-patch to 9.3 where parallel pg_dump was introduced. Discussion: <firstname.lastname@example.org>
Be more predictable about reporting "lock timeout" vs "statement timeout".
commit : 1f1e70a87f311cf075cc2980609d25f93b290978 author : Tom Lane <email@example.com> date : Fri, 27 May 2016 10:40:20 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Fri, 27 May 2016 10:40:20 -0400
If both timeout indicators are set when we arrive at ProcessInterrupts, we've historically just reported "lock timeout". However, some buildfarm members have been observed to fail isolationtester's timeouts test by reporting "lock timeout" when the statement timeout was expected to fire first. The cause seems to be that the process is allowed to sleep longer than expected (probably due to heavy machine load) so that the lock timeout happens before we reach the point of reporting the error, and then this arbitrary tiebreak rule does the wrong thing. We can improve matters by comparing the scheduled timeout times to decide which error to report. I had originally proposed greatly reducing the 1-second window between the two timeouts in the test cases. On reflection that is a bad idea, at least for the case where the lock timeout is expected to fire first, because that would assume that it takes negligible time to get from statement start to the beginning of the lock wait. Thus, this patch doesn't completely remove the risk of test failures on slow machines. Empirically, however, the case this handles is the one we are seeing in the buildfarm. The explanation may be that the other case requires the scheduler to take the CPU away from a busy process, whereas the case fixed here only requires the scheduler to not give the CPU back right away to a process that has been woken from a multi-second sleep (and, perhaps, has been swapped out meanwhile). Back-patch to 9.3 where the isolationtester timeouts test was added. Discussion: <email@example.com>
Make pg_dump behave more sanely when built without HAVE_LIBZ.
commit : 99565a1ef88447bd8844321159f3d848721f04c4 author : Tom Lane <firstname.lastname@example.org> date : Thu, 26 May 2016 11:51:04 -0400 committer: Tom Lane <email@example.com> date : Thu, 26 May 2016 11:51:04 -0400
For some reason the code to emit a warning and switch to uncompressed output was placed down in the guts of pg_backup_archiver.c. This is definitely too late in the case of parallel operation (and I rather wonder if it wasn't too late for other purposes as well). Put it in pg_dump.c's option-processing logic, which seems a much saner place. Also, the default behavior with custom or directory output format was to emit the warning telling you the output would be uncompressed. This seems unhelpful, so silence that case. Back-patch to 9.3 where parallel dump was introduced. Kyotaro Horiguchi, adjusted a bit by me Report: <firstname.lastname@example.org>
In Windows pg_dump, ensure idle workers will shut down during error exit.
commit : b9784e1f769497dcfaa1f866913f3d86a2adf176 author : Tom Lane <email@example.com> date : Thu, 26 May 2016 10:50:30 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Thu, 26 May 2016 10:50:30 -0400
The Windows coding of ShutdownWorkersHard() thought that setting termEvent was sufficient to make workers exit after an error. But that only helps if a worker is busy and passes through checkAborting(). An idle worker will just sit, resulting in pg_dump failing to exit until the user gives up and hits control-C. We should close the write end of the command pipe so that idle workers will see socket EOF and exit, as the Unix coding was already doing. Back-patch to 9.3 where parallel pg_dump was introduced. Kyotaro Horiguchi
Avoid hot standby cancels from VAC FREEZE
commit : 6537a48c5521cb2fc401e18fe678633a688965b5 author : Alvaro Herrera <email@example.com> date : Wed, 25 May 2016 19:39:49 -0400 committer: Alvaro Herrera <firstname.lastname@example.org> date : Wed, 25 May 2016 19:39:49 -0400
VACUUM FREEZE generated false cancelations of standby queries on an otherwise idle master. Caused by an off-by-one error on cutoff_xid which goes back to original commit. Analysis and report by Marco Nenciarini Bug fix by Simon Riggs This is a correct backpatch of commit 66fbcb0d2e to branches 9.1 through 9.4. That commit was backpatched to 9.0 originally, but it was immediately reverted in 9.0-9.4 because it didn't compile.
Ensure that backends see up-to-date statistics for shared catalogs.
commit : 463207630b1142c61ec2fa528ce3fc1f53c79037 author : Tom Lane <email@example.com> date : Wed, 25 May 2016 17:48:15 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Wed, 25 May 2016 17:48:15 -0400
Ever since we split the statistics collector's reports into per-database files (commit 187492b6c2e8cafc), backends have been seeing stale statistics for shared catalogs. This is because the inquiry message only prompts the collector to write the per-database file for the requesting backend's own database. Stats for shared catalogs are in a separate file for "DB 0", which didn't get updated. In normal operation this was partially masked by the fact that the autovacuum launcher would send an inquiry message at least once per autovacuum_naptime that asked for "DB 0"; so the shared-catalog stats would never be more than a minute out of date. However the problem becomes very obvious with autovacuum disabled, as reported by Peter Eisentraut. To fix, redefine the semantics of inquiry messages so that both the specified DB and DB 0 will be dumped. (This might seem a bit inefficient, but we have no good way to know whether a backend's transaction will look at shared-catalog stats, so we have to read both groups of stats whenever we request stats. Sending two inquiry messages would definitely not be better.) Back-patch to 9.3 where the bug was introduced. Report: <56AD41AC.email@example.com>
Fix broken error handling in parallel pg_dump/pg_restore.
commit : 1c8205159840afdca3ebc9f838f19ed11446c9d4 author : Tom Lane <firstname.lastname@example.org> date : Wed, 25 May 2016 12:39:57 -0400 committer: Tom Lane <email@example.com> date : Wed, 25 May 2016 12:39:57 -0400
In the original design for parallel dump, worker processes reported errors by sending them up to the master process, which would print the messages. This is unworkably fragile for a couple of reasons: it risks deadlock if a worker sends an error at an unexpected time, and if the master has already died for some reason, the user will never get to see the error at all. Revert that idea and go back to just always printing messages to stderr. This approach means that if all the workers fail for similar reasons (eg, bad password or server shutdown), the user will see N copies of that message, not only one as before. While that's slightly annoying, it's certainly better than not seeing any message; not to mention that we shouldn't assume that only the first failure is interesting. An additional problem in the same area was that the master failed to disable SIGPIPE (at least until much too late), which meant that sending a command to an already-dead worker would cause the master to crash silently. That was bad enough in itself but was made worse by the total reliance on the master to print errors: even if the worker had reported an error, you would probably not see it, depending on timing. Instead disable SIGPIPE right after we've forked the workers, before attempting to send them anything. Additionally, the master relies on seeing socket EOF to realize that a worker has exited prematurely --- but on Windows, there would be no EOF since the socket is attached to the process that includes both the master and worker threads, so it remains open. Make archive_close_connection() close the worker end of the sockets so that this acts more like the Unix case. It's not perfect, because if a worker thread exits without going through exit_nicely() the closures won't happen; but that's not really supposed to happen. This has been wrong all along, so back-patch to 9.3 where parallel dump was introduced. Report: <firstname.lastname@example.org>
Fetch XIDs atomically during vac_truncate_clog().
commit : ff98ae908bbfd950e98099c653380b9cd0ac2739 author : Tom Lane <email@example.com> date : Tue, 24 May 2016 15:47:51 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Tue, 24 May 2016 15:47:51 -0400
Because vac_update_datfrozenxid() updates datfrozenxid and datminmxid in-place, it's unsafe to assume that successive reads of those values will give consistent results. Fetch each one just once to ensure sane behavior in the minimum calculation. Noted while reviewing Alexander Korotkov's patch in the same area. Discussion: <email@example.com>
Avoid consuming an XID during vac_truncate_clog().
commit : 9f3e4c8131886ffb1f382d4ef8e76caa965983f4 author : Tom Lane <firstname.lastname@example.org> date : Tue, 24 May 2016 15:20:12 -0400 committer: Tom Lane <email@example.com> date : Tue, 24 May 2016 15:20:12 -0400
vac_truncate_clog() uses its own transaction ID as the comparison point in a sanity check that no database's datfrozenxid has already wrapped around "into the future". That was probably fine when written, but in a lazy vacuum we won't have assigned an XID, so calling GetCurrentTransactionId() causes an XID to be assigned when otherwise one would not be. Most of the time that's not a big problem ... but if we are hard up against the wraparound limit, consuming XIDs during antiwraparound vacuums is a very bad thing. Instead, use ReadNewTransactionId(), which not only avoids this problem but is in itself a better comparison point to test whether wraparound has already occurred. Report and patch by Alexander Korotkov. Back-patch to all versions. Report: <CAPpHfdspOkmiQsxh-UZw2chM6dRMwXAJGEmmbmqYR=yvM7-s6A@mail.gmail.com>
Fix latent crash in do_text_output_multiline().
commit : de521989a22172759c349b8b15121dbcb5ace3c1 author : Tom Lane <firstname.lastname@example.org> date : Mon, 23 May 2016 14:16:41 -0400 committer: Tom Lane <email@example.com> date : Mon, 23 May 2016 14:16:41 -0400
do_text_output_multiline() would fail (typically with a null pointer dereference crash) if its input string did not end with a newline. Such cases do not arise in our current sources; but it certainly could happen in future, or in extension code's usage of the function, so we should fix it. To fix, replace "eol += len" with "eol = text + len". While at it, make two cosmetic improvements: mark the input string const, and rename the argument from "text" to "txt" to dodge pgindent strangeness (since "text" is a typedef name). Even though this problem is only latent at present, it seems like a good idea to back-patch the fix, since it's a very simple/safe patch and it's not out of the realm of possibility that we might in future back-patch something that expects sane behavior from do_text_output_multiline(). Per report from Hao Lee. Report: <CAGoxFiFPAGyPAJLcFxTB5cGhTW2yOVBDYeqDugYwV4dEd1L_Ag@mail.gmail.com>
Further improve documentation about --quote-all-identifiers switch.
commit : 7ac03429418c63eb0b9e8b40c0a20b014ef96f13 author : Tom Lane <firstname.lastname@example.org> date : Fri, 20 May 2016 15:51:57 -0400 committer: Tom Lane <email@example.com> date : Fri, 20 May 2016 15:51:57 -0400
Mention it in the Notes section too, per suggestion from David Johnston. Discussion: <firstname.lastname@example.org>
Improve documentation about pg_dump's --quote-all-identifiers switch.
commit : f2727f542fca3970d995877cdbca7c6cf040a492 author : Tom Lane <email@example.com> date : Fri, 20 May 2016 14:59:48 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Fri, 20 May 2016 14:59:48 -0400
Per bug #14152 from Alejandro Martínez. Back-patch to all supported branches. Discussion: <email@example.com>
doc: Fix typo
commit : 405b9baf1b19086094495c827fcf668ccd4cc492 author : Peter Eisentraut <firstname.lastname@example.org> date : Fri, 13 May 2016 21:24:13 -0400 committer: Peter Eisentraut <email@example.com> date : Fri, 13 May 2016 21:24:13 -0400
From: Alexander Law <firstname.lastname@example.org>
Ensure plan stability in contrib/btree_gist regression test.
commit : a2c1bc36daecf94d390215849ba49f115f4328bd author : Tom Lane <email@example.com> date : Thu, 12 May 2016 20:04:12 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Thu, 12 May 2016 20:04:12 -0400
Buildfarm member skink failed with symptoms suggesting that an auto-analyze had happened and changed the plan displayed for a test query. Although this is evidently of low probability, regression tests that sometimes fail are no fun, so add commands to force a bitmap scan to be chosen.
Fix obsolete comment
commit : 6e6e4f1659eadb4e50458a357e5c134e0be46e99 author : Alvaro Herrera <email@example.com> date : Thu, 12 May 2016 15:36:51 -0300 committer: Alvaro Herrera <firstname.lastname@example.org> date : Thu, 12 May 2016 15:36:51 -0300
Fix autovacuum for shared relations
commit : 92ebe509e381002f62faeeeb9007723409725323 author : Alvaro Herrera <email@example.com> date : Tue, 10 May 2016 16:23:54 -0300 committer: Alvaro Herrera <firstname.lastname@example.org> date : Tue, 10 May 2016 16:23:54 -0300
The table-skipping logic in autovacuum would fail to consider that multiple workers could be processing the same shared catalog in different databases. This normally wouldn't be a problem: firstly because autovacuum workers not for wraparound would simply ignore tables in which they cannot acquire lock, and secondly because most of the time these tables are small enough that even if multiple for-wraparound workers are stuck in the same catalog, they would be over pretty quickly. But in cases where the catalogs are severely bloated it could become a problem. Backpatch all the way back, because the problem has been there since the beginning. Reported by Ondřej Světlík Discussion: https://www.postgresql.org/message-id/572B63B1.3030603%40flexibee.eu https://www.postgresql.org/message-id/572A1072.5080308%40flexibee.eu