commit : 30b1e1ac4cea9ab32f2912e503589cda09e535aa author : Tom Lane <email@example.com> date : Mon, 6 May 2019 16:54:48 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Mon, 6 May 2019 16:54:48 -0400
Last-minute updates for release notes.
commit : ffe69b4afdae8284ad3f0d4c57640f077799cfa9 author : Tom Lane <email@example.com> date : Mon, 6 May 2019 12:45:59 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Mon, 6 May 2019 12:45:59 -0400
Security: CVE-2019-10129, CVE-2019-10130
commit : 54ff9fa588fb85f3e192bb7fe365d9db2eeed8a9 author : Peter Eisentraut <email@example.com> date : Mon, 6 May 2019 14:58:53 +0200 committer: Peter Eisentraut <firstname.lastname@example.org> date : Mon, 6 May 2019 14:58:53 +0200
Source-Git-URL: https://git.postgresql.org/git/pgtranslation/messages.git Source-Git-Hash: 07a92cdb0f05a078521d8464c99cd654409cd0af
Use checkAsUser for selectivity estimator checks, if it's set.
commit : 3c09999098519da6b601df878e326eed701201ba author : Dean Rasheed <email@example.com> date : Mon, 6 May 2019 12:05:05 +0100 committer: Dean Rasheed <firstname.lastname@example.org> date : Mon, 6 May 2019 12:05:05 +0100
In examine_variable() and examine_simple_variable(), when checking the user's table and column privileges to determine whether to grant access to the pg_statistic data, use checkAsUser for the privilege checks, if it's set. This will be the case if we're accessing the table via a view, to indicate that we should perform privilege checks as the view owner rather than the current user. This change makes this planner check consistent with the check in the executor, so the planner will be able to make use of statistics if the table is accessible via the view. This fixes a performance regression introduced by commit e2d4ef8de8, which affects queries against non-security barrier views in the case where the user doesn't have privileges on the underlying table, but the view owner does. Note that it continues to provide the same safeguards controlling access to pg_statistic for direct table access (in which case checkAsUser won't be set) and for security barrier views, because of the nearby checks on rte->security_barrier and rte->securityQuals. Back-patch to all supported branches because e2d4ef8de8 was. Dean Rasheed, reviewed by Jonathan Katz and Stephen Frost.
Remove reindex_catalog test from test schedules.
commit : 70200d0d74027fffa53d750546781ed067b44f98 author : Andres Freund <email@example.com> date : Sun, 5 May 2019 23:31:58 -0700 committer: Andres Freund <firstname.lastname@example.org> date : Sun, 5 May 2019 23:31:58 -0700
As the test currently causes occasional deadlocks (due to the schema cleanup from previous sessions potentially still running), and the patch from f912d7dec2 has gotten a fair bit of buildfarm coverage, remove the test from the test schedules. There's a set of minor releases coming up. Leave the tests in place, so it can manually be run using EXTRA_TESTS. For now also leave it in master, as there's no imminent release, and there's plenty (re-)index related work in 12. But we'll have to disable it before long there too, unless somebody comes up with simple enough fixes for the deadlock (I'm about to post a vague idea to the list). Discussion: https://email@example.com Backpatch: 9.4-11 (no master!)
Release notes for 11.3, 10.8, 9.6.13, 9.5.17, 9.4.22.
commit : 21de9ae15935c7bd5a40aeb1b93242cc1da446cb author : Tom Lane <firstname.lastname@example.org> date : Sun, 5 May 2019 14:57:17 -0400 committer: Tom Lane <email@example.com> date : Sun, 5 May 2019 14:57:17 -0400
Fix reindexing of pg_class indexes some more.
commit : 5f8e84ff47ff342e9169b20e3643819372a2f685 author : Tom Lane <firstname.lastname@example.org> date : Thu, 2 May 2019 19:11:29 -0400 committer: Tom Lane <email@example.com> date : Thu, 2 May 2019 19:11:29 -0400
Commits 3dbb317d3 et al failed under CLOBBER_CACHE_ALWAYS testing. Investigation showed that to reindex pg_class_oid_index, we must suppress accesses to the index (via SetReindexProcessing) before we call RelationSetNewRelfilenode, or at least before we do CommandCounterIncrement therein; otherwise, relcache reloads happening within the CCI may try to fetch pg_class rows using the index's new relfilenode value, which is as yet an empty file. Of course, the point of 3dbb317d3 was that that ordering didn't work either, because then RelationSetNewRelfilenode's own update of the index's pg_class row cannot access the index, should it need to. There are various ways we might have got around that, but Andres Freund came up with a brilliant solution: for a mapped index, we can really just skip the pg_class update altogether. The only fields it was actually changing were relpages etc, but it was just setting them to zeroes which is useless make-work. (Correct new values will be installed at the end of index build.) All pg_class indexes are mapped and probably always will be, so this eliminates the problem by removing work rather than adding it, always a pleasant outcome. Having taught RelationSetNewRelfilenode to do it that way, we can revert the code reordering in reindex_index. (But I left the moved setup code where it was; there seems no reason why it has to run without use of the old index. If you're trying to fix a busted pg_class index, you'll have had to disable system index use altogether to get this far.) Moreover, this means we don't need RelationSetIndexList at all, because reindex_relation's hacking to make "REINDEX TABLE pg_class" work is likewise now unnecessary. We'll leave that code in place in the back branches, but a follow-on patch will remove it in HEAD. In passing, do some minor cleanup for commit 5c1560606 (in HEAD only), notably removing a duplicate newrnode assignment. Patch by me, using a core idea due to Andres Freund. Back-patch to all supported branches, as 3dbb317d3 was. Discussion: https://firstname.lastname@example.org
Run catalog reindexing test from 3dbb317d32 serially, to avoid deadlocks.
commit : 856bc0c612bea2df5f5748a51f2cda3f3c33928d author : Andres Freund <email@example.com> date : Tue, 30 Apr 2019 17:45:32 -0700 committer: Andres Freund <firstname.lastname@example.org> date : Tue, 30 Apr 2019 17:45:32 -0700
The tests turn out to cause deadlocks in some circumstances. Fairly reproducibly so with -DRELCACHE_FORCE_RELEASE -DCATCACHE_FORCE_RELEASE. Some of the deadlocks may be hard to fix without disproportionate measures, but others probably should be fixed - but not in 12. We discussed removing the new tests until we can fix the issues underlying the deadlocks, but results from buildfarm animal markhor (which runs with CLOBBER_CACHE_ALWAYS) indicates that there might be a more severe, as of yet undiagnosed, issue (including on stable branches) with reindexing catalogs. The failure is: ERROR: could not read block 0 in file "base/16384/28025": read only 0 of 8192 bytes Therefore it seems advisable to keep the tests. It's not certain that running the tests in isolation removes the risk of deadlocks. It's possible that additional locks are needed to protect against a concurrent auto-analyze or such. Per discussion with Tom Lane. Discussion: https://email@example.com Backpatch: 9.4-, like 3dbb317d3
Fix unused variable compiler warning in !debug builds.
commit : 40230f0e2e3bed70eaada93b73a440221017d4d7 author : Andres Freund <firstname.lastname@example.org> date : Tue, 30 Apr 2019 17:45:32 -0700 committer: Andres Freund <email@example.com> date : Tue, 30 Apr 2019 17:45:32 -0700
Introduced in 3dbb317d3. Fix by using the new local variable in more places. Reported-By: Bruce Momjian (off-list) Backpatch: 9.4-, like 3dbb317d3
Fix potential assertion failure when reindexing a pg_class index.
commit : e7418f89f17ad351d8cbd513a104efda6d25a2bd author : Andres Freund <firstname.lastname@example.org> date : Mon, 29 Apr 2019 19:39:36 -0700 committer: Andres Freund <email@example.com> date : Mon, 29 Apr 2019 19:39:36 -0700
When reindexing individual indexes on pg_class it was possible to either trigger an assertion failure: TRAP: FailedAssertion("!(!ReindexIsProcessingIndex(((index)->rd_id))) That's because reindex_index() called SetReindexProcessing() - which enables an asserts ensuring no index insertions happen into the index - before calling RelationSetNewRelfilenode(). That not correct for indexes on pg_class, because RelationSetNewRelfilenode() updates the relevant pg_class row, which needs to update the indexes. The are two reasons this wasn't noticed earlier. Firstly the bug doesn't trigger when reindexing all of pg_class, as reindex_relation has code "hiding" all yet-to-be-reindexed indexes. Secondly, the bug only triggers when the the update to pg_class doesn't turn out to be a HOT update - otherwise there's no index insertion to trigger the bug. Most of the time there's enough space, making this bug hard to trigger. To fix, move RelationSetNewRelfilenode() to before the SetReindexProcessing() (and, together with some other code, to outside of the PG_TRY()). To make sure the error checking intended by SetReindexProcessing() is more robust, modify CatalogIndexInsert() to check ReindexIsProcessingIndex() even when the update is a HOT update. Also add a few regression tests for REINDEXing of system catalogs. The last two improvements would have prevented some of the issues fixed in 5c1560606dc4c from being introduced in the first place. Reported-By: Michael Paquier Diagnosed-By: Tom Lane and Andres Freund Author: Andres Freund Reviewed-By: Tom Lane Discussion: https://postgr.es/m/20190418011430.GA19133@paquier.xyz Backpatch: 9.4-, the bug is present in all branches
Correct the URL pointing to PL/R
commit : 17947af968b82ffba7aa2bde047ddd862f7fc2f9 author : Joe Conway <firstname.lastname@example.org> date : Sat, 27 Apr 2019 09:28:09 -0400 committer: Joe Conway <email@example.com> date : Sat, 27 Apr 2019 09:28:09 -0400
As pointed out by documentation comment, the URL for PL/R needs to be updated to the correct current repository. Back-patch to all supported branches.
Portability fix for zic.c.
commit : 1c61ec7bb14ee0c6064f54a2ddaaedc3537db581 author : Tom Lane <firstname.lastname@example.org> date : Fri, 26 Apr 2019 21:20:11 -0400 committer: Tom Lane <email@example.com> date : Fri, 26 Apr 2019 21:20:11 -0400
Missed an inttypes.h dependency in previous patch. Per buildfarm.
Sync our copy of the timezone library with IANA release tzcode2019a.
commit : 7f36286c2e454ded91b0bcbb738b9d7648c67390 author : Tom Lane <firstname.lastname@example.org> date : Fri, 26 Apr 2019 19:46:26 -0400 committer: Tom Lane <email@example.com> date : Fri, 26 Apr 2019 19:46:26 -0400
This corrects a small bug in zic that caused it to output an incorrect year-2440 transition in the Africa/Casablanca zone. More interestingly, zic has grown a "-r" option that limits the range of zone transitions that it will put into the output files. That might be useful to people who don't like the weird GMT offsets that tzdb likes to use for very old dates. It appears that for dates before the cutoff time specified with -r, zic will use the zone's standard-time offset as of the cutoff time. So for example one might do make install ZIC_OPTIONS='-r @-1893456000' to cause all dates before 1910-01-01 to be treated as though 1910 standard time prevailed indefinitely far back. (Don't blame me for the unfriendly way of specifying the cutoff time --- it's seconds since or before the Unix epoch. You can use extract(epoch ...) to calculate it.) As usual, back-patch to all supported branches.
Update time zone data files to tzdata release 2019a.
commit : baeb12010d2532955ed815ea31682c3d8d088cbc author : Tom Lane <firstname.lastname@example.org> date : Fri, 26 Apr 2019 17:56:26 -0400 committer: Tom Lane <email@example.com> date : Fri, 26 Apr 2019 17:56:26 -0400
DST law changes in Palestine and Metlakatla. Historical corrections for Israel. Etc/UCT is now a backward-compatibility link to Etc/UTC, instead of being a separate zone that generates the abbreviation "UCT", which nowadays is typically a typo. Postgres will still accept "UCT" as an input zone name, but it won't output it.
Fix some minor postmaster-state-machine issues.
commit : f0c82454dcb4ef37dd360605c49a0ddcdc8b4eb8 author : Tom Lane <firstname.lastname@example.org> date : Wed, 24 Apr 2019 14:15:45 -0400 committer: Tom Lane <email@example.com> date : Wed, 24 Apr 2019 14:15:45 -0400
In sigusr1_handler, don't ignore PMSIGNAL_ADVANCE_STATE_MACHINE based on pmState. The restriction is unnecessary (PostmasterStateMachine should work in any state), not future-proof (since it makes too many assumptions about why the signal might be sent), and broken even today because a race condition can make it necessary to respond to the signal in PM_WAIT_READONLY state. The race condition seems unlikely, but if it did happen, a hot-standby postmaster could fail to shut down after receiving a smart-shutdown request. In MaybeStartWalReceiver, don't clear the WalReceiverRequested flag if the fork attempt fails. Leaving it set allows us to try again in future iterations of the postmaster idle loop. (The startup process would eventually send a fresh request signal, but this change may allow us to retry the fork sooner.) Remove an obsolete comment and unnecessary test in PostmasterStateMachine's handling of PM_SHUTDOWN_2 state. It's not possible to have a live walreceiver in that state, and AFAICT has not been possible since commit 5e85315ea. This isn't a live bug, but the false comment is quite confusing to readers. In passing, rearrange sigusr1_handler's CheckPromoteSignal tests so that we don't uselessly perform stat() calls that we're going to ignore the results of. Add some comments clarifying the behavior of MaybeStartWalReceiver; I very nearly rearranged it in a way that'd reintroduce the race condition fixed in e5d494d78. Mea culpa for not commenting that properly at the time. Back-patch to all supported branches. The PMSIGNAL_ADVANCE_STATE_MACHINE change is the only one of even minor significance, but we might as well keep this code in sync across branches. Discussion: https://firstname.lastname@example.org
Fix detection of passwords hashed with MD5
commit : a82c06f4001d28c17f67c61b492f774d5e507b31 author : Michael Paquier <email@example.com> date : Wed, 24 Apr 2019 09:05:37 +0900 committer: Michael Paquier <firstname.lastname@example.org> date : Wed, 24 Apr 2019 09:05:37 +0900
This commit fixes an issue related to the way password verifiers hashed with MD5 are detected, leading to possibly detect that plain passwords are legit MD5 hashes. A MD5-hashed entry was checked based on if its header uses "md5" and if the string length matches what is expected. Unfortunately the code never checked if the hash only used hexadecimal characters after the three-character prefix. Fix 9.6 down to 9.4, where this code is present. This area of the code has changed in 10 and upwards with the introduction of SCRAM, which led to a different fix committed as of ccae190. Reported-by: Tom Lane Author: Michael Paquier Reviewed-by: Jonathan Katz Discussion: https://email@example.com Backpatch-through: 9.4
Repair assorted issues in locale data extraction.
commit : c0bafc5e58c5043038c98bfd1591e5257202e6af author : Tom Lane <firstname.lastname@example.org> date : Tue, 23 Apr 2019 18:51:31 -0400 committer: Tom Lane <email@example.com> date : Tue, 23 Apr 2019 18:51:31 -0400
cache_locale_time (extraction of LC_TIME-related info) had never been taught the lessons we previously learned about extraction of info related to LC_MONETARY and LC_NUMERIC. Specifically, commit 95a777c61 taught PGLC_localeconv() that data coming out of localeconv() was in an encoding determined by the relevant locale, but we didn't realize that there's a similar issue with strftime(). And commit a4930e7ca hardened PGLC_localeconv() against errors occurring partway through, but failed to do likewise for cache_locale_time(). So, rearrange the latter function to perform encoding conversion and not risk failure while it's got the locales set to temporary values. This time around I also changed PGLC_localeconv() to treat it as FATAL if it can't restore the previous settings of the locale values. There is no reason (except possibly OOM) for that to fail, and proceeding with the wrong locale values seems like a seriously bad idea --- especially on Windows where we have to also temporarily change LC_CTYPE. Also, protect against the possibility that we can't identify the codeset reported for LC_MONETARY or LC_NUMERIC; rather than just failing, try to validate the data without conversion. The user-visible symptom this fixes is that if LC_TIME is set to a locale name that implies an encoding different from the database encoding, non-ASCII localized day and month names would be retrieved in the wrong encoding, leading to either unexpected encoding-conversion error reports or wrong output from to_char(). The other possible failure modes are unlikely enough that we've not seen reports of them, AFAIK. The encoding conversion problems do not manifest on Windows, since we'd already created special-case code to handle that issue there. Per report from Juan José Santamaría Flecha. Back-patch to all supported versions. Juan José Santamaría Flecha and Tom Lane Discussion: https://postgr.es/m/CAC+AXB22So5aZm2vZe+MChYXec7gWfr-n-SK-iO091R0P_1Tew@mail.gmail.com
postgresql.conf.sample: add proper defaults for include actions
commit : b71f78c24ff689c92d0edbfc929e0fe798f86622 author : Bruce Momjian <firstname.lastname@example.org> date : Wed, 17 Apr 2019 18:12:10 -0400 committer: Bruce Momjian <email@example.com> date : Wed, 17 Apr 2019 18:12:10 -0400
Previously, include actions include_dir, include_if_exists, and include listed commented-out values which were not the defaults, which is inconsistent with other entries. Instead, replace them with '', which is the default value. Reported-by: Emanuel Araújo Discussion: https://postgr.es/m/CAMuTAkYMx6Q27wpELDR3_v9aG443y7ZjeXu15_+1nGUjhMWOJA@mail.gmail.com Backpatch-through: 9.4
docs: clarify pg_upgrade's recovery behavior
commit : 39a93fd294a0442292de70f2e42e9b1da1c4e3b6 author : Bruce Momjian <firstname.lastname@example.org> date : Wed, 17 Apr 2019 18:01:01 -0400 committer: Bruce Momjian <email@example.com> date : Wed, 17 Apr 2019 18:01:01 -0400
The previous paragraph trying to explain --check, --link, and no --link modes and the various points of failure was too complex. Instead, use bullet lists and sublists. Reported-by: Daniel Gustafsson Discussion: https://postgr.es/m/qtqiv7hI87s_Xvz5ZXHCaH-1-_AZGpIDJowzlRjF3-AbCr3RhSNydM_JCuJ8DE4WZozrtxhIWmyYTbv0syKyfGB6cYMQitp9yN-NZMm-oAofirstname.lastname@example.org Backpatch-through: 9.4
Consistently test for in-use shared memory.
commit : 3ef5e16c618177160bc5f95216dccab3588defdc author : Noah Misch <email@example.com> date : Fri, 12 Apr 2019 22:36:38 -0700 committer: Noah Misch <firstname.lastname@example.org> date : Fri, 12 Apr 2019 22:36:38 -0700
postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer stop if shmat() of an old segment fails with EACCES. A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed (in earlier versions) by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20190408064141.GA2016666@rfd.leadboat.com
Fix off-by-one check that can lead to a memory overflow in ecpg.
commit : 12c42a543ac80f44e02173c2d6d51ffe18de2703 author : Michael Meskes <email@example.com> date : Thu, 11 Apr 2019 20:56:17 +0200 committer: Michael Meskes <firstname.lastname@example.org> date : Thu, 11 Apr 2019 20:56:17 +0200
Patch by Liu Huailing <email@example.com>
doc: adjust libpq wording to be neither/nor
commit : f4daf89582196596b947b9be96394a1f63a986dc author : Bruce Momjian <firstname.lastname@example.org> date : Thu, 11 Apr 2019 13:25:33 -0400 committer: Bruce Momjian <email@example.com> date : Thu, 11 Apr 2019 13:25:33 -0400
Reported-by: firstname.lastname@example.org Discussion: https://email@example.com Backpatch-through: 9.4
Define WIN32_STACK_RLIMIT throughout win32 and cygwin builds.
commit : 3105844e2d9561913387db5c9e00b5161843a6a2 author : Noah Misch <firstname.lastname@example.org> date : Tue, 9 Apr 2019 08:25:39 -0700 committer: Noah Misch <email@example.com> date : Tue, 9 Apr 2019 08:25:39 -0700
The MSVC build system already did this, and commit 617dc6d299c957e2784320382b3277ede01d9c63 used it in a second file. Back-patch to 9.4, like that commit. Discussion: https://postgr.es/m/CAA8=A7_1SWc3+3Z=-utQrQFOtrj_DeohRVt7diA2tZozxsyUOQ@mail.gmail.com
Avoid "could not reattach" by providing space for concurrent allocation.
commit : 203886d3ae22495241a95207e7b62bf859c389c5 author : Noah Misch <firstname.lastname@example.org> date : Mon, 8 Apr 2019 21:39:00 -0700 committer: Noah Misch <email@example.com> date : Mon, 8 Apr 2019 21:39:00 -0700
We've long had reports of intermittent "could not reattach to shared memory" errors on Windows. Buildfarm member dory fails that way when PGSharedMemoryReAttach() execution overlaps with creation of a thread for the process's "default thread pool". Fix that by providing a second region to receive asynchronous allocations that would otherwise intrude into UsedShmemSegAddr. In pgwin32_ReserveSharedMemoryRegion(), stop trying to free reservations landing at incorrect addresses; the caller's next step has been to terminate the affected process. Back-patch to 9.4 (all supported versions). Reviewed by Tom Lane. He also did much of the prerequisite research; see commit bcbf2346d69f6006f126044864dd9383d50d87b4. Discussion: https://postgr.es/m/20190402135442.GA1173872@rfd.leadboat.com
Fix improper interaction of FULL JOINs with lateral references.
commit : c6df3a28ff40ca7124ea6c28adf2e3ff9910faee author : Tom Lane <firstname.lastname@example.org> date : Mon, 8 Apr 2019 16:09:07 -0400 committer: Tom Lane <email@example.com> date : Mon, 8 Apr 2019 16:09:07 -0400
join_is_legal() needs to reject forming certain outer joins in cases where that would lead the planner down a blind alley. However, it mistakenly supposed that the way to handle full joins was to treat them as applying the same constraints as for left joins, only to both sides. That doesn't work, as shown in bug #15741 from Anthony Skorski: given a lateral reference out of a join that's fully enclosed by a full join, the code would fail to believe that any join ordering is legal, resulting in errors like "failed to build any N-way joins". However, we don't really need to consider full joins at all for this purpose, because we effectively force them to be evaluated in syntactic order, and that order is always legal for lateral references. Hence, get rid of this broken logic for full joins and just ignore them instead. This seems to have been an oversight in commit 7e19db0c0. Back-patch to all supported branches, as that was. Discussion: https://firstname.lastname@example.org
Revert "Consistently test for in-use shared memory."
commit : 0777696fe346dbd3b20814acce08dd93b512afcb author : Noah Misch <email@example.com> date : Fri, 5 Apr 2019 00:00:52 -0700 committer: Noah Misch <firstname.lastname@example.org> date : Fri, 5 Apr 2019 00:00:52 -0700
This reverts commits 2f932f71d9f2963bbd201129d7b971c8f5f077fd, 16ee6eaf80a40007a138b60bb5661660058d0422 and 6f0e190056fe441f7cf788ff19b62b13c94f68f3. The buildfarm has revealed several bugs. Back-patch like the original commits. Discussion: https://postgr.es/m/20190404145319.GA1720877@rfd.leadboat.com
Silence -Wimplicit-fallthrough in sysv_shmem.c.
commit : 41547838b97b709e9ee10eecc656703b357ec86a author : Noah Misch <email@example.com> date : Wed, 3 Apr 2019 23:23:35 -0700 committer: Noah Misch <firstname.lastname@example.org> date : Wed, 3 Apr 2019 23:23:35 -0700
Commit 2f932f71d9f2963bbd201129d7b971c8f5f077fd added code that elicits a warning on buildfarm member flaviventris. Back-patch to 9.4, like that commit. Reported by Andres Freund. Discussion: https://email@example.com
Handle USE_MODULE_DB for all tests able to use an installed postmaster.
commit : 624edab582886a40a2a07e4949be82b3d6fe5f1f author : Noah Misch <firstname.lastname@example.org> date : Wed, 3 Apr 2019 17:06:01 -0700 committer: Noah Misch <email@example.com> date : Wed, 3 Apr 2019 17:06:01 -0700
When $(MODULES) and $(MODULE_big) are empty, derive the database name from the first element of $(REGRESS) instead of using a constant string. When deriving the database name from $(MODULES), use its first element instead of the entire list; the earlier approach would fail if any multi-module directory had $(REGRESS) tests. Treat isolation suites and src/pl correspondingly. Under USE_MODULE_DB=1, installcheck-world and check-world no longer reuse any database name in a given postmaster. Buildfarm members axolotl, mandrill and frogfish saw spurious "is being accessed by other users" failures that would not have happened without database name reuse. (The CountOtherDBBackends() 5s deadline expired during DROP DATABASE; a backend for an earlier test suite had used the same database name and had not yet exited.) Back-patch to 9.4 (all supported versions), except bits pertaining to isolation suites. Concept reviewed by Andrew Dunstan, Andres Freund and Tom Lane. Discussion: https://postgr.es/m/20190401135213.GE891537@rfd.leadboat.com
Consistently test for in-use shared memory.
commit : 3a70b66e625707bf3f8856be2a3d0ea74d6170d7 author : Noah Misch <firstname.lastname@example.org> date : Wed, 3 Apr 2019 17:03:46 -0700 committer: Noah Misch <email@example.com> date : Wed, 3 Apr 2019 17:03:46 -0700
postmaster startup scrutinizes any shared memory segment recorded in postmaster.pid, exiting if that segment matches the current data directory and has an attached process. When the postmaster.pid file was missing, a starting postmaster used weaker checks. Change to use the same checks in both scenarios. This increases the chance of a startup failure, in lieu of data corruption, if the DBA does "kill -9 `head -n1 postmaster.pid` && rm postmaster.pid && pg_ctl -w start". A postmaster will no longer recycle segments pertaining to other data directories. That's good for production, but it's bad for integration tests that crash a postmaster and immediately delete its data directory. Such a test now leaks a segment indefinitely. No "make check-world" test does that. win32_shmem.c already avoided all these problems. In 9.6 and later, enhance PostgresNode to facilitate testing. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20130911033341.GD225735@tornado.leadboat.com
Update HINT for pre-existing shared memory block.
commit : f3461cb8f5b505878f33d6ded15eab6f0e1993dd author : Noah Misch <firstname.lastname@example.org> date : Sun, 31 Mar 2019 19:32:48 -0700 committer: Noah Misch <email@example.com> date : Sun, 31 Mar 2019 19:32:48 -0700
One should almost always terminate an old process, not use a manual removal tool like ipcrm. Removal of the ipcclean script eleven years ago (39627b1ae680cba44f6e56ca5facec4fdbfe9495) and its non-replacement corroborate that manual shm removal is now a niche goal. Back-patch to 9.4 (all supported versions). Reviewed by Daniel Gustafsson and Kyotaro HORIGUCHI. Discussion: https://postgr.es/m/20180812064815.GB2301738@rfd.leadboat.com
Track unowned relations in doubly-linked list
commit : 31737eb43ffd77de9269247dfbd11a2665e6ee57 author : Tomas Vondra <firstname.lastname@example.org> date : Wed, 27 Mar 2019 02:39:39 +0100 committer: Tomas Vondra <email@example.com> date : Wed, 27 Mar 2019 02:39:39 +0100
Relations dropped in a single transaction are tracked in a list of unowned relations. With large number of dropped relations this resulted in poor performance at the end of a transaction, when the relations are removed from the singly linked list one by one. Commit b4166911 attempted to address this issue (particularly when it happens during recovery) by removing the relations in a reverse order, resulting in O(1) lookups in the list of unowned relations. This did not work reliably, though, and it was possible to trigger the O(N^2) behavior in various ways. Instead of trying to remove the relations in a specific order with respect to the linked list, which seems rather fragile, switch to a regular doubly linked. That allows us to remove relations cheaply no matter where in the list they are. As b4166911 was a bugfix, backpatched to all supported versions, do the same thing here. Reviewed-by: Alvaro Herrera Discussion: https://www.postgresql.org/message-id/flat/80c27103-99e4-1d0c-642c-d9f3b94aaa0a%402ndquadrant.com Backpatch-through: 9.4
Doc: clarify that REASSIGN OWNED doesn't handle default privileges.
commit : 35eaab71d1f776ecec7fd66d060b7e04987b06af author : Tom Lane <firstname.lastname@example.org> date : Mon, 25 Mar 2019 17:18:06 -0400 committer: Tom Lane <email@example.com> date : Mon, 25 Mar 2019 17:18:06 -0400
It doesn't touch regular privileges either, but only the latter was explicitly stated. Discussion: https://firstname.lastname@example.org
Avoid double-free in vacuumlo error path.
commit : 99e414cacd063550d576cb9b1acac503157a3bce author : Tom Lane <email@example.com> date : Sun, 24 Mar 2019 15:13:21 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sun, 24 Mar 2019 15:13:21 -0400
The code would do "PQclear(res)" twice if lo_unlink failed, evidently due to careless thinking about how far out a "break" would break. Remove the extra PQclear and adjust the loop logic so that we'll fall out of both levels of loop after an error, as was clearly the intent. Spotted by Coverity. I have no idea why it took this long to notice, since the bug has been there since commit 67ccbb080. Accordingly, back-patch to all supported branches.
Fix WAL format incompatibility introduced by backpatching of 52ac6cd2d0
commit : d1166af192ed0bb40ed9dc53d4e799117403233b author : Alexander Korotkov <email@example.com> date : Sun, 24 Mar 2019 15:26:45 +0300 committer: Alexander Korotkov <firstname.lastname@example.org> date : Sun, 24 Mar 2019 15:26:45 +0300
52ac6cd2d0 added new field to ginxlogDeletePage and was backpatched to 9.4. That led to problems when patched postgres instance applies WAL records generated by non-patched one. WAL records generated by non-patched instance don't contain new field, which patched one is expecting to see. Thankfully, we can distinguish patched and non-patched WAL records by their data size. If we see that WAL record is generated by non-patched instance, we skip processing of new field. This commit comes with some assertions. In particular, if it appears that on some platform struct data size didn't change then static assertion will trigger. Reported-by: Simon Riggs Discussion: https://postgr.es/m/CANP8%2Bj%2BK4whxf7ET7%2BgO%2BG-baC3-WxqqH%3DnV4X2CgfEPA3Yu3g%40mail.gmail.com Author: Alexander Korotkov Reviewed-by: Simon Riggs, Alvaro Herrera Backpatch-through: 9.4
Remove inadequate check for duplicate "xml" PI.
commit : 432356a9def4e03cb96babd396e17bbf5a59b2b1 author : Tom Lane <email@example.com> date : Sat, 23 Mar 2019 17:40:19 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sat, 23 Mar 2019 17:40:19 -0400
I failed to think about PIs starting with "xml". We don't really need this check at all, so just take it out. Oversight in commit 8d1dadb25 et al.
Revert strlen -> strnlen optimization pre-v11.
commit : d5e9e2330e3a4d06b6686586fde6ea332c95be61 author : Tom Lane <email@example.com> date : Sat, 23 Mar 2019 17:35:05 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sat, 23 Mar 2019 17:35:05 -0400
We don't have a src/port substitute for that function in older branches, so it fails on platforms lacking the function natively. Per buildfarm.
Ensure xmloption = content while restoring pg_dump output.
commit : 8ba485422933e0ae48e8c4e1686ea59cfab75111 author : Tom Lane <email@example.com> date : Sat, 23 Mar 2019 16:51:26 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sat, 23 Mar 2019 16:51:26 -0400
In combination with the previous commit, this ensures that valid XML data can always be dumped and reloaded, whether it is "document" or "content". Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com
Accept XML documents when xmloption = content, as required by SQL:2006+.
commit : 78f84fe0f478142c490dcffd5eb46704b0131b9e author : Tom Lane <email@example.com> date : Sat, 23 Mar 2019 16:24:30 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sat, 23 Mar 2019 16:24:30 -0400
Previously we were using the SQL:2003 definition, which doesn't allow this, but that creates a serious dump/restore gotcha: there is no setting of xmloption that will allow all valid XML data. Hence, switch to the 2006 definition. Since libxml doesn't accept <!DOCTYPE> directives in the mode we use for CONTENT parsing, the implementation is to detect <!DOCTYPE> in the input and switch to DOCUMENT parsing mode. This should not cost much, because <!DOCTYPE> should be close to the front of the input if it's there at all. It's possible that this causes the error messages for malformed input to be slightly different than they were before, if said input includes <!DOCTYPE>; but that does not seem like a big problem. In passing, buy back a few cycles in parsing of large XML documents by not doing strlen() of the whole input in parse_xml_decl(). Back-patch because dump/restore failures are not nice. This change shouldn't break any cases that worked before, so it seems safe to back-patch. Chapman Flack (revised a bit by me) Discussion: https://postgr.es/m/CAN-V+g-6JqUQEQZ55Q3toXEN6d5Ez5uvzL4VR+8KtvJKj31taw@mail.gmail.com
Make checkpoint requests more robust.
commit : a1695035b3612518ee2c5dd2578b769aa6afab93 author : Tom Lane <email@example.com> date : Tue, 19 Mar 2019 12:49:27 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Tue, 19 Mar 2019 12:49:27 -0400
Commit 6f6a6d8b1 introduced a delay of up to 2 seconds if we're trying to request a checkpoint but the checkpointer hasn't started yet (or, much less likely, our kill() call fails). However buildfarm experience shows that that's not quite enough for slow or heavily-loaded machines. There's no good reason to assume that the checkpointer won't start eventually, so we may as well make the timeout much longer, say 60 sec. However, if the caller didn't say CHECKPOINT_WAIT, it seems like a bad idea to be waiting at all, much less for as long as 60 sec. We can remove the need for that, and make this whole thing more robust, by adjusting the code so that the existence of a pending checkpoint request is clear from the contents of shared memory, and making sure that the checkpointer process will notice it at startup even if it did not get a signal. In this way there's no need for a non-CHECKPOINT_WAIT call to wait at all; if it can't send the signal, it can nonetheless assume that the checkpointer will eventually service the request. A potential downside of this change is that "kill -INT" on the checkpointer process is no longer enough to trigger a checkpoint, should anyone be relying on something so hacky. But there's no obvious reason to do it like that rather than issuing a plain old CHECKPOINT command, so we'll assume that nobody is. There doesn't seem to be a way to preserve this undocumented quasi-feature without introducing race conditions. Since a principal reason for messing with this is to prevent intermittent buildfarm failures, back-patch to all supported branches. Discussion: https://email@example.com
Ensure dummy paths have correct required_outer if rel is parameterized.
commit : 98f8ffa18ebe5d9cda40314002432635990c5b1d author : Tom Lane <firstname.lastname@example.org> date : Thu, 14 Mar 2019 12:16:10 -0400 committer: Tom Lane <email@example.com> date : Thu, 14 Mar 2019 12:16:10 -0400
The assertions added by commits 34ea1ab7f et al found another problem: set_dummy_rel_pathlist and mark_dummy_rel were failing to label the dummy paths they create with the correct outer_relids, in case the relation is necessarily parameterized due to having lateral references in its tlist. It's likely that this has no user-visible consequences in production builds, at the moment; but still an assertion failure is a bad thing, so back-patch the fix. Per bug #15694 from Roman Zharkov (via Alexander Lakhin) and an independent report by Tushar Ahuja. Discussion: https://firstname.lastname@example.org Discussion: https://email@example.com
Fix potential memory access violation in ecpg if filename of include file is shorter than 2 characters.
commit : e7e78f1598281a7543be563fb19c79b99826c943 author : Michael Meskes <firstname.lastname@example.org> date : Mon, 11 Mar 2019 16:11:16 +0100 committer: Michael Meskes <email@example.com> date : Mon, 11 Mar 2019 16:11:16 +0100
Patch by: "Wu, Fei" <firstname.lastname@example.org>
Disallow NaN as a value for floating-point GUCs.
commit : e04bb261633dd76d6dc2e250c92f33330ec2891e author : Tom Lane <email@example.com> date : Sun, 10 Mar 2019 12:58:52 -0400 committer: Tom Lane <firstname.lastname@example.org> date : Sun, 10 Mar 2019 12:58:52 -0400
None of the code that uses GUC values is really prepared for them to hold NaN, but parse_real() didn't have any defense against accepting such a value. Treat it the same as a syntax error. I haven't attempted to analyze the exact consequences of setting any of the float GUCs to NaN, but since they're quite unlikely to be good, this seems like a back-patchable bug fix. Note: we don't need an explicit test for +-Infinity because those will be rejected by existing range checks. I added a regression test for that in HEAD, but not older branches because the spelling of the value in the error message will be platform-dependent in branches where we don't always use port/snprintf.c. Discussion: https://email@example.com
Simplify release-note links to back branches.
commit : bb2da63c7e9636fea04fdd635e537737da75ad80 author : Tom Lane <firstname.lastname@example.org> date : Sat, 9 Mar 2019 18:42:19 -0500 committer: Tom Lane <email@example.com> date : Sat, 9 Mar 2019 18:42:19 -0500
Now that https://www.postgresql.org/docs/release/ is populated, replace the stopgap text we had under "Prior Releases" with a pointer to that archive. Discussion: https://firstname.lastname@example.org
Fix error handling of readdir() port implementation on first file lookup
commit : 81f5b32839ff9ab9b055538a4c55cf07d12ec450 author : Michael Paquier <email@example.com> date : Mon, 4 Mar 2019 09:50:24 +0900 committer: Michael Paquier <firstname.lastname@example.org> date : Mon, 4 Mar 2019 09:50:24 +0900
The implementation of readdir() in src/port/ which gets used by MSVC has been added in 399a36a, and since the beginning it considers all errors on the first file lookup as ENOENT, setting errno accordingly and letting the routine caller think that the directory is empty. While this is normally enough for the case of the backend, this can confuse callers of this routine on Windows as all errors would map to the same behavior. So, for example, even permission errors would be thought as having an empty directory, while there could be contents in it. This commit changes the error handling so as readdir() gets a behavior similar to native implementations: force errno=0 when seeing ERROR_FILE_NOT_FOUND as error and consider other errors as plain failures. While looking at the patch, I noticed that MinGW does not enforce errno=0 when looking at the first file, but it gets enforced on the next file lookups. A comment related to that was incorrect in the code. Reported-by: Yuri Kurenkov Diagnosed-by: Yuri Kurenkov, Grigory Smolkin Author: Konstantin Knizhnik Reviewed-by: Andrew Dunstan, Michael Paquier Discussion: https://email@example.com Backpatch-through: 9.4
Further fixing for multi-row VALUES lists for updatable views.
commit : 431471e1f2a2e324b0d701c4d4fddd44f96b397a author : Dean Rasheed <firstname.lastname@example.org> date : Sun, 3 Mar 2019 10:58:45 +0000 committer: Dean Rasheed <email@example.com> date : Sun, 3 Mar 2019 10:58:45 +0000
Previously, rewriteTargetListIU() generated a list of attribute numbers from the targetlist, which were passed to rewriteValuesRTE(), which expected them to contain the same number of entries as there are columns in the VALUES RTE, and to be in the same order. That was fine when the target relation was a table, but for an updatable view it could be broken in at least three different ways --- rewriteTargetListIU() could insert additional targetlist entries for view columns with defaults, the view columns could be in a different order from the columns of the underlying base relation, and targetlist entries could be merged together when assigning to elements of an array or composite type. As a result, when recursing to the base relation, the list of attribute numbers generated from the rewritten targetlist could no longer be relied upon to match the columns of the VALUES RTE. We got away with that prior to 41531e42d3 because it used to always be the case that rewriteValuesRTE() did nothing for the underlying base relation, since all DEFAULTS had already been replaced when it was initially invoked for the view, but that was incorrect because it failed to apply defaults from the base relation. Fix this by examining the targetlist entries more carefully and picking out just those that are simple Vars referencing the VALUES RTE. That's sufficient for the purposes of rewriteValuesRTE(), which is only responsible for dealing with DEFAULT items in the VALUES RTE. Any DEFAULT item in the VALUES RTE that doesn't have a matching simple-Var-assignment in the targetlist is an error which we complain about, but in theory that ought to be impossible. Additionally, move this code into rewriteValuesRTE() to give a clearer separation of concerns between the 2 functions. There is no need for rewriteTargetListIU() to know about the details of the VALUES RTE. While at it, fix the comment for rewriteValuesRTE() which claimed that it doesn't support array element and field assignments --- that hasn't been true since a3c7a993d5 (9.6 and later). Back-patch to all supported versions, with minor differences for the pre-9.6 branches, which don't support array element and field assignments to the same column in multi-row VALUES lists. Reviewed by Amit Langote. Discussion: https://firstname.lastname@example.org
Improve documentation of data_sync_retry
commit : 00382f373239feff9dacc99dd2e5dace50da60de author : Michael Paquier <email@example.com> date : Thu, 28 Feb 2019 11:02:40 +0900 committer: Michael Paquier <firstname.lastname@example.org> date : Thu, 28 Feb 2019 11:02:40 +0900
Reflecting an updated parameter value requires a server restart, which was not mentioned in the documentation and in postgresql.conf.sample. Reported-by: Thomas Poty Discussion: https://email@example.com
Fix ecpg bugs caused by missing semicolons in the backend grammar.
commit : 53c2bb78db9f9bb47de14d61fc8b2a71cf1052ca author : Tom Lane <firstname.lastname@example.org> date : Sun, 24 Feb 2019 12:51:51 -0500 committer: Tom Lane <email@example.com> date : Sun, 24 Feb 2019 12:51:51 -0500
The Bison documentation clearly states that a semicolon is required after every grammar rule, and our scripts that generate ecpg's grammar from the backend's implicitly assumed this is true. But it turns out that only ancient versions of Bison actually enforce that. There have been a couple of rules without trailing semicolons in gram.y for some time, and as a consequence, ecpg's grammar was faulty and produced wrong output for the affected statements. To fix, add the missing semis, and add some cross-checks to ecpg's scripts so that they'll bleat if we mess this up again. The cases that were broken were: * "SET variable = DEFAULT" (but not "SET variable TO DEFAULT"), as well as allied syntaxes such as ALTER SYSTEM SET ... DEFAULT. These produced syntactically invalid output that the server would reject. * Multiple type names in DROP TYPE/DOMAIN commands. Only the first type name would be listed in the emitted command. Per report from Daisuke Higuchi. Back-patch to all supported versions. Discussion: https://postgr.es/m/1803D792815FC24D871C00D17AE95905DB51CE@g01jpexmbkw24
Tolerate EINVAL when calling fsync() on a directory.
commit : ede6b19624d1836635d8bd2c8cdc6a694d8ff678 author : Thomas Munro <firstname.lastname@example.org> date : Sun, 24 Feb 2019 23:59:26 +1300 committer: Thomas Munro <email@example.com> date : Sun, 24 Feb 2019 23:59:26 +1300
Previously, we tolerated EBADF as a way for the operating system to indicate that it doesn't support fsync() on a directory. Tolerate EINVAL too, for older versions of Linux CIFS. Bug #15636. Back-patch all the way. Reported-by: John Klann Discussion: https://firstname.lastname@example.org
Fix plan created for inherited UPDATE/DELETE with all tables excluded.
commit : 99a1554a2680d25f0b54d4209ccfc76c443df6e3 author : Tom Lane <email@example.com> date : Fri, 22 Feb 2019 12:23:00 -0500 committer: Tom Lane <firstname.lastname@example.org> date : Fri, 22 Feb 2019 12:23:00 -0500
In the case where inheritance_planner() finds that every table has been excluded by constraints, it thought it could get away with making a plan consisting of just a dummy Result node. While certainly there's no updating or deleting to be done, this had two user-visible problems: the plan did not report the correct set of output columns when a RETURNING clause was present, and if there were any statement-level triggers that should be fired, it didn't fire them. Hence, rather than only generating the dummy Result, we need to stick a valid ModifyTable node on top, which requires a tad more effort here. It's been broken this way for as long as inheritance_planner() has known about deleting excluded subplans at all (cf commit 635d42e9c), so back-patch to all supported branches. Amit Langote and Tom Lane, per a report from Petr Fedorov. Discussion: https://email@example.com
Fix DEFAULT-handling in multi-row VALUES lists for updatable views.
commit : 5a73edf050ed6740fc085cad05a2e7af9fb1f06e author : Dean Rasheed <firstname.lastname@example.org> date : Wed, 20 Feb 2019 08:19:55 +0000 committer: Dean Rasheed <email@example.com> date : Wed, 20 Feb 2019 08:19:55 +0000
INSERT ... VALUES for a single VALUES row is implemented differently from a multi-row VALUES list, which causes inconsistent behaviour in the way that DEFAULT items are handled. In particular, when inserting into an auto-updatable view on top of a table with a column default, a DEFAULT item in a single VALUES row gets correctly replaced with the table column's default, but for a multi-row VALUES list it is replaced with NULL. Fix this by allowing rewriteValuesRTE() to leave DEFAULT items in the VALUES list untouched if the target relation is an auto-updatable view and has no column default, deferring DEFAULT-expansion until the query against the base relation is rewritten. For all other types of target relation, including tables and trigger- and rule-updatable views, we must continue to replace DEFAULT items with NULL in the absence of a column default. This is somewhat complicated by the fact that if an auto-updatable view has DO ALSO rules attached, the VALUES lists for the product queries need to be handled differently from the original query, since the product queries need to act like rule-updatable views whereas the original query has auto-updatable view semantics. Back-patch to all supported versions. Reported by Roger Curley (bug #15623). Patch by Amit Langote and me. Discussion: https://firstname.lastname@example.org
Mark correctly initial slot snapshots with MVCC type when built
commit : 2ad57e9e9faf5e8a6fdec048636581df7210478f author : Michael Paquier <email@example.com> date : Wed, 20 Feb 2019 12:32:23 +0900 committer: Michael Paquier <firstname.lastname@example.org> date : Wed, 20 Feb 2019 12:32:23 +0900
When building an initial slot snapshot, snapshots are marked with historic MVCC snapshots as type with the marker field being set in SnapBuildBuildSnapshot() but not overriden in SnapBuildExportSnapshot(). Existing callers of SnapBuildBuildSnapshot() do not care about the type of snapshot used, but extensions calling it actually may, as reported. Author: Antonin Houska Reviewed-by: Álvaro Herrera, Michael Paquier Discussion: https://postgr.es/m/23215.1527665193@localhost Backpatch-through: 9.4
Fix documentation for dblink_error_message() return value
commit : 64f6a65a00b0b6d768fc3fc49047524f4f99a54d author : Joe Conway <email@example.com> date : Sun, 17 Feb 2019 13:14:29 -0500 committer: Joe Conway <firstname.lastname@example.org> date : Sun, 17 Feb 2019 13:14:29 -0500
The dblink documentation claims that an empty string is returned if there has been no error, however OK is actually returned in that case. Also, clarify that an async error may not be seen unless dblink_is_busy() or dblink_get_result() have been called first. Backpatch to all supported branches. Reported-by: realyota Backpatch-through: 9.4 Discussion: https://email@example.com
Fix CREATE VIEW to allow zero-column views.
commit : 9fdc49d08fe405f0503f93849844a73517bd5cbd author : Tom Lane <firstname.lastname@example.org> date : Sun, 17 Feb 2019 12:37:32 -0500 committer: Tom Lane <email@example.com> date : Sun, 17 Feb 2019 12:37:32 -0500
We should logically have allowed this case when we allowed zero-column tables, but it was overlooked. Although this might be thought a feature addition, it's really a bug fix, because it was possible to create a zero-column view via the convert-table-to-view code path, and then you'd have a situation where dump/reload would fail. Hence, back-patch to all supported branches. Arrange the added test cases to provide coverage of the related pg_dump code paths (since these views will be dumped and reloaded during the pg_upgrade regression test). I also made them test the case where pg_dump has to postpone the view rule into post-data, which disturbingly had no regression coverage before. Report and patch by Ashutosh Sharma (test case by me) Discussion: https://postgr.es/m/CAE9k0PkmHdeSaeZt2ujnb_cKucmK3sDDceDzw7+d5UZoNJPYOg@mail.gmail.com
Fix race in dsm_attach() when handles are reused.
commit : bd195071fd2be49392bf1993397f638d96315ee0 author : Thomas Munro <firstname.lastname@example.org> date : Fri, 15 Feb 2019 10:19:11 +1300 committer: Thomas Munro <email@example.com> date : Fri, 15 Feb 2019 10:19:11 +1300
DSM handle values can be reused as soon as the underlying shared memory object has been destroyed. That means that for a brief moment we might have two DSM slots with the same handle. While trying to attach, if we encounter a slot with refcnt == 1, meaning that it is currently being destroyed, we should continue our search in case the same handle exists in another slot. The race manifested as a rare "dsa_area could not attach to segment" error, and was more likely in 10 and 11 due to the lack of distinct seed for random() in parallel workers. It was made very unlikely in in master by commit 197e4af9, and older releases don't usually create new DSM segments in background workers so it was also unlikely there. This fixes the root cause of bug report #15585, in which the error could also sometimes result in a self-deadlock in the error path. It's not yet clear if further changes are needed to avoid that failure mode. Back-patch to 9.4, where dsm.c arrived. Author: Thomas Munro Reported-by: Justin Pryzby, Sergei Kornilov Discussion: https://postgr.es/m/20190207014719.GJ29720@telsasoft.com Discussion: https://firstname.lastname@example.org
Relax overly strict assertion
commit : 350cdcd5e6dd505454435af8d7640a4b5042fcfd author : Alvaro Herrera <email@example.com> date : Tue, 12 Feb 2019 18:42:37 -0300 committer: Alvaro Herrera <firstname.lastname@example.org> date : Tue, 12 Feb 2019 18:42:37 -0300
Ever since its birth, ReorderBufferBuildTupleCidHash() has contained an assertion that a catalog tuple cannot change Cmax after acquiring one. But that's wrong: if a subtransaction executes DDL that affects that catalog tuple, and later aborts and another DDL affects the same tuple, it will change Cmax. Relax the assertion to merely verify that the Cmax remains valid and monotonically increasing, instead. Add a test that tickles the relevant code. Diagnosed by, and initial patch submitted by: Arseny Sher Co-authored-by: Arseny Sher Discussion: https://postgr.es/m/874l9p8hyw.fsf@ars-thinkpad
Fix erroneous error reports in snapbuild.c.
commit : 42baa60f6bc70cfffcba6a89a4b7a8b0a929b54d author : Tom Lane <email@example.com> date : Tue, 12 Feb 2019 01:12:52 -0500 committer: Tom Lane <firstname.lastname@example.org> date : Tue, 12 Feb 2019 01:12:52 -0500
It's pretty unhelpful to report the wrong file name in a complaint about syscall failure, but SnapBuildSerialize managed to do that twice in a span of 50 lines. Also fix half a dozen missing or poorly-chosen errcode assignments; that's mostly cosmetic, but still wrong. Noted while studying recent failures on buildfarm member nightjar. I'm not sure whether those reports are actually giving the wrong filename, because there are two places here with identically spelled error messages. The other one is specifically coded not to report ENOENT, but if it's this one, how could we be getting ENOENT from open() with O_CREAT? Need to sit back and await results. However, these ereports are clearly broken from birth, so back-patch.