popen: fix a race between setpgrp() and killpg()
In brief: `vfork()` on Mac OS 12 and newer doesn't suspend the parent process, so we should wait for `setpgrp()` to use `killpg()`. See more detailed description of the problem in a comment of the `popen_wait_group_leadership()` function. The solution is to spin in a loop and check child's process group. It looks as the most simple and direct solution. Other possible solutions requires to estimate cons and pros of using extra file descriptor or assigning a signal number for the child -> parent communication. There are the following alternatives and variations: * Create a pipe and notify the parent from the child about the `setpgrp()` call. It costs extra file descriptor, so I decided to don't do that. However if we'll need some channel to deliver information from the child to the parent for another task, it'll worth to reimplement this function too. One possible place, where we may need such channel is delivery of child's errors to the parent. Now the child writes them directly to logger's fd and it requires some tricky code to keep and close the descriptor at right points. Also it doesn't allow to catch those errors in the parent, but we may need it for #4925. * Notify the parent about `setpgrp()` using a signal. It seems too greedly to assign a specific signal for such local problem. It is also unclear how to guarantee that it'll not break any user's code: a user can load a dynamic library, which uses some signals on its own. However we can consider using this approach here if we'll design some common interprocess notification system. * We can use the fiber cond or the `popen_wait_timeout()` function from PR #7648 to react to the child termination instantly. It would complicate the code and anyway wouldn't allow to react instantly on `setpgrp()` in the child. Also it assumes yielding during the wait (see below). * Wait until `setpgrp()` in `popen_send_signal()` instead of `popen_new()`. It would add yielding/waiting inside `popen_send_signal()` and likely will extend a set of its possible exit situations. It is undesirable: this function should have simple and predictable behavior. * Finally, we considered yielding in `popen_wait_group_leadership()` instead of sleeping the whole tx thread. `<popen handle>:new()` doesn't yield at the moment and a user's code may lean on this fact. Yielding would allow to achieve better throughtput (amount of parallel requests per second), but we don't take much care to performance on Mac OS. The primary goal for this platform is to offer the same behavior as on Linux to allow development of applications. I didn't replace `vfork()` with `fork()` on Mac OS, because `vfork()` works and I don't know consequences of calling `pthread_atfork()` handlers in a child created by popen. See the comment in `popen_new()` near to `vfork()` call: it warns about possible mutex double locks. This topic will be investigated further in #6674. Fixes #7658 NO_DOC=fixes incorrect behavior, no need to document the bug NO_TEST=already tested by app-tap/popen.test.lua (cherry picked from commit e2207fdc)
Showing
- changelogs/unreleased/popen-mac-os-race.md 3 additions, 0 deletionschangelogs/unreleased/popen-mac-os-race.md
- src/lib/core/popen.c 103 additions, 2 deletionssrc/lib/core/popen.c
- src/lua/popen.c 2 additions, 2 deletionssrc/lua/popen.c
- test/app-tap/popen.skipcond 0 additions, 4 deletionstest/app-tap/popen.skipcond
Loading
Please register or sign in to comment