popen: add popen Lua module
Fixes #4031 Co-developed-by:Cyrill Gorcunov <gorcunov@gmail.com> Co-developed-by:
Igor Munkin <imun@tarantool.org> Acked-by:
Cyrill Gorcunov <gorcunov@gmail.com> Reviewed-by:
Igor Munkin <imun@tarantool.org> @TarantoolBot document Title: popen module ``` Overview ======== Tarantool supports execution of external programs similarly to well known Python's `subprocess` or Ruby's `Open3`. Note though the `popen` module does not match one to one to the helpers these languages provide and provides only basic functions. The popen object creation is implemented via `vfork()` system call which means the caller thread is blocked until execution of a child process begins. Module functions ================ The `popen` module provides two functions to create that named popen object: `popen.shell` which is the similar to libc `popen` syscall and `popen.new` to create popen object with more specific options. `popen.shell(command[, mode]) -> handle, err` --------------------------------------------- Execute a shell command. @param command a command to run, mandatory @param mode communication mode, optional 'w' to use ph:write() 'r' to use ph:read() 'R' to use ph:read({stderr = true}) nil inherit parent's std* file descriptors Several mode characters can be set together: 'rw', 'rRw', etc. This function is just shortcut for popen.new({command}, opts) with opts.{shell,setsid,group_signal} set to `true` and and opts.{stdin,stdout,stderr} set based on `mode` parameter. All std* streams are inherited from parent by default if it is not changed using mode: 'r' for stdout, 'R' for stderr, 'w' for stdin. Raise an error on incorrect parameters: - IllegalParams: incorrect type or value of a parameter. Return a popen handle on success. Return `nil, err` on a failure. @see popen.new() for possible reasons. Example: | local popen = require('popen') | | -- Run the program and save its handle. | local ph = popen.shell('date', 'r') | | -- Read program's output, strip trailing newline. | local date = ph:read():rstrip() | | -- Free resources. The process is killed (but 'date' | -- exits itself anyway). | ph:close() | | print(date) Execute 'sh -c date' command, read the output and close the popen object. Unix defines a text file as a sequence of lines, each ends with the newline symbol. The same convention is usually applied for a text output of a command (so when it is redirected to a file, the file will be correct). However internally an application usually operates on strings, which are NOT newline terminated (e.g. literals for error messages). The newline is usually added right before a string is written to the outside world (stdout, console or log). :rstrip() in the example above is shown for this sake. `popen.new(argv[, opts]) -> handle, err` ---------------------------------------- Execute a child program in a new process. @param argv an array of a program to run with command line options, mandatory; absolute path to the program is required when @a opts.shell is false (default) @param opts table of options @param opts.stdin action on STDIN_FILENO @param opts.stdout action on STDOUT_FILENO @param opts.stderr action on STDERR_FILENO File descriptor actions: popen.opts.INHERIT (== 'inherit') [default] inherit the fd from the parent popen.opts.DEVNULL (== 'devnull') open /dev/null on the fd popen.opts.CLOSE (== 'close') close the fd popen.opts.PIPE (== 'pipe') feed data from/to the fd to parent using a pipe @param opts.env a table of environment variables to be used inside a process; key is a variable name, value is a variable value. - when is not set then the current environment is inherited; - if set to an empty table then the environment will be dropped - if set then the environment will be replaced @param opts.shell (boolean, default: false) true run a child process via 'sh -c "${opts.argv}"' false call the executable directly @param opts.setsid (boolean, default: false) true run the program in a new session false run the program in the tarantool instance's session and process group @param opts.close_fds (boolean, default: true) true close all inherited fds from a parent false don't do that @param opts.restore_signals (boolean, default: true) true reset all signal actions modified in parent's process false inherit changed actions @param opts.group_signal (boolean, default: false) true send signal to a child process group (only when opts.setsid is enabled) false send signal to a child process only @param opts.keep_child (boolean, default: false) true don't send SIGKILL to a child process at freeing (by :close() or Lua GC) false send SIGKILL to a child process (or a process group if opts.group_signal is enabled) at :close() or collecting of the handle by Lua GC The returned handle provides :close() method to explicitly release all occupied resources (including the child process itself if @a opts.keep_child is not set). However if the method is not called for a handle during its lifetime, the same freeing actions will be triggered by Lua GC. It is recommended to use opts.setsid + opts.group_signal if a child process may spawn its own childs and they all should be killed together. Note: A signal will not be sent if the child process is already dead: otherwise we might kill another process that occupies the same PID later. This means that if the child process dies before its own childs, the function will not send a signal to the process group even when opts.setsid and opts.group_signal are set. Use os.environ() to pass copy of current environment with several replacements (see example 2 below). Raise an error on incorrect parameters: - IllegalParams: incorrect type or value of a parameter. - IllegalParams: group signal is set, while setsid is not. Return a popen handle on success. Return `nil, err` on a failure. Possible reasons: - SystemError: dup(), fcntl(), pipe(), vfork() or close() fails in the parent process. - SystemError: (temporary restriction) the parent process has closed stdin, stdout or stderr. - OutOfMemory: unable to allocate the handle or a temporary buffer. Example 1: | local popen = require('popen') | | local ph = popen.new({'/bin/date'}, { | stdout = popen.opts.PIPE, | }) | local date = ph:read():rstrip() | ph:close() | print(date) -- Thu 16 Apr 2020 01:40:56 AM MSK Execute 'date' command, read the result and close the popen object. Example 2: | local popen = require('popen') | | local env = os.environ() | env['FOO'] = 'bar' | | local ph = popen.new({'echo "${FOO}"'}, { | stdout = popen.opts.PIPE, | shell = true, | env = env, | }) | local res = ph:read():rstrip() | ph:close() | print(res) -- bar It is quite similar to the previous one, but sets the environment variable and uses shell builtin 'echo' to show it. Example 3: | local popen = require('popen') | | local ph = popen.new({'echo hello >&2'}, { -- !! | stderr = popen.opts.PIPE, -- !! | shell = true, | }) | local res = ph:read({stderr = true}):rstrip() | ph:close() | print(res) -- hello This example demonstrates how to capture child's stderr. Example 4: | local function call_jq(input, filter) | -- Start jq process, connect to stdin, stdout and stderr. | local jq_argv = {'/usr/bin/jq', '-M', '--unbuffered', filter} | local ph, err = popen.new(jq_argv, { | stdin = popen.opts.PIPE, | stdout = popen.opts.PIPE, | stderr = popen.opts.PIPE, | }) | if ph == nil then return nil, err end | | -- Write input data to child's stdin and send EOF. | local ok, err = ph:write(input) | if not ok then return nil, err end | ph:shutdown({stdin = true}) | | -- Read everything until EOF. | local chunks = {} | while true do | local chunk, err = ph:read() | if chunk == nil then | ph:close() | return nil, err | end | if chunk == '' then break end -- EOF | table.insert(chunks, chunk) | end | | -- Read diagnostics from stderr if any. | local err = ph:read({stderr = true}) | if err ~= '' then | ph:close() | return nil, err | end | | -- Glue all chunks, strip trailing newline. | return table.concat(chunks):rstrip() | end Demonstrates how to run a stream program (like `grep`, `sed` and so), write to its stdin and read from its stdout. The example assumes that input data are small enough to fit a pipe buffer (typically 64 KiB, but depends on a platform and its configuration). It will stuck in :write() for large data. How to handle this case: call :read() in a loop in another fiber (start it before a first :write()). If a process writes large text to stderr, it may fill out stderr pipe buffer and stuck in write(2, ...). So we need to read stderr in a separate fiber to handle this case. Handle methods ============== `popen_handle:read([opts]) -> str, err` --------------------------------------- Read data from a child peer. @param handle handle of a child process @param opts an options table @param opts.stdout whether to read from stdout, boolean (default: true) @param opts.stderr whether to read from stderr, boolean (default: false) @param opts.timeout time quota in seconds (default: 100 years) Read data from stdout or stderr streams with @a timeout. By default it reads from stdout. Set @a opts.stderr to `true` to read from stderr. It is not possible to read from stdout and stderr both in one call. Set either @a opts.stdout or @a opts.stderr. Raise an error on incorrect parameters or when the fiber is cancelled: - IllegalParams: incorrect type or value of a parameter. - IllegalParams: called on a closed handle. - IllegalParams: opts.stdout and opts.stderr are set both - IllegalParams: a requested IO operation is not supported by the handle (stdout / stderr is not piped). - IllegalParams: attempt to operate on a closed file descriptor. - FiberIsCancelled: cancelled by an outside code. Return a string on success, an empty string at EOF. Return `nil, err` on a failure. Possible reasons: - SocketError: an IO error occurs at read(). - TimedOut: @a timeout quota is exceeded. - OutOfMemory: no memory space for a buffer to read into. - LuajitError: ("not enough memory"): no memory space for the Lua string. `popen_handle:write(str[, opts]) -> str, err` --------------------------------------------- Write data to a child peer. @param handle a handle of a child process @param str a string to write @param opts table of options @param opts.timeout time quota in seconds (default: 100 years) Write string @a str to stdin stream of a child process. The function may yield forever if a child process does not read data from stdin and a pipe buffer becomes full. Size of this buffer depends on a platform. Use @a opts.timeout when unsure. When @a opts.timeout is not set, the function blocks (yields the fiber) until all data is written or an error happened. Raise an error on incorrect parameters or when the fiber is cancelled: - IllegalParams: incorrect type or value of a parameter. - IllegalParams: called on a closed handle. - IllegalParams: string length is greater then SSIZE_MAX. - IllegalParams: a requested IO operation is not supported by the handle (stdin is not piped). - IllegalParams: attempt to operate on a closed file descriptor. - FiberIsCancelled: cancelled by an outside code. Return `true` on success. Return `nil, err` on a failure. Possible reasons: - SocketError: an IO error occurs at write(). - TimedOut: @a timeout quota is exceeded. `popen_handle:shutdown(opts) -> true` ------------------------------------------ Close parent's ends of std* fds. @param handle handle of a child process @param opts an options table @param opts.stdin close parent's end of stdin, boolean @param opts.stdout close parent's end of stdout, boolean @param opts.stderr close parent's end of stderr, boolean The main reason to use this function is to send EOF to child's stdin. However parent's end of stdout / stderr may be closed too. The function does not fail on already closed fds (idempotence). However it fails on attempt to close the end of a pipe that was never exist. In other words, only those std* options that were set to popen.opts.PIPE at a handle creation may be used here (for popen.shell: 'r' corresponds to stdout, 'R' to stderr and 'w' to stdin). The function does not close any fds on a failure: either all requested fds are closed or neither of them. Example: | local popen = require('popen') | | local ph = popen.shell('sed s/foo/bar/', 'rw') | ph:write('lorem foo ipsum') | ph:shutdown({stdin = true}) | local res = ph:read() | ph:close() | print(res) -- lorem bar ipsum Raise an error on incorrect parameters: - IllegalParams: an incorrect handle parameter. - IllegalParams: called on a closed handle. - IllegalParams: neither stdin, stdout nor stderr is choosen. - IllegalParams: a requested IO operation is not supported by the handle (one of std* is not piped). Return `true` on success. `popen_handle:terminate() -> ok, err` ------------------------------------- Send SIGTERM signal to a child process. @param handle a handle carries child process to terminate The function only sends SIGTERM signal and does NOT free any resources (popen handle memory and file descriptors). @see popen_handle:signal() for errors and return values. `popen_handle:kill() -> ok, err` -------------------------------- Send SIGKILL signal to a child process. @param handle a handle carries child process to kill The function only sends SIGKILL signal and does NOT free any resources (popen handle memory and file descriptors). @see popen_handle:signal() for errors and return values. `popen_handle:signal(signo) -> ok, err` --------------------------------------- Send signal to a child process. @param handle a handle carries child process to be signaled @param signo signal number to send When opts.setsid and opts.group_signal are set on the handle the signal is sent to the process group rather than to the process. @see popen.new() for details about group signaling. Note: The module offers popen.signal.SIG* constants, because some signals have different numbers on different platforms. Raise an error on incorrect parameters: - IllegalParams: an incorrect handle parameter. - IllegalParams: called on a closed handle. Return `true` if signal is sent. Return `nil, err` on a failure. Possible reasons: - SystemError: a process does not exists anymore Aside of a non-exist process it is also returned for a zombie process or when all processes in a group are zombies (but see note re Mac OS below). - SystemError: invalid signal number - SystemError: no permission to send a signal to a process or a process group It is returned on Mac OS when a signal is sent to a process group, where a group leader is zombie (or when all processes in it are zombies, don't sure). Whether it may appear due to other reasons is unclear. `popen_handle:info() -> res` ---------------------------- Return information about popen handle. @param handle a handle of a child process Raise an error on incorrect parameters: - IllegalParams: an incorrect handle parameter. - IllegalParams: called on a closed handle. Return information about the handle in the following format: { pid = <number> or <nil>, command = <string>, opts = <table>, status = <table>, stdin = one-of( popen.stream.OPEN (== 'open'), popen.stream.CLOSED (== 'closed'), nil, ), stdout = one-of( popen.stream.OPEN (== 'open'), popen.stream.CLOSED (== 'closed'), nil, ), stderr = one-of( popen.stream.OPEN (== 'open'), popen.stream.CLOSED (== 'closed'), nil, ), } `pid` is a process id of the process when it is alive, otherwise `pid` is nil. `command` is a concatenation of space separated arguments that were passed to execve(). Multiword arguments are quoted. Quotes inside arguments are not escaped. `opts` is a table of handle options in the format of popen.new() `opts` parameter. `opts.env` is not shown here, because the environment variables map is not stored in a handle. `status` is a table that represents a process status in the following format: { state = one-of( popen.state.ALIVE (== 'alive'), popen.state.EXITED (== 'exited'), popen.state.SIGNALED (== 'signaled'), ) -- Present when `state` is 'exited'. exit_code = <number>, -- Present when `state` is 'signaled'. signo = <number>, signame = <string>, } `stdin`, `stdout`, `stderr` reflect status of parent's end of a piped stream. When a stream is not piped the field is not present (`nil`). When it is piped, the status may be one of the following: - popen.stream.OPEN (== 'open') - popen.stream.CLOSED (== 'closed') The status may be changed from 'open' to 'closed' by :shutdown({std... = true}) call. Example 1 (tarantool console): | tarantool> require('popen').new({'/usr/bin/touch', '/tmp/foo'}) | --- | - command: /usr/bin/touch /tmp/foo | status: | state: alive | opts: | stdout: inherit | stdin: inherit | group_signal: false | keep_child: false | close_fds: true | restore_signals: true | shell: false | setsid: false | stderr: inherit | pid: 9499 | ... Example 2 (tarantool console): | tarantool> require('popen').shell('grep foo', 'wrR') | --- | - stdout: open | command: sh -c 'grep foo' | stderr: open | status: | state: alive | stdin: open | opts: | stdout: pipe | stdin: pipe | group_signal: true | keep_child: false | close_fds: true | restore_signals: true | shell: true | setsid: true | stderr: pipe | pid: 10497 | ... `popen_handle:wait() -> res` ---------------------------- Wait until a child process get exited or signaled. @param handle a handle of process to wait Raise an error on incorrect parameters or when the fiber is cancelled: - IllegalParams: an incorrect handle parameter. - IllegalParams: called on a closed handle. - FiberIsCancelled: cancelled by an outside code. Return a process status table (the same as ph.status and ph.info().status). @see popen_handle:info() for the format of the table. `popen_handle:close() -> ok, err` --------------------------------- Close a popen handle. @param handle a handle to close Basically it kills a process using SIGKILL and releases all resources assosiated with the popen handle. Details about signaling: - The signal is sent only when opts.keep_child is not set. - The signal is sent only when a process is alive according to the information available on current even loop iteration. (There is a gap here: a zombie may be signaled; it is harmless.) - The signal is sent to a process or a grocess group depending of opts.group_signal. (@see lbox_popen_new() for details of group signaling). Resources are released disregarding of whether a signal sending succeeds: fds are closed, memory is released, the handle is marked as closed. No operation is possible on a closed handle except :close(), which always successful on closed handle (idempotence). Raise an error on incorrect parameters: - IllegalParams: an incorrect handle parameter. The function may return `true` or `nil, err`, but it always frees the handle resources. So any return value usually means success for a caller. The return values are purely informational: it is for logging or same kind of reporting. Possible diagnostics (don't consider them as errors): - SystemError: no permission to send a signal to a process or a process group This diagnostics may appear due to Mac OS behaviour on zombies when opts.group_signal is set, @see lbox_popen_signal(). Whether it may appear due to other reasons is unclear. Always return `true` when a process is known as dead (say, after ph:wait()): no signal will be send, so no 'failure' may appear. Handle fields ============= - popen_handle.pid - popen_handle.command - popen_handle.opts - popen_handle.status - popen_handle.stdin - popen_handle.stdout - popen_handle.stderr See popen_handle:info() for description of those fields. Module constants ================ - popen.opts - INHERIT (== 'inherit') - DEVNULL (== 'devnull') - CLOSE (== 'close') - PIPE (== 'pipe') - popen.signal - SIGTERM (== 9) - SIGKILL (== 15) - ... - popen.state - ALIVE (== 'alive') - EXITED (== 'exited') - SIGNALED (== 'signaled') - popen.stream - OPEN (== 'open') - CLOSED (== 'closed') ``` (cherry picked from commit 34c2789b48eadd36da742f7f761a998889e70544)
Showing
- src/CMakeLists.txt 1 addition, 0 deletionssrc/CMakeLists.txt
- src/lua/init.c 2 additions, 0 deletionssrc/lua/init.c
- src/lua/popen.c 2542 additions, 0 deletionssrc/lua/popen.c
- src/lua/popen.h 44 additions, 0 deletionssrc/lua/popen.h
- test/app-tap/popen.test.lua 605 additions, 0 deletionstest/app-tap/popen.test.lua
Loading
Please register or sign in to comment