Skip to content
Snippets Groups Projects
Commit a33f3cc7 authored by Vladimir Davydov's avatar Vladimir Davydov Committed by Vladimir Davydov
Browse files

net.box: support graceful shutdown protocol

Closes #5924

@TarantoolBot document
Title: Document graceful shutdown of net.box connections

If a Tarantool server supports the IPROTO graceful shutdown protocol
(`connection.peer_protocol_features` contains `graceful_shutdown` or
`peer_protocol_version` is >= 4), then when the server is asked to exit
(`os.exit()` is called on the server or `SIGTERM` signal is received),
it won't terminate net.box connections immediately. Instead here's what
will happen:

 1. The server stops accepting new connections and sends a special
    'shutdown' packet to all connection that support the graceful
    shutdown protocol.
 2. Upon receiving a 'shutdown' packet, a net.box connection executes
    shutdown triggers. The triggers are installed by `on_shutdown()`
    method of a net.box connection. The method follows the same protocol
    as `on_connect()`, `on_disconnect()`, and `on_schema_reload()`.
    Triggers are executed asynchronously in a new fiber. The connection
    remains active while triggers are running so a trigger callback may
    send new requests over the net.box connection.
 3. After shutdown triggers have returned, the connection is switched to
    `graceful_shutdown` state, in which all new requests fail with an
    error. The connection will remain in this state until all requests
    have been completed.
 4. Once all in-progress requests have completed, the connection is
    closed and switched to `error` or `error_reconnect` state, depending
    on whether `reconnect_after` option is set.
 5. Once all connections that support the graceful shutdown protocol are
    closed, the server exits.

Note, the graceful shutdown protocol is best-effort: there's no
guarantee that the server doesn't exit before all active connections
are gracefully closed; the server may still exit on timeout or just be
killed. The timeout is configured by `box.ctl.set_on_shutdown_timeout()`
on a server.

Please also update the net.box state machine diagram:

```
initial -> auth -> fetch_schema <-> active

fetch_schema, active -> graceful_shutdown

(any state, on error) -> error_reconnect -> auth -> ...
                                         \
                                          -> error
(any state, but 'error') -> closed
```
parent 6f29f9d7
Loading
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment