applier: add timeout to greeting read
A Tarantool server is supposed to send a greeting message right after accepting a new client so the first thing an applier does after connecting to the master is reads the greeting. It does this without timeouts. The problem is that if by mistake we connect to a wrong instance, which doesn't send anything to clients, the applier will hang forever (until the remote closes the socket), without logging any errors. This may happen even with a valid Tarantool instance - if SSL encryption is enabled on the master, but not on the client, because the SSL protocol assumes that the client initiates a connection by writing to the socket first (before the server). Let's add a timeout to the operation reading the greeting. The timeout is set to replication_disconnect_timeout(), after which a connection is broken if the master doesn't send heartbeats for that long. Note, we don't add a timeout to other read/write operations issued to initiate a replication connection, because if we received a greeting and it's valid, then the master is likely to be fine. Closes #7204 NO_DOC=bug
Showing
- changelogs/unreleased/gh-7204-replication-greeting-timeout.md 4 additions, 0 deletions...gelogs/unreleased/gh-7204-replication-greeting-timeout.md
- src/box/applier.cc 9 additions, 1 deletionsrc/box/applier.cc
- test/replication-luatest/gh_7204_greeting_timeout_test.lua 37 additions, 0 deletionstest/replication-luatest/gh_7204_greeting_timeout_test.lua
Loading