Commits · e6869dd2e834c4437ddc49734a33ad171915ebb9 · core / tarantool

May 17, 2019
- travis-ci: add fedora 30 · e6869dd2
  Alexander Turenko authored 5 years ago
  
  Fixes #4194.
  e6869dd2
- Remove schema latch relock · c783c8b7
  Georgy Kirichenko authored 5 years ago
  
  As we enforced applier row order so we don't need to reacquire schema latch after a ddl statement. Followup for: 056deb2c
  c783c8b7
May 16, 2019
- test: update test-run · 46e7d24a
  Alexander Turenko authored 5 years ago
  
  Support more then 60 parallel jobs (#82, PR #171).
  46e7d24a
May 15, 2019

crypto: use crypto library in crypto.lua · 544d648c

Vladislav Shpilevoy authored 5 years ago

crypto.lua is a public module using OpenSSL directly. But now
lib/crypto encapsulates OpenSSL with additional checks and
similar but more conforming API. It allows to replace OpenSSL
cipher in crypto.lua with lib/crypto methods.

544d648c

crypto: implement crypto libary · 4467fdac

Vladislav Shpilevoy authored 5 years ago

OpenSSL API is quite complex and hard to follow, additionally it
is very unstable. Encoding/decoding via OpenSSL methods usually
consists of multiple calls of a lot of functions. This patch
wraps OpenSSL API with one more easy to use and conforming
Tarantool code style in scope of crypto library.

The traditional OpenSSL API is wrapped as well in a form of
crypto_stream object, so OpenSSL API is not cut off.

Besides struct crypto_stream the library provides struct
crypto_codec which encapsulates all the steps of encryption logic
in two short functions:

    crypto_codec_encrypt/decrypt(iv, in, in_size, out, out_size)

A caller can create a needed codec via crypto_codec_new, which
now supports all the same algorithms as crypto.lua module.

Needed for #3234

4467fdac

crypto: make exported methods conform code style · 0c718345

Vladislav Shpilevoy authored 5 years ago

Tarantool has a strict rule for naming methods of libraries - use
the library name as a prefix. For crypto lib methods it should be
'crypto_', not 'tnt_'.

0c718345

crypto: move crypto business into a separate library · 1b9d3c6b

Vladislav Shpilevoy authored 5 years ago

Crypto in Tarantool core was implemented and used very poorly
uintil now. It was just a one tiny file with one-line wrappers
around OpenSSL API. Despite being small and simple, it provided a
powerful interface to the Lua land used by Lua 'crypto' public
and documented module.

Now the time comes when OpenSSL crypto features are wanted on
lower level and with richer API, in core library SWIM written
in C. This patch moves crypto wrappers into a separate library
in src/lib, and drops some methods from the header file because
they are never used from C, and are needed for exporting only.

Needed for #3234

1b9d3c6b

Drop an unused function and class · 0891b159
Vladislav Shpilevoy authored 5 years ago

0891b159

msgpack: allow to pass 'const char *' into decode() · 453fff2b

Vladislav Shpilevoy authored 5 years ago

msgpack.decode() internally uses 'const char *' variable to
decode msgpack, but somewhy expects only 'char *' as input.
This commit allows to pass 'const char *' as well.

453fff2b

msgpack: allow to pass 'struct ibuf *' into encode() · 7d98a3cf

Vladislav Shpilevoy authored 5 years ago

Before the patch msgpack Lua module provided a method encode()
able to take a custom buffer to encode into. But it should be of
type 'struct ibuf', what made it impossible to use
buffer.IBUF_SHARED as a buffer, because its type is
'struct ibuf *'. Strangely, but FFI can't convert these types
automatically.

This commit allows to use 'struct ibuf *' as well, and moves this
functionality into a function in utils.h. Now both msgpack and
merger modules can use ibuf directly and by pointer.

7d98a3cf

swim: set 'left' status in self on swim_quit() · 8cb0206b

Vladislav Shpilevoy authored 5 years ago

swim_quit() notifies all the members that this instance has left
the cluster. Strangely, except self. It is not a real bug, but
showing 'left' status in self struct swim_member would be more
correct than 'alive', obviously.

It is possible, that self struct swim_member was referenced by a
user - this is how 'self' can be available after SWIM instance
deletion.

Part of #3234

8cb0206b

swim: do not rebind when new 'port' is 0 · 2149496b

Vladislav Shpilevoy authored 5 years ago

SWIM internally tries to avoid unnecessary close+socket+bind
calls on reconfiguration if a new URI is the same as an old one.
SWIM transport compares <IP, port> couples and if they are
equal, does nothing.

But if a port is 0, it is not a real port, but a sign to the
kernel to find any free port on the IP address. In such a case
SWIM transport after bind() retrieves and saves a real port.

When the same URI is specified again, the transport compares two
addresses: old <IP, auto found port>, new <IP, 0>, sees they are
'different', and rebinds. It is not necessary, obviously, because
the new URI covers the old one.

This commit avoids rebind, when new IP == old IP, and new port is
0.

Part of #3234

2149496b

swim: encapsulate 'uint16' payload size inside swim.c · 513c196a

Vladislav Shpilevoy authored 5 years ago

uint16 was used in public SWIM C API as a type for payload size
to emphasize its small value. But it is not useful in Lua,
because Lua API should explicitly check if a number overflows
uint16 maximal value, and return the same error as in case it is
< uint16_max, but > payload_size_max.

So main motivation of the patch is to avoid unnecessary checks in
Lua and error message duplication. Internally payload size is
still uint16.

513c196a

swim: drop swim_info() function · 3fc40001

Vladislav Shpilevoy authored 5 years ago

Swim_info() was a function to dump SWIM instance info to a Lua
table without explicit usage of Lua. But now all the info can be
taken from 1) self member and member API, 2) cached cfg options
as a Lua table in a forthcoming Lua API - this is how
box.cfg.<index> works.

3fc40001

test: update test-run · 42b7f6d8

Alexander Turenko authored 5 years ago

- Fix killing of servers at crash (PR #167).
- Show logs for a non-default server failed at start (#159, PR #168).
- Fix TAP13 hung test reporting (#155, PR #169).
- Fix false positive internal error detection (PR #170).

42b7f6d8

May 14, 2019

httpc: add MAX_TOTAL_CONNECTIONS option binding · d11b552e

Ilya Konyukhov authored 5 years ago

Right now there is only one option which is configurable for http
client. That is CURLMOPT_MAXCONNECTS. It can be setup like this:

> httpc = require('http.client').new({max_connections = 16})

Basically, this option tells curl to maintain this many connections in
the cache during client instance lifetime. Caching connections are very
useful when user requests mostly same hosts.

When connections cache is full and all of them are waiting for response
and new request comes in, curl creates a new connection, starts request
and then drops first available connection to keep connections cache size
right.

There is one side effect, that when tcp connection is closed, system
actually updates its state to TIME_WAIT. Then for some time resources
for this socket can't be reused (usually 60 seconds).

When user wants to do lots of requests simultaneously (to the same
host), curl ends up creating and dropping lots of connections, which is
not very efficient. When this load is high enough, sockets won't be able
to recover from TIME_WAIT because of timeout and system may run out of
available sockets which results in performance reduce. And user right
now cannot control or limit this behaviour.

The solution is to add a new binding for CURLMOPT_MAX_TOTAL_CONNECTIONS
option. This option tells curl to hold a new connection until
there is one available (request is finished). Only after that curl will
either drop and create new connection or reuse an old one.

This patch bypasses this option into curl instance. It defaults to -1
which means that there is no limit. To create a client with this option
setup, user needs to set max_total_connections option like this:

> httpc = require('http.client').new({max_connections = 8,
                                      max_total_connections = 8})

In general this options usually useful when doing requests mostly to
the same hosts. Other way, defaults should be enough.

Option CURLMOPT_MAX_TOTAL_CONNECTIONS was added from 7.30.0 version, so
if curl version is under 7.30.0, this option is simply ignored.
https://curl.haxx.se/changes.html#7_30_0

Also, this patch adjusts the default for CURLMOPT_MAX_CONNECTS option to
0 which means that for every new easy handle curl will enlarge its max
cache size by 4. See this option docs for more
https://curl.haxx.se/libcurl/c/CURLMOPT_MAXCONNECTS.html

Fixes #3945

d11b552e

May 13, 2019

small: fix build of static allocator · 14e9b43d

Vladislav Shpilevoy authored 5 years ago

See details in the small repository commit. In the summary, looks
like a GCC bug. Fixed with a workaround.

14e9b43d

sio: optimize sio_strfaddr() for the most common case · 6f20fb7b

Vladislav Shpilevoy authored 5 years ago

SIO library provides a wrapper for getnameinfo able to stringify
Unix socket addresses. But it does not care about limited
Tarantool stack and allocates buffers for getnameinfo() right on
it - ~1Kb. Besides, after successful getnameinfo() the result is
copied onto another static buffer.

This patch optimizes sio_strfaddr() for the most common case -
AF_INET, when 32 bytes is more than enough for any IP:Port pair,
and writes the result into the target buffer directly.

The main motivation behind this commit is that SWIM makes active
use of sio_strfaddr() for logging - for each received/sent
message it writes a couple of addresses into a log. It does it in
verbose mode, but the say() function arguments are still
calculated even when the active mode is lower than verbose.

6f20fb7b

Use static_alloc() instead of 'static char[]' where possible · b6455e6a

Vladislav Shpilevoy authored 5 years ago

This patch harnesses freshly introduced static memory allocator
to eliminate wasteful usage of BSS memory. This commit frees
11Kb per each thread.

b6455e6a

small: introduce small/static · 890068fc

Vladislav Shpilevoy authored 5 years ago

Before the patch Tarantool had a thread- and C-file- local array
of 4 static buffers, each 1028 bytes. It provided an API
tt_static_buf() allowing to return them one by one in a cycle.

Firstly, it consumed totally 200Kb of BSS memory in summary over
all C-files using these buffers. Obviously, it was a bug and was
not made intentionally. The buffers were supposed to be a one
process-global array.

Secondly, even if the bug above had been fixed somehow, sometimes
it would have been needed to obtain a bit bigger buffer. For
example, to store a UDP packet - ~1.5Kb.

This commit replaces these 4 buffers with small/ static allocator
which does basically the same, but in more granulated and
manoeuvrable way. This commit frees ~188Kb of BSS section.

A main motivation for this commit is a wish to use a single
global out-of-stack buffer to read UDP packets into it in the
SWIM library, and on the other hand do not pad out BSS section
with a new SWIM-special static buffer. Now SWIM uses stack for
this and in the incoming cryptography SWIM component it will need
more.

890068fc

tuple_format_iterator: don't set multikey_idx for multikey array itself · e639ec4b

Vladimir Davydov authored 5 years ago

Currently, we set multikey_idx to multikey_frame->idx for the field
corresponding to the multikey_frame itself. This is wrong, because
this field doesn't have any indirection in the field map - we simply
store offset to the multikey array there. It works by a happy
coincidence - the frame has index -1 and we treat -1 as no-multikey
case, see MULTIKEY_NONE. Should we change MULTIKEY_NONE to e.g. -2
or INT_MAX, we would get a crash because of it. So let's move the
code setting multikey_idx before initializing multikey_frame in
tuple_format_iterator_next().

e639ec4b

Use MULTIKEY_NONE instead of -1 · 361e218b
Vladimir Davydov authored 5 years ago
```
Solely to improve code readability. No functional changes.
Suggested by @kostja.
```
361e218b

vinyl: implement multikey index support · 64704bb3

Vladimir Davydov authored 5 years ago

In case of multikey indexes, we use vy_entry.hint to store multikey
array entry index instead of a comparison hint. So all we need to do is
patch all places where a statement is inserted so that in case the key
definition is multikey we iterate over all multikey indexes and insert
an entry for each of them. The rest will be done automatically as vinyl
stores and compares vy_entry objects, which have hints built-in, while
comparators and other generic functions have already been patched to
treat hints as multikey indexes.

There are just a few places we need to patch:

 - vy_tx_set, which inserts a statement into a transaction write set.
 - vy_build_insert_stmt, which is used to fill the new index on index
   creation and DDL recovery.
 - vy_build_on_replace, which forwards modifications done to the space
   during index creation to the new index.
 - vy_check_is_unique_secondary, which checks a secondary index for
   conflicts on insertion of a new statement.
 - vy_tx_handle_deferred_delete, which generates deferred DELETE
   statements if the old tuple is found in memory or in cache.
 - vy_deferred_delete_on_replace, which applies deferred DELETEs on
   compaction.

Plus, we need to teach vy_get_by_secondary_tuple to match a full
multikey tuple to a partial multikey tuple or a key, which implies
iterating over all multikey indexes of the full tuple and comparing
them to the corresponding entries to the partial tuple.

We already have tests that check the functionality for memtx. Enable and
tweak it a little so that it can be used for vinyl as well.

64704bb3

vinyl: use multikey hints while writing runs · e40a7958

Vladimir Davydov authored 5 years ago

Currently, we completely ignore vy_entry.hint while writing a run file,
because they only contain auxiliary information for tuple comparison.
However, soon we will use hints to store multikey offsets, which is
mandatory for extracting keys and hence writing secondary run files.
So this patch propagates vy_entry.hint as multikey offset to tuple_bloom
and tuple_extract_key in vy_run implementation.

e40a7958

vinyl: use field_map_builder for constructing stmt field map · 35be5c1c

Vladimir Davydov authored 5 years ago

Currently, we construct a field map for a vinyl surrogate DELETE
statement by hand, which works fine as long as field maps don't have
extents. Once multikey indexes are introduced, there will be extents
hence we must switch to field_map_builder.

35be5c1c

Make tuple_bloom support multikey indexes · ca0f750c

Vladimir Davydov authored 5 years ago

Just like in case of tuple_extract_key, simply pass multikey_idx to
tuple_bloom_builder_add and tuple_bloom_maybe_has. For now, we always
pass -1, but the following patches will pass offset in multikey array
if the key definition is multikey.

ca0f750c

Make tuple_extract_key support multikey indexes · 9118d408

Vladimir Davydov authored 5 years ago

Add multikey_idx argument to tuple_extract_key and forward it to
tuple_field_by_part in the method implementation. For unikey indexes
pass -1. We need this to support multikey indexes in Vinyl.

We could of course introduce a separate set of methods for multikey
indexes (something like tuple_extract_key_multikey), but that would
look cumbersome and hardly result in any performance benefits, because
passing -1 to a relatively cold function, such as key extractor, isn't
a big deal. Besides, passing multikey_idx unconditionally is consistent
with tuple_compare.

9118d408

Get rid of tuple_field_by_part_multikey · 6c44e11e

Vladimir Davydov authored 5 years ago

Always require multikey_idx to be passed to tuple_field_by_part.
If the key definition isn't multikey, pass -1. Rationale: having
separate functions for multikey and unikey indexes doesn't have
any performance benefits (as all those functions are inline), but
makes the code inconsistent with other functions (e.g. tuple_compare)
which don't have a separate multikey variant. After all, passing -1
when the key definition is known to be unikey doesn't blow the code.
While we are at it, let's also add a few assertions ensuring that
the key definition isn't multikey to functions that don't support
multikey yet.

6c44e11e

Fix compilation · f732ec1b
Vladimir Davydov authored 5 years ago
```
Follow-up 14c529df ("Make tuple comparison hints mandatory").
```
f732ec1b

Make tuple comparison hints mandatory · 14c529df

Vladimir Davydov authored 5 years ago

There isn't much point in having separate versions of tuple comparators
that don't take tuple comparison hints anymore, because all hot paths
have been patched to use the hinted versions. Besides un-hinted versions
don't make sense for multikey indexes, which use hints to store offsets
in multikey arrays. Let's strip the _hinted suffix from all hinted
comparators and zap un-hinted versions. In a few remaining places in the
code that still use un-hinted versions, let's simply pass HINT_NONE.

14c529df

Add merger for tuple streams (Lua part) · c7915ecc

Alexander Turenko authored 6 years ago

Fixes #3276.

@TarantoolBot document
Title: Merger for tuple streams

The main concept of the merger is a source. It is an object that
provides a stream of tuples. There are four types of sources: a tuple
source, a table source, a buffer source and a merger itself.

A tuple source just return one tuple. However this source (as well as a
table and a buffer ones) supports fetching of a next data chunk, so the
API allows to create it from a Lua iterator:
`merger.new_tuple_source(gen, param, state)`. A `gen` function should
return `state, tuple` on each call and then return `nil` when no more
tuples available. Consider the example:

```lua
box.cfg({})
box.schema.space.create('s')
box.space.s:create_index('pk')
box.space.s:insert({1})
box.space.s:insert({2})
box.space.s:insert({3})

s = merger.new_tuple_source(box.space.s:pairs())
s:select()
---
- - [1]
  - [2]
  - [3]
...

s = merger.new_tuple_source(box.space.s:pairs())
s:pairs():totable()
---
- - [1]
  - [2]
  - [3]
...
```

As we see a source (it is common for all sources) has `:select()` and
`:pairs()` methods. The first one has two options: `buffer` and `limit`
with the same meaning as ones in net.box `:select()`. The `:pairs()`
method (or `:ipairs()` alias) returns a luafun iterator (it is a Lua
iterator, but also provides a set of handy methods to operate in
functional style).

The same API exists to create a table and a buffer source:
`merger.new_table_source(gen, param, state)` and
`merger.new_buffer_source(gen, param, state)`. A `gen` function should
return a table or a buffer on each call.

There are also helpers that are useful when all data are available at
once: `merger.new_source_fromtable(tbl)` and
`merger.new_source_frombuffer(buf)`.

A merger is a special kind of a source, which is created from a key_def
object and a set of sources. It performs a kind of the merge sort:
chooses a source with a minimal / maximal tuple on each step, consumes
a tuple from this source and repeats. The API to create a merger is the
following:

```lua
local key_def_lib = require('key_def')
local merger = require('merger')

local key_def = key_def_lib.new(<...>)
local sources = {<...>}
local merger_inst = merger.new(key_def, sources, {
    -- Ascending (false) or descending (true) order.
    -- Default is ascending.
    reverse = <boolean> or <nil>,
})
```

An instance of a merger has the same `:select()` and `:pairs()` methods
as any other source.

The `key_def_lib.new()` function takes a table of key parts as an
argument in the same format as box.space.<...>.index.<...>.parts or
conn.space.<...>.index.<...>.parts (where conn is a net.box connection):

```
local key_parts = {
    {
        fieldno = <number>,
        type = <string>,
        [ is_nullable = <boolean>, ]
        [ collation_id = <number>, ]
        [ collation = <string>, ]
    },
    ...
}
local key_def = key_def_lib.new(key_parts)
```

A key_def can be cached across requests with the same ordering rules
(typically requests to a same space).

c7915ecc

Add merger for tuples streams (C part) · ec12395f
Alexander Turenko authored 6 years ago
```
Needed for #3276.
```
ec12395f

net.box: add skip_header option to use with buffer · 1aaf6378

Alexander Turenko authored 6 years ago

Needed for #3276.

@TarantoolBot document
Title: net.box: skip_header option

This option instructs net.box to skip {[IPROTO_DATA_KEY] = ...} wrapper
from a buffer. This may be needed to pass this buffer to some C function
when it expects some specific msgpack input.

Usage example:

```lua
local net_box = require('net.box')
local buffer = require('buffer')
local ffi = require('ffi')
local msgpack = require('msgpack')
local yaml = require('yaml')

box.cfg{listen = 3301}
box.once('load_data', function()
    box.schema.user.grant('guest', 'read,write,execute', 'universe')
    box.schema.space.create('s')
    box.space.s:create_index('pk')
    box.space.s:insert({1})
    box.space.s:insert({2})
    box.space.s:insert({3})
    box.space.s:insert({4})
end)

local function foo()
    return box.space.s:select()
end
_G.foo = foo

local conn = net_box.connect('localhost:3301')

local buf = buffer.ibuf()
conn.space.s:select(nil, {buffer = buf})
local buf_str = ffi.string(buf.rpos, buf.wpos - buf.rpos)
local buf_lua = msgpack.decode(buf_str)
print('select:\n' .. yaml.encode(buf_lua))
-- {48: [[1], [2], [3], [4]]}

local buf = buffer.ibuf()
conn.space.s:select(nil, {buffer = buf, skip_header = true})
local buf_str = ffi.string(buf.rpos, buf.wpos - buf.rpos)
local buf_lua = msgpack.decode(buf_str)
print('select:\n' .. yaml.encode(buf_lua))
-- [[1], [2], [3], [4]]

local buf = buffer.ibuf()
conn:call('foo', nil, {buffer = buf})
local buf_str = ffi.string(buf.rpos, buf.wpos - buf.rpos)
local buf_lua = msgpack.decode(buf_str)
print('call:\n' .. yaml.encode(buf_lua))
-- {48: [[[1], [2], [3], [4]]]}

local buf = buffer.ibuf()
conn:call('foo', nil, {buffer = buf, skip_header = true})
local buf_str = ffi.string(buf.rpos, buf.wpos - buf.rpos)
local buf_lua = msgpack.decode(buf_str)
print('call:\n' .. yaml.encode(buf_lua))
-- [[[1], [2], [3], [4]]]

os.exit()
```

1aaf6378

lua: add non-recursive msgpack decoding functions · b1b2e7b0

Alexander Turenko authored 6 years ago

Needed for #3276.

@TarantoolBot document
Title: Non-recursive msgpack decoding functions

Contracts:

```
msgpack.decode_array_header(buf.rpos, buf:size()) -> arr_len, new_rpos
msgpack.decode_map_header(buf.rpos, buf:size()) -> map_len, new_rpos
```

These functions are intended to be used with a msgpack buffer received
from net.box. A user may want to skip {[IPROTO_DATA_KEY] = ...} wrapper
and an array header before pass the buffer to decode in some C function.

See https://github.com/tarantool/tarantool/issues/2195 for more
information re this net.box's API.

b1b2e7b0

May 10, 2019

test: fix flaky show_error_on_disconnect failures · 13238539

avtikhon authored 5 years ago

It is needed to wait for upstream/downstream status, otherwise
error occurs:

    [025] --- replication/show_error_on_disconnect.result Fri Apr 12 14:49:26 2019
    [025] +++ replication/show_error_on_disconnect.reject Tue Apr 16 07:35:41 2019
    [025] @@ -77,11 +77,12 @@
    [025] ...
    [025] box.info.replication[other_id].upstream.status
    [025] ---
    [025] -- stopped
    [025] +- sync
    [025] ...
    [025] box.info.replication[other_id].upstream.message:match("Missing")
    [025] ---
    [025] -- Missing
    [025] +- error: '[string "return box.info.replication[other_id].upstrea..."]:1: attempt to
    [025] + index field ''message'' (a nil value)'
    [025] ...
    [025] test_run:cmd("switch master_quorum2")
    [025] ---
    [025]

Close #4161

13238539

test: update test-run · 31ed449f

Alexander Turenko authored 5 years ago

Added test_run:wait_upstream() and test_run:wait_downstream() functions
to wait for certain box.info.replication values (#158).

31ed449f

May 09, 2019

Feature request for a new collation · 132bb3b1

Stanislav Zudin authored 5 years ago

Adds more tests for collations
Marks unstable collation tests.
Removes a duplicate test

Closes #4007

@TarantoolBot document
Title: New collations

The recent commit includes a wide variety of collations.
The naming of the new collations have the following principles:
unicode_<locale>_<strength>
Three strengths are used:
Primary - "s1
Secondary - "s2" and
Tertiary - "s3".

The following list contains a so called "stable" collations -
the ones whose sort order doesn't depend on the ICU version:

unicode_am_s3
unicode_fi_s3
unicode_de__phonebook_s3
unicode_haw_s3
unicode_he_s3
unicode_hi_s3
unicode_is_s3
unicode_ja_s3
unicode_ko_s3
unicode_lt_s3
unicode_pl_s3
unicode_si_s3
unicode_es_s3

132bb3b1

May 08, 2019

swim: check broadcast interfaces more rigorously · e66ab61f

Vladislav Shpilevoy authored 5 years ago

Appeared, that getifaddrs() standard function can return
addresses having IFF_BROADCAST flag, but at the same time not
having struct sockaddr *ifa_broadaddr pointer (NULL).

It led to a crash. The patch does additional check if the address
is NULL.

e66ab61f

May 07, 2019

box: zap field_map_get_size · 55e1a140

Vladimir Davydov authored 5 years ago

Turns out we don't really need it as we can use data_offset + bsize
(i.e. the value returned by tuple_size() helper function) to get the
size of a tuple to free. We only need to take into account the offset
of the base tuple struct in the derived struct (memtx_tuple).

There's a catch though:

 - We use sizeof(struct memtx_tuple) + field_map_size + bsize for
   allocation size.
 - We set data_offset to sizeof(struct tuple) + field_map_size.
 - struct tuple is packed, which makes its size 10 bytes.
 - memtx_tuple embeds struct tuple (base) at 4 byte offset, but since
   it is not packed, its size is 16 bytes, NOT 4 + 10 = 14 bytes as
   one might expect!
 - This means data_offset + bsize + offsetof(struct memtx_tuple, base)
   doesn't equal allocation size.

To fix that, let's mark memtx_tuple packed. The only side effect it has
is that we save 2 bytes per each memtx tuple. It won't affect tuple data
layout at all, because struct memtx_tuple already has a packed layout
and so 'packed' will only affect its size, which is only used for
computing allocation size.

My bad I overlooked it during review.

Follow-up f1d9f257 ("box: introduce multikey indexes in memtx").

55e1a140

box: introduce multikey indexes in memtx · f1d9f257

Kirill Shcherbatov authored 6 years ago

- In the case of multikey index arises an ambiguity: which key
  should be used in the comparison. The previously introduced
  comparison hints act as an non-negative numeric index of key
  to use,
- Memtx B+ tree replace and build_next methods have been
  patched to insert the same tuple multiple times by different
  logical indexes of the key in the array,
- Map fields have been expanded service areas "extent" that
  contain an offset of multikey index keys by additional logical
  index.

Part of #1257

@TarantoolBot document
Title: introduce multikey indexes in memtx
Any JSON index in which at least one partition contains "[*]"
- array index placeholder sign is called "Multikey".
Such indexes allows you to automatically index set of documents
having same document structure.

Multikey indexes design have a number of restrictions that must
be taken into account:
 - it cannot be primary because of the ambiguity arising from
   it's definition (primary index requires the one unique key
   that identify tuple),
 - if some node in the JSON tree of all defined indexes contains
   an array index placeholder [*], no other JSON path can use an
   explicit JSON index on it's nested field.
 - it support "unique" semantics, but it's uniqueness a little
   different from conventional indexes: you may insert a tuple
   in which the same key occurs multiple times in a unique
   multikey index, but you cannot insert a tuple when any of its
   keys is in some other tuple stored in space,
 - the unique multikey index "duplicate" conflict occurs when
   the sets of extracted keys have a non-empty logical
   intersection
 - to identify the different keys by which a given data tuple is
   indexed, each key is assigned a logical sequence number in
   the array defined with array index placeholder [*] in index
   (such array is called multikey index root),
 - no index partition can contain more than one array index
   placeholder sign [*] in it's JSON path,
 - all parts containing JSON paths with array index placeholder
   [*] must have the same (in terms of json tokens) prefix
   before this placeholder sign.

Example 1:
s = box.schema.space.create('clients')
s:format({{name='name', type='string'}, {name='phone', type='array'}})
name_idx = s:create_index('name_idx', {parts = {{'name', 'string'}}})
phone_idx = s:create_index('phone_idx', {parts = {{'phone[*]',
'string'}}})
s:insert({"Jorge", {"911", "89457609234"}})
s:insert({"Bob", {"81239876543"}})

phone_idx:get("911")
---
- ['Jorge', ['911', '89457609234']]
...

Example 2:
s = box.schema.space.create('withdata')
pk = s:create_index('pk')
parts = {
	{2, 'str', path = 'data[*].name'},
        {2, 'str', path = 'data[*].extra.phone'}
}
idx = s:create_index('idx', {parts = parts})
s:insert({1, {data = {{name="A", extra={phone="111"}},
                      {name="B", extra={phone="111"}}},
             garbage = 1}}
idx:get({'A', '111'})

f1d9f257