Skip to content
Snippets Groups Projects
  1. May 17, 2019
  2. May 16, 2019
  3. May 15, 2019
    • Vladislav Shpilevoy's avatar
      crypto: use crypto library in crypto.lua · 544d648c
      Vladislav Shpilevoy authored
      crypto.lua is a public module using OpenSSL directly. But now
      lib/crypto encapsulates OpenSSL with additional checks and
      similar but more conforming API. It allows to replace OpenSSL
      cipher in crypto.lua with lib/crypto methods.
      544d648c
    • Vladislav Shpilevoy's avatar
      crypto: implement crypto libary · 4467fdac
      Vladislav Shpilevoy authored
      OpenSSL API is quite complex and hard to follow, additionally it
      is very unstable. Encoding/decoding via OpenSSL methods usually
      consists of multiple calls of a lot of functions. This patch
      wraps OpenSSL API with one more easy to use and conforming
      Tarantool code style in scope of crypto library.
      
      The traditional OpenSSL API is wrapped as well in a form of
      crypto_stream object, so OpenSSL API is not cut off.
      
      Besides struct crypto_stream the library provides struct
      crypto_codec which encapsulates all the steps of encryption logic
      in two short functions:
      
          crypto_codec_encrypt/decrypt(iv, in, in_size, out, out_size)
      
      A caller can create a needed codec via crypto_codec_new, which
      now supports all the same algorithms as crypto.lua module.
      
      Needed for #3234
      4467fdac
    • Vladislav Shpilevoy's avatar
      crypto: make exported methods conform code style · 0c718345
      Vladislav Shpilevoy authored
      Tarantool has a strict rule for naming methods of libraries - use
      the library name as a prefix. For crypto lib methods it should be
      'crypto_', not 'tnt_'.
      0c718345
    • Vladislav Shpilevoy's avatar
      crypto: move crypto business into a separate library · 1b9d3c6b
      Vladislav Shpilevoy authored
      Crypto in Tarantool core was implemented and used very poorly
      uintil now. It was just a one tiny file with one-line wrappers
      around OpenSSL API. Despite being small and simple, it provided a
      powerful interface to the Lua land used by Lua 'crypto' public
      and documented module.
      
      Now the time comes when OpenSSL crypto features are wanted on
      lower level and with richer API, in core library SWIM written
      in C. This patch moves crypto wrappers into a separate library
      in src/lib, and drops some methods from the header file because
      they are never used from C, and are needed for exporting only.
      
      Needed for #3234
      1b9d3c6b
    • Vladislav Shpilevoy's avatar
      Drop an unused function and class · 0891b159
      Vladislav Shpilevoy authored
      0891b159
    • Vladislav Shpilevoy's avatar
      msgpack: allow to pass 'const char *' into decode() · 453fff2b
      Vladislav Shpilevoy authored
      msgpack.decode() internally uses 'const char *' variable to
      decode msgpack, but somewhy expects only 'char *' as input.
      This commit allows to pass 'const char *' as well.
      453fff2b
    • Vladislav Shpilevoy's avatar
      msgpack: allow to pass 'struct ibuf *' into encode() · 7d98a3cf
      Vladislav Shpilevoy authored
      Before the patch msgpack Lua module provided a method encode()
      able to take a custom buffer to encode into. But it should be of
      type 'struct ibuf', what made it impossible to use
      buffer.IBUF_SHARED as a buffer, because its type is
      'struct ibuf *'. Strangely, but FFI can't convert these types
      automatically.
      
      This commit allows to use 'struct ibuf *' as well, and moves this
      functionality into a function in utils.h. Now both msgpack and
      merger modules can use ibuf directly and by pointer.
      7d98a3cf
    • Vladislav Shpilevoy's avatar
      swim: set 'left' status in self on swim_quit() · 8cb0206b
      Vladislav Shpilevoy authored
      swim_quit() notifies all the members that this instance has left
      the cluster. Strangely, except self. It is not a real bug, but
      showing 'left' status in self struct swim_member would be more
      correct than 'alive', obviously.
      
      It is possible, that self struct swim_member was referenced by a
      user - this is how 'self' can be available after SWIM instance
      deletion.
      
      Part of #3234
      8cb0206b
    • Vladislav Shpilevoy's avatar
      swim: do not rebind when new 'port' is 0 · 2149496b
      Vladislav Shpilevoy authored
      SWIM internally tries to avoid unnecessary close+socket+bind
      calls on reconfiguration if a new URI is the same as an old one.
      SWIM transport compares <IP, port> couples and if they are
      equal, does nothing.
      
      But if a port is 0, it is not a real port, but a sign to the
      kernel to find any free port on the IP address. In such a case
      SWIM transport after bind() retrieves and saves a real port.
      
      When the same URI is specified again, the transport compares two
      addresses: old <IP, auto found port>, new <IP, 0>, sees they are
      'different', and rebinds. It is not necessary, obviously, because
      the new URI covers the old one.
      
      This commit avoids rebind, when new IP == old IP, and new port is
      0.
      
      Part of #3234
      2149496b
    • Vladislav Shpilevoy's avatar
      swim: encapsulate 'uint16' payload size inside swim.c · 513c196a
      Vladislav Shpilevoy authored
      uint16 was used in public SWIM C API as a type for payload size
      to emphasize its small value. But it is not useful in Lua,
      because Lua API should explicitly check if a number overflows
      uint16 maximal value, and return the same error as in case it is
      < uint16_max, but > payload_size_max.
      
      So main motivation of the patch is to avoid unnecessary checks in
      Lua and error message duplication. Internally payload size is
      still uint16.
      513c196a
    • Vladislav Shpilevoy's avatar
      swim: drop swim_info() function · 3fc40001
      Vladislav Shpilevoy authored
      Swim_info() was a function to dump SWIM instance info to a Lua
      table without explicit usage of Lua. But now all the info can be
      taken from 1) self member and member API, 2) cached cfg options
      as a Lua table in a forthcoming Lua API - this is how
      box.cfg.<index> works.
      3fc40001
    • Alexander Turenko's avatar
      test: update test-run · 42b7f6d8
      Alexander Turenko authored
      - Fix killing of servers at crash (PR #167).
      - Show logs for a non-default server failed at start (#159, PR #168).
      - Fix TAP13 hung test reporting (#155, PR #169).
      - Fix false positive internal error detection (PR #170).
      42b7f6d8
  4. May 14, 2019
    • Ilya Konyukhov's avatar
      httpc: add MAX_TOTAL_CONNECTIONS option binding · d11b552e
      Ilya Konyukhov authored
      Right now there is only one option which is configurable for http
      client. That is CURLMOPT_MAXCONNECTS. It can be setup like this:
      
      > httpc = require('http.client').new({max_connections = 16})
      
      Basically, this option tells curl to maintain this many connections in
      the cache during client instance lifetime. Caching connections are very
      useful when user requests mostly same hosts.
      
      When connections cache is full and all of them are waiting for response
      and new request comes in, curl creates a new connection, starts request
      and then drops first available connection to keep connections cache size
      right.
      
      There is one side effect, that when tcp connection is closed, system
      actually updates its state to TIME_WAIT. Then for some time resources
      for this socket can't be reused (usually 60 seconds).
      
      When user wants to do lots of requests simultaneously (to the same
      host), curl ends up creating and dropping lots of connections, which is
      not very efficient. When this load is high enough, sockets won't be able
      to recover from TIME_WAIT because of timeout and system may run out of
      available sockets which results in performance reduce. And user right
      now cannot control or limit this behaviour.
      
      The solution is to add a new binding for CURLMOPT_MAX_TOTAL_CONNECTIONS
      option. This option tells curl to hold a new connection until
      there is one available (request is finished). Only after that curl will
      either drop and create new connection or reuse an old one.
      
      This patch bypasses this option into curl instance. It defaults to -1
      which means that there is no limit. To create a client with this option
      setup, user needs to set max_total_connections option like this:
      
      > httpc = require('http.client').new({max_connections = 8,
                                            max_total_connections = 8})
      
      In general this options usually useful when doing requests mostly to
      the same hosts. Other way, defaults should be enough.
      
      Option CURLMOPT_MAX_TOTAL_CONNECTIONS was added from 7.30.0 version, so
      if curl version is under 7.30.0, this option is simply ignored.
      https://curl.haxx.se/changes.html#7_30_0
      
      Also, this patch adjusts the default for CURLMOPT_MAX_CONNECTS option to
      0 which means that for every new easy handle curl will enlarge its max
      cache size by 4. See this option docs for more
      https://curl.haxx.se/libcurl/c/CURLMOPT_MAXCONNECTS.html
      
      Fixes #3945
      d11b552e
  5. May 13, 2019
    • Vladislav Shpilevoy's avatar
      small: fix build of static allocator · 14e9b43d
      Vladislav Shpilevoy authored
      See details in the small repository commit. In the summary, looks
      like a GCC bug. Fixed with a workaround.
      14e9b43d
    • Vladislav Shpilevoy's avatar
      sio: optimize sio_strfaddr() for the most common case · 6f20fb7b
      Vladislav Shpilevoy authored
      SIO library provides a wrapper for getnameinfo able to stringify
      Unix socket addresses. But it does not care about limited
      Tarantool stack and allocates buffers for getnameinfo() right on
      it - ~1Kb. Besides, after successful getnameinfo() the result is
      copied onto another static buffer.
      
      This patch optimizes sio_strfaddr() for the most common case -
      AF_INET, when 32 bytes is more than enough for any IP:Port pair,
      and writes the result into the target buffer directly.
      
      The main motivation behind this commit is that SWIM makes active
      use of sio_strfaddr() for logging - for each received/sent
      message it writes a couple of addresses into a log. It does it in
      verbose mode, but the say() function arguments are still
      calculated even when the active mode is lower than verbose.
      6f20fb7b
    • Vladislav Shpilevoy's avatar
      Use static_alloc() instead of 'static char[]' where possible · b6455e6a
      Vladislav Shpilevoy authored
      This patch harnesses freshly introduced static memory allocator
      to eliminate wasteful usage of BSS memory. This commit frees
      11Kb per each thread.
      b6455e6a
    • Vladislav Shpilevoy's avatar
      small: introduce small/static · 890068fc
      Vladislav Shpilevoy authored
      Before the patch Tarantool had a thread- and C-file- local array
      of 4 static buffers, each 1028 bytes. It provided an API
      tt_static_buf() allowing to return them one by one in a cycle.
      
      Firstly, it consumed totally 200Kb of BSS memory in summary over
      all C-files using these buffers. Obviously, it was a bug and was
      not made intentionally. The buffers were supposed to be a one
      process-global array.
      
      Secondly, even if the bug above had been fixed somehow, sometimes
      it would have been needed to obtain a bit bigger buffer. For
      example, to store a UDP packet - ~1.5Kb.
      
      This commit replaces these 4 buffers with small/ static allocator
      which does basically the same, but in more granulated and
      manoeuvrable way. This commit frees ~188Kb of BSS section.
      
      A main motivation for this commit is a wish to use a single
      global out-of-stack buffer to read UDP packets into it in the
      SWIM library, and on the other hand do not pad out BSS section
      with a new SWIM-special static buffer. Now SWIM uses stack for
      this and in the incoming cryptography SWIM component it will need
      more.
      890068fc
    • Vladimir Davydov's avatar
      tuple_format_iterator: don't set multikey_idx for multikey array itself · e639ec4b
      Vladimir Davydov authored
      Currently, we set multikey_idx to multikey_frame->idx for the field
      corresponding to the multikey_frame itself. This is wrong, because
      this field doesn't have any indirection in the field map - we simply
      store offset to the multikey array there. It works by a happy
      coincidence - the frame has index -1 and we treat -1 as no-multikey
      case, see MULTIKEY_NONE. Should we change MULTIKEY_NONE to e.g. -2
      or INT_MAX, we would get a crash because of it. So let's move the
      code setting multikey_idx before initializing multikey_frame in
      tuple_format_iterator_next().
      e639ec4b
    • Vladimir Davydov's avatar
      Use MULTIKEY_NONE instead of -1 · 361e218b
      Vladimir Davydov authored
      Solely to improve code readability. No functional changes.
      Suggested by @kostja.
      361e218b
    • Vladimir Davydov's avatar
      vinyl: implement multikey index support · 64704bb3
      Vladimir Davydov authored
      In case of multikey indexes, we use vy_entry.hint to store multikey
      array entry index instead of a comparison hint. So all we need to do is
      patch all places where a statement is inserted so that in case the key
      definition is multikey we iterate over all multikey indexes and insert
      an entry for each of them. The rest will be done automatically as vinyl
      stores and compares vy_entry objects, which have hints built-in, while
      comparators and other generic functions have already been patched to
      treat hints as multikey indexes.
      
      There are just a few places we need to patch:
      
       - vy_tx_set, which inserts a statement into a transaction write set.
       - vy_build_insert_stmt, which is used to fill the new index on index
         creation and DDL recovery.
       - vy_build_on_replace, which forwards modifications done to the space
         during index creation to the new index.
       - vy_check_is_unique_secondary, which checks a secondary index for
         conflicts on insertion of a new statement.
       - vy_tx_handle_deferred_delete, which generates deferred DELETE
         statements if the old tuple is found in memory or in cache.
       - vy_deferred_delete_on_replace, which applies deferred DELETEs on
         compaction.
      
      Plus, we need to teach vy_get_by_secondary_tuple to match a full
      multikey tuple to a partial multikey tuple or a key, which implies
      iterating over all multikey indexes of the full tuple and comparing
      them to the corresponding entries to the partial tuple.
      
      We already have tests that check the functionality for memtx. Enable and
      tweak it a little so that it can be used for vinyl as well.
      64704bb3
    • Vladimir Davydov's avatar
      vinyl: use multikey hints while writing runs · e40a7958
      Vladimir Davydov authored
      Currently, we completely ignore vy_entry.hint while writing a run file,
      because they only contain auxiliary information for tuple comparison.
      However, soon we will use hints to store multikey offsets, which is
      mandatory for extracting keys and hence writing secondary run files.
      So this patch propagates vy_entry.hint as multikey offset to tuple_bloom
      and tuple_extract_key in vy_run implementation.
      e40a7958
    • Vladimir Davydov's avatar
      vinyl: use field_map_builder for constructing stmt field map · 35be5c1c
      Vladimir Davydov authored
      Currently, we construct a field map for a vinyl surrogate DELETE
      statement by hand, which works fine as long as field maps don't have
      extents. Once multikey indexes are introduced, there will be extents
      hence we must switch to field_map_builder.
      35be5c1c
    • Vladimir Davydov's avatar
      Make tuple_bloom support multikey indexes · ca0f750c
      Vladimir Davydov authored
      Just like in case of tuple_extract_key, simply pass multikey_idx to
      tuple_bloom_builder_add and tuple_bloom_maybe_has. For now, we always
      pass -1, but the following patches will pass offset in multikey array
      if the key definition is multikey.
      ca0f750c
    • Vladimir Davydov's avatar
      Make tuple_extract_key support multikey indexes · 9118d408
      Vladimir Davydov authored
      Add multikey_idx argument to tuple_extract_key and forward it to
      tuple_field_by_part in the method implementation. For unikey indexes
      pass -1. We need this to support multikey indexes in Vinyl.
      
      We could of course introduce a separate set of methods for multikey
      indexes (something like tuple_extract_key_multikey), but that would
      look cumbersome and hardly result in any performance benefits, because
      passing -1 to a relatively cold function, such as key extractor, isn't
      a big deal. Besides, passing multikey_idx unconditionally is consistent
      with tuple_compare.
      9118d408
    • Vladimir Davydov's avatar
      Get rid of tuple_field_by_part_multikey · 6c44e11e
      Vladimir Davydov authored
      Always require multikey_idx to be passed to tuple_field_by_part.
      If the key definition isn't multikey, pass -1. Rationale: having
      separate functions for multikey and unikey indexes doesn't have
      any performance benefits (as all those functions are inline), but
      makes the code inconsistent with other functions (e.g. tuple_compare)
      which don't have a separate multikey variant. After all, passing -1
      when the key definition is known to be unikey doesn't blow the code.
      While we are at it, let's also add a few assertions ensuring that
      the key definition isn't multikey to functions that don't support
      multikey yet.
      6c44e11e
    • Vladimir Davydov's avatar
      Fix compilation · f732ec1b
      Vladimir Davydov authored
      Follow-up 14c529df ("Make tuple comparison hints mandatory").
      f732ec1b
    • Vladimir Davydov's avatar
      Make tuple comparison hints mandatory · 14c529df
      Vladimir Davydov authored
      There isn't much point in having separate versions of tuple comparators
      that don't take tuple comparison hints anymore, because all hot paths
      have been patched to use the hinted versions. Besides un-hinted versions
      don't make sense for multikey indexes, which use hints to store offsets
      in multikey arrays. Let's strip the _hinted suffix from all hinted
      comparators and zap un-hinted versions. In a few remaining places in the
      code that still use un-hinted versions, let's simply pass HINT_NONE.
      14c529df
    • Alexander Turenko's avatar
      Add merger for tuple streams (Lua part) · c7915ecc
      Alexander Turenko authored
      Fixes #3276.
      
      @TarantoolBot document
      Title: Merger for tuple streams
      
      The main concept of the merger is a source. It is an object that
      provides a stream of tuples. There are four types of sources: a tuple
      source, a table source, a buffer source and a merger itself.
      
      A tuple source just return one tuple. However this source (as well as a
      table and a buffer ones) supports fetching of a next data chunk, so the
      API allows to create it from a Lua iterator:
      `merger.new_tuple_source(gen, param, state)`. A `gen` function should
      return `state, tuple` on each call and then return `nil` when no more
      tuples available. Consider the example:
      
      ```lua
      box.cfg({})
      box.schema.space.create('s')
      box.space.s:create_index('pk')
      box.space.s:insert({1})
      box.space.s:insert({2})
      box.space.s:insert({3})
      
      s = merger.new_tuple_source(box.space.s:pairs())
      s:select()
      ---
      - - [1]
        - [2]
        - [3]
      ...
      
      s = merger.new_tuple_source(box.space.s:pairs())
      s:pairs():totable()
      ---
      - - [1]
        - [2]
        - [3]
      ...
      ```
      
      As we see a source (it is common for all sources) has `:select()` and
      `:pairs()` methods. The first one has two options: `buffer` and `limit`
      with the same meaning as ones in net.box `:select()`. The `:pairs()`
      method (or `:ipairs()` alias) returns a luafun iterator (it is a Lua
      iterator, but also provides a set of handy methods to operate in
      functional style).
      
      The same API exists to create a table and a buffer source:
      `merger.new_table_source(gen, param, state)` and
      `merger.new_buffer_source(gen, param, state)`. A `gen` function should
      return a table or a buffer on each call.
      
      There are also helpers that are useful when all data are available at
      once: `merger.new_source_fromtable(tbl)` and
      `merger.new_source_frombuffer(buf)`.
      
      A merger is a special kind of a source, which is created from a key_def
      object and a set of sources. It performs a kind of the merge sort:
      chooses a source with a minimal / maximal tuple on each step, consumes
      a tuple from this source and repeats. The API to create a merger is the
      following:
      
      ```lua
      local key_def_lib = require('key_def')
      local merger = require('merger')
      
      local key_def = key_def_lib.new(<...>)
      local sources = {<...>}
      local merger_inst = merger.new(key_def, sources, {
          -- Ascending (false) or descending (true) order.
          -- Default is ascending.
          reverse = <boolean> or <nil>,
      })
      ```
      
      An instance of a merger has the same `:select()` and `:pairs()` methods
      as any other source.
      
      The `key_def_lib.new()` function takes a table of key parts as an
      argument in the same format as box.space.<...>.index.<...>.parts or
      conn.space.<...>.index.<...>.parts (where conn is a net.box connection):
      
      ```
      local key_parts = {
          {
              fieldno = <number>,
              type = <string>,
              [ is_nullable = <boolean>, ]
              [ collation_id = <number>, ]
              [ collation = <string>, ]
          },
          ...
      }
      local key_def = key_def_lib.new(key_parts)
      ```
      
      A key_def can be cached across requests with the same ordering rules
      (typically requests to a same space).
      c7915ecc
    • Alexander Turenko's avatar
      Add merger for tuples streams (C part) · ec12395f
      Alexander Turenko authored
      Needed for #3276.
      ec12395f
    • Alexander Turenko's avatar
      net.box: add skip_header option to use with buffer · 1aaf6378
      Alexander Turenko authored
      Needed for #3276.
      
      @TarantoolBot document
      Title: net.box: skip_header option
      
      This option instructs net.box to skip {[IPROTO_DATA_KEY] = ...} wrapper
      from a buffer. This may be needed to pass this buffer to some C function
      when it expects some specific msgpack input.
      
      Usage example:
      
      ```lua
      local net_box = require('net.box')
      local buffer = require('buffer')
      local ffi = require('ffi')
      local msgpack = require('msgpack')
      local yaml = require('yaml')
      
      box.cfg{listen = 3301}
      box.once('load_data', function()
          box.schema.user.grant('guest', 'read,write,execute', 'universe')
          box.schema.space.create('s')
          box.space.s:create_index('pk')
          box.space.s:insert({1})
          box.space.s:insert({2})
          box.space.s:insert({3})
          box.space.s:insert({4})
      end)
      
      local function foo()
          return box.space.s:select()
      end
      _G.foo = foo
      
      local conn = net_box.connect('localhost:3301')
      
      local buf = buffer.ibuf()
      conn.space.s:select(nil, {buffer = buf})
      local buf_str = ffi.string(buf.rpos, buf.wpos - buf.rpos)
      local buf_lua = msgpack.decode(buf_str)
      print('select:\n' .. yaml.encode(buf_lua))
      -- {48: [[1], [2], [3], [4]]}
      
      local buf = buffer.ibuf()
      conn.space.s:select(nil, {buffer = buf, skip_header = true})
      local buf_str = ffi.string(buf.rpos, buf.wpos - buf.rpos)
      local buf_lua = msgpack.decode(buf_str)
      print('select:\n' .. yaml.encode(buf_lua))
      -- [[1], [2], [3], [4]]
      
      local buf = buffer.ibuf()
      conn:call('foo', nil, {buffer = buf})
      local buf_str = ffi.string(buf.rpos, buf.wpos - buf.rpos)
      local buf_lua = msgpack.decode(buf_str)
      print('call:\n' .. yaml.encode(buf_lua))
      -- {48: [[[1], [2], [3], [4]]]}
      
      local buf = buffer.ibuf()
      conn:call('foo', nil, {buffer = buf, skip_header = true})
      local buf_str = ffi.string(buf.rpos, buf.wpos - buf.rpos)
      local buf_lua = msgpack.decode(buf_str)
      print('call:\n' .. yaml.encode(buf_lua))
      -- [[[1], [2], [3], [4]]]
      
      os.exit()
      ```
      1aaf6378
    • Alexander Turenko's avatar
      lua: add non-recursive msgpack decoding functions · b1b2e7b0
      Alexander Turenko authored
      Needed for #3276.
      
      @TarantoolBot document
      Title: Non-recursive msgpack decoding functions
      
      Contracts:
      
      ```
      msgpack.decode_array_header(buf.rpos, buf:size()) -> arr_len, new_rpos
      msgpack.decode_map_header(buf.rpos, buf:size()) -> map_len, new_rpos
      ```
      
      These functions are intended to be used with a msgpack buffer received
      from net.box. A user may want to skip {[IPROTO_DATA_KEY] = ...} wrapper
      and an array header before pass the buffer to decode in some C function.
      
      See https://github.com/tarantool/tarantool/issues/2195 for more
      information re this net.box's API.
      b1b2e7b0
  6. May 10, 2019
    • avtikhon's avatar
      test: fix flaky show_error_on_disconnect failures · 13238539
      avtikhon authored
      It is needed to wait for upstream/downstream status, otherwise
      error occurs:
      
          [025] --- replication/show_error_on_disconnect.result Fri Apr 12 14:49:26 2019
          [025] +++ replication/show_error_on_disconnect.reject Tue Apr 16 07:35:41 2019
          [025] @@ -77,11 +77,12 @@
          [025] ...
          [025] box.info.replication[other_id].upstream.status
          [025] ---
          [025] -- stopped
          [025] +- sync
          [025] ...
          [025] box.info.replication[other_id].upstream.message:match("Missing")
          [025] ---
          [025] -- Missing
          [025] +- error: '[string "return box.info.replication[other_id].upstrea..."]:1: attempt to
          [025] + index field ''message'' (a nil value)'
          [025] ...
          [025] test_run:cmd("switch master_quorum2")
          [025] ---
          [025]
      
      Close #4161
      13238539
    • Alexander Turenko's avatar
      test: update test-run · 31ed449f
      Alexander Turenko authored
      Added test_run:wait_upstream() and test_run:wait_downstream() functions
      to wait for certain box.info.replication values (#158).
      31ed449f
  7. May 09, 2019
    • Stanislav Zudin's avatar
      Feature request for a new collation · 132bb3b1
      Stanislav Zudin authored
      Adds more tests for collations
      Marks unstable collation tests.
      Removes a duplicate test
      
      Closes #4007
      
      @TarantoolBot document
      Title: New collations
      
      The recent commit includes a wide variety of collations.
      The naming of the new collations have the following principles:
      unicode_<locale>_<strength>
      Three strengths are used:
      Primary - "s1
      Secondary - "s2" and
      Tertiary - "s3".
      
      The following list contains a so called "stable" collations -
      the ones whose sort order doesn't depend on the ICU version:
      
      unicode_am_s3
      unicode_fi_s3
      unicode_de__phonebook_s3
      unicode_haw_s3
      unicode_he_s3
      unicode_hi_s3
      unicode_is_s3
      unicode_ja_s3
      unicode_ko_s3
      unicode_lt_s3
      unicode_pl_s3
      unicode_si_s3
      unicode_es_s3
      132bb3b1
  8. May 08, 2019
    • Vladislav Shpilevoy's avatar
      swim: check broadcast interfaces more rigorously · e66ab61f
      Vladislav Shpilevoy authored
      Appeared, that getifaddrs() standard function can return
      addresses having IFF_BROADCAST flag, but at the same time not
      having struct sockaddr *ifa_broadaddr pointer (NULL).
      
      It led to a crash. The patch does additional check if the address
      is NULL.
      e66ab61f
  9. May 07, 2019
    • Vladimir Davydov's avatar
      box: zap field_map_get_size · 55e1a140
      Vladimir Davydov authored
      Turns out we don't really need it as we can use data_offset + bsize
      (i.e. the value returned by tuple_size() helper function) to get the
      size of a tuple to free. We only need to take into account the offset
      of the base tuple struct in the derived struct (memtx_tuple).
      
      There's a catch though:
      
       - We use sizeof(struct memtx_tuple) + field_map_size + bsize for
         allocation size.
       - We set data_offset to sizeof(struct tuple) + field_map_size.
       - struct tuple is packed, which makes its size 10 bytes.
       - memtx_tuple embeds struct tuple (base) at 4 byte offset, but since
         it is not packed, its size is 16 bytes, NOT 4 + 10 = 14 bytes as
         one might expect!
       - This means data_offset + bsize + offsetof(struct memtx_tuple, base)
         doesn't equal allocation size.
      
      To fix that, let's mark memtx_tuple packed. The only side effect it has
      is that we save 2 bytes per each memtx tuple. It won't affect tuple data
      layout at all, because struct memtx_tuple already has a packed layout
      and so 'packed' will only affect its size, which is only used for
      computing allocation size.
      
      My bad I overlooked it during review.
      
      Follow-up f1d9f257 ("box: introduce multikey indexes in memtx").
      55e1a140
    • Kirill Shcherbatov's avatar
      box: introduce multikey indexes in memtx · f1d9f257
      Kirill Shcherbatov authored
      - In the case of multikey index arises an ambiguity: which key
        should be used in the comparison. The previously introduced
        comparison hints act as an non-negative numeric index of key
        to use,
      - Memtx B+ tree replace and build_next methods have been
        patched to insert the same tuple multiple times by different
        logical indexes of the key in the array,
      - Map fields have been expanded service areas "extent" that
        contain an offset of multikey index keys by additional logical
        index.
      
      Part of #1257
      
      @TarantoolBot document
      Title: introduce multikey indexes in memtx
      Any JSON index in which at least one partition contains "[*]"
      - array index placeholder sign is called "Multikey".
      Such indexes allows you to automatically index set of documents
      having same document structure.
      
      Multikey indexes design have a number of restrictions that must
      be taken into account:
       - it cannot be primary because of the ambiguity arising from
         it's definition (primary index requires the one unique key
         that identify tuple),
       - if some node in the JSON tree of all defined indexes contains
         an array index placeholder [*], no other JSON path can use an
         explicit JSON index on it's nested field.
       - it support "unique" semantics, but it's uniqueness a little
         different from conventional indexes: you may insert a tuple
         in which the same key occurs multiple times in a unique
         multikey index, but you cannot insert a tuple when any of its
         keys is in some other tuple stored in space,
       - the unique multikey index "duplicate" conflict occurs when
         the sets of extracted keys have a non-empty logical
         intersection
       - to identify the different keys by which a given data tuple is
         indexed, each key is assigned a logical sequence number in
         the array defined with array index placeholder [*] in index
         (such array is called multikey index root),
       - no index partition can contain more than one array index
         placeholder sign [*] in it's JSON path,
       - all parts containing JSON paths with array index placeholder
         [*] must have the same (in terms of json tokens) prefix
         before this placeholder sign.
      
      Example 1:
      s = box.schema.space.create('clients')
      s:format({{name='name', type='string'}, {name='phone', type='array'}})
      name_idx = s:create_index('name_idx', {parts = {{'name', 'string'}}})
      phone_idx = s:create_index('phone_idx', {parts = {{'phone[*]',
      'string'}}})
      s:insert({"Jorge", {"911", "89457609234"}})
      s:insert({"Bob", {"81239876543"}})
      
      phone_idx:get("911")
      ---
      - ['Jorge', ['911', '89457609234']]
      ...
      
      Example 2:
      s = box.schema.space.create('withdata')
      pk = s:create_index('pk')
      parts = {
      	{2, 'str', path = 'data[*].name'},
              {2, 'str', path = 'data[*].extra.phone'}
      }
      idx = s:create_index('idx', {parts = parts})
      s:insert({1, {data = {{name="A", extra={phone="111"}},
                            {name="B", extra={phone="111"}}},
                   garbage = 1}}
      idx:get({'A', '111'})
      f1d9f257
Loading