diff --git a/doc/user/configuration-reference.xml b/doc/user/configuration-reference.xml index 519221212befe1323a2b02e7d8302bae83d3d5d0..8d397a97fb92d7d36660680bd899a8c6d4064a56 100644 --- a/doc/user/configuration-reference.xml +++ b/doc/user/configuration-reference.xml @@ -82,6 +82,46 @@ Target: Linux-x86_64-Debug </section> + +<section xml:id="URI" xreflabel="URI"> + <title>URI</title> + <para> +Some configuration parameters and some functions +depend on a URI, or "Universal Resource Identifier". +The URI string format is similar to the +<link +xlink:href="http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax">generic syntax for a URI schema</link>. +So it may contain (in order) a user name for login, a password, +a host name or host IP address, and a port number. +Only the port number is always mandatory. The password +is mandatory if the user name is specified, +unless the user name is 'guest'. +So, formally, the URI syntax is +<code>[host:]port</code> +or <code>[username:password@]host:port</code> +or if username='guest' it may be +<code>[username@]host:port</code>. +If host is omitted, then 'localhost' is assumed. +If username:password is omitted, then 'guest' is assumed. +Some examples: + <informaltable> + <tgroup cols="2" align="left" colsep="1" rowsep="0"> + <thead> + <row><entry>URI fragment</entry><entry>Example</entry></row> + </thead> + <tbody> + <row><entry>port</entry><entry> 3301</entry></row> + <row><entry>host:port</entry><entry> 127.0.0.1:3301</entry></row> + <row><entry>guest@host:port</entry><entry> guest@mail.ru:3301</entry></row> + <row><entry>username:password@host:port</entry><entry> guest:sesame@mail.ru:3301</entry></row> + </tbody> + </tgroup> + </informaltable> +In certain circumstances a Unix socket may be used where a URI is required. +</para> +</section> + + <section xml:id="initialization-file" xreflabel="initialization file"> <title>Initialization file</title> <para> @@ -113,7 +153,6 @@ Then the screen might look like this:<programlisting> ... main/101/script.lua I> recovering from `./00000000000000000000.snap' ... main/101/script.lua I> primary: bound to 0.0.0.0:3301 ... main/102/leave_local_hot_standby I> ready to accept requests -... main/103/snapshot_daemon I> started Starting ARG ... main C> entering the event loop</programlisting> </para> @@ -217,7 +256,7 @@ Starting ARG <entry>integer or string</entry> <entry>null</entry> <entry>no</entry> - <entry>The read/write data port number or URI (Universal Resource Identifier) string. + <entry>The read/write data port number or <link linkend="URI">URI</link> (Universal Resource Identifier) string. Has no default value, so <emphasis role="strong">must be specified</emphasis> if connections will occur from remote clients @@ -342,7 +381,7 @@ tarantool: primary pri: 3301 adm: 3313</programlisting> </table> - <table frame='all' pgwide='1'> + <table xml:id="snapshot-daemon" frame='all' pgwide='1'> <title>Snapshot daemon</title> <tgroup cols='5' colsep='1' rowsep='1'> <colspec colnum="1" colname="col1" colwidth="2*"/> @@ -517,13 +556,8 @@ tarantool: primary pri: 3301 adm: 3313</programlisting> <entry>If replication_source is not an empty string, the server is considered to be a Tarantool replica. The replica server will try to connect to the master - which replication_source specifies with a URI. - The string format is similar to the - <link - xlink:href="http://en.wikipedia.org/wiki/URI_scheme#Generic_syntax">generic syntax for a URI schema</link> - So it may be a user name and password and ip:port address such as 'guest:password@127.0.0.1:3301', - or just ip:port address such as '127.0.0.1:3301', - or just port such as '3301'. + which replication_source specifies with a <link linkend="URI">URI</link> (Universal Resource Identifier), + for example 'konstantin:secret_password@tarantool.org:3301'. The default user name is 'guest'. A replica server does not accept data-change requests on the <olink targetptr="primary_port">listen</olink> port. diff --git a/doc/user/connectors.xml b/doc/user/connectors.xml index ba43c2fb2be6e225c981da54990705ecd55491ee..69cf85f6b94aebe6637e846a978714aa42f4324b 100644 --- a/doc/user/connectors.xml +++ b/doc/user/connectors.xml @@ -118,6 +118,31 @@ parameters. Something like <code>response=tarantool_routine("insert",0,"A","B"); And that is why APIs exist for drivers for Perl, Python, PHP, and so on. </para> </section> + + <section xml:id="connector-server"> + <title>Setting up the server for connector examples</title> + <para> + This chapter has examples that show how to connect to the Tarantool + server via the Perl, PHP, and Python connectors. + The examples contain hard code that will work if and only if + the server (tarantool) is running on localhost (127.0.0.1) and is listening on port 3301 + (<code>box.cfg.listen='3301'</code>) and space 'examples' has id = 999 + (<code>box.space.tester.id = 999</code>), and + space 'examples' has a primary-key index for a numeric field + (<code>box.space[999].index[0].parts[1].type = "NUM"</code>) + and user 'guest' has privileges for reading and writing. + </para> + <para> + It is easy to meet all the conditions by starting the + server and executing this script:<programlisting> +box.cfg{listen=3301} +box.schema.space.create('examples',{id=999}) +box.space.examples:create_index('primary', {type = 'hash', parts = {1, 'NUM'}}) +box.schema.user.grant('guest','read,write','space','examples') +box.schema.user.grant('guest','read','space','_space') +</programlisting> + </para> + </section> <section xml:id="connector-java"> <title>Java</title> @@ -133,29 +158,26 @@ And that is why APIs exist for drivers for Perl, Python, PHP, and so on. It is not supplied as part of the Tarantool repository; it must be installed separately. The most common way to install it is with <link xlink:href='https://en.wikipedia.org/wiki/Cpan'>CPAN, the Comprehensive Perl Archive Network</link>. DR::Tarantool requires other modules which should be installed first. - For example, on Ubuntu, the installation could look like this: - <programlisting> + For example, on Ubuntu, the installation could look like this:<programlisting> sudo cpan install AnyEvent sudo cpan install Devel::GlobalDestruction sudo cpan install Coro sudo cpan install Test::Pod sudo cpan install Test::Spelling sudo cpan install PAR::Dist -sudo cpan install DR::Tarantool - </programlisting> - </para> +sudo cpan install List::MoreUtils +sudo cpan install DR::Tarantool</programlisting> + </para> <para> - Here is a complete Perl program that inserts [99999,'BB'] into space[0] via the Perl API. - Before trying to run, check that the server - (tarantool) is running on localhost (127.0.0.1) and its listen address is the default - (local host, port 3301) and - space[0]'s primary key type is numeric (box.space[0].index[0].parts[1].type = "NUM"). + Here is a complete Perl program that inserts [99999,'BB'] into space[999] via the Perl API. + Before trying to run, check that the server is listening and that <code>examples</code> exists, + as described <link linkend="connector-server">earlier</link>. To run, paste the code into a file named example.pl and say <code>perl example.pl</code>. The program will connect using an application-specific definition of the space. The program will open a socket connection with the tarantool server at localhost:3301, then send an INSERT request, then — if all is well — end without displaying any messages. - If tarantool is not running on localhost with listen address = port 3301, the program will print + If tarantool is not running on localhost with listen address = 3301, the program will print <quote>Connection refused</quote>. </para> <para> @@ -163,22 +185,24 @@ sudo cpan install DR::Tarantool #!/usr/bin/perl use DR::Tarantool ':constant', 'tarantool'; use DR::Tarantool ':all'; +use DR::Tarantool::MsgPack::SyncClient; + +my $tnt = DR::Tarantool::MsgPack::SyncClient->connect( + host => '127.0.0.1', # look for tarantool on localhost + port => 3301, # assume tarantool listen address = default + user => 'guest', # username. one could also say 'password=>...' -my $tnt = tarantool - host => '127.0.0.1', # look for tarantool on localhost - port => 3301, # assume tarantool listen address = default - spaces => { - 0 => { # definition of space[0] ... - name => 't0', # space[0] name = 't0' - default_type => 'STR', # space[0] field type is 'STR' if undefined - fields => [ { # definition of space[0].fields ... - name => 'k0', type => 'NUM' } ], # space[0].field[1] name='k0',type='NUM' - indexes => { # definition of space[0] indexes ... - 0 => { - name => 'k0', fields => 'k0' } } } }; + spaces => { + 999 => { # definition of space[999] ... + name => 'examples', # space[999] name = 'examples' + default_type => 'STR', # space[999] field type is 'STR' if undefined + fields => [ { # definition of space[512].fields ... + name => 'field1', type => 'NUM' } ], # space[999].field[1] name='field1',type='NUM' + indexes => { # definition of space[999] indexes ... + 0 => { + name => 'primary', fields => [ 'field1' ] } } } } ); -$tnt->insert('t0' => [ 99999, 'BB' ]); # INSERT INTO t0 VALUES (99999,'BB') - </programlisting> +$tnt->insert('tester' => [ 99999, 'BB' ]);</programlisting> </para> <para> The example program only shows one command and does not show all that's necessary for @@ -206,9 +230,8 @@ cd tarantool-php phpize ./configure make -#make install is optional - </programlisting> - </para> +#make install is optional</programlisting> + </para> <para> At this point there is a file named <filename>~/tarantool-php/modules/tarantool.so</filename>. PHP will only find it if the PHP initialization file <filename>php.ini</filename> contains a line like @@ -218,15 +241,12 @@ make <programlisting> cd ~ cp ./tarantool-php/modules/tarantool.so . -export PHP_INI_SCAN_DIR=~/tarantool-php/tests/shared - </programlisting> +export PHP_INI_SCAN_DIR=~/tarantool-php/tests/shared</programlisting> </para> <para> - Here is a complete PHP program that inserts [99999,'BB'] into a space named 'tester' via the PHP API. - Before trying to run, check that the server - (tarantool) is running on localhost (127.0.0.1) and its listen address is the default - (local host, port 3301) and - tester's primary key type is numeric (box.space.tester.index[0].parts[1].type = "NUM"). + Here is a complete PHP program that inserts [99999,'BB'] into a space named 'examples' via the PHP API. + Before trying to run, check that the server is listening and that <code>examples</code> exists, + as described <link linkend="connector-server">earlier</link>. To run, paste the code into a file named example.php and say <code>php example.php</code>. The program will open a socket connection with the tarantool server at localhost:3301, then send an INSERT request, @@ -238,14 +258,17 @@ export PHP_INI_SCAN_DIR=~/tarantool-php/tests/shared <?php $tarantool = new Tarantool("localhost", 3301); try { - $tarantool->insert("tester", array(99999, "BB")); + $tarantool->insert("examples", array(99999, "BB")); print "Insert succeeded\n"; } catch (Exception $e) { echo "Exception: ", $e->getMessage(), "\n"; } -?> - </programlisting> +?></programlisting> + </para> + <para> + After running the example, it is good practice to delete the file ./tarantool.so, + since it is only compatible with PHP and its existence could confuse non-PHP applications. </para> <para> The example program only shows one command and does not show all that's necessary for @@ -258,14 +281,14 @@ catch (Exception $e) { <section xml:id="connector-python"> <title>Python</title> <para> - Here is a complete Python program that inserts ['First Tuple','Value','Value'] into space99 via the high-level Python API. + Here is a complete Python program that inserts [99999,'Value','Value'] into space <code>examples</code> via the high-level Python API. </para> <programlisting language="python"> #!/usr/bin/python from tarantool import Connection c = Connection("127.0.0.1", 3301) -result = c.insert("space99",('First Tuple','Value', 'Value')) +result = c.insert("examples",(99999,'Value', 'Value')) print result </programlisting> <para> @@ -273,20 +296,12 @@ print result <userinput><code>pip install tarantool\>0.4</code></userinput> to install in <filename>/usr</filename> (requires root privilege) or <userinput><code>pip install tarantool\>0.4 --user</code></userinput> to install in <filename>~</filename> i.e. user's default directory. - The program is assuming that the server (tarantool) is running on localhost (127.0.0.1) and its listen address is - the default (local host, port 3301) and space99's primary key type is string (box.space.space99.index[0].parts[1].type = "STR") - and user 'guest' has permission to read and write on space99. An administrator could fulfill all those conditions by - starting the tarantool server and executing these requests:<programlisting> -box.cfg{listen = 3301} -box.schema.create_space('space99') -box.space.space99:create_index('primary',{parts = {1,'STR'}}) -box.schema.user.grant('guest', 'read', 'space', '_space') -box.schema.user.grant('guest', 'read,write', 'space', 'space99')</programlisting> + Before trying to run, check that the server is listening and that <code>examples</code> exists, + as described <link linkend="connector-server">earlier</link>. To run the program, say <code>python example.py</code>. The program will connect to the server, will send the request, and will not throw an exception if all went well. If the tuple already exists, the program will throw DatabaseException(“Duplicate key exists in unique indexâ€). </para> - <para> The example program only shows one request and does not show all that's necessary for good practice. For that, see diff --git a/doc/user/databases.xml b/doc/user/databases.xml index bf55c132d3aefb2b080ad370fef13ce77847bda3..2849948c68297348fbb28a8c760d5b4343a40bea 100644 --- a/doc/user/databases.xml +++ b/doc/user/databases.xml @@ -149,13 +149,13 @@ <variablelist xml:id="box.schema" xreflabel="box.schema"> <para> The <code>box.schema</code> package has one data-definition - function: create_space(). + function: space.create(). </para> <varlistentry> <term> - <emphasis role="lua" xml:id="box.create_space"> - box.schema.create_space(<replaceable>space-name</replaceable> [, {<replaceable>options</replaceable>} ]) + <emphasis role="lua" xml:id="box.space.create"> + box.schema.space.create(<replaceable>space-name</replaceable> [, {<replaceable>options</replaceable>} ]) </emphasis> </term> <listitem> @@ -168,7 +168,7 @@ </para> <para> <table> - <title>Options for box.schema.create_space</title> + <title>Options for box.schema.space.create</title> <tgroup cols="4" align="left" colsep="1" rowsep="1"> <tbody> <row> @@ -206,14 +206,14 @@ Possible errors: If a space with the same name already exists. <bridgehead renderas="sect4">Example</bridgehead> <programlisting> -tarantool> <userinput>s = box.schema.create_space('space55')</userinput> +tarantool> <userinput>s = box.schema.space.create('space55')</userinput> --- ... -tarantool> <userinput>s = box.schema.create_space('space55', {id = 555, temporary = false})</userinput> +tarantool> <userinput>s = box.schema.space.create('space55', {id = 555, temporary = false})</userinput> --- - error: Space 'space55' already exists ... -tarantool> <userinput>s = box.schema.create_space('space55', {if_not_exists = true})</userinput> +tarantool> <userinput>s = box.schema.space.create('space55', {if_not_exists = true})</userinput> --- ... </programlisting> @@ -294,6 +294,9 @@ tarantool> <userinput>s = box.schema.create_space('space55', {if_not_exists = tr <row> <entry>parts</entry><entry>field-numbers + types</entry><entry>{field_no, 'NUM'|STR'}</entry><entry>{1, 'NUM'}</entry> </row> + <row> + <entry>if_not_exists</entry><entry>no error if duplicate name</entry><entry>true|false</entry><entry>false</entry> + </row> </tbody> </tgroup> </table> @@ -369,7 +372,7 @@ tarantool> <userinput>s:create_index('primary', {unique = true, parts = {1, 'NUM Possible Errors: No such space; wrong type. <bridgehead renderas="sect4">Example</bridgehead> <programlisting> -tarantool> <userinput>s = box.schema.create_space('tmp', {temporary=true})</userinput> +tarantool> <userinput>s = box.schema.space.create('tmp', {temporary=true})</userinput> --- ... tarantool> <userinput> s:create_index('primary',{parts = {1,'NUM', 2, 'STR'}})</userinput> @@ -649,7 +652,7 @@ tarantool> <userinput>box.space.space55.index.primary:rename('secondary')</useri The <code>update</code> function supports operations on fields — assignment, arithmetic (if the field is unsigned numeric), cutting and pasting fragments of a field, - deletng or inserting a field. Multiple + deleting or inserting a field. Multiple operations can be combined in a single update request, and in this case they are performed atomically and sequentially. Each operation requires specification of a field number. When multiple operations @@ -823,12 +826,12 @@ tarantool> <userinput>box.space.tester:delete('a')</userinput> <varlistentry> <term> - <emphasis role="lua">box.space.<replaceable>space-name</replaceable>.field_count</emphasis> + <emphasis role="lua" xml:id="box.space.field_count">box.space.<replaceable>space-name</replaceable>.field_count</emphasis> </term> <listitem> <para> (type = number) The required field count for all tuples in this space. - The field_count can be set initially with <code>box.schema.create_space<replaceable>... field_count = new-field-count-value ...</replaceable></code>. + The field_count can be set initially with <code>box.schema.space.create<replaceable>... field_count = new-field-count-value ...</replaceable></code>. The default value is 0, which means there is no required field count. </para> </listitem> @@ -920,7 +923,7 @@ tarantool> <userinput>box.space.tester:len()</userinput> Complexity Factors: Index size, Index type, WAL settings. </para> <bridgehead renderas="sect4">Example</bridgehead> -<programlisting>tarantool> <userinput>s = box.schema.create_space('forty_second_space')</userinput> +<programlisting>tarantool> <userinput>s = box.schema.space.create('forty_second_space')</userinput> --- ... tarantool> <userinput>s:create_index('primary', {unique = true, parts = {1, 'NUM', 2, 'STR'}})</userinput> @@ -958,7 +961,7 @@ tarantool> <userinput>box.space.forty_second_space:inc{1,'a'}</userinput> Complexity Factors: Index size, Index type, WAL settings. </para> <bridgehead renderas="sect4">Example</bridgehead> -<programlisting>tarantool> <userinput>s = box.schema.create_space('space19')</userinput> +<programlisting>tarantool> <userinput>s = box.schema.space.create('space19')</userinput> --- ... tarantool> <userinput>s:create_index('primary', {unique = true, parts = {1, 'NUM', 2, 'STR'}})</userinput> @@ -992,7 +995,7 @@ tarantool> <userinput>box.space.space19:dec{1,'a'}</userinput> Within the loop, a value (type = tuple) is returned for each iteration. </para> <bridgehead renderas="sect4">Example</bridgehead> -<programlisting><prompt>tarantool></prompt> <userinput>s = box.schema.create_space('space33')</userinput> +<programlisting><prompt>tarantool></prompt> <userinput>s = box.schema.space.create('space33')</userinput> --- ... <prompt>tarantool></prompt> <userinput>s:create_index('X', {}) -- index 'X' has default parts {1,'NUM'}</userinput> @@ -1028,7 +1031,7 @@ tarantool> <userinput>box.space.space19:dec{1,'a'}</userinput> create a function which increments a counter, create a trigger, do two inserts, drop the space, and display the counter value -- which is 2, because the function is executed once after each insert. -<programlisting>s = box.schema.create_space('space53') +<programlisting>s = box.schema.space.create('space53') s:create_index('primary', {parts = {1, 'NUM'}}) function replace_trigger() replace_counter = replace_counter + 1 end s:on_replace(replace_trigger) @@ -1176,7 +1179,7 @@ console.delimiter('')!</programlisting> </term> <listitem> <para> - _user is a new system tuple set for support of the authorization feature. + _user is a new system tuple set for support of the <link linkend="authentication">authorization feature</link>. </para> </listitem> </varlistentry> @@ -1198,7 +1201,7 @@ console.delimiter('')!</programlisting> </term> <listitem> <para> - _cluster is a new system tuple set for support of the replication feature. + _cluster is a new system tuple set for support of the <olink targetptr="replication">replication feature</olink>. </para> </listitem> </varlistentry> @@ -1316,8 +1319,9 @@ console.delimiter('')! type: NUM fieldno: 1 id: 0 + space_id: 513 + name: primary type: TREE - idx: ' index 0' etc. ...</programlisting></listitem> </varlistentry> @@ -1381,7 +1385,7 @@ console.delimiter('')! <para> <bridgehead renderas="sect4">Example</bridgehead> <programlisting> -<prompt>tarantool></prompt> <userinput> s = box.schema.create_space('space17')</userinput> +<prompt>tarantool></prompt> <userinput> s = box.schema.space.create('space17')</userinput> --- ... <prompt>tarantool></prompt> <userinput> s:create_index('primary', {parts = {1, 'STR', 2, 'STR'}})</userinput> @@ -1461,7 +1465,7 @@ console.delimiter('')! # Create a non-unique index 'secondary' with an index on the second field. # Insert three tuples, values in field[2] equal to 'X', 'Y', and 'Z'. # Select all tuples where the secondary index keys are greater than 'X'. -box.schema.create_space('tester') +box.schema.space.create('tester') box.space.tester:create_index('primary', {parts = {1, 'NUM' }}) box.space.tester:create_index('secondary', {type = 'tree', unique = false, parts = {2, 'STR'}}) box.space.tester:insert{1,'X','Row with field[2]=X'} @@ -1495,7 +1499,7 @@ The result will be a table of tuples and will look like this: <varlistentry> <term> - <emphasis role="lua">box.space.<replaceable>space-name</replaceable>.index.<replaceable>index-name</replaceable>:min([<replaceable>key</replaceable>])</emphasis> + <emphasis role="lua">box.space.<replaceable>space-name</replaceable>.index.<replaceable>index-name</replaceable>:min([<replaceable>key-value</replaceable>])</emphasis> </term> <listitem> <para> @@ -1503,6 +1507,7 @@ The result will be a table of tuples and will look like this: </para> <para> Returns: (type = tuple) the tuple for the first key in the index. + If optional <code>key-value</code> is supplied, returns the first key which is greater than or equal to key-value. </para> <para> Complexity Factors: Index size, Index type. @@ -1523,7 +1528,7 @@ The result will be a table of tuples and will look like this: <varlistentry> <term> - <emphasis role="lua">box.space.<replaceable>space-name</replaceable>.index.<replaceable>index-name</replaceable>:max([<replaceable>key</replaceable>])</emphasis> + <emphasis role="lua">box.space.<replaceable>space-name</replaceable>.index.<replaceable>index-name</replaceable>:max([<replaceable>key-value</replaceable>])</emphasis> </term> <listitem> <para> @@ -1531,6 +1536,7 @@ The result will be a table of tuples and will look like this: </para> <para> Returns: (type = tuple) the tuple for the last key in the index. + If optional <code>key-value</code> is supplied, returns the last key which is less than or equal to key-value. </para> <para> Complexity Factors: Index size, Index type. @@ -2012,7 +2018,8 @@ tarantool> <userinput>t:transform(2,2,'x')</userinput> then the result is returned. </para> <para> - Parameters: <code>start-field-number</code> = base 1, may be negative, <code>end-field-number</code> = optional, base 1, negative treated as positive. + Parameters: <code>start-field-number</code> = base 1, may be negative, <code>end-field-number</code> = optional, base 1, may be negative. + Negative values are counted from the end, for example -2 means the second-last field. </para> <para> Returns: (type = scalar) one or more field values. @@ -2419,6 +2426,9 @@ tarantool> <userinput>box.stat() -- the full contents of the table</userinput> INSERT: total: 48207694 rps: 139 + AUTH: + total: 0 + rps: 0 CALL: total: 8 rps: 0 @@ -2987,7 +2997,7 @@ When a client connects to a Tarantool server, the server sends a random which the client must mix with the hashed-password before sending to the server. Thus the original value 'x' is never stored anywhere except in the -user's head, and the hashed value is never passed passed down a +user's head, and the hashed value is never passed down a network wire except when mixed with a random salt. This system prevents malicious onlookers from finding passwords by snooping in the log files or snooping on the wire. @@ -3007,14 +3017,14 @@ To see more about the details of the algorithm for the purpose of writing a new <para> <bridgehead renderas="sect4">Users and the _user space</bridgehead> The fields in the _user space are: -a numeric id, a number, the user name, the type, and the optional password. +the numeric id of the tuple, the numeric id of the tuple's creator, the user name, the type, and the optional password. </para> <para> There are three special users: 'guest', 'admin', and 'public'. </para> - <para> + <table> <title>The system users</title> <tgroup cols="4" align="left" colsep="1" rowsep="1"> @@ -3026,7 +3036,7 @@ There are three special users: 'guest', 'admin', and 'public'. <entry>guest</entry><entry>0</entry><entry>user</entry><entry>Default when connecting remotely. Usually an untrusted user with few privileges.</entry> </row> <row> - <entry>admin</entry><entry>1</entry><entry>user</entry><entry>Default when using sys/tarantool as a console. Usually an administrative user with all privileges.</entry> + <entry>admin</entry><entry>1</entry><entry>user</entry><entry>Default when using <code>tarantool</code> as a console. Usually an administrative user with all privileges.</entry> </row> <row> <entry>public</entry><entry>2</entry><entry>role</entry><entry>Not a user in the usual sense. A role is a container for privileges which can be granted to regular users. </entry> @@ -3034,15 +3044,15 @@ There are three special users: 'guest', 'admin', and 'public'. </tbody> </tgroup> </table> - </para> + <para> To select a row from the _user space, use <code>box.select</code>. For example, here is what happens with a select for user id = 0, -which is the 'guest' user, without a password: +which is the 'guest' user, which by default has no password: <programlisting><prompt>tarantool></prompt> <userinput>box.space._user:select{0}</userinput> --- -- - [0, 1, 'guest'] +- - [0, 1, 'guest', 'user'] ...</programlisting></para> <para> @@ -3056,6 +3066,9 @@ To create a new user, say <code>box.schema.user.create(<replaceable>user-name</replaceable>)</code> or <code>box.schema.user.create(<replaceable>user-name</replaceable>, {password=<replaceable>password</replaceable>})</code>. +The form +<code>box.schema.user.create(<replaceable>user-name</replaceable>, {password=<replaceable>password</replaceable>})</code> +is better because in a <link linkend="URI">URI</link> (Uniform Resource Identifier) it is usually illegal to include a user-name without a password. </para> <para> @@ -3072,14 +3085,16 @@ To drop a user, say For example, here is a session which creates a new user with a strong password, selects a tuple in the _user space, and then drops the user. -<programlisting><prompt>tarantool></prompt> <userinput>box.schema.user.create('ElizabethBrowning', {password = 'Iwtso65$SDS?'})</userinput> +<programlisting><prompt>tarantool></prompt> <userinput>box.schema.user.create('JeanMartin', {password = 'Iwtso_6_os$$'})</userinput> --- ... -<prompt>tarantool></prompt> <userinput>box.space._user:select{4}</userinput> + +<prompt>tarantool></prompt> <userinput>box.space._user.index.name:select{'JeanMartin'}</userinput> --- -- - [4, 1, 'ElizabethBrowning', {'chap-sha1': 'zyy3yArGOQ4T40PnsL6yPGlgYrU='}] +- - [17, 1, 'JeanMartin', 'user', {'chap-sha1': 't3xjUpQdrt857O+YRvGbMY5py8Q='}] ... -<prompt>tarantool></prompt> <userinput>box.schema.user.drop('ElizabethBrowning')</userinput> + +<prompt>tarantool></prompt> <userinput>box.schema.user.drop('JeanMartin')</userinput> --- ...</programlisting></para> @@ -3209,32 +3224,31 @@ box.space.payroll:select{'Jones'}</programlisting> <title>Limitations</title> <variablelist> - + <varlistentry> <term xml:id="limitations-index-field-count" xreflabel="limitations-index-field-count">Number of fields in an index</term> <listitem><para>For BITSET indexes, the maximum is 1. - For TREE indexes, the theoretical maximum is about 4 billion (BOX_FIELD_MAX) - but the practical maximum is the number of fields in a tuple. + For TREE or HASH indexes, the maximum is 255 (box.schema.INDEX_PART_MAX). </para></listitem> </varlistentry> - + <varlistentry> <term xml:id="limitations-index-count" xreflabel="limitations-index-count">Number of indexes in a space</term> - <listitem><para>10 (BOX_INDEX_MAX). + <listitem><para>10 (box.schema.INDEX_MAX). </para></listitem> </varlistentry> <varlistentry> <term xml:id="limitations-tuple-field-count" xreflabel="limitations-tuple-field-count">Number of fields in a tuple</term> - <listitem><para>There is no theoretical maximum. - The practical maximum is whatever is specified by the space's <code>field_count</code> member, + <listitem><para>The theoretical maximum is 2147483647 (box.schema.FIELD_MAX). + The practical maximum is whatever is specified by the space's <link linkend="box.space.field_count">field_count</link> member, or the maximum tuple length. </para></listitem> </varlistentry> - + <varlistentry> <term xml:id="limitations-space-count" xreflabel="limitations-space-count">Number of spaces</term> - <listitem><para>65535. + <listitem><para>The theoretical maximum is 2147483647 (box.schema.SPACE_MAX). </para></listitem> </varlistentry> @@ -3263,6 +3277,12 @@ box.space.payroll:select{'Jones'}</programlisting> <listitem><para>32. </para></listitem> </varlistentry> + + <varlistentry> + <term xml:id="limitations-name-length" xreflabel="limitations-name-length">Length of an index name or space name or user name</term> + <listitem><para>32 (box.schema.NAME_MAX). + </para></listitem> + </varlistentry> </variablelist> </section> diff --git a/doc/user/lua-tutorial.xml b/doc/user/lua-tutorial.xml index db4e85752eef0182164e606e201ef9d7975d53c7..0be320b948ddaad1369884d9573de8865b710dcc 100644 --- a/doc/user/lua-tutorial.xml +++ b/doc/user/lua-tutorial.xml @@ -599,7 +599,7 @@ targetptr="getting-started-start-stop"><quote>Starting Tarantool and making your <programlisting> box.space.tester:drop() -- if tester is left over from some previous test, destroy it -box.schema.create_space('tester') +box.schema.space.create('tester') box.space.tester:create_index('primary', {parts = {1, 'NUM'}}) </programlisting> then add some tuples where the first field is a number and the second field is a string. diff --git a/doc/user/plugins.xml b/doc/user/plugins.xml index f3a3c572c771e4a345a35ca66d3ac8c953cdc8dd..029edf6a38aa0fa88007981aa022b632bff083b8 100644 --- a/doc/user/plugins.xml +++ b/doc/user/plugins.xml @@ -63,7 +63,8 @@ on the local host 127.0.0.1. </para> <programlisting> -# Check that the include subdirectory exists by looking for ~/include/mysql.h. +# Check that the include subdirectory exists by looking for .../include/mysql.h. +# (If this fails, there's a chance that it's in .../include/mysql/mysql.h instead.) <prompt>$ </prompt><userinput>[ -f ~/mysql-5.5/include/mysql.h ] && echo "OK" || echo "Error"</userinput> OK @@ -119,7 +120,8 @@ Linking CXX shared library libmysql.so # The MySQL module should now be in ./src/module/mysql/mysql.so. # If a "make install" had been done, then mysql.so would be in a -# different place, for example /usr/local/lib/tarantool/1.5/box/net/mysql.so. +# different place, for example +# /usr/local/lib/x86_64-linux-gnu/tarantool/box/net/mysql.so. # In that case there should be additional cmake options such as # -DCMAKE_INSTALL_LIBDIR and -DCMAKE_INSTALL_PREFIX. # For this example we assume that "make install" is not done. diff --git a/doc/user/preface.xml b/doc/user/preface.xml index 0cc1065ea5cfd1f3aee3cd5413523e3c1215963a..c5975df9f7218706604b198551eea930bb013cca 100644 --- a/doc/user/preface.xml +++ b/doc/user/preface.xml @@ -214,9 +214,9 @@ <para> Please report bugs in Tarantool at <link xlink:href="http://github.com/tarantool/tarantool/issues"/>. You can - contact developers directly on + contact developers directly on the <link xlink:href="irc://irc.freenode.net#tarantool">#tarantool</link> - IRC channel or via a mailing list, + IRC channel on freenode, or via a mailing list, <link xlink:href="https://googlegroups.com/group/tarantool">Tarantool Google group</link>. </para> </section> diff --git a/doc/user/proctitle.xml b/doc/user/proctitle.xml index b4ea7969973a7f8839cadf9ddd7e49d2cfd75f8e..a3e5cd19864d714ccefc2cb5dc2dc6c1ac223817 100644 --- a/doc/user/proctitle.xml +++ b/doc/user/proctitle.xml @@ -45,7 +45,7 @@ <emphasis role="strong">spawner</emphasis> -- controls other processes, </para></listitem> <listitem><para> - <emphasis role="strong">replica + uri/status</emphasis> -- replication node accepting connections on <olink targetptr="replication_port"/>, + <emphasis role="strong">replica + URI/status</emphasis> -- replication node accepting connections on <olink targetptr="replication_port"/>, </para></listitem> <listitem><para> <emphasis role="strong">relay + sockaddr</emphasis> -- serves a single replication connection, diff --git a/doc/user/replication.xml b/doc/user/replication.xml index 025c103ba7783a2870ce5a95906f87dc3dce5338..2d6a9273d28558ecf2e552a7ccc9a7b32b1ac3e2 100644 --- a/doc/user/replication.xml +++ b/doc/user/replication.xml @@ -68,7 +68,7 @@ identifier which is unique within the cluster, known as the To prepare the master for connections from the replica, it's only necessary to include "listen" in the initial <code>box.cfg</code> request, for example <code>box.cfg{listen=3301}</code>. - A master with enabled "listen" URI can accept connections + A master with enabled "listen" <link linkend="URI">URI</link> can accept connections from as many replicas as necessary on that URI. Each replica has its own replication state. </para> @@ -106,6 +106,13 @@ identifier which is unique within the cluster, known as the replica to become a master and vice versa with the help of the <olink targetptr="box.cfg">box.cfg</olink> statement. </simpara></note> + <note><simpara> + The replica does not inherit the master's configuration parameters, + such as the ones that cause the <link linkend="snapshot-daemon">snapshot daemon</link> + to run on the master. To get the same behavior, + one would have to set the relevant parameters explicitly + so that they are the same on both master and replica. + </simpara></note> </section> <section xml:id="recovering-from-a-degraded-state"> @@ -126,9 +133,11 @@ identifier which is unique within the cluster, known as the propagated before the old master went down, they would have to be re-applied manually. </para> - + </section> + <section> + <title>Instructions for quick startup of a new two-server simple cluster</title> <para> - <bridgehead renderas="sect4">Instructions for quick startup of a new two-server simple cluster</bridgehead>Step 1. Start the first server thus:<programlisting><userinput>box.cfg{listen=<replaceable>uri#1</replaceable>}</userinput> +Step 1. Start the first server thus:<programlisting><userinput>box.cfg{listen=<replaceable>uri#1</replaceable>}</userinput> <userinput>box.schema.user.grant('guest','read,write,execute','universe') -- replace with more restrictive request</userinput> <userinput>box.snapshot()</userinput></programlisting>... Now a new cluster exists. </para> @@ -158,8 +167,11 @@ down then the replica can take over), or LOAD BALANCING (because clients can connect to either the master or the replica for select requests). </para> +</section> + +<section> +<title>Master-Master Replication</title> <para> - <bridgehead renderas="sect4">master-master</bridgehead> In the simple master-replica configuration, the master's changes are seen by the replica, but not vice versa, because the master was specified as the sole replication source. @@ -183,9 +195,10 @@ replica for select requests). there is a possibility that servers will end up with different contents. </para> +</section> +<section> + <title>All the "What If?" Questions</title> <para> - - <bridgehead renderas="sect4">All the "What If?" Questions</bridgehead> <emphasis>What if there are more than two servers with master-master?</emphasis> ... On each server, specify the replication_source for all the others. For example, server #3 would have a request: @@ -231,11 +244,19 @@ replica for select requests). ... Stop the server, destroy all the database files (the ones with extension "snap" or "xlog" or ".inprogress"), restart the server, and catch up with the master by contacting it again - (just sqy <code>box.cfg{...replication_source=...}</code>). + (just say <code>box.cfg{...replication_source=...}</code>). </para> - <para> - <bridgehead renderas="sect4">Hands-On (Tutorial)</bridgehead> + <emphasis>What if replication causes security concerns?</emphasis> + ... Prevent unauthorized replication sources by associating a password + with every user that has access privileges for the relevant spaces. + That way, the <link linkend="URI">URI</link> for the replication_source parameter + will always have to have the long form <code>replication_source='username:password@host:port'</code>. + </para> + </section> + <section> + <title>Hands-On Replication Tutorial</title> + <para> After following the steps here, an administrator will have experience creating a cluster and adding a replica. @@ -245,7 +266,7 @@ replica for select requests). <informaltable> <tgroup cols="2" align="left" colsep="1" rowsep="0"> <thead> - <row><entry>__________TERMINAL #1__________</entry><entry>__________TERMINAL #2__________</entry></row> + <row><entry>______________TERMINAL #1______________</entry><entry>______________TERMINAL #2______________</entry></row> </thead> <tbody> <row><entry><programlisting><prompt>$</prompt></programlisting></entry> @@ -261,8 +282,9 @@ replica for select requests). <userinput>cd ~/tarantool_test_node_1</userinput> <userinput>rm -R ~/tarantool_test_node_1/*</userinput> <userinput>~/tarantool-master/src/tarantool</userinput> -<userinput>box.cfg{listen=3301, logger='filename.log'}</userinput> -<userinput>box.schema.user.grant('guest','read,write,execute','universe')</userinput> +<userinput>box.cfg{listen=3301}</userinput> +<userinput>box.schema.user.create('replication', {password = 'password'})</userinput> +<userinput>box.schema.user.grant('replication','read,write','universe')</userinput> <userinput>box.space._cluster:select({0},{iterator='GE'})</userinput> </programlisting> </para> @@ -285,16 +307,19 @@ $ <userinput>cd ~/tarantool_test_node_1</userinput> type 'help' for interactive help tarantool> <userinput>box.cfg{listen=3301}</userinput> ... ... -tarantool> <userinput>box.schema.user.grant('guest','read,write,execute','universe')</userinput> -2014-08-14 13:39:57.712 [24956] wal I> creating `./00000000000000000000.xlog.inprogress' +tarantool> <userinput>box.schema.user.create('replication', {password = 'password'})</userinput> +2014-10-13 11:12:56.052 [25018] wal I> creating `./00000000000000000000.xlog.inprogress' +--- +... +tarantool> <userinput>box.schema.user.grant('replication','read,write','universe')</userinput> --- ... tarantool> <userinput>box.space._cluster:select({0},{iterator='GE'})</userinput> --- -- - [1, 'd3de1435-5e26-4122-95e5-3e2d40e6e1df'] +- - [1, '6190d919-1133-4452-b123-beca0b178b32'] ... </programlisting></entry> - <entry><programlisting>$ + <entry><programlisting>$ @@ -321,7 +346,7 @@ execute these commands:<programlisting> <userinput>cd ~/tarantool_test_node_2</userinput> <userinput>rm -R ~/tarantool_test_node_2/*</userinput> <userinput>~/tarantool-master/src/tarantool</userinput> -<userinput>box.cfg{listen=3302, replication_source=3301}</userinput> +<userinput>box.cfg{listen=3302, replication_source='replication:password@localhost:3301'}</userinput> <userinput>box.space._cluster:select({0},{iterator='GE'})</userinput></programlisting> The result is that a replica is set up. Messages appear on Terminal #1 confirming that the @@ -343,40 +368,41 @@ servers are in the same cluster. <row><entry><programlisting>... ... tarantool> box.space._cluster:select({0},{iterator='GE'}) --- -- - [1, 'd3de1435-5e26-4122-95e5-3e2d40e6e1df'] +- - [1, '6190d919-1133-4452-b123-beca0b178b32'] ... -tarantool> 2014-08-14 13:41:31.097 [24958] main/101/spawner I> created a replication relay: pid = 25148 -2014-08-14 13:41:31.098 [25148] main/101/relay/127.0.0.1:42758 I> recovery start -2014-08-14 13:41:31.098 [25148] main/101/relay/127.0.0.1:42758 I> recovering from `./00000000000000000000.snap' -2014-08-14 13:41:31.098 [25148] main/101/relay/127.0.0.1:42758 I> snapshot sent -2014-08-14 13:41:31.190 [24958] main/101/spawner I> created a replication relay: pid = 25150 -2014-08-14 13:41:31.291 [25150] main/101/relay/127.0.0.1:42759 I> recover from `./00000000000000000000.xlog'</programlisting></entry> +tarantool> 2014-10-13 11:20:08.691 [25020] main/101/spawner I> created a replication relay: pid = 25583 +2014-10-13 11:20:08.691 [25583] main/101/relay/127.0.0.1:50883 I> recovery start +2014-10-13 11:20:08.691 [25583] main/101/relay/127.0.0.1:50883 I> recovering from `./00000000000000000000.snap' +2014-10-13 11:20:08.692 [25583] main/101/relay/127.0.0.1:50883 I> snapshot sent +2014-10-13 11:20:08.789 [25020] main/101/spawner I> created a replication relay: pid = 25585 +2014-10-13 11:20:08.890 [25585] main/101/relay/127.0.0.1:50884 I> recover from `./00000000000000000000.xlog' +</programlisting></entry> <entry><programlisting><prompt>$</prompt> <userinput># Terminal 2</userinput> ~/tarantool_test_node_2$ <userinput>mkdir -p ~/tarantool_test_node_2</userinput> ~/tarantool_test_node_2$ <userinput>cd ~/tarantool_test_node_2</userinput> ~/tarantool_test_node_2$ <userinput>rm -R ~/tarantool_test_node_2/*</userinput> ~/tarantool_test_node_2$ <userinput>~/tarantool-master/src/tarantool</userinput> -/home/pgulutzan/tarantool-master/src/tarantool: version 1.6.3-1724-g033ed69 +/home/username/tarantool-master/src/tarantool: version 1.6.3-1724-g033ed69 type 'help' for interactive help -tarantool> <userinput>box.cfg{listen=3302, replication_source=3301}</userinput> +tarantool> <userinput>box.cfg{listen=3302, replication_source='replication:password@localhost:3301'}</userinput> ... ... --- ... tarantool> <userinput>box.space._cluster:select({0},{iterator='GE'})</userinput> -2014-08-14 13:41:31.189 [25139] main/102/replica/0.0.0.0:3301 C> connected to master -2014-08-14 13:41:31.291 [25139] wal I> creating `./00000000000000000000.xlog.inprogress' +2014-10-13 11:20:08.789 [25579] main/103/replica/localhost:3301 C> connected to 127.0.0.1:3301 +2014-10-13 11:20:08.789 [25579] main/103/replica/localhost:3301 I> authenticated +2014-10-13 11:20:08.901 [25579] wal I> creating `./00000000000000000000.xlog.inprogress' --- -- - [1, 'd3de1435-5e26-4122-95e5-3e2d40e6e1df'] - - [2, 'ea7d17d7-6690-4334-b09c-f38ffa305d36'] -... -</programlisting></entry></row> +- - [1, '6190d919-1133-4452-b123-beca0b178b32'] + - [2, '236230b8-af3e-406b-b709-15a60b44c20c'] +...</programlisting></entry></row> </tbody> </tgroup> </informaltable> On Terminal #1, execute these requests: -<programlisting><userinput>s = box.schema.create_space('tester')</userinput> -<userinput>s:create_index('primary', {})</userinput> +<programlisting><userinput>s = box.schema.space.create('tester')</userinput> +<userinput>i = s:create_index('primary', {})</userinput> <userinput>s:insert{1,'Tuple inserted on Terminal #1'}</userinput></programlisting> Now the screen looks like this: <informaltable> @@ -386,16 +412,18 @@ Now the screen looks like this: </thead> <tbody> <row><entry><programlisting>... ... -tarantool> 2014-08-14 13:41:31.097 [24958] main/101/spawner I> created a replication relay: pid = 25148 -2014-08-14 13:41:31.098 [25148] main/101/relay/127.0.0.1:42758 I> recovery start -2014-08-14 13:41:31.098 [25148] main/101/relay/127.0.0.1:42758 I> recovering from `./00000000000000000000.snap' -2014-08-14 13:41:31.098 [25148] main/101/relay/127.0.0.1:42758 I> snapshot sent -2014-08-14 13:41:31.190 [24958] main/101/spawner I> created a replication relay: pid = 25150 -2014-08-14 13:41:31.291 [25150] main/101/relay/127.0.0.1:42759 I> recover from `./00000000000000000000.xlog' -<userinput>s = box.schema.create_space('tester')</userinput> +tarantool> 2014-10-13 11:20:08.691 [25020] main/101/spawner I> created a replication relay: pid = 25583 +2014-10-13 11:20:08.691 [25583] main/101/relay/127.0.0.1:50883 I> recovery start +2014-10-13 11:20:08.691 [25583] main/101/relay/127.0.0.1:50883 I> recovering from `./00000000000000000000.snap' +2014-10-13 11:20:08.692 [25583] main/101/relay/127.0.0.1:50883 I> snapshot sent +2014-10-13 11:20:08.789 [25020] main/101/spawner I> created a replication relay: pid = 25585 +2014-10-13 11:20:08.890 [25585] main/101/relay/127.0.0.1:50884 I> recover from `./00000000000000000000.xlog' +--- +... +tarantool> <userinput>s = box.schema.space.create('tester')</userinput> --- ... -tarantool> <userinput>s:create_index('primary', {})</userinput> +tarantool> <userinput>i = s:create_index('primary', {})</userinput> --- ... tarantool> <userinput>s:insert{1,'Tuple inserted on Terminal #1'}</userinput> @@ -408,18 +436,20 @@ tarantool> <userinput>s:insert{1,'Tuple inserted on Terminal #1'}</userinput> ~/tarantool_test_node_2$ cd ~/tarantool_test_node_2 ~/tarantool_test_node_2$ rm -R ~/tarantool_test_node_2/* ~/tarantool_test_node_2$ ~/tarantool-master/src/tarantool -/home/pgulutzan/tarantool-master/src/tarantool: version 1.6.3-1724-g033ed69 +/home/username/tarantool-master/src/tarantool: version 1.6.3-1724-g033ed69 type 'help' for interactive help -tarantool> box.cfg{listen=3302, replication_source=3301} +tarantool> box.cfg{listen=3302, replication_source='replication:password@localhost:3301'} ... ... --- ... tarantool> box.space._cluster:select({0},{iterator='GE'}) -2014-08-14 13:41:31.189 [25139] main/102/replica/0.0.0.0:3301 C> connected to master -2014-08-14 13:41:31.291 [25139] wal I> creating `./00000000000000000000.xlog.inprogress' +2014-10-13 11:20:08.789 [25579] main/103/replica/localhost:3301 C> connected to 127.0.0.1:3301 +2014-10-13 11:20:08.789 [25579] main/103/replica/localhost:3301 I> authenticated +2014-10-13 11:20:08.901 [25579] wal I> creating `./00000000000000000000.xlog.inprogress' + --- -- - [1, 'd3de1435-5e26-4122-95e5-3e2d40e6e1df'] - - [2, 'ea7d17d7-6690-4334-b09c-f38ffa305d36'] +- - [1, '6190d919-1133-4452-b123-beca0b178b32'] + - [2, '236230b8-af3e-406b-b709-15a60b44c20c'] ...</prompt></programlisting></entry></row> </tbody> </tgroup> @@ -441,17 +471,19 @@ Now the screen looks like this: <row><entry align="center">TERMINAL #1</entry><entry align="center">TERMINAL #2</entry></row> </thead> <tbody> - <row><entry><programlisting><prompt>... ... -tarantool> 2014-08-14 13:41:31.097 [24958] main/101/spawner I> created a replication relay: pid = 25148 -2014-08-14 13:41:31.098 [25148] main/101/relay/127.0.0.1:42758 I> recovery start -2014-08-14 13:41:31.098 [25148] main/101/relay/127.0.0.1:42758 I> recovering from `./00000000000000000000.snap' -2014-08-14 13:41:31.098 [25148] main/101/relay/127.0.0.1:42758 I> snapshot sent -2014-08-14 13:41:31.190 [24958] main/101/spawner I> created a replication relay: pid = 25150 -2014-08-14 13:41:31.291 [25150] main/101/relay/127.0.0.1:42759 I> recover from `./00000000000000000000.xlog' -s = box.schema.create_space('tester') + <row><entry><programlisting><prompt>... +tarantool> 2014-10-13 11:20:08.691 [25020] main/101/spawner I> created a replication relay: pid = 25583 +2014-10-13 11:20:08.691 [25583] main/101/relay/127.0.0.1:50883 I> recovery start +2014-10-13 11:20:08.691 [25583] main/101/relay/127.0.0.1:50883 I> recovering from `./00000000000000000000.snap' +2014-10-13 11:20:08.692 [25583] main/101/relay/127.0.0.1:50883 I> snapshot sent +2014-10-13 11:20:08.789 [25020] main/101/spawner I> created a replication relay: pid = 25585 +2014-10-13 11:20:08.890 [25585] main/101/relay/127.0.0.1:50884 I> recover from `./00000000000000000000.xlog' --- ... -tarantool> s:create_index('primary', {}) +tarantool> s = box.schema.space.create('tester') +--- +... +tarantool> i = s:create_index('primary', {}) --- ... tarantool> s:insert{1,'Tuple inserted on Terminal #1'} @@ -460,11 +492,12 @@ tarantool> s:insert{1,'Tuple inserted on Terminal #1'} ...</prompt></programlisting></entry> <entry><programlisting>... ... tarantool> box.space._cluster:select({0},{iterator='GE'}) -2014-08-14 13:41:31.189 [25139] main/102/replica/0.0.0.0:3301 C> connected to master -2014-08-14 13:41:31.291 [25139] wal I> creating `./00000000000000000000.xlog.inprogress' +2014-10-13 11:20:08.789 [25579] main/103/replica/localhost:3301 C> connected to 127.0.0.1:3301 +2014-10-13 11:20:08.789 [25579] main/103/replica/localhost:3301 I> authenticated +2014-10-13 11:20:08.901 [25579] wal I> creating `./00000000000000000000.xlog.inprogress' --- -- - [1, 'd3de1435-5e26-4122-95e5-3e2d40e6e1df'] - - [2, 'ea7d17d7-6690-4334-b09c-f38ffa305d36'] +- - [1, '6190d919-1133-4452-b123-beca0b178b32'] + - [2, '236230b8-af3e-406b-b709-15a60b44c20c'] ... tarantool> <userinput>s = box.space.tester</userinput> --- @@ -508,20 +541,19 @@ tarantool> s:insert{1,'Tuple inserted on Terminal #1'} - [1, 'Tuple inserted on Terminal #1'] ... tarantool> <userinput>os.exit()</userinput> -2014-08-14 15:08:40.376 [25150] main/101/relay/127.0.0.1:42759 I> done `./00000000000000000000.xlog' -2014-08-14 15:08:40.414 [24958] main/101/spawner I> Exiting: master shutdown -2014-08-14 15:08:40.414 [24958] main/101/spawner I> sending signal 15 to 1 children -2014-08-14 15:08:40.414 [24958] main/101/spawner I> waiting for children for up to 5 seconds +2014-10-13 11:45:20.455 [25585] main/101/relay/127.0.0.1:50884 I> done `./00000000000000000000.xlog' +2014-10-13 11:45:20.531 [25020] main/101/spawner I> Exiting: master shutdown +2014-10-13 11:45:20.531 [25020] main/101/spawner I> sending signal 15 to 1 children +2014-10-13 11:45:20.531 [25020] main/101/spawner I> waiting for children for up to 5 seconds ~/tarantool_test_node_1$ <userinput>ls -l ~/tarantool_test_node_1</userinput> -total 12 --rw-rw-r-- 1 1781 Aug 14 13:39 00000000000000000000.snap --rw-rw-r-- 1 416 Aug 14 15:08 00000000000000000000.xlog -drwxr-x--- 2 4096 Aug 14 13:39 sophia -~/tarantool_test_node_1$ <userinput>ls -l ~/tarantool_test_node_2</userinput> -total 12 --rw-rw-r-- 1 1781 Aug 14 13:41 00000000000000000000.snap --rw-rw-r-- 1 486 Aug 14 14:52 00000000000000000000.xlog -drwxr-x--- 2 4096 Aug 14 13:41 sophia +total 8 +-rw-rw-r-- 1 1781 Oct 13 11:12 00000000000000000000.snap +-rw-rw-r-- 1 518 Oct 13 11:45 00000000000000000000.xlog +~/tarantool_test_node_1$ <userinput>ls -l ~/tarantool_test_node_2/</userinput> +total 8 +-rw-rw-r-- 1 1781 Oct 13 11:20 00000000000000000000.snap +-rw-rw-r-- 1 588 Oct 13 11:38 00000000000000000000.xlog +~/tarantool_test_node_1$ </programlisting></entry> <entry><programlisting><prompt>... ... tarantool> s:select({1},{iterator='GE'}) @@ -532,20 +564,20 @@ tarantool> s:insert{2,'Tuple inserted on Terminal #2'} --- - [2, 'Tuple inserted on Terminal #2'] ... -tarantool> 2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 I> can't read row -2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 !> SystemError +tarantool> 2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 I> can't read row +2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 !> SystemError unexpected EOF when reading from socket, -called on fd 11, aka 127.0.0.1:42759, peer of 127.0.0.1:3301: Broken pipe -2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 I> will retry every 1 second +called on fd 10, aka 127.0.0.1:50884, peer of 127.0.0.1:3301: Broken pipe +2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 I> will retry every 1 second </prompt></programlisting></entry></row> </tbody> </tgroup> </informaltable> -On Terminal #2, execute these requests:<programlisting> +On Terminal #2, ignore the repeated messages saying "failed to connect", and execute these requests:<programlisting> <userinput>box.space.tester:select({0},{iterator='GE'})</userinput> <userinput>box.space.tester:insert{3,'Another'}</userinput></programlisting> -Now the screen looks like this: +Now the screen looks like this (ignoring the repeated messages saying "failed to connect"): <informaltable> <tgroup cols="2" align="left" colsep="1" rowsep="0"> <thead> @@ -558,37 +590,35 @@ tarantool> s:insert{1,'Tuple inserted on Terminal #1'} - [1, 'Tuple inserted on Terminal #1'] ... tarantool> os.exit() -2014-08-14 15:08:40.376 [25150] main/101/relay/127.0.0.1:42759 I> done `./00000000000000000000.xlog' -2014-08-14 15:08:40.414 [24958] main/101/spawner I> Exiting: master shutdown -2014-08-14 15:08:40.414 [24958] main/101/spawner I> sending signal 15 to 1 children -2014-08-14 15:08:40.414 [24958] main/101/spawner I> waiting for children for up to 5 seconds +2014-10-13 11:45:20.455 [25585] main/101/relay/127.0.0.1:50884 I> done `./00000000000000000000.xlog' +2014-10-13 11:45:20.531 [25020] main/101/spawner I> Exiting: master shutdown +2014-10-13 11:45:20.531 [25020] main/101/spawner I> sending signal 15 to 1 children +2014-10-13 11:45:20.531 [25020] main/101/spawner I> waiting for children for up to 5 seconds ~/tarantool_test_node_1$ ls -l ~/tarantool_test_node_1 -total 12 --rw-rw-r-- 1 1781 Aug 14 13:39 00000000000000000000.snap --rw-rw-r-- 1 416 Aug 14 15:08 00000000000000000000.xlog -drwxr-x--- 2 4096 Aug 14 13:39 sophia -~/tarantool_test_node_1$ ls -l ~/tarantool_test_node_2 -total 12 --rw-rw-r-- 1 1781 Aug 14 13:41 00000000000000000000.snap --rw-rw-r-- 1 486 Aug 14 14:52 00000000000000000000.xlog -drwxr-x--- 2 4096 Aug 14 13:41 sophia +total 8 +-rw-rw-r-- 1 1781 Oct 13 11:12 00000000000000000000.snap +-rw-rw-r-- 1 518 Oct 13 11:45 00000000000000000000.xlog +~/tarantool_test_node_1$ ls -l ~/tarantool_test_node_2/ +total 8 +-rw-rw-r-- 1 1781 Oct 13 11:20 00000000000000000000.snap +-rw-rw-r-- 1 588 Oct 13 11:38 00000000000000000000.xlog +~/tarantool_test_node_1$ </prompt></programlisting></entry> <entry><programlisting>... ... tarantool> s:insert{2,'Tuple inserted on Terminal #2'} --- - [2, 'Tuple inserted on Terminal #2'] ... -tarantool> 2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 I> can't read row -2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 !> SystemError +tarantool> 2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 I> can't read row +2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 !> SystemError unexpected EOF when reading from socket, -called on fd 11, aka 127.0.0.1:42759, peer of 127.0.0.1:3301: Broken pipe -2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 I> will retry every 1 second +called on fd 10, aka 127.0.0.1:50884, peer of 127.0.0.1:3301: Broken pipe +2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 I> will retry every 1 second tarantool> <userinput>box.space.tester:select({0},{iterator='GE'})</userinput> --- - - [1, 'Tuple inserted on Terminal #1'] - [2, 'Tuple inserted on Terminal #2'] ... - tarantool> <userinput>box.space.tester:insert{3,'Another'}</userinput> --- - [3, 'Another'] @@ -606,7 +636,7 @@ On Terminal #1 execute these commands:<programlisting> <userinput>~/tarantool-master/src/tarantool</userinput> <userinput>box.cfg{listen=3301}</userinput> <userinput>box.space.tester:select({0},{iterator='GE'})</userinput></programlisting> -Now the screen looks like this: +Now the screen looks like this (ignoring the repeated messages on terminal #2 saying "failed to connect"): <informaltable> <tgroup cols="2" align="left" colsep="1" rowsep="0"> <thead> @@ -619,31 +649,29 @@ tarantool> s:insert{1,'Tuple inserted on Terminal #1'} - [1, 'Tuple inserted on Terminal #1'] ... tarantool> os.exit() -2014-08-14 15:08:40.376 [25150] main/101/relay/127.0.0.1:42759 I> done `./00000000000000000000.xlog' -2014-08-14 15:08:40.414 [24958] main/101/spawner I> Exiting: master shutdown -2014-08-14 15:08:40.414 [24958] main/101/spawner I> sending signal 15 to 1 children -2014-08-14 15:08:40.414 [24958] main/101/spawner I> waiting for children for up to 5 seconds +2014-10-13 11:45:20.455 [25585] main/101/relay/127.0.0.1:50884 I> done `./00000000000000000000.xlog' +2014-10-13 11:45:20.531 [25020] main/101/spawner I> Exiting: master shutdown +2014-10-13 11:45:20.531 [25020] main/101/spawner I> sending signal 15 to 1 children +2014-10-13 11:45:20.531 [25020] main/101/spawner I> waiting for children for up to 5 seconds ~/tarantool_test_node_1$ ls -l ~/tarantool_test_node_1 -total 12 --rw-rw-r-- 1 1781 Aug 14 13:39 00000000000000000000.snap --rw-rw-r-- 1 416 Aug 14 15:08 00000000000000000000.xlog -drwxr-x--- 2 4096 Aug 14 13:39 sophia -~/tarantool_test_node_1$ ls -l ~/tarantool_test_node_2 -total 12 --rw-rw-r-- 1 1781 Aug 14 13:41 00000000000000000000.snap --rw-rw-r-- 1 486 Aug 14 14:52 00000000000000000000.xlog -drwxr-x--- 2 4096 Aug 14 13:41 sophia +total 8 +-rw-rw-r-- 1 1781 Oct 13 11:12 00000000000000000000.snap +-rw-rw-r-- 1 518 Oct 13 11:45 00000000000000000000.xlog +~/tarantool_test_node_1$ ls -l ~/tarantool_test_node_2/ +total 8 +-rw-rw-r-- 1 1781 Oct 13 11:20 00000000000000000000.snap +-rw-rw-r-- 1 588 Oct 13 11:38 00000000000000000000.xlog ~/tarantool_test_node_1$ <userinput>~/tarantool-master/src/tarantool</userinput> -~/tarantool: version 1.6.3-1724-g033ed69 +/home/username/tarantool-master/src/tarantool: version 1.6.3-515-g0a06cce type 'help' for interactive help tarantool> <userinput>box.cfg{listen=3301}</userinput> ... ... --- ... tarantool> <userinput>box.space.tester:select({0},{iterator='GE'})</userinput> -2014-08-14 15:22:22.883 [14305] main/101/spawner I> created a replication relay: pid = 14313 -2014-08-14 15:22:22.983 [14313] main/101/relay/127.0.0.1:43646 I> recover from `./00000000000000000000.xlog' -2014-08-14 15:22:22.984 [14313] main/101/relay/127.0.0.1:43646 I> done `./00000000000000000000.xlog' +2014-10-13 12:01:55.615 [28989] main/101/spawner I> created a replication relay: pid = 28992 +2014-10-13 12:01:55.716 [28992] main/101/relay/127.0.0.1:51892 I> recover from `./00000000000000000000.xlog' +2014-10-13 12:01:55.716 [28992] main/101/relay/127.0.0.1:51892 I> done `./00000000000000000000.xlog' --- - - [1, 'Tuple inserted on Terminal #1'] ... @@ -653,23 +681,23 @@ tarantool> s:insert{2,'Tuple inserted on Terminal #2'} --- - [2, 'Tuple inserted on Terminal #2'] ... -tarantool> 2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 I> can't read row -2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 !> SystemError +tarantool> 2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 I> can't read row +2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 !> SystemError unexpected EOF when reading from socket, -called on fd 11, aka 127.0.0.1:42759, peer of 127.0.0.1:3301: Broken pipe -2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 I> will retry every 1 second +called on fd 10, aka 127.0.0.1:50884, peer of 127.0.0.1:3301: Broken pipe +2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 I> will retry every 1 second tarantool> box.space.tester:select({0},{iterator='GE'}) --- - - [1, 'Tuple inserted on Terminal #1'] - [2, 'Tuple inserted on Terminal #2'] ... - tarantool> box.space.tester:insert{3,'Another'} --- - [3, 'Another'] ... tarantool> -2014-08-14 15:22:22.881 [25139] main/102/replica/0.0.0.0:3301 C> connected to master +2014-10-13 12:01:55.614 [25579] main/103/replica/localhost:3301 C> connected to 127.0.0.1:3301 +2014-10-13 12:01:55.614 [25579] main/103/replica/localhost:3301 I> authenticated </prompt></programlisting></entry></row> </tbody> </tgroup> @@ -684,7 +712,7 @@ to act as a replication source. </para> <para> On Terminal #1, say:<programlisting> -<userinput>box.cfg{replication_source='3302'}</userinput> +<userinput>box.cfg{replication_source='replication:password@localhost:3302'}</userinput> <userinput>box.space.tester:select({0},{iterator='GE'})</userinput></programlisting> The screen now looks like this: <informaltable> @@ -702,45 +730,52 @@ tarantool> box.cfg{listen=3301} --- ... tarantool> box.space.tester:select({0},{iterator='GE'}) -2014-08-14 15:22:22.883 [14305] main/101/spawner I> created a replication relay: pid = 14313 -2014-08-14 15:22:22.983 [14313] main/101/relay/127.0.0.1:43646 I> recover from `./00000000000000000000.xlog' -2014-08-14 15:22:22.984 [14313] main/101/relay/127.0.0.1:43646 I> done `./00000000000000000000.xlog' +2014-10-13 12:01:55.615 [28989] main/101/spawner I> created a replication relay: pid = 28992 +2014-10-13 12:01:55.716 [28992] main/101/relay/127.0.0.1:51892 I> recover from `./00000000000000000000.xlog' +2014-10-13 12:01:55.716 [28992] main/101/relay/127.0.0.1:51892 I> done `./00000000000000000000.xlog' + --- - - [1, 'Tuple inserted on Terminal #1'] ... -tarantool> <userinput>box.cfg{replication_source='3302'}</userinput> -2014-08-14 15:35:47.567 [14303] main/101/interactive C> starting replication from 0.0.0.0:3302 +tarantool> <userinput>box.cfg{replication_source='replication:password@localhost:3302'}</userinput> +2014-10-13 12:10:21.485 [28987] main/101/interactive C> starting replication from localhost:3302 --- ... +2014-10-13 12:10:21.487 [28987] main/104/replica/localhost:3302 C> connected to 127.0.0.1:3302 +2014-10-13 12:10:21.487 [28987] main/104/replica/localhost:3302 I> authenticated tarantool> <userinput>box.space.tester:select({0},{iterator='GE'})</userinput> -2014-08-14 15:35:47.568 [14303] main/103/replica/0.0.0.0:3302 C> connected to master -2014-08-14 15:35:47.670 [14303] wal I> creating `./00000000000000000005.xlog.inprogress' -2014-08-14 15:35:47.684 [14313] main/101/relay/127.0.0.1:43646 I> recover from `./00000000000000000005.xlog' +2014-10-13 12:10:21.592 [28987] wal I> creating `./00000000000000000006.xlog.inprogress' +2014-10-13 12:10:21.617 [28992] main/101/relay/127.0.0.1:51892 I> recover from `./00000000000000000006.xlog' +--- +- - [1, 'Tuple inserted on Terminal #1'] + - [2, 'Tuple inserted on Terminal #2'] + - [3, 'Another'] +... </programlisting></entry> <entry><programlisting><prompt>... ... -tarantool> s:insert{2,'Tuple inserted on Terminal #2'} + tarantool> s:insert{2,'Tuple inserted on Terminal #2'} --- - [2, 'Tuple inserted on Terminal #2'] ... -tarantool> 2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 I> can't read row -2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 !> SystemError +tarantool> 2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 I> can't read row +2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 !> SystemError unexpected EOF when reading from socket, -called on fd 11, aka 127.0.0.1:42759, peer of 127.0.0.1:3301: Broken pipe -2014-08-14 15:08:40.417 [25139] main/102/replica/0.0.0.0:3301 I> will retry every 1 second +called on fd 10, aka 127.0.0.1:50884, peer of 127.0.0.1:3301: Broken pipe +2014-10-13 11:45:20.532 [25579] main/103/replica/localhost:3301 I> will retry every 1 second tarantool> box.space.tester:select({0},{iterator='GE'}) --- - - [1, 'Tuple inserted on Terminal #1'] - [2, 'Tuple inserted on Terminal #2'] ... - tarantool> box.space.tester:insert{3,'Another'} --- - [3, 'Another'] ... tarantool> -2014-08-14 15:22:22.881 [25139] main/102/replica/0.0.0.0:3301 C> connected to master -tarantool> 2014-08-14 15:35:47.569 [25141] main/101/spawner I> created a replication relay: pid = 15585 -2014-08-14 15:35:47.670 [15585] main/101/relay/127.0.0.1:51915 I> recover from `./00000000000000000000.xlog' +2014-10-13 12:01:55.614 [25579] main/103/replica/localhost:3301 C> connected to 127.0.0.1:3301 +2014-10-13 12:01:55.614 [25579] main/103/replica/localhost:3301 I> authenticated +2014-10-13 12:10:21.488 [25581] main/101/spawner I> created a replication relay: pid = 29632 +2014-10-13 12:10:21.592 [29632] main/101/relay/127.0.0.1:45908 I> recover from `./00000000000000000000.xlog' </prompt></programlisting></entry></row> </tbody> </tgroup> diff --git a/doc/user/server-administration.xml b/doc/user/server-administration.xml index 09d6cfa7b02aefab3ec65787555d57c0aa0ce204..ab651ffe294de1ea3d3b34cd7b64606b3ed244cf 100644 --- a/doc/user/server-administration.xml +++ b/doc/user/server-administration.xml @@ -163,7 +163,7 @@ Here is an example of an interactive-mode tarantool client session: [ tarantool will display an introductory message including version number here ] tarantool> <userinput>box.cfg{listen=3301}</userinput> [ tarantool will display configuration information here ] -tarantool> <userinput>s = box.schema.create_space('tester')</userinput> +tarantool> <userinput>s = box.schema.space.create('tester')</userinput> [ tarantool may display an in-progress message here ] --- ... @@ -216,7 +216,7 @@ program, allocating disk resources specifically for that program, via a standardized deployment method." If Tarantool was downloaded from source, then the script is in ~/extra/dist/tarantoolctl. If Tarantool was installed -with debian or Red Hat installation packages, the script +with Debian or Red Hat installation packages, the script is renamed <code>tarantoolctl</code> and is in /usr/bin/tarantoolctl. The script handles such things as: starting, stopping, rotating logs, logging in to the @@ -267,6 +267,7 @@ The script will add /<replaceable>instance-name</replaceable>.log" to the name. </para> <para> username = the user that runs the tarantool server. +This is the operating-system user name rather than the Tarantool-client user name. </para> <para> instance_dir = the directory where all applications for this host are stored. @@ -369,7 +370,7 @@ box.cfg{listen = 3301} box.schema.user.passwd('Gx5!') box.schema.user.grant('guest','read,write,execute','universe') fiber = require('fiber') -box.schema.create_space('tester') +box.schema.space.create('tester') box.space.tester:create_index('primary',{}) i = 0 while 0 == 0 do diff --git a/doc/user/stored-procedures.xml b/doc/user/stored-procedures.xml index 1aa26a9ebdf60c2186c151d5a3627e123b320453..e675abdeb09367d2da00e65ed6bd0e2a85317680 100644 --- a/doc/user/stored-procedures.xml +++ b/doc/user/stored-procedures.xml @@ -86,7 +86,7 @@ tarantool> <userinput>'hello' .. ' world' -- '..' means 'concatenate'</userinput or is the full <replaceable>library-name.package-name[object-numeric-id]</replaceable>. The following example shows all four forms of object-specifier: <programlisting> -tarantool> <userinput>s = box.schema.create_space('name_of_space', {id = 33})</userinput> +tarantool> <userinput>s = box.schema.space.create('name_of_space', {id = 33})</userinput> --- ... tarantool> <userinput>i = s:create_index('name_of_index', {type = 'tree', parts = {1, 'STR'}})</userinput> @@ -266,11 +266,41 @@ For example, to bring in the i18n package: install luarocks, say <code>luarocks start Tarantool, and say <code>require('i18n')</code>. </para> +<section xml:id="rocks"><title>Installing rocks from tarantool.org</title> +<para> +The Lua rocks that Tarantool supplies are available +on <link xlink:href="http://rocks.tarantool.org">rocks.tarantool.org</link> and can be installed using +the luarocks utilities. Here is an example. +</para> +<para> +Look at rocks.tarantool.org. Notice that one of the +available rocks is expirationd -- Expiration daemon for Tarantool. +</para> +<para> +Create a file named ~/.luarocks/config.lua containing these three lines:<programlisting> +rocks_servers = { + [[http://rocks.tarantool.org/]] +}</programlisting> +Install the expirationd rock with either <programlisting>luarocks --local install expirationd</programlisting> +or, as root user, <programlisting>luarocks install expirationd</programlisting> +Start the tarantool server and make the request:<programlisting>expirationd=require('expirationd')</programlisting> +If there is an error, display the Lua variable +<code>package_path</code> to make sure it is searching along +a path that includes the new expirationd.lua file. +</para> +<para> +If the result is success, which it will be if +nothing unusual has been done when installing +Tarantool or Luarocks, then the new rock is +available henceforward for use in the Tarantool +application server. +</para> + <para> The rest of this chapter is a reference that has what's needed for programming and administration with the built-in packages. </para> - +</section> <section xml:id="sp-digest"> <title>Package <code>digest</code></title> @@ -599,11 +629,11 @@ tarantool> <userinput>json.decode('{"hello": "world"}').hello</userinput> </para> <bridgehead renderas="sect4">Example</bridgehead> <programlisting> -<prompt>tarantool></prompt> <userinput>-- When nil is assigned to a Lua-table field, the field disappears</userinput> +<prompt>tarantool></prompt> <userinput>-- When nil is assigned to a Lua-table field, the field is null</userinput> <prompt>tarantool></prompt> <userinput>{nil, 'a', 'b'}</userinput> ---- -- 2: a - 3: b +- - null + - a + - b ... <prompt>tarantool></prompt> <userinput> -- When json.NULL is assigned to a Lua-table field, the field is json.NULL</userinput> <prompt>tarantool></prompt> <userinput>{json.NULL, 'a', 'b'}</userinput> @@ -1472,7 +1502,7 @@ end <para> The functions for setting up and connecting are <code>socket</code>, <code>sysconnect</code>, <code>tcp_connect</code>. The functions for sending data are <code>send</code>, <code>sendto</code>, <code>write</code>, <code>syswrite</code>. - The functions for receiving data are <code>recv</code>, <code>recvfrom</code>, <code>read</code>, <code>readline</code>. + The functions for receiving data are <code>recv</code>, <code>recvfrom</code>, <code>read</code>. The functions for waiting before sending/receiving data are <code>wait</code>, <code>readable</code>, <code>writable</code>. The functions for setting flags are <code>nonblock</code>, <code>setsockopt</code>. The functions for stopping and disconnecting are <code>shutdown</code>, <code>close</code>. @@ -1496,7 +1526,6 @@ end <row><entry><link linkend="socket-recv">recv</link></entry><entry>receiving</entry></row> <row><entry><link linkend="socket-recvfrom">recvfrom</link></entry><entry>receiving</entry></row> <row><entry><link linkend="socket-read">read</link></entry><entry>receiving</entry></row> - <row><entry><link linkend="socket-readline">readline</link></entry><entry>receiving</entry></row> <row><entry><link linkend="socket-nonblock">nonblock</link></entry><entry>flag setting </entry></row> <row><entry><link linkend="socket-setsockopt">setsockopt</link></entry><entry>flag setting </entry></row> <row><entry><link linkend="socket-linger">linger</link></entry><entry>flag setting</entry></row> @@ -1634,59 +1663,30 @@ end </varlistentry> <varlistentry> - <term xml:id="socket-readline" xreflabel="socket-readline"><emphasis role="lua"><replaceable>sock</replaceable>:readline(<replaceable>[limit] [, separator list]</replaceable>)</emphasis></term> + <term xml:id="socket-read" xreflabel="socket-read"> + <emphasis role="lua"><replaceable>sock</replaceable>:read(<replaceable>limit</replaceable> [, <replaceable>timeout</replaceable>])</emphasis> + or <emphasis role="lua"><replaceable>sock</replaceable>:read(<replaceable>delimiter</replaceable> [, <replaceable>timeout</replaceable>])</emphasis> + or <emphasis role="lua"><replaceable>sock</replaceable>:read({limit=<replaceable>limit</replaceable>} [, <replaceable>timeout</replaceable>])</emphasis> + or <emphasis role="lua"><replaceable>sock</replaceable>:read({delimiter=<replaceable>delimiter</replaceable>} [, <replaceable>timeout</replaceable>])</emphasis> + or <emphasis role="lua"><replaceable>sock</replaceable>:read({limit=<replaceable>limit</replaceable>, delimiter=<replaceable>delimiter</replaceable>} [, <replaceable>timeout</replaceable>])</emphasis> + </term> <listitem> <para> - Read a line from a connected socket. + Read from a connected socket until some condition is true, and return the bytes that were read. </para> <para> - <code>sock:readline()</code> with no arguments reads data from a socket - until '\n' (line feed) or eof (end of transmission). - </para> - <para> - Parameters: <code>limit</code> — maximum number of bytes to read. The function reads - until a separator is seen, or until (limit) bytes have been read. The default is "no limit". - <code>separator list</code> — a Lua table containing one or more separators. - The function reads until one of the separators is seen. The default is a Lua table containing '\n'. + Parameters: <code>limit</code> (type = integer) — maximum number of bytes to read for example 50 means "stop after 50 bytes", + <code>delimiter</code> (type = string) — separator or <link xlink:href="http://www.lua.org/pil/20.2.html">Lua pattern</link> for example '[0-9]' means "stop after a digit", + <code>timeout</code> (type = number) — maximum number of seconds to wait for example 50 means "stop after 50 seconds". + Reading goes on until <code>limit</code> bytes have been read, + or a delimiter has been read, or a timeout has expired. </para> <para> Returns: - (type = string) A Lua string with data if success, - an empty string if error. If multiple separators were passed in <code>separator list</code>, - the separator which matched is also shown, as the third part of the return. - <table> - <title><code>readline()</code> returns</title> - <tgroup cols="2" align="left" colsep="1" rowsep="1"> - <tbody> - <row> - <entry><code>data, nil, separator</code></entry><entry>success</entry> - </row> - <row> - <entry><code>"", "timeout", ETIMEDOUT, errstr</code></entry><entry>timeout</entry> - </row> - <row> - <entry><code>"", "error", errno, errstr</code></entry><entry>error</entry> - </row> - <row> - <entry><code>data, "limit"</code></entry><entry>limit</entry> - </row> - <row> - <entry><code>data, "eof"</code></entry><entry>eof</entry> - </row> - </tbody> - </tgroup> - </table> - </para> - </listitem> - </varlistentry> - - <varlistentry> - <term xml:id="socket-read" xreflabel="socket-read"><emphasis role="lua"><replaceable>sock</replaceable>:read(<replaceable>size</replaceable>)</emphasis></term> - <listitem> - <para> - Read data on a socket, until <code>size</code> bytes have been read, - or until there is nothing more to read, or until an error occurs. - Similar to sock:readline(), except for the fact that there is no list of separators. + (type = string) an empty string if there is nothing more to read, + or a nil value if error, + or a string up to <code>limit</code> bytes long, + which may include the bytes that matched the <code>delimiter</code> expression. </para> </listitem> </varlistentry> @@ -1923,7 +1923,7 @@ end This is not a useful way to communicate with this particular site, but shows that the system works. <programlisting> - <prompt>tarantool></prompt> <userinput>socket = require('socket')</userinput> +<prompt>tarantool></prompt> <userinput>socket = require('socket')</userinput> --- ... @@ -2250,7 +2250,7 @@ end </varlistentry> <varlistentry> - <term><emphasis role="lua">fio.open(<replaceable>path-name</replaceable>, <replaceable>flags</replaceable>)</emphasis></term> + <term><emphasis role="lua">fio.open(<replaceable>path-name</replaceable>[, <replaceable>flags</replaceable>])</emphasis></term> <listitem> <para> Open a file in preparation for reading or writing or seeking. @@ -2296,7 +2296,7 @@ end </varlistentry> <varlistentry> - <term><emphasis role="lua"><replaceable>file-handle</replaceable>:pread or <replaceable>file-handle></replaceable>:pwrite(<replaceable>count or new string</replaceable>, <replaceable>offset</replaceable>)</emphasis></term> + <term><emphasis role="lua"><replaceable>file-handle</replaceable>:pread or <replaceable>file-handle</replaceable>:pwrite(<replaceable>count or new string</replaceable>, <replaceable>offset</replaceable>)</emphasis></term> <listitem> <para> Perform read/write random-access operation on a file, without affecting the current seek position of the file. @@ -2322,7 +2322,7 @@ end </varlistentry> <varlistentry> - <term><emphasis role="lua"><replaceable>file-handle</replaceable>:read or <replaceable>file-handle></replaceable>:write(<replaceable>count or new string</replaceable>)</emphasis></term> + <term><emphasis role="lua"><replaceable>file-handle</replaceable>:read or <replaceable>file-handle</replaceable>:write(<replaceable>count or new string</replaceable>)</emphasis></term> <listitem> <para> Perform non-random-access read or write on a file. For details type "man 2 read" or "man 2 write". @@ -2470,10 +2470,10 @@ end <variablelist> <varlistentry> - <term><emphasis role="lua">console.connect(<replaceable>host</replaceable>,<replaceable>port</replaceable>[, <replaceable>options</replaceable>])</emphasis></term> + <term><emphasis role="lua">console.connect(<replaceable>URI</replaceable>[, <replaceable>options</replaceable>])</emphasis></term> <listitem> <para> - Connect to the server at host:port, change the prompt from + Connect to the server at <link linkend="URI">URI</link>, change the prompt from 'tarantool' to 'host:port', and act henceforth as a client until the user ends the session or types control-D. </para> @@ -2500,10 +2500,10 @@ end <code>box.schema.user.grant('guest','execute','universe')</code>. </para> <para> - Parameters: <code>host</code>, <code>port</code>, <code>options</code>. + Parameters: <code>URI</code>, <code>options</code>. The options may be necessary if the Tarantool server at host:port requires authentication. In such a case the connection might look something like: - <code>console.connect('127.0.0.1', port, { user = 'netbox', password = '123' })</code>. + <code>console.connect('netbox:123@127.0.0.1'})</code>. </para> <para> Returns: nothing. @@ -2517,7 +2517,7 @@ end <prompt>tarantool></prompt> <userinput>console = require('console')</userinput> --- ... -<prompt>tarantool></prompt> <userinput>console.connect('198.18.44.44', 3301)</userinput> +<prompt>tarantool></prompt> <userinput>console.connect('198.18.44.44:3301')</userinput> --- ... <prompt>198.18.44.44:3301></prompt> <userinput>-- prompt is telling us that server is remote</userinput></programlisting> @@ -2530,7 +2530,7 @@ end <listitem> <para> Listen on host:port. The primary way of listening for incoming - requests is via the host and port, or URI, specified in + requests is via the host and port, or <link linkend="URI">URI</link>, specified in <code>box.cfg{listen=...}</code>. The alternative way of listening is via the host and port, or URI, specified in <code>console.listen(...)</code>. This alternative way is @@ -2542,8 +2542,8 @@ end specified as host = 'unix/', port = 'path/to/something.sock'. </para> <para xml:id="admin_port" xreflabel="admin_port"> - The "admin" address is the port or URI to listen on for administrative - connections. It has no default value, so it must be specified + The "admin" address is the port or <link linkend="URI">URI</link> to listen on for administrative + connections. It has no default value, so it must be specified if connections will occur via telnet. It is not used unless assigned a value. The parameters may be expressed with URI = Universal Resource Identifier format, for example "unix://unix_domain_socket", @@ -2953,7 +2953,7 @@ get expirationd through the test. </para> <para> 1. Get expirationd.lua. - There are standard ways -- it is after all part of a standard rock -- + There are standard ways -- it is after all part of <link linkend="rocks">a standard rock</link> -- but for this purpose just copy the contents of <link xlink:href="https://github.com/tarantool/expirationd/blob/master/expirationd.lua">https://github.com/tarantool/expirationd/blob/master/expirationd.lua</link> to a default directory. @@ -2965,16 +2965,16 @@ get expirationd through the test. 3. Execute these requests: <programlisting><userinput> box.cfg{} - a = box.schema.create_space('origin') + a = box.schema.space.create('origin') a:create_index('first', {type = 'tree', parts = {1, 'NUM'}}) - b = box.schema.create_space('archive') + b = box.schema.space.create('archive') b:create_index('first', {type = 'tree', parts = {1, 'STR'}}) expd = require('expirationd') expd._debug = true expd.do_test('origin', 'archive') os.exit() </userinput></programlisting> -The database-specific requests (cfg, create_space, create_index) +The database-specific requests (cfg, space.create, create_index) should already be familiar. The key for getting the rock rolling is <code>expd = require('expirationd')</code>. The "require" function is what reads in the program; it will appear diff --git a/doc/user/tutorial.xml b/doc/user/tutorial.xml index 77e82044a9264d420fd0aaf4cdb11749303292b9..dfe81560af544fda47755eb1cca8497fee8e920f 100644 --- a/doc/user/tutorial.xml +++ b/doc/user/tutorial.xml @@ -97,8 +97,8 @@ release=`lsb_release -c -s` <para> There is always an up-to-date Ubuntu repository at <link xlink:href="http://tarantool.org/dist/master/ubuntu">http://tarantool.org/dist/master/ubuntu</link> -The repository contains builds for Ubuntu 12.04 "precise", 12.10 "quantal", -13.04 "raring", 13.10 "saucy", 14.04 "Trusty". +The repository contains builds for Ubuntu 12.04 "precise", +13.10 "saucy", and 14.04 "trusty". Add the tarantool.org repository to your apt sources list $release is an environment variable which will contain the Ubuntu version code e.g. "precise". If you want the version that comes with @@ -174,12 +174,12 @@ add the new section thus: <simplesect> <title>Fedora</title> <para> -These instructions are applicable for Fedora 20. +These instructions are applicable for Fedora 19 or Fedora 20. Pick the Fedora repository, for example <link xlink:href="http://tarantool.org/dist/master/fedora/20/x86_64">http://tarantool.org/dist/master/fedora/20/x86_64</link> for version 20, x86-64. Add the following section to your yum repository list (<filename>/etc/yum.repos.d/tarantool.repo</filename>) (in the following instructions, $releasever i.e. Fedora release -version must be 2 and $basearch i.e. base architecture must be x86_64): +version must be 19 or 20 and $basearch i.e. base architecture must be x86_64): <programlisting> <userinput> [tarantool] @@ -194,13 +194,12 @@ For example, if you have Fedora version 20, you can add the new section thus: <userinput> <command>echo</command> "[tarantool]" | \ <command>sudo tee</command> <filename>/etc/yum.repos.d/tarantool.repo</filename> -<command>echo</command> "name=Fedora - Tarantool"| <command>sudo tee</command> <option>-a</option> <filename>/etc/yum.repos.d/tarantool.repo</filename> -<command>echo</command> "baseurl="http://tarantool.org/dist/master/fedora/20/x86_64/" | \ +<command>echo</command> "name=Fedora-20 - Tarantool"| <command>sudo tee</command> <option>-a</option> <filename>/etc/yum.repos.d/tarantool.repo</filename> +<command>echo</command> "baseurl=http://tarantool.org/dist/master/fedora/20/x86_64/" | \ <command>sudo tee</command> <option>-a</option> <filename>/etc/yum.repos.d/tarantool.repo</filename> <command>echo</command> "enabled=1" | <command>sudo tee</command> <option>-a</option> <filename>/etc/yum.repos.d/tarantool.repo</filename> -<command>echo</command> "gpgcheck=0" | <command>sudo tee</command> <option>-a</option> <filename>/etc/yum.repos.d/tarantool.repo</filename> -</userinput> -</programlisting> +<command>echo</command> "gpgcheck=0" | <command>sudo tee</command> <option>-a</option> <filename>/etc/yum.repos.d/tarantool.repo</filename></userinput></programlisting> +Then install with <code>sudo yum install tarantool</code>. </para> </simplesect> @@ -237,7 +236,7 @@ so it's not a true binary download, some source code is involved. First upgrade Clang (the C compiler) to version 3.2 or later using Command Line Tools for Xcode disk image version 4.6+ from Apple Developer web-site. Then download the recipe file from -<link xlink:href="tarantool.org/dist/tarantool.rb">tarantool.org/dist/tarantool.rb</link>. +<link xlink:href="http://tarantool.org/dist/master/tarantool.rb">tarantool.org/dist/master/tarantool.rb</link>. Make the file executable, execute it, and the script in the file should handle the necessary steps with cmake, make, and make install. </para> @@ -530,8 +529,7 @@ tarantool> box.cfg{listen=3301} 2014-08-07 09:41:41.095 ... saving snapshot `./00000000000000000000.snap.inprogress' 2014-08-07 09:41:41.127 ... done 2014-08-07 09:41:41.128 ... primary: bound to 0.0.0.0:3301 -2014-08-07 09:41:41.128 ... ready to accept requests -2014-08-07 09:41:41.128 ... started</computeroutput></programlisting> +2014-08-07 09:41:41.128 ... ready to accept requests</computeroutput></programlisting> </para> <para> @@ -549,7 +547,7 @@ Tarantool is waiting for the user to type instructions. <para> To create the first space and the first <link linkend="an-index">index</link>, try this:<programlisting> -<prompt>tarantool> </prompt><userinput>s = box.schema.create_space('tester')</userinput> +<prompt>tarantool> </prompt><userinput>s = box.schema.space.create('tester')</userinput> <prompt>tarantool> </prompt><userinput>i = s:create_index('primary', {type = 'hash', parts = {1, 'NUM'}})</userinput></programlisting> </para> @@ -564,7 +562,7 @@ To select a tuple from the first space of the database, using the first defined key, try this:<programlisting><prompt>tarantool> </prompt><userinput>s:select{3}</userinput></programlisting> Your terminal screen should now look like this:<programlisting><computeroutput> -tarantool> s = box.schema.create_space('tester') +tarantool> s = box.schema.space.create('tester') 2014-06-10 12:04:18.158 ... creating `./00000000000000000002.xlog.inprogress' --- ... @@ -623,9 +621,8 @@ The server name is <computeroutput><filename>tarantool</filename></computeroutpu <command>~/tarantool/src/tarantool</command> </userinput></programlisting> </para> <para> -3. Try these requests:<programlisting><userinput>box.cfg{} -console = require('console') -console.connect('localhost', 3301) +3. Try these requests:<programlisting><userinput>sconsole = require('console') +console.connect('localhost:3301') box.space.tester:select{2}</userinput></programlisting> </para> <para> @@ -640,7 +637,7 @@ tarantool> console = require('console') --- ... -tarantool> console.connect('localhost', 3301) +tarantool> console.connect('localhost:3301') 2014-08-31 12:46:54.650 [32628] main/101/interactive I> connected to localhost:3301 --- ... @@ -726,7 +723,7 @@ inserted and selected tuples. </para> <para>DATA DEFINITION WITH RUNTIME REQUESTS RATHER THAN A CONFIGURATION FILE. <computeroutput>box.schema</computeroutput> is a new package for space configuration. - Spaces are added/dropped with box.schema.create_space / box.space.<replaceable>space_name</replaceable>.drop. + Spaces are added/dropped with box.schema.space.create / box.space.<replaceable>space_name</replaceable>.drop. Indexes are added/dropped with box.space.<replaceable>space-name</replaceable>.create_index / box.space.<replaceable>.space-name</replaceable>.index.<replaceable>index_name</replaceable>.drop. "space.estimated_rows" no longer exists. diff --git a/doc/www/content/doc/box-protocol.rst b/doc/www/content/doc/box-protocol.rst index dc3db9027cf94f40a14876b64aa6fa6762621286..a6fa768d2f731c4cd2c8de31407d07931f798b4f 100644 --- a/doc/www/content/doc/box-protocol.rst +++ b/doc/www/content/doc/box-protocol.rst @@ -4,47 +4,93 @@ :url: doc/box-protocol.html :template: documentation +-------------------------------------------------------------------------------- + Notion in diagrams +-------------------------------------------------------------------------------- + +.. code-block:: bash + + 0 X + +----+ + | | - X bytes + +----+ + TYPE - type of MsgPack value (if it is MsgPack object) + + +====+ + | | - Variable size MsgPack object + +====+ + TYPE - type of MsgPack value + + +~~~~+ + | | - Variable size MsgPack Array/Map + +~~~~+ + TYPE - type of MsgPack value + + +MsgPack data types: + +* **MP_INT** - Unsigned Integer +* **MP_MAP** - Map +* **MP_ARR** - Array +* **MP_STRING** - String +* **MP_FIXSTR** - Fixed size string +* **MP_OBJECT** - Any MsgPack object + + -------------------------------------------------------------------------------- Overview -------------------------------------------------------------------------------- -IPROTO is a binary request/response protocol. The server begins the dialogue by -sending a fixed-size (128 bytes) text greeting to the client. The first 64 bytes -of the greeting contain server version. The second 64 bytes contain a -base64-encoded random string, to use in authentification packet. +IPROTO is a binary request/response protocol. + +-------------------------------------------------------------------------------- + Greeting Package +-------------------------------------------------------------------------------- + +.. code-block:: bash + + TARANTOOL'S GRETTING: + + 0 63 + +--------------------------------------+ + | | + | Tarantool Greeting (server version) | + | 64 bytes | + +---------------------+----------------+ + | | | + | BASE64 encoded SALT | NULL | + | 44 bytes | | + +---------------------+----------------+ + 64 107 127 + +The server begins the dialogue by sending a fixed-size (128 bytes) text greeting +to the client. The first 64 bytes of the greeting contain server version. The +second 44 bytes contain a base64-encoded random string, to use in authentification +packet. And it ends with 20 bytes of spaces. + +-------------------------------------------------------------------------------- + Unified package structure +-------------------------------------------------------------------------------- Once a greeting is read, the protocol becomes pure request/response and features a complete access to Tarantool functionality, including: -- request multiplexing, e.g. ability to asynchronously issue multiple requests\ +- request multiplexing, e.g. ability to asynchronously issue multiple requests via the same connection - response format that supports zero-copy writes For data structuring and encoding, the protocol uses msgpack data format, see http://msgpack.org -Since msgpack uses a variable representation for compound data structures, such -as arrays and maps, the exact byte sequence mandated by msgpack format is omitted -in this spec. This spec therefore only defines the expected **schema** of msgpack -streams. - -To specify that a msgpack map is expected in the stream, the contents of the map is -put into "{}" (curly braces). - -To specify that a msgpack array is expected in the stream, the contents of the array -is put into "[]" (square brackets). - -A single key-value pair in a map is separated by a ":" (semicolon), values of a map -or array are separated by "," (comma). - -Tarantool protocol mandates use of a few integer constants serving as keys in maps -used in the protocol. These constants are defined in -https://github.com/tarantool/tarantool/blob/master/src/iproto_constants.h +Tarantool protocol mandates use of a few integer constants serving as keys in +maps used in the protocol. These constants are defined in `src/box/iproto_constants.h +<https://github.com/tarantool/tarantool/blob/master/src/iproto_constants.h>`_ Let's list them here too: -.. code-block:: lua +.. code-block:: bash + -- user keys <code> ::= 0x00 <sync> ::= 0x01 <space_id> ::= 0x10 @@ -55,255 +101,403 @@ Let's list them here too: <key> ::= 0x20 <tuple> ::= 0x21 <function_name> ::= 0x22 + <username> ::= 0x23 <data> ::= 0x30 <error> ::= 0x31 +.. code-block:: bash -The value of the constant defines the type of value of the map. For example, -for :code:`<error> key (0x31)`, the expected value is a msgpack string with -error message. All requests and responses utilize the same basic structure + -- -- Value for <code> key in request can be: + -- User command codes + <select> ::= 0x01 + <insert> ::= 0x02 + <replace> ::= 0x03 + <update> ::= 0x04 + <delete> ::= 0x05 + <call> ::= 0x06 + <auth> ::= 0x07 + -- Admin command codes + <ping> ::= 0x40 -.. code-block:: lua - - <packet> ::= <request> | <response> - <request> ::= <len><header><body> - <response> ::= <len><header><body> - <len> is the length of the packet, in msgpack format. + -- -- Value for <code> key in response can be: + <OK> ::= 0x00 + <ERROR> ::= 0x8XXX -Implementor note: for simplicity of the implementation, the server never -"compresses" the packet length, i.e. it is always passed as msgpack 32-bit -unsigned int, :code:`0xce b4 b3 b2 b1 (5 bytes)` - -.. code-block:: lua - - <len> ::= msgpack Int (unsigned) Both :code:`<header>` and :code:`<body>` are msgpack maps: -.. code-block:: lua +.. code-block:: bash - <header> ::= { (<key> : <value>)+ } - <body> ::= { (<key> : <value>)+ } + Request/Response: -They only differ in the allowed set of keys and values, the key defines the -type of value that follows. If a key is missing, and expects an integer value, -the missing value is always assumed to be 0. If the missing key assumes a string -value, the string is assumed to be empty. If a body has no keys, entire msgpack -map for the body may be missing. Such is the case, for example, in <ping> request. + 0 5 + +------+ +============+ +===================================+ + |BODY +| | | | | + |HEADER| | HEADER | | BODY | + | SIZE | | | | | + +------+ +============+ +===================================+ + MP_INT MP_MAP MP_MAP -.. code-block:: lua +.. code-block:: bash - <key> ::= <header_key> | <body_key> - <header_key> ::= <code> | <sync> + UNIFIED HEADER: -:code:`<code>` is request code or response code + +================+================+ + | | | + | 0x00: CODE | 0x01: SYNC | + | MP_INT: MP_INT | MP_INT: MP_INT | + | | | + +================+================+ + MP_MAP + +They only differ in the allowed set of keys and values, the key defines the +type of value that follows. If a body has no keys, entire msgpack map for +the body may be missing. Such is the case, for example, in <ping> request. -------------------------------------------------------------------------------- - Request packet structure + Authorization -------------------------------------------------------------------------------- -.. code-block:: lua +.. code-block:: bash - Value for <code> key in request can be: - 1 -- <select> - 2 -- <insert> - 3 -- <replace> - 4 -- <update> - 5 -- <delete> - 6 -- <call> - 7 -- <auth> - 64 -- <ping> - 66 -- <subscribe> + PREPARE SCRAMBLE: -:code:`<sync>` is a unique request identifier, preserved in the response, The -identifier is necessary to allow request multiplexing -- i.e. sending multiple -requests through the same connection before fetching a response to any of them. -The value of the identifier currently bears no meaning to the server. Consequently, -<sync> can be 0 or two requests can have an identical id. + LEN(ENCODED_SALT) = 44; + LEN(SCRAMBLE) = 20; -.. code-block:: lua - - <body_key> ::= <request_key> | <response_key> + prepare 'chap-sha1' scramble: -Different request types allow different keys in the body: + salt = base64_decode(encoded_salt); + step_1 = sha1(password); + step_2 = sha1(step_1); + step_3 = sha1(salt, step_2); + scramble = xor(step_1, step_4); + return scramble; -.. code-block:: lua - - <request_key> ::= <select> | <replace> | <delete> | <update> | <call> + AUTHORIZATION BODY: CODE = 0x07 -Find tuples matching the search pattern + +==================+====================================+ + | | +-------------+-----------+ | + | (KEY) | (TUPLE)| len == 9 | len == 20 | | + | 0x23:USERNAME | 0x21:| "chap-sha1" | SCRAMBLE | | + | MP_INT:MP_STRING | MP_INT:| MP_STRING | MP_STRING | | + | | +-------------+-----------+ | + | | MP_ARRAY | + +==================+====================================+ + MP_MAP + +:code:`<key>` holds the user name. :code:`<tuple>` must be an array of 2 fields: +authentication mechanism ("chap-sha1" is the only supported mechanism right now) +and password, encrypted according to the specified mechanism. Authentication in +Tarantool is optional, if no authentication is performed, session user is 'guest'. +The server responds to authentication packet with a standard response with 0 tuples. + +-------------------------------------------------------------------------------- + Requests +-------------------------------------------------------------------------------- -.. code-block:: lua - - <select> ::= <space_id> | <index_id> | <iterator> | <offset> | <limit> | <key> +* SELECT: CODE - 0x01 + Find tuples matching the search pattern + +.. code-block:: bash + + SELECT BODY: + + +==================+==================+==================+ + | | | | + | 0x10: SPACE_ID | 0x11: INDEX_ID | 0x12: LIMIT | + | MP_INT: MP_INT | MP_INT: MP_INT | MP_INT: MP_INT | + | | | | + +==================+==================+==================+ + | | | | + | 0x13: OFFSET | 0x14: ITERATOR | 0x14: KEY | + | MP_INT: MP_INT | MP_INT: MP_INT | MP_INT: MP_ARRAY | + | | | | + +==================+==================+==================+ + MP_MAP + +* INSERT: CODE - 0x02 + Inserts tuple into the space, if no tuple with same unique keys exists. Otherwise throw *duplicate key* error. +* REPLACE: CODE - 0x03 + Insert a tuple into the space or replace an existing one. + +.. code-block:: bash + + + INSERT/REPLACE BODY: + + +==================+==================+ + | | | + | 0x10: SPACE_ID | 0x21: TUPLE | + | MP_INT: MP_INT | MP_INT: MP_ARRAY | + | | | + +==================+==================+ + MP_MAP + +* UPDATE: CODE - 0x04 + Update a tuple + +.. code-block:: bash + + UPDATE BODY: + + +==================+==================+==================+=======================+ + | | | | +~~~~~~~~~~+ | + | | | | | | | + | | | | (TUPLE) | OP | | + | 0x10: SPACE_ID | 0x11: INDEX_ID | 0x14: KEY | 0x21: | | | + | MP_INT: MP_INT | MP_INT: MP_INT | MP_INT: MP_ARRAY | MP_INT: +~~~~~~~~~~+ | + | | | | MP_ARRAY | + +==================+==================+==================+=======================+ + MP_MAP + +.. code-block:: bash + + OP: + Works only for integer fields: + * Addition OP = '+' . space[key][field_no] += argument + * Subtraction OP = '-' . space[key][field_no] -= argument + * Bitwise AND OP = '&' . space[key][field_no] &= argument + * Bitwise XOR OP = '^' . space[key][field_no] ^= argument + * Bitwise OR OP = '|' . space[key][field_no] |= argument + Works on any fields: + * Delete OP = '#' + delete <argument> fields starting from <field_no> in the space[<key>] + + 0 2 + +-----------+==========+==========+ + | | | | + | OP | FIELD_NO | ARGUMENT | + | MP_FIXSTR | MP_INT | MP_INT | + | | | | + +-----------+==========+==========+ + MP_ARRAY + +.. code-block:: bash + + * Insert OP = '!' + insert <argument> before <field_no> + * Assign OP = '=' + assign <argument> to field <field_no>. + will extend the tuple if <field_no> == <max_field_no> + 1 + + 0 2 + +-----------+==========+===========+ + | | | | + | OP | FIELD_NO | ARGUMENT | + | MP_FIXSTR | MP_INT | MP_OBJECT | + | | | | + +-----------+==========+===========+ + MP_ARRAY + + Works on string fields: + * Splice OP = ':' + take the string from space[key][field_no] and + substitute <offset> bytes from <position> with <argument> + +.. code-block:: bash + + 0 2 + +-----------+==========+==========+========+==========+ + | | | | | | + | ':' | FIELD_NO | POSITION | OFFSET | ARGUMENT | + | MP_FIXSTR | MP_INT | MP_INT | MP_INT | MP_STR | + | | | | | | + +-----------+==========+==========+========+==========+ + MP_ARRAY + + +It's an error to specify an argument of a type that differs from expected type. + +* DELETE: CODE - 0x05 + Delete a tuple + +.. code-block:: bash + + DELETE BODY: + + +==================+==================+==================+ + | | | | + | 0x10: SPACE_ID | 0x11: INDEX_ID | 0x14: KEY | + | MP_INT: MP_INT | MP_INT: MP_INT | MP_INT: MP_ARRAY | + | | | | + +==================+==================+==================+ + MP_MAP + + +* CALL: CODE - 0x06 + Call a stored function + +.. code-block:: bash + + CALL BODY: + + +=======================+==================+ + | | | + | 0x22: FUNCTION_NAME | 0x21: TUPLE | + | MP_INT: MP_STRING | MP_INT: MP_ARRAY | + | | | + +=======================+==================+ + MP_MAP -Insert a tuple into the space or replace an existing one. +-------------------------------------------------------------------------------- + Response packet structure +-------------------------------------------------------------------------------- -.. code-block:: lua - - <replace> ::= <space_id> | <tuple> +We'll show whole packets here: -Insert is similar to replace, but will return a duplicate key error if such tuple -already exists. +.. code-block:: bash -.. code-block:: lua - - <insert> ::= <space_id> | <tuple> -Delete a tuple + OK: LEN + HEADER + BODY -.. code-block:: lua - - <delete> ::= <space_id> | <index_id> | <key> + 0 5 OPTIONAL + +------++================+================++===================+ + | || | || | + | BODY || 0x00: 0x00 | 0x01: SYNC || 0x30: DATA | + |HEADER|| MP_INT: MP_INT | MP_INT: MP_INT || MP_INT: MP_OBJECT | + | SIZE || | || | + +------++================+================++===================+ + MP_INT MP_MAP MP_MAP -Update a tuple +Set of tuples in the response :code:`<data>` expects a msgpack array of tuples as value -.. code-block:: lua - - <udpate> ::= <space_id> | <index_id> | <key> | <tuple> +.. code-block:: bash -Call a stored function + ERROR: LEN + HEADER + BODY -.. code-block:: lua - - <call> ::= <function_name> | <tuple> + 0 5 + +------++================+================++===================+ + | || | || | + | BODY || 0x00: 0x8XXX | 0x01: SYNC || 0x31: ERROR | + |HEADER|| MP_INT: MP_INT | MP_INT: MP_INT || MP_INT: MP_STRING | + | SIZE || | || | + +------++================+================++===================+ + MP_INT MP_MAP MP_MAP -Authenticate a session -:code:`<key>` holds the user name. :code:`<tuple>` must be an array of 2 fields: -authentication mechanism ("chap-sha1" is the only supported mechanism right now) -and password, encrypted according to the specified mechanism -https://github.com/tarantool/tarantool/blob/master/src/scramble.h -for instructions how to prepare a hashed password for "chap-sha1" authentication -mechanism. Authentication in Tarantool is optional, if no authentication is -performed, session user is 'guest'. The server responds to authentication packet -with a standard response with 0 tuples. - -.. code-block:: lua - - <auth> ::= <key> | <tuple> - -As can be seen from the grammar some requests have common keys, whereas other -keys can be present only in a body of a single request type. - -:code:`<space_id>` space to use in the request To find the numeric space id by -space name, one must first query :code:`_space` system space. Id of :code:`_space` -system space is defined in :code:`box.schema.SPACE_ID` (global Lua variable set -in package :code:`box`) - -:code:`<index_id>` index id of the index to use in the request Similarly to -space, to find the numeric index id by index name, one must query the -:code:`_index` system space. Id of :code:`_index` system space is defined in -:code:`box.schema.INDEX_ID` (global Lua variable set in package :code:`box`). -:code:`<tuple>` defines the actual argument of the operation in :code:`<replace>` -it defines the tuple which will be inserted into the database. In :code:`<call>` -it defines call arguments. When request body allows :code:`<tuple>` as a key, -it must always be present, since otherwise the request is meaningless. - -:code:`<offset>` specifies offset in the result set, expects :code:`<uint32>` -value :code:`<limit>` specifies limit in the result set, expects a :code:`<uint32>` -value :code:`<iterator>` specifies the iterator type to use in search, an integer -constant from the range defined in -https://github.com/tarantool/tarantool/blob/master/src/box/index.h#L61 -:code:`<function_name>` is used to give call path for a Lua function :code:`<tuple>` -in :code:`<update>` must carry a list of update operations: - -.. code-block:: lua - - <op_list> ::= [ (<operation>)+ ] - <operation> ::= [ <op>, <field_no>, (<argument>)+ ] - <field_no> ::= <int32> - - -:code:`<op>` is a 1-byte ASCII string carrying operation code: - -- "=" - assign operation argument to field <field_no> .\ - will extend the tuple if <field_no> == <max_field_no> + 1 -- "#" - delete <argument> fields starting from <field_no> -- "!" - insert <argument> before <field_no> - -The following operation(s) are only defined for integer types (32 and 64 bit): - -- "+" - add argument to field <field_no>, both arguments \ - are treated as signed 32 or 64 -bit ints -- "-" - subtract argument from the field <field_no> -- "&" - bitwise AND of argument and field <field_no> -- "^" - bitwise XOR of argument and field <field_no> -- "|" - bitwise OR of argument and field <field_no> - -Finally there is an operation that expects offset, cut length and string paste -arguments - -- ":" - implementation of Perl 'splice' command - - -It's an error to specify an argument of a type that -differs from expected type. + Where 0xXXX is ERRCODE. + +Error message is present in the response only if there is an error :code:`<error>` +expects as value a msgpack string + +Convenience macros which define hexadecimal constants for return codes +can be found in `src/errcode.h +<https://github.com/tarantool/tarantool/blob/master/src/errcode.h>`_ -------------------------------------------------------------------------------- - Response packet structure + Replication packet structure -------------------------------------------------------------------------------- -Value of :code:`<code>` key in response is: +.. code-block:: bash + + -- replication keys + <server_id> ::= 0x02 + <lsn> ::= 0x03 + <timestamp> ::= 0x04 + <server_uuid> ::= 0x24 + <cluster_uuid> ::= 0x25 + <vclock> ::= 0x26 + +.. code-block:: bash + + -- replication codes + <join> ::= 0x41 + <subscribe> ::= 0x42 + + +.. code-block:: bash + + JOIN: + + In the beginning you must send JOIN + HEADER BODY + +================+================+===================++-------+ + | | | SERVER_UUID || | + | 0x00: 0x41 | 0x01: SYNC | 0x24: UUID || EMPTY | + | MP_INT: MP_INT | MP_INT: MP_INT | MP_INT: MP_STRING || | + | | | || | + +================+================+===================++-------+ + MP_MAP MP_MAP + + Then server, which we connect to, will send last SNAP file by, simply, + creating a number of INSERT's (with additional LSN and ServerID) (don't reply) + Then it'll send a vclock's MP_MAP and close a socket. + + +================+================++============================+ + | | || +~~~~~~~~~~~~~~~~~+ | + | | || | | | + | 0x00: 0x00 | 0x01: SYNC || 0x26:| SRV_ID: SRV_LSN | | + | MP_INT: MP_INT | MP_INT: MP_INT || MP_INT:| MP_INT: MP_INT | | + | | || +~~~~~~~~~~~~~~~~~+ | + | | || MP_MAP | + +================+================++============================+ + MP_MAP MP_MAP + + SUBSCRIBE: + + Then you must send SUBSCRIBE: + + HEADER + +================+================+===================+===================+ + | | | SERVER_UUID | CLUSTER_UUID | + | 0x00: 0x41 | 0x01: SYNC | 0x24: UUID | 0x25: UUID | + | MP_INT: MP_INT | MP_INT: MP_INT | MP_INT: MP_STRING | MP_INT: MP_STRING | + | | | | | + +================+================+===================+===================+ + MP_MAP + BODY + +================+ + | | + | 0x26: VCLOCK | + | MP_INT: MP_INT | + | | + +================+ + MP_MAP + + Then you must process every query that'll came through other masters. + Every request between masters will have Additional LSN and SERVER_ID. -.. code-block:: lua - - 0 -- SUCCESS - !0 -- Tarantool error code +-------------------------------------------------------------------------------- + XLOG / SNAP +-------------------------------------------------------------------------------- -If response :code:`<code>` is 0 (success), response body contains zero or more -tuples, otherwise it carries an error message that corresponds to the return code. +XLOG and SNAP have one format now. For example, they starts with: -On success, the server always returns a tuple or tuples, when found. I.e. on success, -response :code:`<body>` contains :code:`<set>` key. For select/update/delete, it's -the tuple that matched the search criterion. For :code:`<replace>`, it's the inserted -tuple. For :code:`<call>`, it's whatever the called function returns. +.. code-block:: bash -.. code-block:: lua - - <response_key> = <data> | <error> + SNAP\n + 0.12\n + Server: e6eda543-eda7-4a82-8bf4-7ddd442a9275\n + VClock: {1: 0}\n + \n + ... -Set of tuples in the response :code:`<data>` expects a msgpack array of tuples as value +So, **Header** of SNAP/XLOG consists from: -Error message is present in the response only if there is an error :code:`<error>` -expects as value a msgpack string +.. code-block:: bash -The error :code:`<code>` consists of the actual error code and request completion status -(completion status is complementary since it can be deduced from the error code) -There are only 3 completion status codes in use: - -.. code-block:: lua - - 0 - success; The only possible error code with this status is - 0, ER_OK - 1 - try again; An indicator of an intermittent error. - Usually is returned when two clients attempt to change - the same tuple simultaneously. - (<update> is not always done atomically) - 2 - error - -The error code holds the actual error. Existing error codes include: - -.. code-block:: lua - - Completion status 0 (success) - ------------------------------------------- - 0x00000000 -- ER_OK - - Completion status 1 (try again) - ------------------------------------------- - 0x00000201 -- ER_MEMORY_ISSUE - An error occurred when allocating memory - - Completion status 2 (error) - ------------------------------------------- - 0x00000102 -- ER_ILLEGAL_PARAMS - Malformed query - 0x00000302 -- ER_TUPLE_FOUND - Duplicate key exists in a unique index - -Convenience macros which define hexadecimal constants for :code:`<int32>` return -codes (completion status + code) can be found here: -https://github.com/tarantool/tarantool/blob/master/src/errcode.h + <format>\n + <format_version>\n + Server: <server_uuid> + VClock: <vclock_map>\n + \n --------------------------------------------------------------------------------- - Additional packets --------------------------------------------------------------------------------- -TODO +There're two markers: tuple beggining - **0xd5ba0bab** and EOF marker - **0xd510aded**. So, next, between **Header** and EOF marker there's data with such schema: + +.. code-block:: bash + + 0 3 4 17 + +-------------+========+============+===========+=========+ + | | | | | | + | 0xd5ba0bab | LENGTH | CRC32 PREV | CRC32 CUR | PADDING | + | | | | | | + +-------------+========+============+===========+=========+ + MP_FIXEXT2 MP_INT MP_INT MP_INT --- + + +============+ +===================================+ + | | | | + | HEADER | | BODY | + | | | | + +============+ +===================================+ + MP_MAP MP_MAP diff --git a/doc/www/theme/static/pygmentize.css b/doc/www/theme/static/pygmentize.css index b089f5bc652babc6b8473dde6f0c0b0279b10fc5..0f31924d1d32137ab54d75df16ea56fc4d5418cf 100644 --- a/doc/www/theme/static/pygmentize.css +++ b/doc/www/theme/static/pygmentize.css @@ -1,5 +1,5 @@ .highlight .hll { background-color: #49483e } -.highlight { background: #272822; color: #f8f8f2; } +.highlight { background: #272822; color: #f8f8f2; font-size: 75%} div .highlight { margin: 20px} .highlight pre { padding: 10px} .highlight .c { color: #75715e } /* Comment */ diff --git a/doc/www/theme/templates/script b/doc/www/theme/templates/script index 2526976f4d97a0e6af8910c62489d01b6a37b1f5..255753008cb53b7c5b2ae71c6e3aaa5b822c1bd3 100644 --- a/doc/www/theme/templates/script +++ b/doc/www/theme/templates/script @@ -1,18 +1,14 @@ <script type="text/javascript"> + (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){ + (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o), + m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m) + })(window,document,'script','//www.google-analytics.com/analytics.js','ga'); - var _gaq = _gaq || []; - _gaq.push(['_setAccount', 'UA-22120502-1']); - _gaq.push(['_trackPageview']); - - (function() { - var ga = document.createElement('script'); ga.type = 'text/javascript'; ga.async = true; - ga.src = ('https:' == document.location.protocol ? 'https://ssl' : 'http://www') + '.google-analytics.com/ga.js'; - var s = document.getElementsByTagName('script')[0]; s.parentNode.insertBefore(ga, s); - })(); + ga('create', 'UA-22120502-1', 'auto'); + ga('send', 'pageview'); </script> - <script type="text/javascript">//<![CDATA[ (function(w,n,d,r,s){(new Image).src='http://dd.cd.b2.a2.top.mail.ru/counter?id=2284916;js=13'+ ((r=d.referrer)?';r='+escape(r):'')+((s=w.screen)?';s='+s.width+'*'+s.height:'')+';_='+Math.random();})(window,navigator,document);//]]> diff --git a/src/box/CMakeLists.txt b/src/box/CMakeLists.txt index bef3d928599d4b02f47441cda1c6e40f66c78c02..8cb6295982548b00add6647eaaf34ff606589f1e 100644 --- a/src/box/CMakeLists.txt +++ b/src/box/CMakeLists.txt @@ -41,7 +41,8 @@ add_library(box request.cc txn.cc box.cc - access.cc + user_def.cc + user_cache.cc authentication.cc vclock.c cluster.cc diff --git a/src/box/alter.cc b/src/box/alter.cc index 297b50b097927fa7ac2ea6981c4f51680d6a3718..00c78ee5720ef2ae15f00574edfb6a0fa5ddb28c 100644 --- a/src/box/alter.cc +++ b/src/box/alter.cc @@ -28,7 +28,8 @@ */ #include "alter.h" #include "schema.h" -#include "access.h" +#include "user_def.h" +#include "user_cache.h" #include "space.h" #include "txn.h" #include "tuple.h" @@ -39,6 +40,7 @@ #include <stdio.h> /* snprintf() */ #include <ctype.h> #include "cluster.h" /* for cluster_set_uuid() */ +#include "session.h" /* to fetch the current user. */ /** _space columns */ #define ID 0 @@ -70,14 +72,15 @@ void access_check_ddl(uint32_t owner_uid) { - struct user *user = user(); + struct current_user *user = current_user(); /* * Only the creator of the space or superuser can modify * the space, since we don't have ALTER privilege. */ if (owner_uid != user->uid && user->uid != ADMIN) { + struct user_def *def = user_cache_find(user->uid); tnt_raise(ClientError, ER_ACCESS_DENIED, - "Create or drop", user->name); + "Create or drop", def->name); } } @@ -1155,7 +1158,7 @@ user_has_data(uint32_t uid) */ void -user_fill_auth_data(struct user *user, const char *auth_data) +user_fill_auth_data(struct user_def *user, const char *auth_data) { uint8_t type = mp_typeof(*auth_data); if (type == MP_ARRAY || type == MP_NIL) { @@ -1199,7 +1202,7 @@ user_fill_auth_data(struct user *user, const char *auth_data) } void -user_create_from_tuple(struct user *user, struct tuple *tuple) +user_create_from_tuple(struct user_def *user, struct tuple *tuple) { /* In case user password is empty, fill it with \0 */ memset(user, 0, sizeof(*user)); @@ -1252,7 +1255,7 @@ user_cache_alter_user(struct trigger * /* trigger */, void *event) { struct txn *txn = (struct txn *) event; struct txn_stmt *stmt = txn_stmt(txn); - struct user user; + struct user_def user; user_create_from_tuple(&user, stmt->new_tuple); user_cache_replace(&user); } @@ -1271,9 +1274,9 @@ on_replace_dd_user(struct trigger * /* trigger */, void *event) uint32_t uid = tuple_field_u32(old_tuple ? old_tuple : new_tuple, ID); - struct user *old_user = user_cache_find(uid); + struct user_def *old_user = user_by_id(uid); if (new_tuple != NULL && old_user == NULL) { /* INSERT */ - struct user user; + struct user_def user; user_create_from_tuple(&user, new_tuple); (void) user_cache_replace(&user); struct trigger *on_rollback = @@ -1305,7 +1308,7 @@ on_replace_dd_user(struct trigger * /* trigger */, void *event) * password) but first check that the change is * correct. */ - struct user user; + struct user_def user; user_create_from_tuple(&user, new_tuple); struct trigger *on_commit = txn_alter_trigger_new(user_cache_alter_user, NULL); @@ -1319,8 +1322,19 @@ func_def_create_from_tuple(struct func_def *func, struct tuple *tuple) { func->fid = tuple_field_u32(tuple, ID); func->uid = tuple_field_u32(tuple, UID); - func->auth_token = BOX_USER_MAX; /* invalid value */ func->setuid = false; + /* + * Do not initialize the privilege cache right away since + * when loading up a function definition during recovery, + * user cache may not be filled up yet (space _user is + * recovered after space _func), so no user cache entry + * may exist yet for such user. The cache will be filled + * up on demand upon first access. + * + * Later on consistency of the cache is ensured by DDL + * checks (see user_has_data()). + */ + func->setuid_user.auth_token = BOX_USER_MAX; /* invalid value */ const char *name = tuple_field_cstr(tuple, NAME); uint32_t len = strlen(name); if (len >= sizeof(func->name)) { @@ -1434,12 +1448,9 @@ priv_def_create_from_tuple(struct priv_def *priv, struct tuple *tuple) static void priv_def_check(struct priv_def *priv) { - struct user *grantor = user_cache_find(priv->grantor_id); - struct user *grantee = user_cache_find(priv->grantee_id); - if (grantor == NULL) { - tnt_raise(ClientError, ER_NO_SUCH_USER, - int2str(priv->grantor_id)); - } + struct user_def *grantor = user_cache_find(priv->grantor_id); + /* May be a role */ + struct user_def *grantee = user_by_id(priv->grantee_id); if (grantee == NULL) { tnt_raise(ClientError, ER_NO_SUCH_USER, int2str(priv->grantee_id)); @@ -1482,14 +1493,20 @@ priv_def_check(struct priv_def *priv) static void grant_or_revoke(struct priv_def *priv) { - struct user *grantee = user_cache_find(priv->grantee_id); + struct user_def *grantee = user_by_id(priv->grantee_id); if (grantee == NULL) return; struct access *access = NULL; switch (priv->object_type) { case SC_UNIVERSE: + { access = &grantee->universal_access; + /** Update cache at least in the current session. */ + struct current_user *user = current_user(); + if (grantee->uid == user->uid) + user->universal_access = priv->access; break; + } case SC_SPACE: { struct space *space = space_by_id(priv->object_id); diff --git a/src/box/authentication.cc b/src/box/authentication.cc index 89f03d5d47bed7b54133e4d72982d7691ae81702..b35669ad642c769f7efb85b028c668abba3a12c4 100644 --- a/src/box/authentication.cc +++ b/src/box/authentication.cc @@ -26,20 +26,16 @@ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ -#include "access.h" +#include "user_cache.h" +#include "user_def.h" +#include "session.h" void authenticate(const char *user_name, uint32_t len, const char *tuple, const char * /* tuple_end */) { - struct user *user = user_by_name(user_name, len); - if (user == NULL) { - char name[BOX_NAME_MAX + 1]; - /* \0 - to correctly print user name the error message. */ - snprintf(name, sizeof(name), "%.*s", len, user_name); - tnt_raise(ClientError, ER_NO_SUCH_USER, name); - } - struct session *session = session(); + struct user_def *user = user_cache_find_by_name(user_name, len); + struct session *session = current_session(); uint32_t part_count = mp_decode_array(&tuple); if (part_count < 2) { /* Expected at least: authentication mechanism and data. */ @@ -58,6 +54,6 @@ authenticate(const char *user_name, uint32_t len, if (scramble_check(scramble, session->salt, user->hash2)) tnt_raise(ClientError, ER_PASSWORD_MISMATCH, user->name); - session_set_user(session, user->auth_token, user->uid); + current_user_init(&session->user, user); } diff --git a/src/box/box.cc b/src/box/box.cc index 5f079c7e811580fecec6fc870c593077dc56f2ba..d6c744896c54c33de53e306c392a76f437391f0d 100644 --- a/src/box/box.cc +++ b/src/box/box.cc @@ -27,15 +27,13 @@ * SUCH DAMAGE. */ #include "box/box.h" -#include <arpa/inet.h> -#include <sys/wait.h> -#include <errcode.h> -#include "recovery.h" -#include "log_io.h" #include <say.h> #include "iproto.h" +#include "iproto_constants.h" +#include "recovery.h" #include "replication.h" +#include "replica.h" #include <stat.h> #include <tarantool.h> #include "tuple.h" @@ -50,7 +48,7 @@ #include "request.h" #include "txn.h" #include "fiber.h" -#include "access.h" +#include "user_cache.h" #include "cfg.h" #include "iobuf.h" @@ -330,7 +328,7 @@ box_on_cluster_join(const tt_uuid *server_uuid) } void -box_process_join(struct xrow_header *header) +box_process_join(int fd, struct xrow_header *header) { assert(header->type == IPROTO_JOIN); struct tt_uuid server_uuid = uuid_nil; @@ -338,15 +336,15 @@ box_process_join(struct xrow_header *header) box_on_cluster_join(&server_uuid); - /* process JOIN request via replication relay */ - replication_join(session()->fd, header); + /* Process JOIN request via replication relay */ + replication_join(fd, header); } void -box_process_subscribe(struct xrow_header *header) +box_process_subscribe(int fd, struct xrow_header *header) { /* process SUBSCRIBE request via replication relay */ - replication_subscribe(session()->fd, header); + replication_subscribe(fd, header); } /** Replace the current server id in _cluster */ @@ -383,6 +381,7 @@ box_free(void) { if (recovery == NULL) return; + session_free(); user_cache_free(); schema_free(); tuple_free(); @@ -390,7 +389,6 @@ box_free(void) recovery = NULL; engine_shutdown(); stat_free(); - session_free(); } static void @@ -410,7 +408,6 @@ box_init() box_check_config(); title("loading", NULL); - session_init(); replication_prefork(cfg_gets("snap_dir"), cfg_gets("wal_dir")); stat_init(); @@ -422,6 +419,12 @@ box_init() schema_init(); user_cache_init(); + /* + * The order is important: to initialize sessions, + * we need to access the admin user, which is used + * as a default session user when running triggers. + */ + session_init(); /* recovery initialization */ recovery = recovery_new(cfg_gets("snap_dir"), cfg_gets("wal_dir"), diff --git a/src/box/box.h b/src/box/box.h index a2ebc2f28912092ab442fe9d7ac43e815ab01b23..23ad5451e5a23465100b16f4a620431efac6e240 100644 --- a/src/box/box.h +++ b/src/box/box.h @@ -101,10 +101,10 @@ void box_leave_local_standby_mode(void *data __attribute__((unused))); void -box_process_join(struct xrow_header *header); +box_process_join(int fd, struct xrow_header *header); void -box_process_subscribe(struct xrow_header *header); +box_process_subscribe(int fd, struct xrow_header *header); /** * Check Lua configuration before initialization or diff --git a/src/box/iproto.cc b/src/box/iproto.cc index 0c2470153e45f8f6d8e5cf9a6cac2d79ad3e5974..bce1016e6c2c27ffe470ed24f173ca4dcb6edf5a 100644 --- a/src/box/iproto.cc +++ b/src/box/iproto.cc @@ -47,18 +47,7 @@ #include "coio.h" #include "xrow.h" #include "iproto_constants.h" - -class IprotoConnectionShutdown: public Exception -{ -public: - IprotoConnectionShutdown(const char *file, int line) - :Exception(file, line) {} - virtual void log() const; -}; - -void -IprotoConnectionShutdown::log() const -{} +#include "user_def.h" /* {{{ iproto_request - declaration */ @@ -210,6 +199,7 @@ iproto_queue_handler(va_list ap) while ((request = iproto_queue_pop(i_queue))) { IprotoRequestGuard guard(request); fiber_set_session(fiber(), request->session); + fiber_set_user(fiber(), &request->session->user); request->process(request); } /** Put the current fiber into a queue fiber cache. */ @@ -346,8 +336,8 @@ iproto_connection_delete(struct iproto_connection *con) assert(iproto_connection_is_idle(con)); assert(!evio_is_active(&con->output)); if (con->session) { - fiber_set_session(fiber(), con->session); - session_run_on_disconnect_triggers(con->session); + if (! rlist_empty(&session_on_disconnect)) + session_run_on_disconnect_triggers(con->session); session_destroy(con->session); } iobuf_delete(con->iobuf[0]); @@ -556,8 +546,6 @@ iproto_connection_on_input(ev_loop *loop, struct ev_io *watcher, */ if (!ev_is_active(&con->input)) ev_feed_event(loop, &con->input, EV_READ); - } catch (IprotoConnectionShutdown *e) { - iproto_connection_shutdown(con); } catch (Exception *e) { e->log(); iproto_connection_close(con); @@ -709,13 +697,17 @@ iproto_process_admin(struct iproto_request *ireq) ireq->header.sync); break; case IPROTO_JOIN: - box_process_join(&ireq->header); - /* TODO: check requests in `con; queue */ + ev_io_stop(con->loop, &con->input); + ev_io_stop(con->loop, &con->output); + box_process_join(con->input.fd, &ireq->header); + /* TODO: check requests in `con' queue */ iproto_connection_shutdown(con); return; case IPROTO_SUBSCRIBE: - box_process_subscribe(&ireq->header); - /* TODO: check requests in `con; queue */ + ev_io_stop(con->loop, &con->input); + ev_io_stop(con->loop, &con->output); + box_process_subscribe(con->input.fd, &ireq->header); + /* TODO: check requests in `con' queue */ iproto_connection_shutdown(con); return; default: @@ -769,10 +761,8 @@ iproto_process_connect(struct iproto_request *request) con->session = session_create(fd, con->cookie); coio_write(&con->input, iproto_greeting(con->session->salt), IPROTO_GREETING_SIZE); - fiber_set_session(fiber(), con->session); - session_run_on_connect_triggers(con->session); - /* Set session user to guest, until it is authenticated. */ - session_set_user(con->session, GUEST, GUEST); + if (! rlist_empty(&session_on_connect)) + session_run_on_connect_triggers(con->session); } catch (ClientError *e) { iproto_reply_error(&iobuf->out, e, request->header.type); try { @@ -784,7 +774,6 @@ iproto_process_connect(struct iproto_request *request) return; } catch (Exception *e) { e->log(); - assert(con->session == NULL); iproto_connection_close(con); return; } diff --git a/src/box/key_def.h b/src/box/key_def.h index 63e7251357d554a31174d1ddac1f06980d7d7769..91982d07ebcab8b110a859185a36b8a590752189 100644 --- a/src/box/key_def.h +++ b/src/box/key_def.h @@ -291,11 +291,28 @@ struct access { uint8_t effective; }; +/** + * Effective session user. A cache of user data + * and access stored in session and fiber local storage. + * Differs from the authenticated user when executing + * setuid functions. + */ +struct current_user { + /** A look up key to quickly find session user. */ + uint8_t auth_token; + /** + * Cached global grants, to avoid an extra look up + * when checking global grants. + */ + uint8_t universal_access; + /** User id of the authenticated user. */ + uint32_t uid; +}; + /** * Definition of a function. Function body is not stored * or replicated (yet). */ - struct func_def { /** Function id. */ uint32_t fid; @@ -303,14 +320,14 @@ struct func_def { uint32_t uid; /** * True if the function requires change of user id before - * invocaction. + * invocation. */ bool setuid; /** * Authentication id of the owner of the function, * used for set-user-id functions. */ - uint8_t auth_token; + struct current_user setuid_user; /** Function name. */ char name[BOX_NAME_MAX + 1]; /** diff --git a/src/box/lua/call.cc b/src/box/lua/call.cc index e1683d7557f90028329a2f7398f75237f8b2dd1a..0c233653e3858a1d89b95d6107619837f53226f8 100644 --- a/src/box/lua/call.cc +++ b/src/box/lua/call.cc @@ -48,8 +48,11 @@ #include "box/request.h" #include "box/engine.h" #include "box/txn.h" -#include "box/access.h" +#include "box/user_def.h" +#include "box/user_cache.h" #include "box/schema.h" +#include "box/session.h" +#include "box/iproto_constants.h" /* contents of box.lua, misc.lua, box.net.lua respectively */ extern char session_lua[], @@ -469,21 +472,17 @@ struct SetuidGuard { /** True if the function was set-user-id one. */ bool setuid; - /** Original authentication token, only set if setuid = true. */ - uint8_t orig_auth_token; - /** Original user id, only set if setuid = true. */ - uint32_t orig_uid; + struct current_user *orig_user; inline SetuidGuard(const char *name, uint32_t name_len, - struct user *user, uint8_t access); + uint8_t access); inline ~SetuidGuard(); }; SetuidGuard::SetuidGuard(const char *name, uint32_t name_len, - struct user *user, uint8_t access) + uint8_t access) :setuid(false) - ,orig_auth_token(GUEST) /* silence gnu warning */ - ,orig_uid(GUEST) + ,orig_user(current_user()) { /* @@ -491,9 +490,9 @@ SetuidGuard::SetuidGuard(const char *name, uint32_t name_len, * No special check for ADMIN user is necessary * since ADMIN has universal access. */ - if (user->universal_access.effective & PRIV_ALL) + if (orig_user->universal_access & PRIV_ALL) return; - access &= ~user->universal_access.effective; + access &= ~orig_user->universal_access; /* * We need to look up the function by name even if * the user has access to it, since it could require @@ -509,35 +508,37 @@ SetuidGuard::SetuidGuard(const char *name, uint32_t name_len, */ return; } - if (func == NULL || (func->uid != user->uid && - access & ~func->access[user->auth_token].effective)) { + if (func == NULL || (func->uid != orig_user->uid && + access & ~func->access[orig_user->auth_token].effective)) { /* Access violation, report error. */ char name_buf[BOX_NAME_MAX + 1]; snprintf(name_buf, sizeof(name_buf), "%.*s", name_len, name); + struct user_def *def = user_cache_find(orig_user->uid); tnt_raise(ClientError, ER_FUNCTION_ACCESS_DENIED, - priv_name(access), user->name, name_buf); + priv_name(access), def->name, name_buf); } if (func->setuid) { /** Remember and change the current user id. */ - if (unlikely(func->auth_token >= BOX_USER_MAX)) { - /* Optimization: cache auth_token on first access */ - struct user *owner = user_cache_find(func->uid); - assert(owner != NULL); /* checked by user_has_data() */ - func->auth_token = owner->auth_token; - assert(owner->auth_token < BOX_USER_MAX); + if (unlikely(func->setuid_user.auth_token >= BOX_USER_MAX)) { + /* + * Fill the cache upon first access, since + * when func_def is created, no user may + * be around to fill it (recovery of + * system spaces from a snapshot). + */ + struct user_def *owner = user_cache_find(func->uid); + current_user_init(&func->setuid_user, owner); } setuid = true; - orig_auth_token = user->auth_token; - orig_uid = user->uid; - session_set_user(session(), func->auth_token, func->uid); + fiber_set_user(fiber(), &func->setuid_user); } } SetuidGuard::~SetuidGuard() { if (setuid) - session_set_user(session(), orig_auth_token, orig_uid); + fiber_set_user(fiber(), orig_user); } /** @@ -547,7 +548,6 @@ SetuidGuard::~SetuidGuard() void box_lua_call(struct request *request, struct port *port) { - struct user *user = user(); lua_State *L = lua_newthread(tarantool_L); LuarefGuard coro_ref(tarantool_L); const char *name = request->key; @@ -562,7 +562,7 @@ box_lua_call(struct request *request, struct port *port) * https://github.com/tarantool/tarantool/issues/300 * - if a function does not exist, say it first. */ - SetuidGuard setuid(name, name_len, user, PRIV_X); + SetuidGuard setuid(name, name_len, PRIV_X); /* Push the rest of args (a tuple). */ const char *args = request->tuple; uint32_t arg_count = mp_decode_array(&args); diff --git a/src/box/lua/index.cc b/src/box/lua/index.cc index 4f5424312baf917db8b2c5df48790ee727344315..9293cf8ae52fef4df1539bd28096a7d1b15cdb2b 100644 --- a/src/box/lua/index.cc +++ b/src/box/lua/index.cc @@ -31,7 +31,7 @@ #include "box/index.h" #include "box/space.h" #include "box/schema.h" -#include "box/access.h" +#include "box/user_def.h" #include "box/lua/tuple.h" #include "fiber.h" #include "tbuf.h" diff --git a/src/box/lua/session.cc b/src/box/lua/session.cc index f5781785022cf1adbb9c71bb06b1610be66715f8..c0ac116d54a6bd4861f59fa64721f127b307ebae 100644 --- a/src/box/lua/session.cc +++ b/src/box/lua/session.cc @@ -29,7 +29,8 @@ #include "session.h" #include "lua/utils.h" #include "lua/trigger.h" -#include "box/access.h" +#include "box/user_cache.h" +#include "box/user_def.h" extern "C" { #include <lua.h> @@ -55,23 +56,31 @@ static const char *sessionlib_name = "box.session"; static int lbox_session_id(struct lua_State *L) { - lua_pushnumber(L, session()->id); + lua_pushnumber(L, current_session()->id); return 1; } -/** Session user id. */ +/** + * Session user id. + * Note: effective user id (current_user()->uid) + * may be different in a setuid function. + */ static int lbox_session_uid(struct lua_State *L) { - lua_pushnumber(L, session()->uid); + lua_pushnumber(L, current_session()->user.uid); return 1; } -/** Session user id. */ +/** + * Session user name. + * Note: effective user name may be different in + * a setuid function. + */ static int lbox_session_user(struct lua_State *L) { - struct user *user = user_cache_find(session()->uid); + struct user_def *user = user_by_id(current_session()->user.uid); if (user) lua_pushstring(L, user->name); else @@ -85,25 +94,18 @@ lbox_session_su(struct lua_State *L) { if (lua_gettop(L) != 1) luaL_error(L, "session.su(): bad arguments"); - struct session *session = session(); + struct session *session = current_session(); if (session == NULL) luaL_error(L, "session.su(): session does not exit"); - struct user *user; + struct user_def *user; if (lua_type(L, 1) == LUA_TSTRING) { size_t len; const char *name = lua_tolstring(L, 1, &len); - user = user_by_name(name, len); - if (user == NULL) - tnt_raise(ClientError, ER_NO_SUCH_USER, name); + user = user_cache_find_by_name(name, len); } else { - uint32_t uid = lua_tointeger(L, 1);; - user = user_cache_find(uid); - if (user == NULL) { - tnt_raise(ClientError, ER_NO_SUCH_USER, - int2str(uid)); - } + user = user_cache_find(lua_tointeger(L, 1)); } - session_set_user(session, user->auth_token, user->uid); + current_user_init(&session->user, user); return 0; } @@ -149,15 +151,14 @@ lbox_session_peer(struct lua_State *L) luaL_error(L, "session.peer(sid): bad arguments"); int fd; - if (lua_gettop(L) == 1) { - struct session *session = session_find(luaL_checkint(L, 1)); - if (session == NULL) - luaL_error(L, "session.peer(): session does not exit"); - fd = session->fd; - } else { - fd = session()->fd; - } - + struct session *session; + if (lua_gettop(L) == 1) + session = session_find(luaL_checkint(L, 1)); + else + session = current_session(); + if (session == NULL) + luaL_error(L, "session.peer(): session does not exit"); + fd = session->fd; if (fd < 0) { lua_pushnil(L); /* no associated peer */ return 1; diff --git a/src/box/lua/slab.cc b/src/box/lua/slab.cc index 400c53d95784a745b1ed4c9def4e432bb9e83a4c..6babee055a0bcce87c0e83f9c74cf21431fdaca2 100644 --- a/src/box/lua/slab.cc +++ b/src/box/lua/slab.cc @@ -37,6 +37,7 @@ extern "C" { #include "box/tuple.h" #include "small/small.h" +#include "small/quota.h" #include "memory.h" /** A callback passed into salloc_stat() and invoked for every slab class. */ @@ -130,7 +131,7 @@ lbox_runtime_info(struct lua_State *L) lua_settable(L, -3); lua_pushstring(L, "maxalloc"); - luaL_pushnumber64(L, runtime.maxalloc); + luaL_pushnumber64(L, quota_get(runtime.quota)); lua_settable(L, -3); return 1; diff --git a/src/box/recovery.cc b/src/box/recovery.cc index e39be264d3d208425eacaa356b4b01ce4263a7cc..cbfa110c2fbdac76f2c99454d1e66819b4b29fbe 100644 --- a/src/box/recovery.cc +++ b/src/box/recovery.cc @@ -532,7 +532,6 @@ recovery_finalize(struct recovery_state *r) * locally or send to the replica. */ struct wal_watcher { - struct session *session; /** * Rescan the WAL directory in search for new WAL files * every wal_dir_rescan_delay seconds. @@ -574,10 +573,15 @@ recovery_rescan_dir(ev_loop * loop, ev_timer *w, int /* revents */) struct wal_watcher *watcher = r->watcher; struct log_io *save_current_wal = r->current_wal; - /** To process transactions, we need a working session. */ - fiber_set_session(fiber(), r->watcher->session); + /** + * local hot standby is running from an ev + * watcher, without fiber infrastructure (todo: fix), + * but to run queries we need at least a current + * user. + */ + fiber_set_user(fiber(), &admin_user); int result = recover_remaining_wals(r); - fiber_set_session(fiber(), NULL); + fiber_set_user(fiber(), NULL); if (result < 0) panic("recover failed: %i", result); if (save_current_wal != r->current_wal) { @@ -593,9 +597,9 @@ recovery_rescan_file(ev_loop * loop, ev_stat *w, int /* revents */) { struct recovery_state *r = (struct recovery_state *) w->data; struct wal_watcher *watcher = r->watcher; - fiber_set_session(fiber(), r->watcher->session); + fiber_set_user(fiber(), &admin_user); int result = recover_wal(r, r->current_wal); - fiber_set_session(fiber(), NULL); + fiber_set_user(fiber(), NULL); if (result < 0) panic("recover failed"); if (result == LOG_EOF) { @@ -616,8 +620,6 @@ recovery_follow_local(struct recovery_state *r, ev_tstamp wal_dir_rescan_delay) struct wal_watcher *watcher = r->watcher= &wal_watcher; - r->watcher->session = session_create(-1, 0); - ev_timer_init(&watcher->dir_timer, recovery_rescan_dir, wal_dir_rescan_delay, wal_dir_rescan_delay); watcher->dir_timer.data = watcher->stat.data = r; @@ -638,8 +640,6 @@ recovery_stop_local(struct recovery_state *r) ev_timer_stop(loop(), &watcher->dir_timer); if (ev_is_active(&watcher->stat)) ev_stat_stop(loop(), &watcher->stat); - session_destroy(watcher->session); - watcher->session = NULL; r->watcher = NULL; } diff --git a/src/box/recovery.h b/src/box/recovery.h index e982d8291bcc8b1801cdac8e48a8cb4bb2a32f79..4f1628064a9fa1a17d08219f8f52b03ecbc82f49 100644 --- a/src/box/recovery.h +++ b/src/box/recovery.h @@ -29,14 +29,14 @@ * SUCH DAMAGE. */ #include <stdbool.h> +#include <netinet/in.h> #include "trivia/util.h" #include "third_party/tarantool_ev.h" #include "log_io.h" #include "vclock.h" #include "tt_uuid.h" -#include "replica.h" -#include "small/region.h" +#include "uri.h" #if defined(__cplusplus) extern "C" { @@ -62,16 +62,51 @@ enum wal_mode { WAL_NONE = 0, WAL_WRITE, WAL_FSYNC, WAL_MODE_MAX }; /** String constants for the supported modes. */ extern const char *wal_mode_STRS[]; +/** State of a replication relay. */ +struct relay { + /** Replica connection */ + int sock; + /* Request type - SUBSCRIBE or JOIN */ + uint32_t type; + /* Request sync */ + uint64_t sync; + /* Only used in SUBSCRIBE request */ + uint32_t server_id; + struct vclock vclock; +}; + +enum { REMOTE_SOURCE_MAXLEN = 1024 }; /* enough to fit URI with passwords */ + +/** State of a replication connection to the master */ +struct remote { + struct fiber *reader; + ev_tstamp recovery_lag, recovery_last_update_tstamp; + bool warning_said; + char source[REMOTE_SOURCE_MAXLEN]; + struct uri uri; + union { + struct sockaddr addr; + struct sockaddr_storage addrstorage; + }; + socklen_t addr_len; +}; + struct recovery_state { struct vclock vclock; - /* The WAL we're currently reading/writing from/to. */ + /** The WAL we're currently reading/writing from/to. */ struct log_io *current_wal; struct log_dir snap_dir; struct log_dir wal_dir; - int64_t signature; /* used to find missing xlog files */ + /** Used to find missing xlog files */ + int64_t signature; struct wal_writer *writer; struct wal_watcher *watcher; - struct remote remote; + union { + /** slave->master state */ + struct remote remote; + /** master->slave state */ + struct relay relay; + }; /** * row_handler is a module callback invoked during initial * recovery and when reading rows from the master. It is diff --git a/src/box/replica.cc b/src/box/replica.cc index deff8eafbd89714840e344855d6e5b671eda392e..e26c04d357d63556449ac7c70ca7e09a633a7338 100644 --- a/src/box/replica.cc +++ b/src/box/replica.cc @@ -40,7 +40,6 @@ #include "recovery.h" #include "xrow.h" #include "msgpuck/msgpuck.h" -#include "session.h" #include "box/cluster.h" #include "iproto_constants.h" @@ -223,8 +222,6 @@ pull_from_remote(va_list ap) struct ev_io coio; struct iobuf *iobuf = NULL; ev_loop *loop = loop(); - /** This fiber executes transactions. */ - SessionGuard session_guard(-1, 0); coio_init(&coio); diff --git a/src/box/replica.h b/src/box/replica.h index 08832263537206c0c8cf26bf80de1279a8cd7d16..2832e89c6faeedf6dc4eceefe0c5cd8dcba45d7b 100644 --- a/src/box/replica.h +++ b/src/box/replica.h @@ -28,26 +28,8 @@ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ -#include <netinet/in.h> -#include "tarantool_ev.h" -#include <uri.h> - -enum { REMOTE_SOURCE_MAXLEN = 1024 }; /* enough to fit URI with passwords */ - -/** Master connection */ -struct remote { - struct fiber *reader; - ev_tstamp recovery_lag, recovery_last_update_tstamp; - bool warning_said; - char source[REMOTE_SOURCE_MAXLEN]; - struct uri uri; - union { - struct sockaddr addr; - struct sockaddr_storage addrstorage; - }; - socklen_t addr_len; -}; +struct recovery_state; /** Connect to a master and request a snapshot. * Raises an exception on error. * diff --git a/src/box/replication.cc b/src/box/replication.cc index d37940c43ed7b9a44cbf5390bb7ce5dd0726a827..4638df1f358a7235ffa31fe0d3855a5292cef15e 100644 --- a/src/box/replication.cc +++ b/src/box/replication.cc @@ -40,6 +40,7 @@ #include <limits.h> #include <fcntl.h> +#include "tarantool.h" #include "fiber.h" #include "recovery.h" #include "log_io.h" @@ -48,6 +49,7 @@ #include "box/cluster.h" #include "box/schema.h" #include "box/vclock.h" +#include "scoped_guard.h" /** Replication topology * ---------------------- @@ -78,29 +80,15 @@ static int master_to_spawner_socket; static char cfg_wal_dir[PATH_MAX]; static char cfg_snap_dir[PATH_MAX]; - -/** - * State of a replica. We only need one global instance - * since we fork() for every replica. - */ -struct relay_data { - /** Replica connection */ - int sock; - /* Request type - SUBSCRIBE or JOIN */ - uint32_t type; - /* Request sync */ - uint64_t sync; - /* Only used in SUBSCRIBE request */ - uint32_t server_id; - struct vclock vclock; -} relay; - /** Send a file descriptor to replication relay spawner. * * Invoked when spawner's end of the socketpair becomes ready. */ static void replication_send_socket(ev_loop *loop, ev_io *watcher, int /* events */); +static void +replication_relay_send_row(struct recovery_state *r, void * /* param */, + struct xrow_header *packet); /** Replication spawner process */ static struct spawner { @@ -140,7 +128,7 @@ spawner_sigchld_handler(int signal __attribute__((unused))); * @return 0 on success, -1 on error */ static int -spawner_create_replication_relay(); +spawner_create_replication_relay(struct relay *relay); /** Shut down all relays when shutting down the spawner. */ static void @@ -148,7 +136,7 @@ spawner_shutdown_children(); /** Initialize replication relay process. */ static void -replication_relay_loop(); +replication_relay_loop(struct relay *relay); /* * ------------------------------------------------------------------------ @@ -207,27 +195,63 @@ replication_prefork(const char *snap_dir, const char *wal_dir) struct replication_request { struct ev_io io; int fd; - struct relay_data data; + struct relay data; }; /** Replication acceptor fiber handler. */ +static void * +replication_join_thread(void *arg) +{ + struct recovery_state *r = (struct recovery_state *) arg; + + /* Turn off the non-blocking mode, if any. */ + int nonblock = sio_getfl(r->relay.sock) & O_NONBLOCK; + sio_setfl(r->relay.sock, O_NONBLOCK, 0); + auto socket_guard = make_scoped_guard([=]{ + /* Restore non-blocking mode */ + sio_setfl(r->relay.sock, O_NONBLOCK, nonblock); + }); + + /* Send snapshot */ + recover_snap(r); + + /* Send response to JOIN command = end of stream */ + struct xrow_header row; + xrow_encode_vclock(&row, &r->vclock); + row.sync = r->relay.sync; + struct iovec iov[XROW_IOVMAX]; + int iovcnt = xrow_to_iovec(&row, iov); + sio_writev_all(r->relay.sock, iov, iovcnt); + + say_info("snapshot sent"); + return NULL; +} + void replication_join(int fd, struct xrow_header *packet) { - struct replication_request *request = (struct replication_request *) - malloc(sizeof(*request)); - if (request == NULL) { - tnt_raise(ClientError, ER_MEMORY_ISSUE, sizeof(*request), - "iproto", "JOIN"); - } - request->fd = fd; - request->io.data = request; - request->data.type = packet->type; - request->data.sync = packet->sync; + struct recovery_state *r; + r = recovery_new(cfg_snap_dir, cfg_wal_dir, + replication_relay_send_row, + NULL, NULL, INT32_MAX); + auto recovery_guard = make_scoped_guard([&]{ + recovery_delete(r); + }); + r->relay.sock = fd; + r->relay.sync = packet->sync; + r->relay.server_id = packet->server_id; + r->relay.type = IPROTO_JOIN; + + char name[FIBER_NAME_MAX]; + struct sockaddr_storage peer; + socklen_t addrlen = sizeof(peer); + getpeername(r->relay.sock, ((struct sockaddr*)&peer), &addrlen); + snprintf(name, sizeof(name), "relay/%s", + sio_strfaddr((struct sockaddr *)&peer, addrlen)); - ev_io_init(&request->io, replication_send_socket, - master_to_spawner_socket, EV_WRITE); - ev_io_start(loop(), &request->io); + struct cord cord; + cord_start(&cord, name, replication_join_thread, r); + cord_cojoin(&cord); } /** Replication acceptor fiber handler. */ @@ -435,6 +459,7 @@ spawner_main_loop() break; } + struct relay relay; int sock = spawner_unpack_cmsg(&msg); msglen = read(spawner.sock, &relay, len); relay.sock = sock; @@ -448,7 +473,8 @@ spawner_main_loop() /* continue, the error may be temporary */ break; } - spawner_create_replication_relay(); + assert(msglen == sizeof(relay)); + spawner_create_replication_relay(&relay); } spawner_shutdown(); } @@ -507,7 +533,7 @@ spawner_sigchld_handler(int signo __attribute__((unused))) /** Create replication client handler process. */ static int -spawner_create_replication_relay() +spawner_create_replication_relay(struct relay *relay) { /* flush buffers to avoid multiple output */ /* https://github.com/tarantool/tarantool/issues/366 */ @@ -524,10 +550,10 @@ spawner_create_replication_relay() ev_loop_fork(loop()); ev_run(loop(), EVRUN_NOWAIT); close(spawner.sock); - replication_relay_loop(); + replication_relay_loop(relay); } else { spawner.child_count++; - close(relay.sock); + close(relay->sock); say_info("created a replication relay: pid = %d", (int) pid); } @@ -635,10 +661,10 @@ replication_relay_send_row(struct recovery_state *r, void * /* param */, /* Don't duplicate data */ if (packet->server_id == 0 || packet->server_id != r->server_id) { - packet->sync = relay.sync; + packet->sync = r->relay.sync; struct iovec iov[XROW_IOVMAX]; int iovcnt = xrow_to_iovec(packet, iov); - sio_writev_all(relay.sock, iov, iovcnt); + sio_writev_all(r->relay.sock, iov, iovcnt); } /* @@ -649,33 +675,13 @@ replication_relay_send_row(struct recovery_state *r, void * /* param */, vclock_follow(&r->vclock, packet->server_id, packet->lsn); } -static void -replication_relay_join(struct recovery_state *r) -{ - FDGuard guard_replica(relay.sock); - - /* Send snapshot */ - recover_snap(r); - - /* Send response to JOIN command = end of stream */ - struct xrow_header row; - xrow_encode_vclock(&row, &r->vclock); - row.sync = relay.sync; - struct iovec iov[XROW_IOVMAX]; - int iovcnt = xrow_to_iovec(&row, iov); - sio_writev_all(relay.sock, iov, iovcnt); - - say_info("snapshot sent"); - /* relay.sock closed by guard */ -} - static void replication_relay_subscribe(struct recovery_state *r) { /* Set LSNs */ - vclock_copy(&r->vclock, &relay.vclock); + vclock_copy(&r->vclock, &r->relay.vclock); /* Set server_id */ - r->server_id = relay.server_id; + r->server_id = r->relay.server_id; recovery_follow_local(r, 0.1); ev_run(loop(), 0); @@ -685,7 +691,7 @@ replication_relay_subscribe(struct recovery_state *r) /** The main loop of replication client service process. */ static void -replication_relay_loop() +replication_relay_loop(struct relay *relay) { struct sigaction sa; @@ -695,7 +701,7 @@ replication_relay_loop() */ struct sockaddr_storage peer; socklen_t addrlen = sizeof(peer); - getpeername(relay.sock, ((struct sockaddr*)&peer), &addrlen); + getpeername(relay->sock, ((struct sockaddr*)&peer), &addrlen); title("relay", "%s", sio_strfaddr((struct sockaddr *)&peer, addrlen)); fiber_set_name(fiber(), status); @@ -733,29 +739,24 @@ replication_relay_loop() * relay. */ struct ev_io sock_read_ev; - sock_read_ev.data = (void *)(intptr_t) relay.sock; + sock_read_ev.data = (void *)(intptr_t) relay->sock; ev_io_init(&sock_read_ev, replication_relay_recv, - relay.sock, EV_READ); + relay->sock, EV_READ); ev_io_start(loop(), &sock_read_ev); /** Turn off the non-blocking mode,if any. */ - sio_setfl(relay.sock, O_NONBLOCK, 0); + sio_setfl(relay->sock, O_NONBLOCK, 0); /* Initialize the recovery process */ - struct recovery_state *r = recovery_new(cfg_snap_dir, cfg_wal_dir, - replication_relay_send_row, - NULL, NULL, INT32_MAX); + struct recovery_state *r; + r = recovery_new(cfg_snap_dir, cfg_wal_dir, + replication_relay_send_row, + NULL, NULL, INT32_MAX); + r->relay = *relay; /* copy relay state to recovery */ + int rc = EXIT_SUCCESS; try { - switch (relay.type) { - case IPROTO_JOIN: - replication_relay_join(r); - break; - case IPROTO_SUBSCRIBE: - replication_relay_subscribe(r); - break; - default: - assert(false); - } + assert(r->relay.type == IPROTO_SUBSCRIBE); + replication_relay_subscribe(r); } catch (Exception *e) { say_error("relay error: %s", e->errmsg()); rc = EXIT_FAILURE; diff --git a/src/box/replication.h b/src/box/replication.h index c100196ddc43ca07e2b3b4aea6e632cca5e3eef0..3798b2a0335390b067293ab7ecc4c83a2e16c88b 100644 --- a/src/box/replication.h +++ b/src/box/replication.h @@ -28,8 +28,7 @@ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ -#include <tarantool.h> -#include "trivia/util.h" +struct xrow_header; /** * Pre-fork replication spawner process. diff --git a/src/box/request.cc b/src/box/request.cc index 008a848cc6c9cd6c00768ec63ea238ab406516b6..e3289d6d58cf7f49d22cf55c8b6723aa1fbb0be7 100644 --- a/src/box/request.cc +++ b/src/box/request.cc @@ -40,7 +40,8 @@ #include <scoped_guard.h> #include <third_party/base64.h> #include "authentication.h" -#include "access.h" +#include "user_def.h" +#include "iproto_constants.h" enum dup_replace_mode dup_replace_mode(uint32_t op) diff --git a/src/box/schema.cc b/src/box/schema.cc index fb9a9144de646400cf6e07d8b1d52b736f11b198..6ce440d4290dc1162a92448a0ed9dc300dd0c623 100644 --- a/src/box/schema.cc +++ b/src/box/schema.cc @@ -27,7 +27,7 @@ * SUCH DAMAGE. */ #include "schema.h" -#include "access.h" +#include "user_def.h" #include "engine.h" #include "space.h" #include "tuple.h" diff --git a/src/box/schema.h b/src/box/schema.h index 69b63f66b9464b5f0b245af079b0cc3b4c20137b..cbd77926c2fe8d50f7602a04b1cc873b2d48c94d 100644 --- a/src/box/schema.h +++ b/src/box/schema.h @@ -81,10 +81,7 @@ space_cache_find(uint32_t id) struct space *space = space_by_id(id); if (space) return space; - - char name[12]; - snprintf(name, sizeof(name), "#%u", id); - tnt_raise(ClientError, ER_NO_SUCH_SPACE, name); + tnt_raise(ClientError, ER_NO_SUCH_SPACE, int2str(id)); } /** diff --git a/src/box/session.cc b/src/box/session.cc index c892a411f9d3c69686b5cfafd346b4c1a4602fdc..56f2324f8c1511312ac80f7ed49fb1a04a7ec378 100644 --- a/src/box/session.cc +++ b/src/box/session.cc @@ -35,6 +35,7 @@ #include "exception.h" #include "random.h" #include <sys/socket.h> +#include "user_cache.h" static struct mh_i32ptr_t *session_registry; @@ -54,17 +55,18 @@ sid_max() } static void -session_on_stop(struct trigger * trigger, void *event) +session_on_stop(struct trigger *trigger, void * /* event */) { - (void) event; - /* Remove on_stop trigger from fiber */ + /* + * Remove on_stop trigger from the fiber, otherwise the + * fiber will attempt to destroy the trigger eventually, + * after the trigger and its memory is long gone. + */ trigger_clear(trigger); - struct session *session = fiber_get_session(fiber()); - if (session == NULL) - return; - /* Destroy session */ + struct session *session = (struct session *) + fiber_get_key(fiber(), FIBER_KEY_SESSION); + /* Destroy the session */ session_destroy(session); - fiber_set_session(fiber(), NULL); } struct session * @@ -75,11 +77,10 @@ session_create(int fd, uint64_t cookie) session->id = sid_max(); session->fd = fd; session->cookie = cookie; - session->fiber_on_stop = { - rlist_nil, session_on_stop, NULL, NULL - }; - session_set_user(session, ADMIN, ADMIN); - random_bytes(session->salt, SESSION_SEED_SIZE); + /* For on_connect triggers. */ + current_user_init(&session->user, user_by_token(GUEST)); + if (fd >= 0) + random_bytes(session->salt, SESSION_SEED_SIZE); struct mh_i32ptr_node_t node; node.key = session->id; node.val = session; @@ -94,11 +95,35 @@ session_create(int fd, uint64_t cookie) return session; } +struct session * +session_create_on_demand() +{ + /* Create session on demand */ + struct session *s = session_create(-1, 0); + s->fiber_on_stop = { + rlist_nil, session_on_stop, NULL, NULL + }; + /* Add a trigger to destroy session on fiber stop */ + trigger_add(&fiber()->on_stop, &s->fiber_on_stop); + current_user_init(&s->user, user_by_token(ADMIN)); + fiber_set_session(fiber(), s); + fiber_set_user(fiber(), &s->user); + return s; +} + +/** + * To quickly switch to admin user when executing + * on_connect/on_disconnect triggers in iproto. + */ +struct current_user admin_user; + void session_run_on_disconnect_triggers(struct session *session) { + struct fiber *fiber = fiber(); /* For triggers. */ - session_set_user(session, ADMIN, ADMIN); + fiber_set_session(fiber, session); + fiber_set_user(fiber, &admin_user); try { trigger_run(&session_on_disconnect, NULL); } catch (Exception *e) { @@ -112,8 +137,11 @@ session_run_on_disconnect_triggers(struct session *session) void session_run_on_connect_triggers(struct session *session) { - (void) session; + struct fiber *fiber = fiber(); + fiber_set_session(fiber, session); + fiber_set_user(fiber, &admin_user); trigger_run(&session_on_connect, NULL); + /* Set session user to guest, until it is authenticated. */ } void @@ -141,6 +169,7 @@ session_init() if (session_registry == NULL) panic("out of memory"); mempool_create(&session_pool, &cord()->slabc, sizeof(struct session)); + current_user_init(&admin_user, user_by_token(ADMIN)); } void @@ -149,27 +178,3 @@ session_free() if (session_registry) mh_i32ptr_delete(session_registry); } - -SessionGuard::SessionGuard(int fd, uint64_t cookie) -{ - session = session_create(fd, cookie); - fiber_set_session(fiber(), session); -} - -SessionGuard::~SessionGuard() -{ - assert(session == fiber_get_session(fiber())); - session_destroy(session); - fiber_set_session(fiber(), NULL); -} - -SessionGuardWithTriggers::SessionGuardWithTriggers(int fd, uint64_t cookie) - :SessionGuard(fd, cookie) -{ - session_run_on_connect_triggers(session); -} - -SessionGuardWithTriggers::~SessionGuardWithTriggers() -{ - session_run_on_disconnect_triggers(session); -} diff --git a/src/box/session.h b/src/box/session.h index a1d2157ec482404cee0b4e15f1daec6c940fa053..ff0c1a16361f1edd8a6c959e994b2f6714f3c9ed 100644 --- a/src/box/session.h +++ b/src/box/session.h @@ -32,20 +32,19 @@ #include <stdbool.h> #include "trigger.h" #include "fiber.h" +#include "user_def.h" enum { SESSION_SEED_SIZE = 32, SESSION_DELIM_SIZE = 16 }; -/** Predefined user ids. */ -enum { GUEST = 0, ADMIN = 1, PUBLIC = 2 /* role */ }; /** * Abstraction of a single user session: * for now, only provides accounting of established * sessions and on-connect/on-disconnect event - * handling, in future: user credentials, protocol, etc. + * handling, user credentials. In future: the + * client/server protocol, etc. * Session identifiers grow monotonically. * 0 sid is reserved to mean 'no session'. */ - struct session { /** Session id. */ uint32_t id; @@ -55,10 +54,8 @@ struct session { uint64_t cookie; /** Authentication salt. */ char salt[SESSION_SEED_SIZE]; - /** A look up key to quickly find session user. */ - uint8_t auth_token; - /** User id of the authenticated user. */ - uint32_t uid; + /** Cached user id and global grants */ + struct current_user user; /** Trigger for fiber on_stop to cleanup created on-demand session */ struct trigger fiber_on_stop; }; @@ -95,14 +92,6 @@ session_destroy(struct session *); struct session * session_find(uint32_t sid); -/** Set session auth token and user id. */ -static inline void -session_set_user(struct session *session, uint8_t auth_token, uint32_t uid) -{ - session->auth_token = auth_token; - session->uid = uid; -} - /** Global on-connect triggers. */ extern struct rlist session_on_connect; @@ -126,10 +115,10 @@ session_free(); void session_storage_cleanup(int sid); -static inline struct session * -fiber_get_session(struct fiber *fiber) +static inline void +fiber_set_user(struct fiber *fiber, struct current_user *user) { - return (struct session *) fiber_get_key(fiber, FIBER_KEY_SESSION); + fiber_set_key(fiber, FIBER_KEY_USER, user); } static inline void @@ -138,29 +127,56 @@ fiber_set_session(struct fiber *fiber, struct session *session) fiber_set_key(fiber, FIBER_KEY_SESSION, session); } -#define session() ({\ - struct session *s = fiber_get_session(fiber()); \ - /* Create session on demand */ \ - if (s == NULL) { \ - s = session_create(-1, 0); \ - fiber_set_session(fiber(), s); \ - /* Add a trigger to destroy session on fiber stop */ \ - trigger_add(&fiber()->on_stop, &s->fiber_on_stop); \ - } \ - s; }) - -/** A helper class to create and set session in single-session fibers. */ -struct SessionGuard +/** + * Create a new session on demand, and set fiber on_stop + * trigger to destroy it when this fiber ends. + */ +struct session * +session_create_on_demand(); + +/* + * For use in local hot standby, which runs directly + * from ev watchers (without current fiber), but needs + * to execute transactions. + */ +extern struct current_user admin_user; + +/* + * When creating a new fiber, the database (box) + * may not be initialized yet. When later on + * this fiber attempts to access the database, + * we have no other choice but initialize fiber-specific + * database state (something like a database connection) + * on demand. This is why this function needs to + * check whether or not the current session exists + * and create it otherwise. + */ +static inline struct session * +current_session() { - struct session *session; - SessionGuard(int fd, uint64_t cookie); - ~SessionGuard(); -}; + struct session *s = (struct session *) + fiber_get_key(fiber(), FIBER_KEY_SESSION); + return s ? s : session_create_on_demand(); +} -struct SessionGuardWithTriggers: public SessionGuard +/* + * Return the current user. Create it if it doesn't + * exist yet. + * The same rationale for initializing the current + * user on demand as in current_session() applies. + */ +static inline struct current_user * +current_user() { - SessionGuardWithTriggers(int fd, uint64_t cookie); - ~SessionGuardWithTriggers(); -}; + struct current_user *u = + (struct current_user *) fiber_get_key(fiber(), + FIBER_KEY_USER); + if (u == NULL) { + session_create_on_demand(); + u = (struct current_user *) fiber_get_key(fiber(), + FIBER_KEY_USER); + } + return u; +} #endif /* INCLUDES_TARANTOOL_SESSION_H */ diff --git a/src/box/space.cc b/src/box/space.cc index 7947cd3b0b5f5c9e2eac4ee7ab36e5ff4f5e152a..0713814cdaa2fc40ddd2b41b4db6f80b8c0b330d 100644 --- a/src/box/space.cc +++ b/src/box/space.cc @@ -33,12 +33,14 @@ #include "tuple.h" #include "scoped_guard.h" #include "trigger.h" -#include "access.h" +#include "user_def.h" +#include "user_cache.h" +#include "session.h" void access_check_space(struct space *space, uint8_t access) { - struct user *user = user(); + struct current_user *user = current_user(); /* * If a user has a global permission, clear the respective * privilege from the list of privileges required @@ -46,11 +48,12 @@ access_check_space(struct space *space, uint8_t access) * No special check for ADMIN user is necessary * since ADMIN has universal access. */ - access &= ~user->universal_access.effective; + access &= ~user->universal_access; if (access && space->def.uid != user->uid && access & ~space->access[user->auth_token].effective) { + struct user_def *def = user_cache_find(user->uid); tnt_raise(ClientError, ER_SPACE_ACCESS_DENIED, - priv_name(access), user->name, space->def.name); + priv_name(access), def->name, space->def.name); } } diff --git a/src/box/tree_index.cc b/src/box/tree_index.cc index cc45bdf8bad326cf499d32b4e50cbc35ceddc8b0..d877a09e4611e869852e024340f119cec95808f2 100644 --- a/src/box/tree_index.cc +++ b/src/box/tree_index.cc @@ -36,9 +36,12 @@ #include <third_party/qsort_arg.h> /** For all memory used by all tree indexes. */ +extern struct quota memtx_quota; +static struct slab_arena index_arena; +static struct slab_cache index_arena_slab_cache; static struct mempool tree_extent_pool; /** Number of allocated extents. */ -static int tree_extent_pool_initialized = 0; +static bool index_arena_initialized = false; /* {{{ Utilities. *************************************************/ @@ -203,10 +206,17 @@ TreeIndex::TreeIndex(struct key_def *key_def_arg) : Index(key_def_arg), build_array(0), build_array_size(0), build_array_alloc_size(0) { - if (tree_extent_pool_initialized == 0) { - mempool_create(&tree_extent_pool, &cord()->slabc, + if (index_arena_initialized == false) { + const uint32_t SLAB_SIZE = 4 * 1024 * 1024; + if (slab_arena_create(&index_arena, &memtx_quota, + 0, SLAB_SIZE, MAP_PRIVATE)) { + panic_syserror("failed to initialize index arena"); + } + slab_cache_create(&index_arena_slab_cache, &index_arena, + SLAB_SIZE); + mempool_create(&tree_extent_pool, &index_arena_slab_cache, BPS_TREE_EXTENT_SIZE); - tree_extent_pool_initialized = 1; + index_arena_initialized = true; } bps_tree_index_create(&tree, key_def, extent_alloc, extent_free); } diff --git a/src/box/tuple.cc b/src/box/tuple.cc index 2df74eea68db664e894596789b35587e7b1d4307..b1b873df4d6df227fdccdc1976a1f9e5d96b2e18 100644 --- a/src/box/tuple.cc +++ b/src/box/tuple.cc @@ -29,6 +29,7 @@ #include "tuple.h" #include "small/small.h" +#include "small/quota.h" #include "tbuf.h" #include "key_def.h" @@ -45,8 +46,10 @@ static uint32_t formats_size, formats_capacity; uint32_t snapshot_version; +struct quota memtx_quota; + struct slab_arena memtx_arena; -struct slab_cache memtx_slab_cache; +static struct slab_cache memtx_slab_cache; struct small_alloc memtx_alloc; /** Extract all available type info from keys. */ @@ -517,15 +520,16 @@ tuple_compare_with_key(const struct tuple *tuple, const char *key, } void -tuple_init(float arena_prealloc, uint32_t objsize_min, +tuple_init(float tuple_arena_max_size, uint32_t objsize_min, float alloc_factor) { tuple_format_ber = tuple_format_new(&rlist_nil); /* Make sure this one stays around. */ tuple_format_ref(tuple_format_ber, 1); - uint32_t slab_size = 4*1024*1024; - size_t prealloc = arena_prealloc * 1024 * 1024 * 1024; + const uint32_t SLAB_SIZE = 4 * 1024 * 1024; + size_t max_size = tuple_arena_max_size * 1024 * 1024 * 1024; + quota_init(&memtx_quota, max_size); int flags; if (access("/proc/user_beancounters", F_OK) == 0) { @@ -534,23 +538,24 @@ tuple_init(float arena_prealloc, uint32_t objsize_min, flags = MAP_PRIVATE; } else { say_info("mapping %zu bytes for a shared arena...", - prealloc); + max_size); flags = MAP_SHARED; } - if (slab_arena_create(&memtx_arena, prealloc, prealloc, - slab_size, flags)) { - if (ENOMEM == errno) + if (slab_arena_create(&memtx_arena, &memtx_quota, + max_size, SLAB_SIZE, flags)) { + if (ENOMEM == errno) { panic("failed to preallocate %zu bytes: " - "Cannot allocate memory, " - "check option 'slab_alloc_arena' in box.cfg(..)", - prealloc); - else + "Cannot allocate memory, check option " + "'slab_alloc_arena' in box.cfg(..)", + max_size); + } else { panic_syserror("failed to preallocate %zu bytes", - prealloc); + max_size); + } } slab_cache_create(&memtx_slab_cache, &memtx_arena, - slab_size); + SLAB_SIZE); small_alloc_create(&memtx_alloc, &memtx_slab_cache, objsize_min, alloc_factor); } diff --git a/src/box/tuple.h b/src/box/tuple.h index 959a75da265aa6dca1af5e4e3d2d34673e74b8a7..5c973d512e063059dc7854c293c4edb8e6a19a2e 100644 --- a/src/box/tuple.h +++ b/src/box/tuple.h @@ -36,8 +36,11 @@ enum { FORMAT_REF_MAX = INT32_MAX, TUPLE_REF_MAX = UINT16_MAX }; struct tbuf; +/** Common quota for tuples and indexes */ +extern struct quota memtx_quota; +/** Tuple allocator */ extern struct small_alloc memtx_alloc; -extern struct slab_cache memtx_slab_cache; +/** Tuple slab arena */ extern struct slab_arena memtx_arena; /** @@ -519,7 +522,7 @@ tuple_to_buf(struct tuple *tuple, char *buf); /** Initialize tuple library */ void -tuple_init(float slab_alloc_arena, uint32_t slab_alloc_minimal, +tuple_init(float alloc_arena_max_size, uint32_t slab_alloc_minimal, float alloc_factor); /** Cleanup tuple library */ diff --git a/src/box/access.cc b/src/box/user_cache.cc similarity index 82% rename from src/box/access.cc rename to src/box/user_cache.cc index 1521b5f6e4a8f4557a4df1db829659034f60b8eb..d82555cd3b845eaf75a46b05e4fd171ef5fc7ae9 100644 --- a/src/box/access.cc +++ b/src/box/user_cache.cc @@ -26,11 +26,12 @@ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ -#include "access.h" +#include "user_cache.h" +#include "user_def.h" #include "assoc.h" #include "schema.h" -struct user users[BOX_USER_MAX]; +struct user_def users[BOX_USER_MAX]; /** Bitmap type for used/unused authentication token map. */ typedef unsigned long user_map_t; @@ -69,7 +70,7 @@ user_map_get_slot() void user_map_put_slot(uint8_t auth_token) { - memset(users + auth_token, 0, sizeof(struct user)); + memset(users + auth_token, 0, sizeof(struct user_def)); uint32_t bit_no = auth_token & (sizeof(user_map_t) * CHAR_BIT - 1); auth_token /= sizeof(user_map_t) * CHAR_BIT; user_map[auth_token] |= ((user_map_t) 1) << bit_no; @@ -77,20 +78,10 @@ user_map_put_slot(uint8_t auth_token) user_map_idx = auth_token; } -const char * -priv_name(uint8_t access) -{ - if (access & PRIV_R) - return "Read"; - if (access & PRIV_W) - return "Write"; - return "Execute"; -} - void -user_cache_replace(struct user *user) +user_cache_replace(struct user_def *user) { - struct user *old = user_cache_find(user->uid); + struct user_def *old = user_by_id(user->uid); if (old == NULL) { uint8_t auth_token = user_map_get_slot(); old = users + auth_token; @@ -107,7 +98,7 @@ user_cache_replace(struct user *user) void user_cache_delete(uint32_t uid) { - struct user *old = user_cache_find(uid); + struct user_def *old = user_by_id(uid); if (old) { assert(old->auth_token > ADMIN); user_map_put_slot(old->auth_token); @@ -118,22 +109,37 @@ user_cache_delete(uint32_t uid) } /** Find user by id. */ -struct user * -user_cache_find(uint32_t uid) +struct user_def * +user_by_id(uint32_t uid) { mh_int_t k = mh_i32ptr_find(user_registry, uid, NULL); if (k == mh_end(user_registry)) return NULL; - return (struct user *) mh_i32ptr_node(user_registry, k)->val; + return (struct user_def *) mh_i32ptr_node(user_registry, k)->val; +} + +struct user_def * +user_cache_find(uint32_t uid) +{ + struct user_def *user = user_by_id(uid); + if (user) + return user; + tnt_raise(ClientError, ER_NO_SUCH_USER, int2str(uid)); } /** Find user by name. */ -struct user * -user_by_name(const char *name, uint32_t len) +struct user_def * +user_cache_find_by_name(const char *name, uint32_t len) { uint32_t uid = schema_find_id(SC_USER_ID, 2, name, len); - struct user *user = user_cache_find(uid); - return user && user->type == SC_USER ? user : NULL; + struct user_def *user = user_by_id(uid); + if (user == NULL || user->type != SC_USER) { + char name_buf[BOX_NAME_MAX + 1]; + /* \0 - to correctly print user name the error message. */ + snprintf(name_buf, sizeof(name_buf), "%.*s", len, name); + tnt_raise(ClientError, ER_NO_SUCH_USER, name_buf); + } + return user; } void @@ -149,7 +155,7 @@ user_cache_init() * for 'guest' and 'admin' users here, they will be * updated with snapshot contents during recovery. */ - struct user guest; + struct user_def guest; memset(&guest, 0, sizeof(guest)); snprintf(guest.name, sizeof(guest.name), "guest"); guest.owner = ADMIN; @@ -160,7 +166,7 @@ user_cache_init() guest.uid == GUEST && users[guest.auth_token].uid == guest.uid); - struct user admin; + struct user_def admin; memset(&admin, 0, sizeof(admin)); snprintf(admin.name, sizeof(admin.name), "admin"); admin.uid = admin.owner = ADMIN; diff --git a/src/box/access.h b/src/box/user_cache.h similarity index 51% rename from src/box/access.h rename to src/box/user_cache.h index 385807aa4e37729031181450e945b76952ee21df..d3da115278ce7b9998f10ea751955358b9225c69 100644 --- a/src/box/access.h +++ b/src/box/user_cache.h @@ -1,5 +1,5 @@ -#ifndef INCLUDES_TARANTOOL_BOX_ACCESS_H -#define INCLUDES_TARANTOOL_BOX_ACCESS_H +#ifndef INCLUDES_TARANTOOL_BOX_USER_CACHE_H +#define INCLUDES_TARANTOOL_BOX_USER_CACHE_H /* * Redistribution and use in source and binary forms, with or * without modification, are permitted provided that the following @@ -28,51 +28,9 @@ * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF * SUCH DAMAGE. */ -#include "iproto_constants.h" -#include "key_def.h" -#include "scramble.h" -#include "fiber.h" -#include "session.h" - -enum { - /* SELECT */ - PRIV_R = 1, - /* INSERT, UPDATE, DELETE, REPLACE */ - PRIV_W = 2, - /* CALL */ - PRIV_X = 4, - /** Everything. */ - PRIV_ALL = PRIV_R + PRIV_W + PRIV_X -}; - -/* Privilege name for error messages */ -const char * -priv_name(uint8_t access); - -/** - * A cache entry for an existing user. Entries for all existing - * users are always present in the cache. The entry is maintained - * in sync with _user and _priv system spaces by system space - * triggers. - * @sa alter.cc - */ -struct user { - /** User id. */ - uint32_t uid; - /** Creator of the user */ - uint32_t owner; - /** 'user' or 'role' */ - enum schema_object_type type; - /** User password - hash2 */ - char hash2[SCRAMBLE_SIZE]; - /** User name - for error messages and debugging */ - char name[BOX_NAME_MAX + 1]; - /** Global privileges this user has on the universe. */ - struct access universal_access; - /** An id in users[] array to quickly find user */ - uint8_t auth_token; -}; +#include <stdint.h> +struct user_def; /** * For best performance, all users are maintained in this array. * Position in the array is store in user->auth_token and also @@ -84,7 +42,7 @@ struct user { * is also used to find out user privileges when accessing stored * objects, such as spaces and functions. */ -extern struct user users[]; +extern struct user_def users[]; /* * Insert or update user object (a cache entry @@ -99,7 +57,7 @@ extern struct user users[]; * with an index in the users[] array. */ void -user_cache_replace(struct user *user); +user_cache_replace(struct user_def *user); /** * Find a user by id and delete it from the @@ -109,47 +67,17 @@ void user_cache_delete(uint32_t uid); /** Find user by id. */ -struct user * -user_cache_find(uint32_t uid); +struct user_def * +user_by_id(uint32_t uid); + +#define user_by_token(token) (users + token) /* Find a user by name. Used by authentication. */ -struct user * -user_by_name(const char *name, uint32_t len); +struct user_def * +user_cache_find(uint32_t uid); -/** - * Return the current user. - * - * @todo: this doesn't account for the case when a user - * was dropped, its slot in users array was reused - * for a new user, and some sessions exist which still - * use the old auth_token. In this case, already - * authenticated sessions use grants of the new user, - * not the old one. - * - * This can be easily fixed by checking that uid of the - * user found by means of auth_token matches the uid - * stored in the session, and invalidating the session - * auth_token when it doesn't. - * - * Alternatively, one could invalidate the session - * auth_token whenever sc_version changes. Alternatively, - * one could invalidate auth_token in all sessions whenever - * any tuple in _user or _priv spaces is modified. - * - * None of these 3 solutions seems to be worth the while - * at the moment. - */ -#define user() \ -({ \ - struct session *s = session(); \ - struct user *u = &users[s->auth_token]; \ - if (u->auth_token != s->auth_token || \ - u->uid != s->uid) { \ - tnt_raise(ClientError, ER_NO_SUCH_USER, \ - int2str(s->uid)); \ - } \ - u; \ -}) +struct user_def * +user_cache_find_by_name(const char *name, uint32_t len); /** Initialize the user cache and access control subsystem. */ void @@ -159,4 +87,4 @@ user_cache_init(); void user_cache_free(); -#endif /* INCLUDES_TARANTOOL_BOX_ACCESS_H */ +#endif /* INCLUDES_TARANTOOL_BOX_USER_CACHE_H */ diff --git a/src/box/user_def.cc b/src/box/user_def.cc new file mode 100644 index 0000000000000000000000000000000000000000..5f178be766c4735b99964f2b5bfa715a2fe507b4 --- /dev/null +++ b/src/box/user_def.cc @@ -0,0 +1,39 @@ +/* + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * 1. Redistributions of source code must retain the above + * copyright notice, this list of conditions and the + * following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ +#include "user_def.h" +const char * +priv_name(uint8_t access) +{ + if (access & PRIV_R) + return "Read"; + if (access & PRIV_W) + return "Write"; + return "Execute"; +} + diff --git a/src/box/user_def.h b/src/box/user_def.h new file mode 100644 index 0000000000000000000000000000000000000000..26de9fa1adbfac9bb419e225cb794cbe34426e04 --- /dev/null +++ b/src/box/user_def.h @@ -0,0 +1,84 @@ +#ifndef TARANTOOL_BOX_USER_DEF_H_INCLUDED +#define TARANTOOL_BOX_USER_DEF_H_INCLUDED +/* + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * 1. Redistributions of source code must retain the above + * copyright notice, this list of conditions and the + * following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ +#include "key_def.h" /* for SCHEMA_OBJECT_TYPE */ +#include "scramble.h" /* for SCRAMBLE_SIZE */ + +enum { + /* SELECT */ + PRIV_R = 1, + /* INSERT, UPDATE, DELETE, REPLACE */ + PRIV_W = 2, + /* CALL */ + PRIV_X = 4, + /** Everything. */ + PRIV_ALL = PRIV_R + PRIV_W + PRIV_X +}; + +/* Privilege name for error messages */ +const char * +priv_name(uint8_t access); + +/** + * A cache entry for an existing user. Entries for all existing + * users are always present in the cache. The entry is maintained + * in sync with _user and _priv system spaces by system space + * triggers. + * @sa alter.cc + */ +struct user_def { + /** User id. */ + uint32_t uid; + /** Creator of the user */ + uint32_t owner; + /** 'user' or 'role' */ + enum schema_object_type type; + /** User password - hash2 */ + char hash2[SCRAMBLE_SIZE]; + /** User name - for error messages and debugging */ + char name[BOX_NAME_MAX + 1]; + /** Global privileges this user has on the universe. */ + struct access universal_access; + /** An id in users[] array to quickly find user */ + uint8_t auth_token; +}; + +/** Predefined user ids. */ +enum { GUEST = 0, ADMIN = 1, PUBLIC = 2 /* role */ }; + +static inline void +current_user_init(struct current_user *user, struct user_def *def) +{ + user->auth_token = def->auth_token; + user->universal_access = def->universal_access.effective; + user->uid = def->uid; +} + +#endif /* TARANTOOL_BOX_USER_DEF_H_INCLUDED */ diff --git a/src/exception.cc b/src/exception.cc index d42b42cdbc3259b5da5c05d0fc426c74f17d135e..aefc9f26e53c02cdd8f4d13f5c48b3543e781e3f 100644 --- a/src/exception.cc +++ b/src/exception.cc @@ -66,6 +66,33 @@ Exception::operator new(size_t size) throw cord->exception; } +void +Exception::init(struct cord *cord) +{ + cord->exception = NULL; + cord->exception_size = 0; +} + +void +Exception::cleanup(struct cord *cord) +{ + if (cord->exception != NULL && cord->exception != &out_of_memory) { + cord->exception->~Exception(); + free(cord->exception); + } + Exception::init(cord); +} + +void +Exception::move(struct cord *from, struct cord *to) +{ + Exception::cleanup(to); + to->exception = from->exception; + to->exception_size = from->exception_size; + Exception::init(from); +} + + void Exception::operator delete(void * /* ptr */) { diff --git a/src/exception.h b/src/exception.h index 3cf645e399b1d833505128745ca4b92d0c4a87c6..d9afb096db2bb0d1903cb40884c2b996beba49c6 100644 --- a/src/exception.h +++ b/src/exception.h @@ -33,6 +33,7 @@ #include "errcode.h" #include "say.h" +struct cord; class Exception: public Object { public: @@ -53,6 +54,11 @@ class Exception: public Object { virtual void log() const = 0; virtual ~Exception() {} + static void init(struct cord *cord); + /** Clear the last error saved in the current thread's TLS */ + static void cleanup(struct cord *cord); + /** Move an exception from one thread to another. */ + static void move(struct cord *from, struct cord *to); protected: Exception(const char *file, unsigned line); /* The copy constructor is needed for C++ throw */ diff --git a/src/fiber.cc b/src/fiber.cc index fe7b9ec570582f0ffef8d817531f47ad472eb7c9..b8b2dd07d87bf27b52501ef3793add9d7875e41f 100644 --- a/src/fiber.cc +++ b/src/fiber.cc @@ -37,6 +37,7 @@ #include "assoc.h" #include "memory.h" #include "trigger.h" +#include "coeio.h" static struct cord main_cord; __thread struct cord *cord_ptr = NULL; @@ -533,6 +534,7 @@ cord_create(struct cord *cord, const char *name) cord->sp = cord->stack; cord->max_fid = 100; + Exception::init(cord); ev_async_init(&cord->ready_async, fiber_ready_async); ev_async_start(cord->loop, &cord->ready_async); @@ -550,11 +552,7 @@ cord_destroy(struct cord *cord) } slab_cache_destroy(&cord->slabc); ev_loop_destroy(cord->loop); - /* Cleanup memory allocated for exceptions */ - if (cord->exception && cord->exception != &out_of_memory) { - cord->exception->~Exception(); - free(cord->exception); - } + Exception::cleanup(cord); } struct cord_thread_arg @@ -571,15 +569,33 @@ struct cord_thread_arg void *cord_thread_func(void *p) { struct cord_thread_arg *ct_arg = (struct cord_thread_arg *) p; - struct cord *cord = cord() = ct_arg->cord; + cord() = ct_arg->cord; + struct cord *cord = cord(); cord_create(cord, ct_arg->name); + /** Can't possibly be the main thread */ + assert(cord->id != main_thread_id); tt_pthread_mutex_lock(&ct_arg->start_mutex); void *(*f)(void *) = ct_arg->f; void *arg = ct_arg->arg; ct_arg->is_started = true; tt_pthread_cond_signal(&ct_arg->start_cond); tt_pthread_mutex_unlock(&ct_arg->start_mutex); - return f(arg); + void *res; + try { + res = f(arg); + /* + * Clear a possible leftover exception object + * to not confuse the invoker of the thread. + */ + Exception::cleanup(cord); + } catch (Exception *) { + /* + * The exception is now available to the caller + * via cord->exception. + */ + res = NULL; + } + return res; } int @@ -604,16 +620,47 @@ cord_start(struct cord *cord, const char *name, void *(*f)(void *), void *arg) int cord_join(struct cord *cord) { - int res = 0; - if (tt_pthread_join(cord->id, NULL)) { - /* We can't recover from this in any reasonable way. */ - say_syserror("%s: thread join failed", cord->name); - res = -1; + assert(cord() != cord); /* Can't join self. */ + void *retval = NULL; + int res = tt_pthread_join(cord->id, &retval); + if (res == 0 && cord->exception) { + /* + * cord_thread_func guarantees that + * cord->exception is only set if the subject cord + * has terminated with an uncaught exception, + * transfer it to the caller. + */ + Exception::move(cord, cord()); + cord_destroy(cord); + cord()->exception->raise(); } cord_destroy(cord); return res; } +ssize_t +cord_cojoin_cb(va_list ap) +{ + struct cord *cord = va_arg(ap, struct cord *); + void *retval = NULL; + int res = tt_pthread_join(cord->id, &retval); + return res; +} + +int +cord_cojoin(struct cord *cord) +{ + assert(cord() != cord); /* Can't join self. */ + int rc = coeio_custom(cord_cojoin_cb, TIMEOUT_INFINITY, cord); + if (rc == 0 && cord->exception) { + Exception::move(cord, cord()); + cord_destroy(cord); + cord()->exception->raise(); /* re-throw exception from cord */ + } + cord_destroy(cord); + return rc; +} + void fiber_init(void) { diff --git a/src/fiber.h b/src/fiber.h index eda96e01b620afe8c2c0625192ac77bf2668552a..c6433d0b902a63794f6f68f7f0ad72768fb2aa06 100644 --- a/src/fiber.h +++ b/src/fiber.h @@ -85,7 +85,9 @@ enum fiber_key { FIBER_KEY_LUA_STORAGE = 1, /** transaction */ FIBER_KEY_TXN = 2, - FIBER_KEY_MAX = 3 + /** User global privilege and authentication token */ + FIBER_KEY_USER = 3, + FIBER_KEY_MAX = 4 }; struct fiber { @@ -171,13 +173,46 @@ extern __thread struct cord *cord_ptr; #define fiber() cord()->fiber #define loop() (cord()->loop) +/** + * Start a cord with the given thread function. + * The return value of the function can be collected + * with cord_join(). If the function terminates with + * an exception, the return value is NULL, and cord_join() + * moves the exception from the terminated cord to + * the caller of cord_join(). + */ int cord_start(struct cord *cord, const char *name, void *(*f)(void *), void *arg); +/** + * Wait for \a cord to terminate. If \a cord has already + * terminated, then returns immediately. + * + * @post If the subject cord terminated with an exception, + * preserves the exception in the caller's cord. + * + * @param cord cord + * @retval 0 pthread_join succeeded. + * If the thread function terminated with an + * exception, the exception is raised in the + * caller cord. + * @retval -1 pthread_join failed. + */ int cord_join(struct cord *cord); +/** + * \brief Yield until \a cord has terminated. + * If \a cord has terminated with an uncaught exception + * **re-throws** this exception in the calling cord/fiber. + * \param cord cord + * \sa pthread_join() + * \return 0 on success + */ +int +cord_cojoin(struct cord *cord); + static inline void cord_set_name(const char *name) { diff --git a/src/lib/salad/rlist.h b/src/lib/salad/rlist.h index 7952d07cf5157732f5a6dea3e26b5415e4087947..b9a973b01d7147437bc592330c775e344e98ac1b 100644 --- a/src/lib/salad/rlist.h +++ b/src/lib/salad/rlist.h @@ -47,6 +47,7 @@ struct rlist { struct rlist *next; }; +/* Used for static initialization of an empty list. */ extern struct rlist rlist_nil; /** diff --git a/src/lib/small/quota.h b/src/lib/small/quota.h new file mode 100644 index 0000000000000000000000000000000000000000..96312f3305bff1f6904ad65262507bd5b23ef822 --- /dev/null +++ b/src/lib/small/quota.h @@ -0,0 +1,188 @@ +#ifndef INCLUDES_TARANTOOL_SMALL_QUOTA_H +#define INCLUDES_TARANTOOL_SMALL_QUOTA_H +/* + * Redistribution and use in source and binary forms, with or + * without modification, are permitted provided that the following + * conditions are met: + * + * 1. Redistributions of source code must retain the above + * copyright notice, this list of conditions and the + * following disclaimer. + * + * 2. Redistributions in binary form must reproduce the above + * copyright notice, this list of conditions and the following + * disclaimer in the documentation and/or other materials + * provided with the distribution. + * + * THIS SOFTWARE IS PROVIDED BY <COPYRIGHT HOLDER> ``AS IS'' AND + * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED + * TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR + * A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL + * <COPYRIGHT HOLDER> OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, + * INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL + * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF + * SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR + * BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF + * LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT + * (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF + * THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF + * SUCH DAMAGE. + */ +#include <stddef.h> +#include <stdlib.h> +#include <stdint.h> +#include <assert.h> + +#if defined(__cplusplus) +extern "C" { +#endif /* defined(__cplusplus) */ + +#define QUOTA_UNIT_SIZE 1024ULL + +static const uint64_t QUOTA_MAX = QUOTA_UNIT_SIZE * UINT32_MAX; + +/** A basic limit on memory usage */ +struct quota { + /** + * High order dword is the total available memory + * and the low order dword is the currently used amount. + * Both values are represented in units of size + * QUOTA_UNIT_SIZE. + */ + uint64_t value; +}; + +/** + * Initialize quota with a given memory limit + */ +static inline void +quota_init(struct quota *quota, size_t total) +{ + uint64_t new_total = (total + (QUOTA_UNIT_SIZE - 1)) / + QUOTA_UNIT_SIZE; + quota->value = new_total << 32; +} + +/** + * Provide wrappers around gcc built-ins for now. + * These built-ins work with all numeric types - may not + * be the case when another implementation is used. + * Private use only. + */ +#define atomic_cas(a, b, c) __sync_val_compare_and_swap(a, b, c) + +/** + * Get current quota limit + */ +static inline size_t +quota_get(const struct quota *quota) +{ + return (quota->value >> 32) * QUOTA_UNIT_SIZE; +} + +/** + * Get current quota usage + */ +static inline size_t +quota_used(const struct quota *quota) +{ + return (quota->value & UINT32_MAX) * QUOTA_UNIT_SIZE; +} + +static inline void +quota_get_total_and_used(struct quota *quota, size_t *total, size_t *used) +{ + uint64_t value = quota->value; + *total = (value >> 32) * QUOTA_UNIT_SIZE; + *used = (value & UINT32_MAX) * QUOTA_UNIT_SIZE; +} + +/** + * Set quota memory limit. + * @retval > 0 aligned size set on success + * @retval -1 error, i.e. when it is not possible to decrease + * limit due to greater current usage + */ +static inline ssize_t +quota_set(struct quota *quota, size_t new_total) +{ + assert(new_total <= QUOTA_MAX); + /* Align the new total */ + uint32_t new_total_in_units = (new_total + (QUOTA_UNIT_SIZE - 1)) / + QUOTA_UNIT_SIZE; + while (1) { + uint64_t value = quota->value; + uint32_t used_in_units = value & UINT32_MAX; + if (new_total_in_units < used_in_units) + return -1; + uint64_t new_value = + ((uint64_t) new_total_in_units << 32) | used_in_units; + if (atomic_cas("a->value, value, new_value) == value) + break; + } + return new_total_in_units * QUOTA_UNIT_SIZE; +} + +/** + * Use up a quota + * @retval > 0 aligned value on success + * @retval -1 on error - if quota limit reached + */ +static inline ssize_t +quota_use(struct quota *quota, size_t size) +{ + assert(size < QUOTA_MAX); + uint32_t size_in_units = (size + (QUOTA_UNIT_SIZE - 1)) + / QUOTA_UNIT_SIZE; + assert(size_in_units); + while (1) { + uint64_t value = quota->value; + uint32_t total_in_units = value >> 32; + uint32_t used_in_units = value & UINT32_MAX; + + uint32_t new_used_in_units = used_in_units + size_in_units; + assert(new_used_in_units > used_in_units); + + if (new_used_in_units > total_in_units) + return -1; + + uint64_t new_value = + ((uint64_t) total_in_units << 32) | new_used_in_units; + + if (atomic_cas("a->value, value, new_value) == value) + break; + } + return size_in_units * QUOTA_UNIT_SIZE; +} + +/** Release used memory */ +static inline void +quota_release(struct quota *quota, size_t size) +{ + assert(size < QUOTA_MAX); + uint32_t size_in_units = (size + (QUOTA_UNIT_SIZE - 1)) + / QUOTA_UNIT_SIZE; + assert(size_in_units); + while (1) { + uint64_t value = quota->value; + uint32_t total_in_units = value >> 32; + uint32_t used_in_units = value & UINT32_MAX; + + assert(size_in_units <= used_in_units); + uint32_t new_used_in_units = used_in_units - size_in_units; + + uint64_t new_value = + ((uint64_t) total_in_units << 32) | new_used_in_units; + + if (atomic_cas("a->value, value, new_value) == value) + break; + } +} + +#undef atomic_cas +#undef QUOTA_UNIT_SIZE + +#if defined(__cplusplus) +} /* extern "C" { */ +#endif /* defined(__cplusplus) */ +#endif /* INCLUDES_TARANTOOL_SMALL_QUOTA_H */ diff --git a/src/lib/small/slab_arena.c b/src/lib/small/slab_arena.c index 5e54359785df3eb017dc821041f850943c79e1df..12e2ff27432baaad76fe21170adf90678fd36ceb 100644 --- a/src/lib/small/slab_arena.c +++ b/src/lib/small/slab_arena.c @@ -27,6 +27,7 @@ * SUCH DAMAGE. */ #include "small/slab_arena.h" +#include "small/quota.h" #include <stdio.h> #include <stdint.h> #include <string.h> @@ -105,9 +106,8 @@ pow2round(size_t size) #define MIN(a, b) (a) < (b) ? (a) : (b) int -slab_arena_create(struct slab_arena *arena, - size_t prealloc, size_t maxalloc, - uint32_t slab_size, int flags) +slab_arena_create(struct slab_arena *arena, struct quota *quota, + size_t prealloc, uint32_t slab_size, int flags) { assert(flags & (MAP_PRIVATE | MAP_SHARED)); lf_lifo_init(&arena->cache); @@ -118,9 +118,9 @@ slab_arena_create(struct slab_arena *arena, */ arena->slab_size = small_round(MAX(slab_size, SLAB_MIN_SIZE)); - arena->maxalloc = maxalloc; - /** Prealloc can not be greater than maxalloc */ - prealloc = MIN(prealloc, maxalloc); + arena->quota = quota; + /** Prealloc can not be greater than the quota */ + prealloc = MIN(prealloc, quota_get(quota)); /** Extremely large sizes can not be aligned properly */ prealloc = MIN(prealloc, SIZE_MAX - arena->slab_size); /* Align prealloc around a fixed number of slabs. */ @@ -165,15 +165,14 @@ slab_map(struct slab_arena *arena) if ((ptr = lf_lifo_pop(&arena->cache))) return ptr; + if (quota_use(arena->quota, arena->slab_size) < 0) + return NULL; + /** Need to allocate a new slab. */ size_t used = __sync_add_and_fetch(&arena->used, arena->slab_size); if (used <= arena->prealloc) return arena->arena + used - arena->slab_size; - if (used > arena->maxalloc) { - __sync_sub_and_fetch(&arena->used, arena->slab_size); - return NULL; - } return mmap_checked(arena->slab_size, arena->slab_size, arena->flags); } diff --git a/src/lib/small/slab_arena.h b/src/lib/small/slab_arena.h index 322fb80a08d19ad1d8227999a70f2adb1b252aa3..29943436efad3c74625751180ccb72df737c0968 100644 --- a/src/lib/small/slab_arena.h +++ b/src/lib/small/slab_arena.h @@ -38,15 +38,16 @@ extern "C" { enum { /* Smallest possible slab size. */ - SLAB_MIN_SIZE = USHRT_MAX, + SLAB_MIN_SIZE = ((size_t)USHRT_MAX) + 1, /** The largest allowed amount of memory of a single arena. */ - SMALL_UNLIMITED = SIZE_MAX/2 + 1 + SMALL_UNLIMITED = SIZE_MAX/2 + 1 }; /** * slab_arena -- a source of large aligned blocks of memory. * MT-safe. * Uses a lock-free LIFO to maintain a cache of used slabs. + * Uses a lock-free quota to limit allocating memory. * Never returns memory to the operating system. */ struct slab_arena { @@ -59,6 +60,22 @@ struct slab_arena { struct lf_lifo cache; /** A preallocated arena of size = prealloc. */ void *arena; + /** + * How much memory is preallocated during initialization + * of slab_arena. + */ + size_t prealloc; + /** + * How much memory in the preallocated arena has + * already been initialized for slabs. + * @invariant used <= prealloc. + */ + size_t used; + /** + * An external quota to which we must adhere. + * A quota exists to set a common limit on two arenas. + */ + struct quota *quota; /* * Each object returned by arena_map() has this size. * The size is provided at arena initialization. @@ -75,25 +92,6 @@ struct slab_arena { * allocate objects of size up to ~1MB. */ uint32_t slab_size; - /** - * How much memory is preallocated during initialization - * of slab_arena. - */ - size_t prealloc; - /** - * How much above 'prealloc' size we can go after - * the arena is exhausted (arena_used == prealloc). - * Each new slab is allocated by a dedicated mmap() - * call. When returned, slabs allocated this way - * are munmap()-ed right away. - */ - size_t maxalloc; - /** - * How much memory in the preallocated arena has - * already been initialized for slabs. - * @invariant arena_used <= prealloc. - */ - size_t used; /** * mmap() flags: MAP_SHARED or MAP_PRIVATE */ @@ -102,8 +100,8 @@ struct slab_arena { /** Initialize an arena. */ int -slab_arena_create(struct slab_arena *arena, size_t prealloc, - size_t maxalloc, uint32_t slab_size, int flags); +slab_arena_create(struct slab_arena *arena, struct quota *quota, + size_t prealloc, uint32_t slab_size, int flags); /** Destroy an arena. */ void @@ -121,7 +119,10 @@ slab_unmap(struct slab_arena *arena, void *ptr); void slab_arena_mprotect(struct slab_arena *arena); -/** Align a size. Alignment must be a power of 2 */ +/** + * Align a size - round up to nearest divisible by the given alignment. + * Alignment must be a power of 2 + */ static inline size_t small_align(size_t size, size_t alignment) { @@ -132,13 +133,14 @@ small_align(size_t size, size_t alignment) return (size - 1 + alignment) & ~(alignment - 1); } -/** Round a number to the nearest power of two. */ +/** Round up a number to the nearest power of two. */ static inline size_t small_round(size_t size) { if (size < 2) return size; - assert(size <= SMALL_UNLIMITED); + assert(size <= SIZE_MAX / 2 + 1); + assert(size - 1 <= ULONG_MAX); size_t r = 1; return r << (sizeof(unsigned long) * CHAR_BIT - __builtin_clzl((unsigned long) (size - 1))); @@ -148,7 +150,7 @@ small_round(size_t size) static inline size_t small_lb(size_t size) { - assert(size <= SMALL_UNLIMITED); + assert(size <= ULONG_MAX); return sizeof(unsigned long) * CHAR_BIT - __builtin_clzl((unsigned long) size) - 1; } diff --git a/src/lua/fiber.cc b/src/lua/fiber.cc index 6d9e42201880a301fa87d2b898787edc349ffdb9..03a155d397fd71b4db07cd40128ead67b279917a 100644 --- a/src/lua/fiber.cc +++ b/src/lua/fiber.cc @@ -260,7 +260,7 @@ lbox_fiber_info(struct lua_State *L) } static void -box_lua_fiber_run_detached(va_list ap) +box_lua_fiber_run(va_list ap) { LuarefGuard coro_guard(va_arg(ap, int)); struct lua_State *L = va_arg(ap, struct lua_State *); @@ -270,7 +270,6 @@ box_lua_fiber_run_detached(va_list ap) fiber_get_key(fiber(), FIBER_KEY_LUA_STORAGE); if (storage_ref > 0) lua_unref(L, storage_ref); - fiber_set_key(fiber(), FIBER_KEY_LUA_STORAGE, NULL); }); try { @@ -293,7 +292,7 @@ lbox_fiber_create(struct lua_State *L) luaL_error(L, "fiber.create(function, ...): bad arguments"); fiber_checkstack(); - struct fiber *f = fiber_new("lua", box_lua_fiber_run_detached); + struct fiber *f = fiber_new("lua", box_lua_fiber_run); /* Not a system fiber. */ f->flags |= FIBER_USER_MODE; struct lua_State *child_L = lua_newthread(L); diff --git a/src/lua/init.cc b/src/lua/init.cc index 17d3cc78f5b90381bcd840056cee92a83c3d8d68..667cfbc314d010f6ec51eaabe6726bd3876c381a 100644 --- a/src/lua/init.cc +++ b/src/lua/init.cc @@ -57,14 +57,17 @@ extern "C" { #include "lua/pickle.h" #include "lua/fio.h" -#include <ctype.h> -#include "small/region.h" -#include <stdio.h> -#include <unistd.h> #include <readline/readline.h> #include <readline/history.h> +/** + * The single Lua state of the transaction processor (tx) thread. + */ struct lua_State *tarantool_L; +/** + * The fiber running the startup Lua script + */ +struct fiber *script_fiber; /* contents of src/lua/ files */ extern char uuid_lua[], @@ -436,19 +439,30 @@ tarantool_lua_run_script(char *path) * To work this problem around we must run init script in * a separate fiber. */ - struct fiber *loader = fiber_new(title, run_script); - fiber_call(loader, tarantool_L, path); + script_fiber = fiber_new(title, run_script); + fiber_call(script_fiber, tarantool_L, path); /* * Run an auxiliary event loop to re-schedule run_script fiber. * When this fiber finishes, it will call ev_break to stop the loop. */ ev_run(loop(), 0); + /* The fiber running the startup script has ended. */ + script_fiber = NULL; } void tarantool_lua_free() { + /* + * Some part of the start script panicked, and called + * exit(). The call stack in this case leads us back to + * luaL_call() in run_script(). Trying to free a Lua state + * from within luaL_call() is not the smartest idea (@sa + * gh-612). + */ + if (script_fiber) + return; /* * Got to be done prior to anything else, since GC * handlers can refer to other subsystems (e.g. fibers). diff --git a/src/memory.cc b/src/memory.cc index fa88038b4bbc1f4720c97c0ed985d51809cb0a15..95e9f97675f19482826cb0ab8627dd5dcebe4d09 100644 --- a/src/memory.cc +++ b/src/memory.cc @@ -27,16 +27,20 @@ * SUCH DAMAGE. */ #include "memory.h" +#include "small/quota.h" struct slab_arena runtime; -static const size_t SLAB_SIZE = 4194304; - void memory_init() { + static struct quota runtime_quota; + static const size_t SLAB_SIZE = 4 * 1024 * 1024; + /* default quota initialization */ + quota_init(&runtime_quota, QUOTA_MAX); + /* No limit on the runtime memory. */ - slab_arena_create(&runtime, 0, SMALL_UNLIMITED, + slab_arena_create(&runtime, &runtime_quota, 0, SLAB_SIZE, MAP_PRIVATE); } diff --git a/src/trivia/util.h b/src/trivia/util.h index f6d1a7f49e638e662209e7adad1c4be6c9440115..d14e80249bb60012f571b6f6f2e35d5e7b4913e5 100644 --- a/src/trivia/util.h +++ b/src/trivia/util.h @@ -172,7 +172,7 @@ char *find_path(const char *argv0); char *abspath(const char *filename); char * -int2str(int val); +int2str(long int val); #ifndef HAVE_MEMMEM /* Declare memmem(). */ diff --git a/src/tt_pthread.h b/src/tt_pthread.h index c0a6683651ac1d429e3b491cd172a95322651459..6f83a950e90160ca54020800b363b79102a57d81 100644 --- a/src/tt_pthread.h +++ b/src/tt_pthread.h @@ -46,7 +46,7 @@ #define tt_pthread_error(e) \ if (e != 0) \ - say_error("%s error %d", __func__, e);\ + say_syserror("%s error %d", __func__, e);\ assert(e == 0); \ e diff --git a/src/util.cc b/src/util.cc index f0cd3ca37cf1eb6a557f4322f97eaf9d8589a3c2..557beb94a662b488ba43e4c83d41e917a72afdd7 100644 --- a/src/util.cc +++ b/src/util.cc @@ -367,10 +367,10 @@ abspath(const char *filename) } char * -int2str(int val) +int2str(long int val) { static char __thread buf[22]; - snprintf(buf, sizeof(buf), "%d", val); + snprintf(buf, sizeof(buf), "%ld", val); return buf; } diff --git a/test/app/console.test.lua b/test/app/console.test.lua index 5a611cc4e4b7c7961534b53c1d136311cdcef5f1..148926c580d8c8ed53df3fddcd5a33cfb9237ae3 100755 --- a/test/app/console.test.lua +++ b/test/app/console.test.lua @@ -72,7 +72,7 @@ test:ok(yaml.decode(client:read(EOL))[1].error:find('access denied'), box.schema.user.create('test', { password = 'pass' }) client:write(string.format("require('console').connect('test:pass@%s')\n", IPROTO_SOCKET)) --- error: Execute access denied for user 'tester' to function 'dostring +-- error: Execute access denied for user 'test' to function 'dostring test:ok(yaml.decode(client:read(EOL))[1].error:find('access denied'), 'remote access denied') diff --git a/test/app/session.test.lua b/test/app/session.test.lua index d885b845ff584442ed63645dc55f09a43460df4a..9c1a674bf1c0f612b924f75148fae538b8f4aae4 100755 --- a/test/app/session.test.lua +++ b/test/app/session.test.lua @@ -3,7 +3,7 @@ -- -- Check that Tarantool creates ADMIN session for #! script -- -box.cfg{logger="tarantool.log"} +box.cfg{logger="tarantool.log", slab_alloc_arena=0.1} print('session.id()', box.session.id()) print('session.uid()', box.session.uid()) os.exit(0) diff --git a/test/app/trigger_atexit.test.lua b/test/app/trigger_atexit.test.lua index 1e85558d9effec3aea8b4ae2fcb4bf96ea5a4439..7ac54bd5e961396560ef15d4637ce6c8110205d6 100755 --- a/test/app/trigger_atexit.test.lua +++ b/test/app/trigger_atexit.test.lua @@ -13,7 +13,8 @@ box.cfg { wal_dir = tempdir, snap_dir = tempdir, sophia_dir = tempdir, - logger = fio.pathjoin(tempdir, 'tarantool.log') + logger = fio.pathjoin(tempdir, 'tarantool.log'), + slab_alloc_arena=0.1 -- for small systems } local function test_replace(old_tuple, new_tuple) diff --git a/test/box/alter.result b/test/box/alter.result index e5974dc8195617b14807d660197f2cbc2561970e..06f8a885e1b89c26a265c84f7c840a141bb86349 100644 --- a/test/box/alter.result +++ b/test/box/alter.result @@ -134,7 +134,7 @@ space_deleted ... space:replace{0} --- -- error: Space '#321' does not exist +- error: Space '321' does not exist ... _index:insert{_space.id, 0, 'primary', 'tree', 1, 1, 0, 'num'} --- diff --git a/test/box/errinj.result b/test/box/errinj.result index 204d1ddda9c0ba993046521b96ec3227f0c86583..5435850f509c716131f38889df07223bbec5de1c 100644 --- a/test/box/errinj.result +++ b/test/box/errinj.result @@ -271,7 +271,7 @@ box.space['withdata'] ... index7 = s_withdata:create_index('another', { type = 'tree', parts = { 5, 'num' }, unique = false}) --- -- error: Space '#514' does not exist +- error: Space '514' does not exist ... s_withdata.index.another --- diff --git a/test/box/session.storage.result b/test/box/session.storage.result index b10f215b103064ba95113155f248109c5c7250b8..a292e1b2f287f2e476444ae66b9637ac7f928e8b 100644 --- a/test/box/session.storage.result +++ b/test/box/session.storage.result @@ -31,7 +31,7 @@ all = getmetatable(session).aggregate_storage ... dump(all) --- -- '''[null,null,null,{"abc":"cde"}]''' +- '''[null,null,{"abc":"cde"}]''' ... --# create connection second to default --# set connection second diff --git a/test/box/sql.result b/test/box/sql.result index 9c1c375829e19b4d40aaba65013fd1306a8d6199..9bab5ad7f66dd8bf9f2c33e9409767686f3574c8 100644 --- a/test/box/sql.result +++ b/test/box/sql.result @@ -168,19 +168,19 @@ select * from t1 where k0 = 0 --- - error: errcode: ER_NO_SUCH_SPACE - errmsg: Space '#1' does not exist + errmsg: Space '1' does not exist ... select * from t65537 where k0 = 0 --- - error: errcode: ER_NO_SUCH_SPACE - errmsg: Space '#65537' does not exist + errmsg: Space '65537' does not exist ... select * from t4294967295 where k0 = 0 --- - error: errcode: ER_NO_SUCH_SPACE - errmsg: Space '#4294967295' does not exist + errmsg: Space '4294967295' does not exist ... box.space[0]:drop() --- diff --git a/test/unit/CMakeLists.txt b/test/unit/CMakeLists.txt index 32306d8db765db812ed0f6190d3d4700bd21a375..dc113ed4955596b58e92f319083d3bfde8e54aae 100644 --- a/test/unit/CMakeLists.txt +++ b/test/unit/CMakeLists.txt @@ -47,6 +47,8 @@ add_executable(vclock.test vclock.cc test.c ${CMAKE_SOURCE_DIR}/src/box/vclock.c) target_link_libraries(vclock.test core small ${LIBEV_LIBRARIES} ${LIBEIO_LIBRARIES} ${LIBCORO_LIBRARIES}) +add_executable(quota.test quota.cc test.c) +target_link_libraries(quota.test pthread) set(MSGPUCK_DIR ${PROJECT_SOURCE_DIR}/src/lib/msgpuck/) add_executable(msgpack.test diff --git a/test/unit/arena_mt.c b/test/unit/arena_mt.c index 725d63b6fd34d39c0b4b54f717d870460eb36d48..b8eca23477d30141fedd41e04fba2d3ce3386de4 100644 --- a/test/unit/arena_mt.c +++ b/test/unit/arena_mt.c @@ -1,4 +1,5 @@ #include "small/slab_arena.h" +#include "small/quota.h" #include <stdio.h> #include <limits.h> #include <stdlib.h> @@ -7,6 +8,7 @@ #include <pthread.h> struct slab_arena arena; +struct quota quota; int THREADS = 8; int ITERATIONS = 1009 /* 100003 */; @@ -63,7 +65,8 @@ int main() { size_t maxalloc = THREADS * (OSCILLATION + 1) * SLAB_MIN_SIZE; - slab_arena_create(&arena, maxalloc/8, maxalloc*2, + quota_init("a, maxalloc); + slab_arena_create(&arena, "a, maxalloc/8, SLAB_MIN_SIZE, MAP_PRIVATE); bench(THREADS); slab_arena_destroy(&arena); diff --git a/test/unit/mempool.c b/test/unit/mempool.c index aef5fcb5a14439032cd6da5692a12ee66aae2562..90c72d5f53b09e61969d2ee2ffa1208f88cc861f 100644 --- a/test/unit/mempool.c +++ b/test/unit/mempool.c @@ -1,4 +1,5 @@ #include "small/mempool.h" +#include "small/quota.h" #include "unit.h" #include <stdio.h> #include <stdlib.h> @@ -14,6 +15,7 @@ enum { struct slab_arena arena; struct slab_cache cache; +struct quota quota; struct mempool pool; int objsize; size_t used; @@ -101,7 +103,9 @@ int main() if (objsize < OBJSIZE_MIN) objsize = OBJSIZE_MIN; - slab_arena_create(&arena, 0, UINT_MAX, + quota_init("a, UINT_MAX); + + slab_arena_create(&arena, "a, 0, 4000000, MAP_PRIVATE); slab_cache_create(&cache, &arena, 0); diff --git a/test/unit/quota.cc b/test/unit/quota.cc new file mode 100644 index 0000000000000000000000000000000000000000..0f66a391bf08d43e8a4ddbc59c1dfaf4158be4fa --- /dev/null +++ b/test/unit/quota.cc @@ -0,0 +1,98 @@ +#include "small/quota.h" + +#include <pthread.h> +#include "test.h" + +struct quota quota; + +const size_t THREAD_CNT = 10; +const size_t RUN_CNT = 128 * 1024; + +struct thread_data { + size_t use_change; + size_t last_lim_set; + long use_change_success; + long lim_change_success; +}; + +pthread_t threads[THREAD_CNT]; +thread_data datum[THREAD_CNT]; + +void *thread_routine(void *vparam) +{ + struct thread_data *data = (struct thread_data *)vparam; + size_t check_fail_count = 0; + ssize_t allocated_size = 0; + for (size_t i = 0; i < RUN_CNT; i++) { + { + size_t total, used; + quota_get_total_and_used("a, &total, &used); + if (used > total) + check_fail_count++; + } + ssize_t max = rand() % QUOTA_MAX; + max = quota_set("a, max); + pthread_yield(); + if (max > 0) { + data->last_lim_set = max; + data->lim_change_success++; + } + if (allocated_size > 0) { + quota_release("a, allocated_size); + allocated_size = -1; + data->use_change = 0; + data->use_change_success++; + pthread_yield(); + } else { + allocated_size = rand() % max + 1; + allocated_size = quota_use("a, allocated_size); + if (allocated_size > 0) { + data->use_change = allocated_size; + data->use_change_success++; + } + pthread_yield(); + } + } + return (void *)check_fail_count; +} + +int +main(int n, char **a) +{ + (void)n; + (void)a; + quota_init("a, 0); + srand(time(0)); + + plan(5); + + for (size_t i = 0; i < THREAD_CNT; i++) { + pthread_create(threads + i, 0, thread_routine, (void *)(datum + i)); + } + + size_t check_fail_count = 0; + for (size_t i = 0; i < THREAD_CNT; i++) { + void *ret; + check_fail_count += (size_t)pthread_join(threads[i], &ret); + } + + bool one_set_successed = false; + size_t total_alloc = 0; + long set_success_count = 0; + long use_success_count = 0; + for (size_t i = 0; i < THREAD_CNT; i++) { + if (datum[i].last_lim_set == quota_get("a)) + one_set_successed = true; + total_alloc += datum[i].use_change; + use_success_count += datum[i].use_change_success; + set_success_count += datum[i].lim_change_success; + } + + ok(check_fail_count == 0, "no fails detected"); + ok(one_set_successed, "one of thread limit set is final"); + ok(total_alloc == quota_used("a), "total alloc match"); + ok(use_success_count > THREAD_CNT * RUN_CNT * .1, "uses are mosly successful"); + ok(set_success_count > THREAD_CNT * RUN_CNT * .1, "sets are mosly successful"); + + return check_plan(); +} diff --git a/test/unit/quota.result b/test/unit/quota.result new file mode 100644 index 0000000000000000000000000000000000000000..84fddb5dab8adc71b5d9f8f6f86a49b182ab443f --- /dev/null +++ b/test/unit/quota.result @@ -0,0 +1,6 @@ +1..5 +ok 1 - no fails detected +ok 2 - one of thread limit set is final +ok 3 - total alloc match +ok 4 - uses are mosly successful +ok 5 - sets are mosly successful diff --git a/test/unit/region.c b/test/unit/region.c index ee3ede14d72fd3c75a1d682402531bc43afaf3a6..b1fbd342387bfc1dee65bf72cadaa98efd38fddf 100644 --- a/test/unit/region.c +++ b/test/unit/region.c @@ -1,9 +1,11 @@ #include "small/region.h" +#include "small/quota.h" #include "unit.h" #include <stdio.h> struct slab_cache cache; struct slab_arena arena; +struct quota quota; void region_basic() @@ -75,7 +77,8 @@ region_test_truncate() int main() { - slab_arena_create(&arena, 0, UINT_MAX, + quota_init("a, UINT_MAX); + slab_arena_create(&arena, "a, 0, 4000000, MAP_PRIVATE); slab_cache_create(&cache, &arena, 0); diff --git a/test/unit/slab_arena.c b/test/unit/slab_arena.c index e0c27cb1c8dd1e6a7b0b1c34b083b92e325e9ddd..fe71bc713454bd8cd3041f79c2633d04dbb7b919 100644 --- a/test/unit/slab_arena.c +++ b/test/unit/slab_arena.c @@ -1,4 +1,5 @@ #include "small/slab_arena.h" +#include "small/quota.h" #include <stdio.h> #include <limits.h> #include <stdlib.h> @@ -10,17 +11,22 @@ slab_arena_print(struct slab_arena *arena) { printf("arena->prealloc = %zu\narena->maxalloc = %zu\n" "arena->used = %zu\narena->slab_size = %u\n", - arena->prealloc, arena->maxalloc, + arena->prealloc, quota_get(arena->quota), arena->used, arena->slab_size); } int main() { + struct quota quota; struct slab_arena arena; - slab_arena_create(&arena, 0, 0, 0, MAP_PRIVATE); + + quota_init("a, 0); + slab_arena_create(&arena, "a, 0, 0, MAP_PRIVATE); slab_arena_print(&arena); slab_arena_destroy(&arena); - slab_arena_create(&arena, 1, 1, 1, MAP_PRIVATE); + + quota_init("a, SLAB_MIN_SIZE); + slab_arena_create(&arena, "a, 1, 1, MAP_PRIVATE); slab_arena_print(&arena); void *ptr = slab_map(&arena); slab_arena_print(&arena); @@ -30,9 +36,10 @@ int main() slab_unmap(&arena, ptr); slab_unmap(&arena, ptr1); slab_arena_print(&arena); - slab_arena_destroy(&arena); - slab_arena_create(&arena, 2000000, 3000000, 1, MAP_PRIVATE); + + quota_init("a, 2000000); + slab_arena_create(&arena, "a, 3000000, 1, MAP_PRIVATE); slab_arena_print(&arena); slab_arena_destroy(&arena); } diff --git a/test/unit/slab_arena.result b/test/unit/slab_arena.result index 5af7304e7433542a38978afa3798c659d0d8bd4e..fb73025acbdff3314c646dca93edef2e95c8cf0c 100644 --- a/test/unit/slab_arena.result +++ b/test/unit/slab_arena.result @@ -3,23 +3,23 @@ arena->maxalloc = 0 arena->used = 0 arena->slab_size = 65536 arena->prealloc = 65536 -arena->maxalloc = 1 +arena->maxalloc = 65536 arena->used = 0 arena->slab_size = 65536 arena->prealloc = 65536 -arena->maxalloc = 1 +arena->maxalloc = 65536 arena->used = 65536 arena->slab_size = 65536 going beyond the limit: (nil) arena->prealloc = 65536 -arena->maxalloc = 1 +arena->maxalloc = 65536 arena->used = 65536 arena->slab_size = 65536 arena->prealloc = 65536 -arena->maxalloc = 1 +arena->maxalloc = 65536 arena->used = 65536 arena->slab_size = 65536 arena->prealloc = 2031616 -arena->maxalloc = 3000000 +arena->maxalloc = 2000896 arena->used = 0 arena->slab_size = 65536 diff --git a/test/unit/small_alloc.c b/test/unit/small_alloc.c index 85bec1474a41e8784dc90e50fa346e2def05abd2..96126ac3ad2ed401ebe7bd798c449ee5a9f8e50b 100644 --- a/test/unit/small_alloc.c +++ b/test/unit/small_alloc.c @@ -1,4 +1,5 @@ #include "small/small.h" +#include "small/quota.h" #include "unit.h" #include <stdio.h> #include <stdlib.h> @@ -15,6 +16,7 @@ enum { struct slab_arena arena; struct slab_cache cache; struct small_alloc alloc; +struct quota quota; /* Streak type - allocating or freeing */ bool allocating = true; /** Keep global to easily inspect the core. */ @@ -92,7 +94,9 @@ int main() srand(seed); - slab_arena_create(&arena, 0, UINT_MAX, 4000000, + quota_init("a, UINT_MAX); + + slab_arena_create(&arena, "a, 0, 4000000, MAP_PRIVATE); slab_cache_create(&cache, &arena, 0); diff --git a/test/wal/oom.result b/test/wal/oom.result index 4a27adc6142d909ce5bc7ab0aa724603b1a67b07..1256e4946451bae12e270fa922f59beffc3d6022 100644 --- a/test/wal/oom.result +++ b/test/wal/oom.result @@ -15,11 +15,11 @@ while true do i = i + 1 end; --- -- error: Failed to allocate 268 bytes in slab allocator for tuple +- error: Failed to allocate 252 bytes in slab allocator for tuple ... space:len(); --- -- 62 +- 58 ... i = 1; --- @@ -29,11 +29,11 @@ while true do i = i + 1 end; --- -- error: Failed to allocate 268 bytes in slab allocator for tuple +- error: Failed to allocate 252 bytes in slab allocator for tuple ... space:len(); --- -- 124 +- 116 ... i = 1; --- @@ -43,12 +43,12 @@ while true do i = i + 1 end; --- -- error: Failed to allocate 265 bytes in slab allocator for tuple +- error: Failed to allocate 249 bytes in slab allocator for tuple ... --# setopt delimiter '' space:len() --- -- 185 +- 173 ... space.index['primary']:get{0} ---