An error occurred while fetching folder content.
Ivan Koptelov
authored
If utf-8 string is passed to built-in functions such as LIKE, LENGTH etc, and it contains '\0' symbol, then one is assumed to be end-of-string. This approach is considered to be inappropriate. Lets fix it: treat '\0' as another one utf-8 symbol and process strings containing it entirely. Consider examples: LENGTH(CHAR(65,00,65)) == 3 LIKE(CHAR(65,00,65), CHAR(65,00,66)) == False Also the patch changes the way we count length of utf-8 strings. Before we processed each byte of the string. Now we use the following algorithm. Starting from the first byte in string, we try to determine what kind of byte it is: first byte of 1,2,3 or 4 byte sequence. Then we skip corresponding amount of bytes and increment symbol length (e.g. 2 bytes for 2 byte sequence). If current byte is not a valid first byte of any sequence, when we skip it and increment symbol length. Note that new approach might increase performance of LENGTH(), INSTR() and TRIM(). Closes #3542 @TarantoolBot document Title: null-term is treated now as usual character in str funcs User-visible behavior of sql functions dealing with strings would change as it is described in the commit message.
Name | Last commit | Last update |
---|