MessagePack extensions
Tarantool uses predefined MessagePack extension types to represent some
of the special values. Extension types include MP_DECIMAL
, MP_UUID
,
MP_ERROR
, MP_DATETIME
, and MP_INTERVAL
.
These types require special attention from the connector developers,
as they must be treated separately from the default MessagePack types,
and correctly mapped to programming language types.
The MessagePack EXT type MP_EXT
together with the extension type
MP_DECIMAL
is a header for values of the DECIMAL type.
MP_DECIMAL
type is 1.
MessagePack specification defines two kinds of types:
fixext 1/2/4/8/16
types have fixed length so the length is not encoded explicitly.ext 8/16/32
types require the data length to be encoded.
MP_EXP
+ optional length
imply using one of these types.
The decimal MessagePack representation looks like this:
+--------+-------------------+------------+===============+
| MP_EXT | length (optional) | MP_DECIMAL | PackedDecimal |
+--------+-------------------+------------+===============+
Here length
is the length of PackedDecimal
field, and it is of type
MP_UINT
, when encoded explicitly (i.e. when the type is ext 8/16/32
).
PackedDecimal
has the following structure:
<--- length bytes -->
+-------+=============+
| scale | BCD |
+-------+=============+
Here scale
is either MP_INT
or MP_UINT
.
scale
= number of digits after the decimal point
BCD
is a sequence of bytes representing decimal digits of the encoded number
(each byte has two decimal digits each encoded using 4-bit nibbles
),
so byte >> 4
is the first digit and byte & 0x0f
is the second digit.
The leftmost digit in the array is the most significant.
The rightmost digit in the array is the least significant.
The first byte of the BCD
array contains the first digit of the number,
represented as follows:
| 4 bits | 4 bits |
= 0x = the 1st digit
(The first nibble
contains 0 if the decimal number has an even number of digits.)
The last byte of the BCD
array contains the last digit of the number and the
final nibble
, represented as follows:
| 4 bits | 4 bits |
= the last digit = nibble
The final nibble
represents the number’s sign:
0x0a
,0x0c
,0x0e
,0x0f
stand for plus,0x0b
and0x0d
stand for minus.
Examples
The decimal -12.34
will be encoded as 0xd6,0x01,0x02,0x01,0x23,0x4d
:
|MP_EXT (fixext 4) | MP_DECIMAL | scale | 1 | 2,3 | 4 (minus) |
| 0xd6 | 0x01 | 0x02 | 0x01 | 0x23 | 0x4d |
The decimal 0.000000000000000000000000000000000010
will be encoded as 0xc7,0x03,0x01,0x24,0x01,0x0c
:
| MP_EXT (ext 8) | length | MP_DECIMAL | scale | 1 | 0 (plus) |
| 0xc7 | 0x03 | 0x01 | 0x24 | 0x01 | 0x0c |
The MessagePack EXT type MP_EXT
together with the extension type
MP_UUID
for values of the UUID type. Since version 2.4.1.
MP_UUID
type is 2.
The MessagePack specification
defines d8
to mean fixext
with size 16, and a UUID’s size is always 16.
So the UUID MessagePack representation looks like this:
+--------+------------+-----------------+
| MP_EXT | MP_UUID | UuidValue |
| = d8 | = 2 | = 16-byte value |
+--------+------------+-----------------+
The 16-byte value has 2 digits per byte. Typically, it consists of 11 fields, which are encoded as big-endian unsigned integers in the following order:
time_low
(4 bytes)time_mid
(2 bytes)time_hi_and_version
(2 bytes)clock_seq_hi_and_reserved
(1 byte)clock_seq_low
(1 byte)node[0]
, …,node[5]
(1 byte each)
Some of the functions in Module uuid can produce values which are compatible with the UUID data type. For example, after
uuid = require('uuid')
box.schema.space.create('t')
box.space.t:create_index('i', {parts={1,'uuid'}})
box.space.t:insert{uuid.fromstr('f6423bdf-b49e-4913-b361-0740c9702e4b')}
box.space.t:select()
a peek at the server response packet will show that it contains
d8 02 f6 42 3b df b4 9e 49 13 b3 61 07 40 c9 70 2e 4b
Since version 2.4.1, responses for errors have extra information
following what was described in
Box protocol – responses for errors.
This is a “compatible” enhancement, because clients that expect old-style
server responses should ignore map components that they do not recognize.
Notice, however, that there has been a renaming of a constant:
formerly IPROTO_ERROR
in ./box/iproto_constants.h
was 0x31
,
now IPROTO_ERROR
is 0x52
and IPROTO_ERROR_24
is 0x31
.
MP_ERROR
type is 3.
++=========================+============================+
|| | |
|| 0x31: IPROTO_ERROR_24 | 0x52: IPROTO_ERROR |
|| MP_INT: MP_STRING | MP_MAP: extra information |
|| | |
++=========================+============================+
MP_MAP
The extra information, most of which is also in error object fields, is:
MP_ERROR_TYPE
(0x00) (MP_STR) Type that implies source, as in error_object.base_type
, for example “ClientError”.
MP_ERROR_FILE
(0x01) (MP_STR) Source code file where error was caught, as in error_object.trace
.
MP_ERROR_LINE
(0x02) (MP_UINT) Line number in source code file, as in error_object.trace
.
MP_ERROR_MESSAGE
(0x03) (MP_STR) Text of reason, as in error_object.message
.
The value here will be the same as in the IPROTO_ERROR_24
value.
MP_ERROR_ERRNO
(0x04) (MP_UINT) Ordinal number of the error, as in error_object.errno
.
Not to be confused with MP_ERROR_ERRCODE
.
MP_ERROR_ERRCODE
(0x05) (MP_UINT) Number of the error as defined in errcode.h, as in error_object.code
,
which can also be retrieved with the C function box_error_code().
The value here will be the same as the lower part of the Response-Code-Indicator value.
MP_ERROR_FIELDS
(0x06) (MP_MAPs) Additional fields depending on error
type. For example, if MP_ERROR_TYPE
is “AccessDeniedError”, then MP_ERROR_FIELDS
will include “object_type”, “object_name”, “access_type”. This field will be
omitted from the response body if there are no additional fields available.
Client and connector programmers should ensure that unknown map keys are ignored, and should check for addition of new keys in the Tarantool source code file where error object creation is defined. In version 2.4.1 the name of this source code file is mp_error.cc.
For example, in version 2.4.1 or later, if we try to create a duplicate space with
conn:eval([[box.schema.space.create('_space');]])
the server response will look like this:
ce 00 00 00 88 MP_UINT = HEADER + BODY SIZE
83 MP_MAP, size 3 (i.e. 3 items in header)
00 Response-Code-Indicator
ce 00 00 80 0a MP_UINT = hexadecimal 800a
01 IPROTO_SYNC
cf 00 00 00 00 00 00 00 05 MP_UINT = sync value
05 IPROTO_SCHEMA_VERSION
ce 00 00 00 4e MP_UINT = schema version value
82 MP_MAP, size 2
31 IPROTO_ERROR_24
bd 53 70 61 63 etc. MP_STR = "Space '_space' already exists"
52 IPROTO_ERROR
81 MP_MAP, size 1
00 MP_ERROR_STACK
91 MP_ARRAY, size 1
86 MP_MAP, size 6
00 MP_ERROR_TYPE
ab 43 6c 69 65 6e 74 etc. MP_STR = "ClientError"
02 MP_ERROR_LINE
cd MP_UINT = line number
01 MP_ERROR_FILE
aa 01 b6 62 75 69 6c etc. MP_STR "builtin/box/schema.lua"
03 MP_ERROR_MESSAGE
bd 53 70 61 63 65 20 etc. MP_STR = Space.'_space'.already.exists"
04 MP_ERROR_ERRNO
00 MP_UINT = error number
05 MP_ERROR_ERRCODE
0a MP_UINT = error code ER_SPACE_EXISTS
Since version 2.10.0.
The MessagePack EXT type MP_EXT
together with the extension type
MP_DATETIME
is a header for values of the DATETIME type.
It creates a container with a payload of 8 or 16 bytes.
MP_DATETIME
type is 4.
The MessagePack specification
defines d7
to mean fixext
with size 8 or d8
to mean fixext
with size 16.
So the datetime MessagePack representation looks like this:
+---------+----------------+==========+-----------------+
| MP_EXT | MP_DATETIME | seconds | nsec; tzoffset; |
| = d7/d8 | = 4 | | tzindex; |
+---------+----------------+==========+-----------------+
MessagePack data contains:
- Seconds (8 bytes) as an unencoded 64-bit signed integer stored in the little-endian order.
- The optional fields (8 bytes), if any of them have a non-zero value.
The fields include
nsec
,tzoffset
, andtzindex
packed in the little-endian order.
For more information about the datetime type, see datetime field type details and reference for the datetime module.
Since version 2.10.0.
The MessagePack EXT type MP_EXT
together with the extension type
MP_INTERVAL
is a header for values of the INTERVAL type.
MP_INTERVAL
type is 6.
The interval is saved as a variant of a map with a predefined number of known attribute names. If some attributes are undefined, they are omitted from the generated payload.
The interval MessagePack representation looks like this:
+--------+-------------------------+-------------+----------------+
| MP_EXT | Size of packed interval | MP_INTERVAL | PackedInterval |
+--------+-------------------------+-------------+----------------+
Packed interval consists of:
- Packed number of non-zero fields.
- Packed non-null fields.
Each packed field has the following structure:
+----------+=====================+
| field ID | field value |
+----------+=====================+
The number of defined (non-null) fields can be zero. In this case, the packed interval will be encoded as integer 0.
List of the field IDs:
- 0 – year
- 1 – month
- 2 – week
- 3 – day
- 4 – hour
- 5 – minute
- 6 – second
- 7 – nanosecond
- 8 – adjust
Example
Interval value 1 years, 200 months, -77 days
is encoded in the following way:
tarantool> I = datetime.interval.new{year = 1, month = 200, day = -77}
---
...
tarantool> I
---
- +1 years, 200 months, -77 days
...
tarantool> M = msgpack.encode(I)
---
...
tarantool> M
---
- !!binary xwsGBAABAczIA9CzCAE=
...
tarantool> tohex = function(s) return (s:gsub('.', function(c) return string.format('%02X ', string.byte(c)) end)) end
---
...
tarantool> tohex(M)
---
- 'C7 0B 06 04 00 01 01 CC C8 03 D0 B3 08 01 '
...
Where:
- C7 – MP_EXT
- 0B – size of a packed interval value (11 bytes)
- 06 – MP_INTERVAL type
- 04 – number of defined fields
- 00 – field ID (year)
- 01 – packed value
1
- 01 – field ID (month)
- CCC8 – packed value
200
- 03 – field ID (day)
- D0B3 – packed value
-77
- 08 – field ID (adjust)
- 01 – packed value
1
(DT_LIMIT)
For more information about the interval type, see interval field type details and description of the datetime module.