RLP encoding

RLP (recursive length prefix) is a common algorithm for encoding of variable length binary data. RLP encodes data before storing on disk or transmitting via network.

Theory

Encoding

Primary RLP can only deal with “item” type, which is defined as:

  1. Byte string (bytes or bytearray in Python) or

  2. Sequence of items (usually list).

Some examples are:

  • b'\x00\xff'

  • empty list []

  • list of bytes [b'\x00', b'\x01\x03']

  • list of combinations [[], b'\x00', [b'\x00']]

The encoded result is always a byte string:

RLP encoding diagram

RLP encoding diagram

Encoding algorithm

Given x item as input, we define rlp_encode as the following algorithm:

Let concat be a function that joins given bytes into single byte sequence.

  1. If x is a single byte and 0x00 <= x <= 0x7F, rlp_encode(x) = x.

  2. Otherwise, if x is a byte string, Let len(x) be length of x in bytes and define encoding as follows:

    • If 0 < len(x) < 0x38 (note that empty byte string fulfills this requirement, as well as b'0x80'):

      rlp_encode(x) = concat(0x80 + len(x), x)
      

      In this case first byte is in range [0x80; 0xB7].

    • If 0x38 <= len(x) <= 0xFFFFFFFF:

      rlp_encode(x) = concat(0xB7 + len(len(x)), len(x), x)
      

      In this case first byte is in range [0xB8; 0xBF].

    • For longer strings encoding is undefined.

  3. Otherwise, if x is a list, let s = concat(map(rlp_encode, x)) be concatenation of RLP encodings of all its items.

    • If 0 < len(s) < 0x38 (note that empty list matches):

      rlp_encode(x) = concat(0xC0 + len(s), s)
      

      In this case first byte is in range [0xC0; 0xF7].

    • If 0x38 <= len(s) <= 0xFFFFFFFF:

      rlp_encode(x) = concat(0xF7 + len(len(s)), len(s), x)
      

      In this case first byte is in range [0xF8; 0xFF].

    • For longer lists encoding is undefined.

See more in Ethereum wiki.

Encoding examples

Encoding examples

x

rlp_encode(x)

b''

0x80

b'\x00'

0x00

b'\x0F'

0x0F

b'\x79'

0x79

b'\x80'

0x81 0x80

b'\xFF'

0x81 0xFF

b'foo'

0x83 0x66 0x6F 0x6F

[]

0xC0

[b'\x0F']

0xC1 0x0F

[b'\xEF']

0xC1 0x81 0xEF

[[], [[]]]

0xC3 0xC0 0xC1 0xC0

Serialization

However, in the real world, the inputs are not pure bytes nor lists. Some are of complex key-value pairs like dict. Some are of "0x123" form of number.

This module exists for some pre-defined conversion, serialization:

Actual RLP encoding diagram

Actual RLP encoding diagram

API documentation

RLP Encoding/Decoding layer for “real-world” objects.

Classes:

ComplexCodec

Wrapper around BaseWrapper that implements RLP encoding.

BytesKind

Convert bytes type of Python object to RLP "item".

NumericKind

Serializer for number-like objects.

BlobKind

Serializer for 0x.... hex strings.

FixedBlobKind

Serializer for 0x.... fixed-length hex strings.

OptionalFixedBlobKind

Serializer for 0x.... fixed-length hex strings that may be None.

CompactFixedBlobKind

Serializer for 0x.... fixed-length hex strings that may start with zeros.

DictWrapper

A container for working with dict-like objects.

ListWrapper

Container for parsing a heterogeneous list.

HomoListWrapper

Container for parsing a homogeneous list.

AbstractSerializer

Abstract class for all serializers.

ScalarKind

Abstract class for all scalar serializers (they accept "basic" values).

BaseWrapper

Abstract serializer for complex types.

Deprecated:

NoneableFixedBlobKind

Deprecated alias for OptionalFixedBlobKind.

pack

Pack a Python object according to wrapper.

unpack

Unpack a serialized thing back into a dict/list or a Python basic type.

class thor_devkit.rlp.ComplexCodec(wrapper: thor_devkit.rlp.AbstractSerializer[Any])[source]

Bases: object

Wrapper around BaseWrapper that implements RLP encoding.

Abstract layer to join serialization and encoding (and reverse operations) together.

Attributes:

wrapper

BaseWrapper or ScalarKind to use for serialization.

Methods:

encode

Serialize and RLP-encode given high-level data to bytes.

decode

RLP-decode and deserialize given bytes into higher-level structure.

wrapper: thor_devkit.rlp.AbstractSerializer[Any]

BaseWrapper or ScalarKind to use for serialization.

encode(data: Any) bytes[source]

Serialize and RLP-encode given high-level data to bytes.

decode(data: bytes) Any[source]

RLP-decode and deserialize given bytes into higher-level structure.

class thor_devkit.rlp.BytesKind[source]

Bases: thor_devkit.rlp.ScalarKind[bytes]

Convert bytes type of Python object to RLP “item”.

Methods:

is_valid_type

Confirm that obj is bytes or bytearray.

serialize

Serialize the object into a RLP encodable "item".

deserialize

Deserialize a RLP "item" back to bytes.

classmethod is_valid_type(obj: object) TypeGuard[bytes][source]

Confirm that obj is bytes or bytearray.

serialize(obj: bytes) bytes[source]

Serialize the object into a RLP encodable “item”.

Parameters

obj (bytes) – The input.

Returns

The “item” in bytes.

Return type

bytes

Raises

TypeError – If input is not bytes.

deserialize(serial: bytes) bytes[source]

Deserialize a RLP “item” back to bytes.

Parameters

serial (bytes) – The input.

Returns

Original bytes.

Return type

bytes

Raises

TypeError – If input is not bytes.

class thor_devkit.rlp.NumericKind(max_bytes: Optional[int] = None)[source]

Bases: rlp.sedes.big_endian_int.BigEndianInt, thor_devkit.rlp.ScalarKind[int]

Serializer for number-like objects.

Good examples are:

'0x0', '0x123', '0', '100', 0, 0x123, True

Bad examples are:

'0x123z', {}, '0x', -1, '0x12345678123456780'

Changed in version 2.0.0: Allowed bool values True and False.

Initialize a NumericKind.

Parameters

max_bytes (Optional[int], optional) – Max bytes in the encoded result (prepend 0 if there’s not enough)

Attributes:

max_bytes

Maximal allowed size of number, in bytes.

Methods:

serialize

Serialize the object into a RLP encodable "item".

deserialize

Deserialize bytes to int.

max_bytes: Optional[int]

Maximal allowed size of number, in bytes.

serialize(obj: Union[str, int]) bytes[source]

Serialize the object into a RLP encodable “item”.

Parameters

obj (str or int) – obj is either int or string representation of int parseable by int().

Returns

Serialized data

Return type

bytes

Raises
deserialize(serial: bytes) int[source]

Deserialize bytes to int.

Parameters

serial (bytes) – bytes

Returns

Deserialized number.

Return type

int

Raises

DeserializationError – If bytes contain leading 0.

class thor_devkit.rlp.BlobKind[source]

Bases: thor_devkit.rlp.ScalarKind[str]

Serializer for 0x.... hex strings.

Used for strings that shouldn’t be interpreted as a number, usually an identifier.

Examples: address, block_ref, data to smart contract.

Methods:

serialize

Serialize a 0x... string to bytes.

deserialize

Deserialize bytes to 0x... string.

serialize(obj: str) bytes[source]

Serialize a 0x... string to bytes.

Parameters

obj (str) – 0x... style string.

Returns

Encoded string.

Return type

bytes

Raises

SerializationError – If input data is malformed.

deserialize(serial: bytes) str[source]

Deserialize bytes to 0x... string.

Parameters

serial (bytes) – Encoded string.

Returns

string of style 0x...

Return type

str

Raises

TypeError – If input is not bytes nor bytearray

class thor_devkit.rlp.FixedBlobKind(byte_length: int)[source]

Bases: thor_devkit.rlp.BlobKind

Serializer for 0x.... fixed-length hex strings.

Used for strings that shouldn’t be interpreted as a number, usually an identifier. Examples: address, block_ref, data to smart contract.

Note

This kind has a fixed length of bytes. (also means the input hex is fixed length)

Attributes:

byte_length

Length of blob, in bytes.

Methods:

serialize

Serialize a 0x... string to bytes.

deserialize

Deserialize bytes to 0x... string.

byte_length: int

Length of blob, in bytes.

serialize(obj: str) bytes[source]

Serialize a 0x... string to bytes.

Parameters

obj (str) – 0x... style string.

Returns

Encoded string.

Return type

bytes

Raises

SerializationError – If input data is malformed (e.g. wrong length)

deserialize(serial: bytes) str[source]

Deserialize bytes to 0x... string.

Parameters

serial (bytes) – Encoded string.

Returns

String of style 0x...'

Return type

str

Raises

DeserializationError – If input is malformed (e.g. wrong length)

class thor_devkit.rlp.OptionalFixedBlobKind(byte_length: int)[source]

Bases: thor_devkit.rlp.FixedBlobKind

Serializer for 0x.... fixed-length hex strings that may be None.

Used for strings that shouldn’t be interpreted as a number, usually an identifier. Examples: address, block_ref, data to smart contract.

Note

This kind has a fixed length of bytes. (also means the input hex is fixed length)

For this kind, input can be None. Then decoded is also None.

Methods:

serialize

Serialize a 0x... string or None to bytes.

deserialize

Deserialize bytes to 0x... string or None.

serialize(obj: Optional[str] = None) bytes[source]

Serialize a 0x... string or None to bytes.

Parameters

obj (Optional[str], default: None) – 0x... style string.

Returns

Encoded string.

Return type

bytes

deserialize(serial: bytes) Optional[str][source]

Deserialize bytes to 0x... string or None.

Parameters

serial (bytes) – Serialized data.

Returns

String of style 0x... or None

Return type

Optional[str]

class thor_devkit.rlp.CompactFixedBlobKind(byte_length: int)[source]

Bases: thor_devkit.rlp.FixedBlobKind

Serializer for 0x.... fixed-length hex strings that may start with zeros.

Used for strings that shouldn’t be interpreted as a number, usually an identifier. Examples: address, block_ref, data to smart contract.

Note

When encode, the result fixed length bytes will be removed of leading zeros. i.e. 000123 -> 123

When decode, it expects the input bytes length <= fixed_length. and it pads the leading zeros back. Output '0x{"0" * n}xxx...'

Methods:

serialize

Serialize a 0x... string to bytes, stripping leading zeroes.

deserialize

Deserialize bytes to 0x... string.

serialize(obj: str) bytes[source]

Serialize a 0x... string to bytes, stripping leading zeroes.

Parameters

obj (str) – 0x... style string.

Returns

Encoded string with leading zeroes removed.

Return type

bytes

deserialize(serial: bytes) str[source]

Deserialize bytes to 0x... string.

Parameters

serial (bytes) – Encoded data.

Returns

String of style 0x... of fixed length

Return type

str

Raises

DeserializationError – If input is malformed.

class thor_devkit.rlp.DictWrapper(codecs: Union[Sequence[Tuple[str, thor_devkit.rlp.AbstractSerializer[Any]]], Mapping[str, thor_devkit.rlp.AbstractSerializer[Any]]])[source]

Bases: thor_devkit.rlp.BaseWrapper[Mapping[str, Any]]

A container for working with dict-like objects.

Create wrapper from items.

Parameters

codecs (Mapping[str, BaseWrapper or ScalarKind] or its .values()-like list) –

Codecs to use. Possible values (codec is any BaseWrapper or ScalarKind):

  • Any mapping from str to codec, e.g. {'foo': NumericKind()}

  • Any sequence of tuples (name, codec), e.g. [('foo', NumericKind())]

Attributes:

keys

Field names.

codecs

Codecs to use for each field.

Methods:

serialize

Serialize dictionary to sequence of serialized values.

deserialize

Deserialize sequence of encoded values to dictionary with serialized values.

keys: Sequence[str]

Field names.

codecs: Sequence[thor_devkit.rlp.AbstractSerializer[Any]]

Codecs to use for each field.

serialize(obj: Mapping[str, Any]) Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]][source]

Serialize dictionary to sequence of serialized values.

New in version 2.0.0.

Parameters

obj (Mapping[str, Any]) – Dictionary to serialize.

Returns

Sequence of serialized values.

Return type

Sequence[bytes or Sequence[…]] (recursive)

Raises

SerializationError – If input is malformed.

deserialize(serial: Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]]) Dict[str, Any][source]

Deserialize sequence of encoded values to dictionary with serialized values.

New in version 2.0.0.

Parameters

obj (Sequence[bytes or Sequence[...]] (recursive)) – Sequence of values to deserialize.

Returns

Deserialized values, mapping field names to decoded values.

Return type

Mapping[str, Any]

Raises

DeserializationError – If input is malformed.

class thor_devkit.rlp.ListWrapper(codecs: Sequence[thor_devkit.rlp.AbstractSerializer[Any]])[source]

Bases: thor_devkit.rlp.BaseWrapper[Sequence[Any]]

Container for parsing a heterogeneous list.

The items in the list can be of different types.

Create wrapper from items.

Parameters

codecs (Sequence[AbstractSerializer]) – A list of codecs. eg. [codec, codec, codec…] codec is either a BaseWrapper, or a ScalarKind.

Attributes:

codecs

Codecs to use for each element of sequence.

Methods:

serialize

Serialize sequence (list) of values to sequence of serialized values.

deserialize

Deserialize sequence of encoded values to sequence.

codecs: Sequence[thor_devkit.rlp.AbstractSerializer[Any]]

Codecs to use for each element of sequence.

serialize(obj: Sequence[Any]) Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]][source]

Serialize sequence (list) of values to sequence of serialized values.

New in version 2.0.0.

Parameters

obj (Sequence[Any]) – Sequence of values to serialize.

Returns

Sequence of serialized values.

Return type

Sequence[bytes or Sequence[…]] (recursive)

Raises

SerializationError – If input is malformed.

deserialize(serial: Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]]) Sequence[Any][source]

Deserialize sequence of encoded values to sequence.

New in version 2.0.0.

Parameters

obj (Sequence[bytes or Sequence[...]] (recursive)) – Sequence of values to deserialize.

Returns

Deserialized values.

Return type

Sequence[Any]

Raises

DeserializationError – If input is malformed.

class thor_devkit.rlp.HomoListWrapper(codec: thor_devkit.rlp.AbstractSerializer[Any])[source]

Bases: thor_devkit.rlp.BaseWrapper[Sequence[Any]]

Container for parsing a homogeneous list.

Used when the items in the list are of the same type.

Create wrapper from items.

Parameters

codec (AbstractSerializer) – codec is either a BaseWrapper, or a ScalarKind.

Attributes:

codec

Codec to use for each element of array.

Methods:

serialize

Serialize sequence (list) of values to sequence of serialized values.

deserialize

Deserialize sequence of encoded values to sequence.

codec: thor_devkit.rlp.AbstractSerializer[Any]

Codec to use for each element of array.

serialize(obj: Sequence[Any]) Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]][source]

Serialize sequence (list) of values to sequence of serialized values.

New in version 2.0.0.

Parameters

obj (Sequence[Any]) – Sequence of values to serialize.

Returns

Sequence of serialized values.

Return type

Sequence[bytes or Sequence[…]] (recursive)

Raises

SerializationError – If input is malformed.

deserialize(serial: Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]]) Sequence[Any][source]

Deserialize sequence of encoded values to sequence.

New in version 2.0.0.

Parameters

obj (Sequence[bytes or Sequence[...]] (recursive)) – Sequence of values to deserialize.

Returns

Deserialized values.

Return type

Sequence[Any]

Raises

DeserializationError – If input is malformed.

class thor_devkit.rlp.AbstractSerializer[source]

Bases: Generic[thor_devkit.rlp._T], abc.ABC

Abstract class for all serializers.

New in version 2.0.0.

Methods:

serialize

Serialize the object into a RLP encodable "item".

deserialize

Deserialize given bytes into higher-level object.

abstract serialize(_AbstractSerializer__obj: thor_devkit.rlp._T) Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]]][source]

Serialize the object into a RLP encodable “item”.

abstract deserialize(_AbstractSerializer__serial: Any) thor_devkit.rlp._T[source]

Deserialize given bytes into higher-level object.

class thor_devkit.rlp.ScalarKind[source]

Bases: thor_devkit.rlp.AbstractSerializer[thor_devkit.rlp._T]

Abstract class for all scalar serializers (they accept “basic” values).

Methods:

serialize

Serialize the object into a RLP encodable "item".

deserialize

Deserialize given bytes into higher-level object.

abstract serialize(_ScalarKind__obj: thor_devkit.rlp._T) bytes[source]

Serialize the object into a RLP encodable “item”.

abstract deserialize(_ScalarKind__serial: bytes) thor_devkit.rlp._T[source]

Deserialize given bytes into higher-level object.

class thor_devkit.rlp.BaseWrapper[source]

Bases: thor_devkit.rlp.AbstractSerializer[thor_devkit.rlp._T]

Abstract serializer for complex types.

Methods:

serialize

Serialize the object into a RLP encodable "item".

deserialize

Deserialize given bytes into higher-level object.

abstract serialize(_BaseWrapper__obj: thor_devkit.rlp._T) Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]][source]

Serialize the object into a RLP encodable “item”.

New in version 2.0.0.

abstract deserialize(_BaseWrapper__serial: Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]]) thor_devkit.rlp._T[source]

Deserialize given bytes into higher-level object.

New in version 2.0.0.

class thor_devkit.rlp.NoneableFixedBlobKind(*args: Any, **kwargs: Any)[source]

Bases: thor_devkit.rlp.OptionalFixedBlobKind

Deprecated alias for OptionalFixedBlobKind.

Deprecated since version 2.0.0: Use OptionalFixedBlobKind instead.

thor_devkit.rlp.pack(obj: Any, wrapper: thor_devkit.rlp.AbstractSerializer[Any]) Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]]][source]

Pack a Python object according to wrapper.

Deprecated since version 2.0.0: Use <wrapper>.serialize directly instead.

Parameters
  • obj (Any) – A dict, a list, or a string/int/any…

  • wrapper (AbstractSerializer[Any]) – A Wrapper.

Returns

  • bytes – If obj is a basic type.

  • List of packed items – If obj is dict/list.

Raises
thor_devkit.rlp.unpack(packed: Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Union[bytes, Sequence[Any]]]]]]]], wrapper: thor_devkit.rlp.AbstractSerializer[Any]) Union[Dict[str, Any], List[Any], Any][source]

Unpack a serialized thing back into a dict/list or a Python basic type.

Deprecated since version 2.0.0: Use <wrapper>.deserialize directly instead.

Parameters
  • packed (bytes or sequence of them) – A list of RLP encoded or pure bytes (may be nested).

  • wrapper (AbstractSerializer[Any]) – The Wrapper.

Returns

dict/list if the wrapper instruction is dict/list, Python basic type if input is bytes.

Return type

Dict[str, Any] or List[Any] or Any

Raises