fbpx
Wikipedia

Bencode

Bencode (pronounced like Bee-encode) is the encoding used by the peer-to-peer file sharing system BitTorrent for storing and transmitting loosely structured data.[1]

It supports four different types of values:

Bencoding is most commonly used in torrent files, and as such is part of the BitTorrent specification. These metadata files are simply bencoded dictionaries.

Bencoding is simple and (because numbers are encoded as text in decimal notation) is unaffected by endianness, which is important for a cross-platform application like BitTorrent. It is also fairly flexible, as long as applications ignore unexpected dictionary keys, so that new ones can be added without creating incompatibilities.

Encoding algorithm

Bencode uses ASCII characters as delimiters and digits.

  • An integer is encoded as i<integer encoded in base ten ASCII>e. Leading zeros are not allowed (although the number zero is still represented as "0"). Negative values are encoded by prefixing the number with a hyphen-minus. The number 42 would thus be encoded as i42e, 0 as i0e, and -42 as i-42e. Negative zero is not permitted.
  • A byte string (a sequence of bytes, not necessarily characters) is encoded as <length>:<contents>. The length is encoded in base 10, like integers, but must be non-negative (zero is allowed); the contents are just the bytes that make up the string. The string "spam" would be encoded as 4:spam. The specification does not deal with encoding of characters outside the ASCII set; to mitigate this, some BitTorrent applications explicitly communicate the encoding (most commonly UTF-8) in various non-standard ways. This is identical to how netstrings work, except that netstrings additionally append a comma suffix after the byte sequence.
  • A list of values is encoded as l<contents>e . The contents consist of the bencoded elements of the list, in order, concatenated. A list consisting of the string "spam" and the number 42 would be encoded as: l4:spami42ee. Note the absence of separators between elements, and the first character is the letter 'l', not digit '1'.
  • A dictionary is encoded as d<contents>e. The elements of the dictionary are encoded with each key immediately followed by its value. All keys must be byte strings and must appear in lexicographical order. A dictionary that associates the values 42 and "spam" with the keys "foo" and "bar", respectively (in other words, {"bar": "spam", "foo": 42}}), would be encoded as follows: d3:bar4:spam3:fooi42ee.

There are no restrictions on what kind of values may be stored in lists and dictionaries; they may (and usually do) contain other lists and dictionaries. This allows for arbitrarily complex data structures to be encoded.

Features & drawbacks

Bencode is a very specialized kind of binary coding with some unique properties:

  • For each possible (complex) value, there is only a single valid bencoding; i.e. there is a bijection between values and their encodings. This has the advantage that applications may compare bencoded values by comparing their encoded forms, eliminating the need to decode the values.
  • Many BE codegroups can be decoded manually. Since the bencoded values often contain binary data, decoding may become quite complex. Bencode is not considered a human-readable encoding format.
  • Bencoding serves similar purposes as data languages like JSON and YAML, allowing complex yet loosely structured data to be stored in a platform independent way.

However, this uniqueness can cause some problems:

  • There are very few bencode editors[2]
  • Because bencoded files contain binary data, and because of some of the intricacies involved in the way binary strings are typically stored, it is often not safe to edit bencode files in text editors.

See also

References

  1. ^ The BitTorrent Protocol Specification 2019-07-26 at the Wayback Machine. BitTorrent.org. Retrieved 8 October 2018.
  2. ^ "BEncode Editor". μTorrent Community Forums. from the original on 24 October 2014. Retrieved 24 October 2014.

External links

  • Bencoding specification
  • File_Bittorrent2 - Another PHP Bencode/decode implementation
  • The original BitTorrent implementation in Python as standalone package
  • Torrent File Editor cross-platform GUI editor for BEncode files
  • bencode-tools - a C library for manipulating bencoded data and a XML schema like validator for bencode messages in Python
  • Bento - Bencode library in Elixir.
  • Beecoder - the file stream parser that de/encoding "B-encode" data format on Java using java.io.* stream Api.
  • Bencode library in Scala
  • There are numerous Perl implementations on CPAN

bencode, pronounced, like, encode, encoding, used, peer, peer, file, sharing, system, bittorrent, storing, transmitting, loosely, structured, data, supports, four, different, types, values, byte, strings, integers, lists, dictionaries, associative, arrays, ben. Bencode pronounced like Bee encode is the encoding used by the peer to peer file sharing system BitTorrent for storing and transmitting loosely structured data 1 It supports four different types of values byte strings integers lists and dictionaries associative arrays Bencoding is most commonly used in torrent files and as such is part of the BitTorrent specification These metadata files are simply bencoded dictionaries Bencoding is simple and because numbers are encoded as text in decimal notation is unaffected by endianness which is important for a cross platform application like BitTorrent It is also fairly flexible as long as applications ignore unexpected dictionary keys so that new ones can be added without creating incompatibilities Contents 1 Encoding algorithm 2 Features amp drawbacks 3 See also 4 References 5 External linksEncoding algorithm EditBencode uses ASCII characters as delimiters and digits An integer is encoded as i lt integer encoded in base ten ASCII gt e Leading zeros are not allowed although the number zero is still represented as 0 Negative values are encoded by prefixing the number with a hyphen minus The number 42 would thus be encoded as i42e 0 as i0e and 42 as i 42e Negative zero is not permitted A byte string a sequence of bytes not necessarily characters is encoded as lt length gt lt contents gt The length is encoded in base 10 like integers but must be non negative zero is allowed the contents are just the bytes that make up the string The string spam would be encoded as 4 spam The specification does not deal with encoding of characters outside the ASCII set to mitigate this some BitTorrent applications explicitly communicate the encoding most commonly UTF 8 in various non standard ways This is identical to how netstrings work except that netstrings additionally append a comma suffix after the byte sequence A list of values is encoded as l lt contents gt e The contents consist of the bencoded elements of the list in order concatenated A list consisting of the string spam and the number 42 would be encoded as l4 spami42ee Note the absence of separators between elements and the first character is the letter l not digit 1 A dictionary is encoded as d lt contents gt e The elements of the dictionary are encoded with each key immediately followed by its value All keys must be byte strings and must appear in lexicographical order A dictionary that associates the values 42 and spam with the keys foo and bar respectively in other words bar spam foo 42 would be encoded as follows d3 bar4 spam3 fooi42ee There are no restrictions on what kind of values may be stored in lists and dictionaries they may and usually do contain other lists and dictionaries This allows for arbitrarily complex data structures to be encoded Features amp drawbacks EditBencode is a very specialized kind of binary coding with some unique properties For each possible complex value there is only a single valid bencoding i e there is a bijection between values and their encodings This has the advantage that applications may compare bencoded values by comparing their encoded forms eliminating the need to decode the values Many BE codegroups can be decoded manually Since the bencoded values often contain binary data decoding may become quite complex Bencode is not considered a human readable encoding format Bencoding serves similar purposes as data languages like JSON and YAML allowing complex yet loosely structured data to be stored in a platform independent way However this uniqueness can cause some problems There are very few bencode editors 2 Because bencoded files contain binary data and because of some of the intricacies involved in the way binary strings are typically stored it is often not safe to edit bencode files in text editors See also EditBitTorrentReferences Edit The BitTorrent Protocol Specification Archived 2019 07 26 at the Wayback Machine BitTorrent org Retrieved 8 October 2018 BEncode Editor mTorrent Community Forums Archived from the original on 24 October 2014 Retrieved 24 October 2014 External links EditBencoding specification File Bittorrent2 Another PHP Bencode decode implementation The original BitTorrent implementation in Python as standalone package Torrent File Editor cross platform GUI editor for BEncode files bencode tools a C library for manipulating bencoded data and a XML schema like validator for bencode messages in Python Bento Bencode library in Elixir Beecoder the file stream parser that de encoding B encode data format on Java using java io stream Api Bencode parsing in Java Bencode library in Scala Bencode parsing in C There are numerous Perl implementations on CPAN Retrieved from https en wikipedia org w index php title Bencode amp oldid 1138893929, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.