fbpx
Wikipedia

Null character

The null character (also null terminator) is a control character with the value zero.[1][2][3][4] It is present in many character sets, including those defined by the Baudot and ITA2 codes, ISO/IEC 646 (or ASCII), the C0 control code, the Universal Coded Character Set (or Unicode), and EBCDIC. It is available in nearly all mainstream programming languages.[5] It is often abbreviated as NUL (or NULL, though in some contexts that term is used for the null pointer). In 8-bit codes, it is known as a null byte.

The original meaning of this character was like NOP—when sent to a printer or a terminal, it has no effect (some terminals, however, incorrectly display it as space). When electromechanical teleprinters were used as computer output devices, one or more null characters were sent at the end of each printed line to allow time for the mechanism to return to the first printing position on the next line.[citation needed] On punched tape, the character is represented with no holes at all, so a new unpunched tape is initially filled with null characters, and often text could be inserted at a reserved space of null characters by punching the new characters into the tape over the nulls.

Today the character has much more significance in the programming language C and its derivatives and in many data formats, where it serves as a reserved character used to signify the end of a string,[6] often called a null-terminated string.[7] This allows the string to be any length with only the overhead of one byte; the alternative of storing a count requires either a string length limit of 255 or an overhead of more than one byte (there are other advantages/disadvantages described in the null-terminated string article).

Representation edit

In source code, the null character is often represented as the escape sequence \0 in string literals (for example, "abc\0def") or in character constants ('\0'); the latter may also be written instead simply as 0 (without quotes nor slash).[8] In many languages (such as C, which introduced this notation), this is not a separate escape sequence, but an octal escape sequence with a single octal digit 0; as a consequence, \0 must not be followed by any of the digits 0 through 7; otherwise it is interpreted as the start of a longer octal escape sequence.[9] Other escape sequences that are found in use in various languages are \000, \x00, \z, or \u0000. A null character can be placed in a URL with the percent code %00.

The ability to represent a null character does not always mean the resulting string will be correctly interpreted, as many programs will consider the null to be the end of the string. Thus the ability to type it (in case of unchecked user input) creates a vulnerability known as null byte injection and can lead to security exploits.[10]

In caret notation the null character is ^@. On some keyboards, one can enter a null character by holding down Ctrl and pressing @ (on US layouts just Ctrl+2 will often work, there being no need for ⇧ Shift to get the @ sign).

The Hexadecimal notation for null is 00. Decoding the Base64 string AA== also yields the null character.

In documentation, the null character is sometimes represented as a single-em-width symbol containing the letters "NUL". In Unicode, there is a character for this: U+2400 .

Encoding edit

In all modern character sets, the null character has a code point value of zero. In most encodings, this is translated to a single code unit with a zero value. For instance, in UTF-8 it is a single zero byte. However, in Modified UTF-8 the null character is encoded as two bytes: 0xC0, 0x80. This allows the byte with the value of zero, which is now not used for any character, to be used as a string terminator.

References edit

  1. ^ ASCII format for Network Interchange. IETF. sec. 5.2. doi:10.17487/RFC0020. RFC 20. NUL (Null): The all-zeros character which may serve to accomplish time fill and media fill.
  2. ^ (PDF). Secretariat ISO/TC 97/SC 2. 1975-12-01. p. 4.4. Archived from the original (PDF) on 2014-05-12. Position: 0/0, Name: Null, Abbreviation: Nul
  3. ^ "Unicode Character 'NULL' (U+0000)". Retrieved 2018-10-20.
  4. ^ "C0 Controls and Basic Latin" (PDF). Unicode Consortium. 2018. Retrieved 2018-10-20.
  5. ^ "A byte with all bits set to 0, called the null character, shall exist in the basic execution character set; it is used to terminate a character string literal." — ANSI/ISO 9899:1990 (the ANSI C standard), section 5.2.1
  6. ^ "A string is a contiguous sequence of characters terminated by and including the first null character" — ANSI/ISO 9899:1990 (the ANSI C standard), section 7.1.1
  7. ^ Working Draft, Standard for Programming Language C++ (PDF) (ISO 14882 standard working draft), ISO/IEC, 28 February 2011, p. 427, N3242=11-0012, retrieved 27 February 2013, A null-terminated byte string, or NTBS, is a character sequence whose highest-addressed element with defined content has the value zero (the terminating null character); no other element in the sequence has the value zero.
  8. ^ Kernighan and Ritchie, C, p. 38: "The character constant '\0' represents the character with value zero, the null character. '\0' is often written instead of 0 to emphasize the character nature of some expression, but the numeric value is just 0."}}
  9. ^ In YAML this combination is a separate escape sequence.
  10. ^ Null Byte Injection WASC Threat Classification Null Byte Attack section.

External links edit

  • Null Byte Injection WASC Threat Classification Null Byte Attack section
  • Poison Null Byte Introduction Introduction to Null Byte Attack
  • Apple null byte injection QR code vulnerability

null, character, other, uses, null, symbol, null, character, also, null, terminator, control, character, with, value, zero, present, many, character, sets, including, those, defined, baudot, ita2, codes, ascii, control, code, universal, coded, character, unico. For other uses see Null symbol The null character also null terminator is a control character with the value zero 1 2 3 4 It is present in many character sets including those defined by the Baudot and ITA2 codes ISO IEC 646 or ASCII the C0 control code the Universal Coded Character Set or Unicode and EBCDIC It is available in nearly all mainstream programming languages 5 It is often abbreviated as NUL or NULL though in some contexts that term is used for the null pointer In 8 bit codes it is known as a null byte The original meaning of this character was like NOP when sent to a printer or a terminal it has no effect some terminals however incorrectly display it as space When electromechanical teleprinters were used as computer output devices one or more null characters were sent at the end of each printed line to allow time for the mechanism to return to the first printing position on the next line citation needed On punched tape the character is represented with no holes at all so a new unpunched tape is initially filled with null characters and often text could be inserted at a reserved space of null characters by punching the new characters into the tape over the nulls Today the character has much more significance in the programming language C and its derivatives and in many data formats where it serves as a reserved character used to signify the end of a string 6 often called a null terminated string 7 This allows the string to be any length with only the overhead of one byte the alternative of storing a count requires either a string length limit of 255 or an overhead of more than one byte there are other advantages disadvantages described in the null terminated string article Contents 1 Representation 2 Encoding 3 References 4 External linksRepresentation editIn source code the null character is often represented as the escape sequence 0 in string literals for example abc 0def or in character constants 0 the latter may also be written instead simply as 0 without quotes nor slash 8 In many languages such as C which introduced this notation this is not a separate escape sequence but an octal escape sequence with a single octal digit 0 as a consequence 0 must not be followed by any of the digits 0 through 7 otherwise it is interpreted as the start of a longer octal escape sequence 9 Other escape sequences that are found in use in various languages are 000 x00 z or u0000 A null character can be placed in a URL with the percent code 00 The ability to represent a null character does not always mean the resulting string will be correctly interpreted as many programs will consider the null to be the end of the string Thus the ability to type it in case of unchecked user input creates a vulnerability known as null byte injection and can lead to security exploits 10 In caret notation the null character is On some keyboards one can enter a null character by holding down Ctrl and pressing on US layouts just Ctrl 2 will often work there being no need for Shift to get the sign The Hexadecimal notation for null is 00 Decoding the Base64 string AA also yields the null character In documentation the null character is sometimes represented as a single em width symbol containing the letters NUL In Unicode there is a character for this U 2400 Encoding editIn all modern character sets the null character has a code point value of zero In most encodings this is translated to a single code unit with a zero value For instance in UTF 8 it is a single zero byte However in Modified UTF 8 the null character is encoded as two bytes 0xC0 0x80 This allows the byte with the value of zero which is now not used for any character to be used as a string terminator References edit ASCII format for Network Interchange IETF sec 5 2 doi 10 17487 RFC0020 RFC 20 NUL Null The all zeros character which may serve to accomplish time fill and media fill The set of control characters of the ISO 646 PDF Secretariat ISO TC 97 SC 2 1975 12 01 p 4 4 Archived from the original PDF on 2014 05 12 Position 0 0 Name Null Abbreviation Nul Unicode Character NULL U 0000 Retrieved 2018 10 20 C0 Controls and Basic Latin PDF Unicode Consortium 2018 Retrieved 2018 10 20 A byte with all bits set to 0 called the null character shall exist in the basic execution character set it is used to terminate a character string literal ANSI ISO 9899 1990 the ANSI C standard section 5 2 1 A string is a contiguous sequence of characters terminated by and including the first null character ANSI ISO 9899 1990 the ANSI C standard section 7 1 1 Working Draft Standard for Programming Language C PDF ISO 14882 standard working draft ISO IEC 28 February 2011 p 427 N3242 11 0012 retrieved 27 February 2013 A null terminated byte string or NTBS is a character sequence whose highest addressed element with defined content has the value zero the terminating null character no other element in the sequence has the value zero Kernighan and Ritchie C p 38 The character constant 0 represents the character with value zero the null character 0 is often written instead of 0 to emphasize the character nature of some expression but the numeric value is just 0 In YAML this combination is a separate escape sequence Null Byte Injection WASC Threat Classification Null Byte Attack section External links editNull Byte Injection WASC Threat Classification Null Byte Attack section Poison Null Byte Introduction Introduction to Null Byte Attack Apple null byte injection QR code vulnerability Retrieved from https en wikipedia org w index php title Null character amp oldid 1163231125, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.