C character classification is an operation provided by a group of functions in the ANSI C Standard Library for the C programming language. These functions are used to test characters for membership in a particular class of characters, such as alphabetic characters, control characters, etc. Both single-byte, and wide characters are supported.[1]
Early C-language programmers working on the Unix operating system developed programming idioms for classifying characters into different types. For example, for the ASCII character set, the following expression identifies a letter, when its value is true:
('A'<=c&&c<='Z')||('a'<=c&&c<='z')
As this may be expressed in multiple formulations, it became desirable to introduce short, standardized forms of such tests that were placed in the system-wide header file ctype.h.
Implementationedit
Unlike the above example, the character classification routines are not written as comparison tests. In most C libraries, they are written as static table lookups instead of macros or functions.
For example, an array of 256 eight-bit integers, arranged as bitfields, is created, where each bit corresponds to a particular property of the character, e.g., isdigit, isalpha. If the lowest-order bit of the integers corresponds to the isdigit property, the code could be written as
#define isdigit(x) (TABLE[x] & 1)
Early versions of Linux used a potentially faulty method similar to the first code sample:
#define isdigit(x) ((x) >= '0' && (x) <= '9')
This can cause problems if when the macro expands, the expression substituted for x has a side effect. For example, if one calls isdigit(x++) or isdigit(run_some_program()). It is not immediately evident that the argument to isdigit is evaluated twice. For this reason, the table-based approach is generally used.
Overview of functionsedit
The functions that operate on single-byte characters are defined in ctype.h header file (cctype in C++). The functions that operate on wide characters are defined in wctype.h header file (cwctype in C++).
The classification is evaluated according to the effective locale.
Byte character
Wide character
Description
isalnum
iswalnum
checks whether the operand is alphanumeric
isalpha
iswalpha
checks whether the operand is alphabetic
islower
iswlower
checks whether the operand is lowercase
isupper
iswupper
checks whether the operand is an uppercase
isdigit
iswdigit
checks whether the operand is a digit
isxdigit
iswxdigit
checks whether the operand is hexadecimal
iscntrl
iswcntrl
checks whether the operand is a control character
isgraph
iswgraph
checks whether the operand is a graphical character
checks whether the operand is a blank space character
isprint
iswprint
checks whether the operand is a printable character
ispunct
iswpunct
checks whether the operand is punctuation
tolower
towlower
converts the operand to lowercase
toupper
towupper
converts the operand to uppercase
—
iswctype
checks whether the operand falls into specific class
—
towctrans
converts the operand using a specific mapping
—
wctype
returns a wide character class to be used with iswctype
—
wctrans
returns a transformation mapping to be used with towctrans
Referencesedit
^ISO/IEC 9899:1999 specification(PDF). p. 193, § 7.4.
External linksedit
The Wikibook A Little C Primer has a page on the topic of: C Character Class Test Library
The Wikibook C Programming has a page on the topic of: C Programming/C Reference
April 11, 2024
character, classification, this, article, needs, additional, citations, verification, please, help, improve, this, article, adding, citations, reliable, sources, unsourced, material, challenged, removed, find, sources, news, newspapers, books, scholar, jstor, . This article needs additional citations for verification Please help improve this article by adding citations to reliable sources Unsourced material may be challenged and removed Find sources C character classification news newspapers books scholar JSTOR October 2011 Learn how and when to remove this template message C character classification is an operation provided by a group of functions in the ANSI C Standard Library for the C programming language These functions are used to test characters for membership in a particular class of characters such as alphabetic characters control characters etc Both single byte and wide characters are supported 1 Contents 1 History 2 Implementation 3 Overview of functions 4 References 5 External linksHistory editEarly C language programmers working on the Unix operating system developed programming idioms for classifying characters into different types For example for the ASCII character set the following expression identifies a letter when its value is true A lt c amp amp c lt Z a lt c amp amp c lt z As this may be expressed in multiple formulations it became desirable to introduce short standardized forms of such tests that were placed in the system wide header file ctype h Implementation editUnlike the above example the character classification routines are not written as comparison tests In most C libraries they are written as static table lookups instead of macros or functions For example an array of 256 eight bit integers arranged as bitfields is created where each bit corresponds to a particular property of the character e g isdigit isalpha If the lowest order bit of the integers corresponds to the isdigit property the code could be written as define isdigit x TABLE x amp 1 Early versions of Linux used a potentially faulty method similar to the first code sample define isdigit x x gt 0 amp amp x lt 9 This can cause problems if when the macro expands the expression substituted for x has a side effect For example if one calls isdigit x or isdigit run some program It is not immediately evident that the argument to isdigit is evaluated twice For this reason the table based approach is generally used Overview of functions editThe functions that operate on single byte characters are defined in ctype h header file cctype in C The functions that operate on wide characters are defined in wctype h header file cwctype in C The classification is evaluated according to the effective locale Bytecharacter Widecharacter Descriptionisalnum iswalnum checks whether the operand is alphanumericisalpha iswalpha checks whether the operand is alphabeticislower iswlower checks whether the operand is lowercaseisupper iswupper checks whether the operand is an uppercaseisdigit iswdigit checks whether the operand is a digitisxdigit iswxdigit checks whether the operand is hexadecimaliscntrl iswcntrl checks whether the operand is a control characterisgraph iswgraph checks whether the operand is a graphical characterisspace iswspace checks whether the operand is spaceisblank iswblank checks whether the operand is a blank space characterisprint iswprint checks whether the operand is a printable characterispunct iswpunct checks whether the operand is punctuationtolower towlower converts the operand to lowercasetoupper towupper converts the operand to uppercase iswctype checks whether the operand falls into specific class towctrans converts the operand using a specific mapping wctype returns a wide character class to be used with iswctype wctrans returns a transformation mapping to be used with towctransReferences edit ISO IEC 9899 1999 specification PDF p 193 7 4 External links edit nbsp The Wikibook A Little C Primer has a page on the topic of C Character Class Test Library nbsp The Wikibook C Programming has a page on the topic of C Programming C Reference Retrieved from https en wikipedia org w index php title C character classification amp oldid 1115151620, wikipedia, wiki, book, books, library,