fbpx
Wikipedia

wildmat

wildmat is a pattern matching library developed by Rich Salz. Based on the wildcard syntax already used in the Bourne shell, wildmat provides a uniform mechanism for matching patterns across applications with simpler syntax than that typically offered by regular expressions. Patterns are implicitly anchored at the beginning and end of each string when testing for a match.

In June 2019, Rich Salz released the original version of the now-defunct library on GitHub under a public domain dedication.[1]

Pattern matching operations edit

There are five pattern matching operations other than a strict one-to-one match between the pattern and the source to be checked for a match.

  • Asterisk (*) to match any sequence of zero or more characters.
  • Question mark (?) to match any single character.
  • Set of specified characters. It is specified as a list of characters, or as a range of characters where the beginning and end of the range are separated by a minus (or dash) character, or as any combination of lists and ranges. The dash can also be included in the set as a character if it is the beginning or end of the set. This set is enclosed in square brackets. The close square bracket (]) may be used in a set if it is the first character in the set.
  • Negation of a set. It is specified the same way as the set with the addition of a caret character (^) at the beginning of the test string just inside the open square bracket. (NNTP specifies an alternative !. The implementation can be configured to do either.)
  • Backslash (\) character to invalidate the special meaning of the open square bracket ([), the asterisk, backslash or the question mark. Two backslashes in sequence will result in the evaluation of the backslash as a character with no special meaning.

Examples edit

  • *foo* matches string containing "foo".
  • mini* matches anything that begins with "mini" (including the string "mini" itself).
  • ???* matches any string of three and more letters.
  • [0-9a-zA-Z] matches every single alphanumeric ASCII character.
  • [^]-] matches a character other than a close square bracket or a dash.

Usage edit

wildmat is most commonly seen in NNTP implementations such as Salz's own INN, also in unrelated software such as GNU tar and Transmission. GNU tar replaced wildmat with the POSIX fnmatch glob matcher in September 1992. The early version contained a potential out-of-bound access on unclosed [.[2]

The original byte oriented wildmat implementation is unable to handle multibyte character sets, and poses problems when the text being searched may contain multiple incompatible character sets. A simplified version of wildmat oriented toward UTF-8 encoding has been developed by the IETF NNTP working group. It is a part of RFC 3977 (section 4), the 2006 standard for NNTP.

In the newer INN which supports UTF-8, a "uwildmat" was added which supports all the features of wildmat. This 2000 rewrite, performed by Russ Allbery, fixes the OOB in the original implementation. Tightly-wound C loops were written out into smaller statements.[3][4]

Rsync includes a GPLv3-licensed wildmat descendant known as wildmatch, modified by Wayne Davison. The Git version control system imports and makes use of it. It does not support UTF-8, but has the OOB fixed and has additional support for character classes and star globs (** for arbitrary-depth).[5]

See also edit

References edit

  1. ^ Salz, Rich (25 June 2019). "wildmat: The hoary classic wildmat pattern matcher; public domain". GitHub. Retrieved 25 November 2019.
  2. ^ Salz, Rich (25 June 2019). "wildmat.c". GitHub. Might not be robust in face of malformed patterns; e.g., "foo[a-" could cause a segmentation violation.
  3. ^ uwildmat(3) – Linux Library Functions Manual
  4. ^ "uwildmat.c in trunk/lib – INN". inn.eyrie.org. Retrieved 27 November 2019.
  5. ^ "git/git: wildmatch.c". GitHub. 16 February 2022.

External links edit

  • Rich Salz (April 4, 1991). "v17i079: wildmat-1.4 - a /bin/sh-style pattern matcher, Part01/01". Newsgroup: comp.sources.misc. Usenet: 1991Apr4.034350.3923@sparky.IMD.Sterling.COM.
  • Rich Salz (March 9, 1991). "v17i034: wildmat - a /bin/sh-style pattern matcher, Part01/01". Newsgroup: comp.sources.misc. Usenet: 1991Mar9.044016.2409@sparky.IMD.Sterling.COM.

wildmat, pattern, matching, library, developed, rich, salz, based, wildcard, syntax, already, used, bourne, shell, provides, uniform, mechanism, matching, patterns, across, applications, with, simpler, syntax, than, that, typically, offered, regular, expressio. wildmat is a pattern matching library developed by Rich Salz Based on the wildcard syntax already used in the Bourne shell wildmat provides a uniform mechanism for matching patterns across applications with simpler syntax than that typically offered by regular expressions Patterns are implicitly anchored at the beginning and end of each string when testing for a match wildmatDeveloper s Rich SalzTypePattern matchingIn June 2019 Rich Salz released the original version of the now defunct library on GitHub under a public domain dedication 1 Contents 1 Pattern matching operations 1 1 Examples 2 Usage 3 See also 4 References 5 External linksPattern matching operations editThere are five pattern matching operations other than a strict one to one match between the pattern and the source to be checked for a match Asterisk to match any sequence of zero or more characters Question mark to match any single character Set of specified characters It is specified as a list of characters or as a range of characters where the beginning and end of the range are separated by a minus or dash character or as any combination of lists and ranges The dash can also be included in the set as a character if it is the beginning or end of the set This set is enclosed in square brackets The close square bracket may be used in a set if it is the first character in the set Negation of a set It is specified the same way as the set with the addition of a caret character at the beginning of the test string just inside the open square bracket NNTP specifies an alternative The implementation can be configured to do either Backslash character to invalidate the special meaning of the open square bracket the asterisk backslash or the question mark Two backslashes in sequence will result in the evaluation of the backslash as a character with no special meaning Examples edit foo matches string containing foo mini matches anything that begins with mini including the string mini itself matches any string of three and more letters 0 9a zA Z matches every single alphanumeric ASCII character matches a character other than a close square bracket or a dash Usage editwildmat is most commonly seen in NNTP implementations such as Salz s own INN also in unrelated software such as GNU tar and Transmission GNU tar replaced wildmat with the POSIX fnmatch glob matcher in September 1992 The early version contained a potential out of bound access on unclosed 2 The original byte oriented wildmat implementation is unable to handle multibyte character sets and poses problems when the text being searched may contain multiple incompatible character sets A simplified version of wildmat oriented toward UTF 8 encoding has been developed by the IETF NNTP working group It is a part of RFC 3977 section 4 the 2006 standard for NNTP In the newer INN which supports UTF 8 a uwildmat was added which supports all the features of wildmat This 2000 rewrite performed by Russ Allbery fixes the OOB in the original implementation Tightly wound C loops were written out into smaller statements 3 4 Rsync includes a GPLv3 licensed wildmat descendant known as wildmatch modified by Wayne Davison The Git version control system imports and makes use of it It does not support UTF 8 but has the OOB fixed and has additional support for character classes and star globs for arbitrary depth 5 See also editglob programming Kleene star Matching wildcardsReferences edit Salz Rich 25 June 2019 wildmat The hoary classic wildmat pattern matcher public domain GitHub Retrieved 25 November 2019 Salz Rich 25 June 2019 wildmat c GitHub Might not be robust in face of malformed patterns e g foo a could cause a segmentation violation uwildmat 3 Linux Library Functions Manual uwildmat c in trunk lib INN inn eyrie org Retrieved 27 November 2019 git git wildmatch c GitHub 16 February 2022 External links editRich Salz April 4 1991 v17i079 wildmat 1 4 a bin sh style pattern matcher Part01 01 Newsgroup comp sources misc Usenet 1991Apr4 034350 3923 sparky IMD Sterling COM Rich Salz March 9 1991 v17i034 wildmat a bin sh style pattern matcher Part01 01 Newsgroup comp sources misc Usenet 1991Mar9 044016 2409 sparky IMD Sterling COM Retrieved from https en wikipedia org w index php title Wildmat amp oldid 1072127464, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.