fbpx
Wikipedia

Writeprint

Writeprint is a method in forensic linguistics of establishing author identification over the internet, likened to a digital fingerprint. Identity is established through a comparison of distinguishing stylometric characteristics of an unknown written text with known samples of the suspected author (writer invariants). Even without a suspect, writeprint provides potential background characteristics of the author, such as nationality and education.[1]

There are five broad aspects to author identification in writeprint:

  • Lexical features - the analysis of the lexicon, the author's choice of vocabulary, using characters and words to identify preferences of an individual;
    • use of uppercase and lowercase letters, frequency of certain letters, average length of word, mean length of the utterance itself[2]
  • Syntactic features - the analysis of the author's writing style and sentence structure, such as punctuation and hyphenation, use of passive voice, and sentence complexity;
  • Structural features - the analysis of the author's organization and structural arrangement of the work, including paragraph length, spacing, and indentation.
    • encompassing arrangement of sentences within paragraphs, use of farewells, greetings and signatures in an email setting, for example;
  • Content-specific features - the analysis of the language that is contextually significant to subject of the written work, including the use of slang or acronyms. To be more specific, these features determine the interests of the subject by pinpointing keywords they use;
  • Idiosyncratic features - the analysis of errors and other ungrammatical elements that may be unique to the author, such as incorrect spelling, misuse of words and inaccurate verb forms. Because this can be hard to control, it has achieved high accuracy in author identification when combined with other features.[3]

While the five features above are the traditional methods of author identification, there are features unique to online text. Features such as choice in font, the use of emojis, and links to other websites all provide a path to identification which is absent in traditional text analysis.[4]

See also

References

  1. ^ Li, Jiexun; Zheng, Rong; Chen, Hsinchun (April 2006). "From Fingerprint to Writeprint". Communications of the ACM. 49 (4): 76–82. doi:10.1145/1121949.1121951. S2CID 14341797.
  2. ^ Iqbal, F; Binsalleeh, H; Fung, B; Debbabi, M (October 2010). "Mining writeprints from anonymous e-mails for forensic investigation". Digital Investigation. 7 (1–2): 56–64. doi:10.1016/j.diin.2010.03.003.
  3. ^ Abbasi, Ahmed; Chen, Hsinchun; Nunamaker Jr., Jay F. (Summer 2008). "Stylometric Identification in Electronic Markets: Scalability and Robustness". Journal of Management Information Systems. 25 (1): 49–78. doi:10.2753/MIS0742-1222250103. JSTOR 40398926. S2CID 3941985.
  4. ^ Rehmeyer, Juli (Jan 13, 2007). "Digital Fingerprints". Science News. 171 (2): 26–28. doi:10.1002/scin.2007.5591710210. JSTOR 3982506.


writeprint, method, style, handwriting, that, combines, cursive, cursive, letters, within, single, word, print, writing, method, forensic, linguistics, establishing, author, identification, over, internet, likened, digital, fingerprint, identity, established, . For the method or style of handwriting that combines cursive and non cursive letters within a single word see print writing Writeprint is a method in forensic linguistics of establishing author identification over the internet likened to a digital fingerprint Identity is established through a comparison of distinguishing stylometric characteristics of an unknown written text with known samples of the suspected author writer invariants Even without a suspect writeprint provides potential background characteristics of the author such as nationality and education 1 There are five broad aspects to author identification in writeprint Lexical features the analysis of the lexicon the author s choice of vocabulary using characters and words to identify preferences of an individual use of uppercase and lowercase letters frequency of certain letters average length of word mean length of the utterance itself 2 Syntactic features the analysis of the author s writing style and sentence structure such as punctuation and hyphenation use of passive voice and sentence complexity Structural features the analysis of the author s organization and structural arrangement of the work including paragraph length spacing and indentation encompassing arrangement of sentences within paragraphs use of farewells greetings and signatures in an email setting for example Content specific features the analysis of the language that is contextually significant to subject of the written work including the use of slang or acronyms To be more specific these features determine the interests of the subject by pinpointing keywords they use Idiosyncratic features the analysis of errors and other ungrammatical elements that may be unique to the author such as incorrect spelling misuse of words and inaccurate verb forms Because this can be hard to control it has achieved high accuracy in author identification when combined with other features 3 While the five features above are the traditional methods of author identification there are features unique to online text Features such as choice in font the use of emojis and links to other websites all provide a path to identification which is absent in traditional text analysis 4 See also EditAuthor profiling Stylometry Forensic linguisticsReferences Edit Li Jiexun Zheng Rong Chen Hsinchun April 2006 From Fingerprint to Writeprint Communications of the ACM 49 4 76 82 doi 10 1145 1121949 1121951 S2CID 14341797 Iqbal F Binsalleeh H Fung B Debbabi M October 2010 Mining writeprints from anonymous e mails for forensic investigation Digital Investigation 7 1 2 56 64 doi 10 1016 j diin 2010 03 003 Abbasi Ahmed Chen Hsinchun Nunamaker Jr Jay F Summer 2008 Stylometric Identification in Electronic Markets Scalability and Robustness Journal of Management Information Systems 25 1 49 78 doi 10 2753 MIS0742 1222250103 JSTOR 40398926 S2CID 3941985 Rehmeyer Juli Jan 13 2007 Digital Fingerprints Science News 171 2 26 28 doi 10 1002 scin 2007 5591710210 JSTOR 3982506 This law enforcement related article is a stub You can help Wikipedia by expanding it vte This crime related article is a stub You can help Wikipedia by expanding it vte Retrieved from https en wikipedia org w index php title Writeprint amp oldid 1129008493, wikipedia, wiki, book, books, library,

article

, read, download, free, free download, mp3, video, mp4, 3gp, jpg, jpeg, gif, png, picture, music, song, movie, book, game, games.