AI Detection Tool Testing — Initial Results

We’ve limited our testing to Turnitin’s AI detection tool. Why? Turnitin has undergone privacy and risk reviews and is a college-approved technology. Other detection tools haven’t been reviewed and may not meet recommended data privacy standards.

What We’ve Learned So Far

  • Unedited AI-generated content often receives a 100% AI-generated score, although more complex writing by ChatGPT4 can score far less than 100%.
  • Adding typos and grammar mistakes or prompting the AI generator to include errors throughout a document canchange the AI-generated score from 100% to 0%. 
  • Adding I-statements throughout a document has a dramatic impact in lowering the AI score. 
  • Interrupting the flow of text by replacing one word every couple of sentences with a less likely word, increases the perplexity of the wording and lowers the AI-generated percentage.  AI text generators act like text predictors, creating text by adding the most likely next work. If the detector is perplexed by a word because the word is not the most likely choice, then it’s determined to be human written.
  • Unlike human-generated writing, AI sentences tend to be uniform. Changing the length of sentences throughout a document, making some sentences shorter and others longer and more complex, alters the burstiness and lowers the generated-by-AI score. 
  • By replacing one or two words per paragraph and modifying the length of sentences here and there throughout a chunk of text — i.e. by doing minor tweaks of both perplexity and burstiness — the AI-generated score changes from 100% to 0%. 

To learn more about how AI detection tools work, read AI Classifiers — What’s the problem with detection tools?