brainsteam.co.uk/brainsteam/content/annotations/2023/01/29/1674989184.md

---
date: '2023-01-29T10:46:24'
hypothesis-meta:
  created: '2023-01-29T10:46:24.271948+00:00'
  document:
    title:
    - 2301.11305.pdf
  flagged: false
  group: __world__
  hidden: false
  id: LNKuap_CEe2NNLuZfhdxTA
  links:
    html: https://hypothes.is/a/LNKuap_CEe2NNLuZfhdxTA
    incontext: https://hyp.is/LNKuap_CEe2NNLuZfhdxTA/arxiv.org/pdf/2301.11305.pdf
    json: https://hypothes.is/api/annotations/LNKuap_CEe2NNLuZfhdxTA
  permissions:
    admin:
    - acct:ravenscroftj@hypothes.is
    delete:
    - acct:ravenscroftj@hypothes.is
    read:
    - group:__world__
    update:
    - acct:ravenscroftj@hypothes.is
  tags:
  - chatgpt
  - detecting gpt
  target:
  - selector:
    - end: 31791
      start: 31366
      type: TextPositionSelector
    - exact: Figure 5. We simulate human edits to machine-generated text byreplacing
        varying fractions of model samples with T5-3B gener-ated text (masking out
        random five word spans until r% of text ismasked to simulate human edits to
        machine-generated text). Thefour top-performing methods all generally degrade
        in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment
        is conducted on the XSum dataset
      prefix: etectGPTLogRankLikelihoodEntropy
      suffix: .XSum SQuAD WritingPromptsMethod
      type: TextQuoteSelector
    source: https://arxiv.org/pdf/2301.11305.pdf
  text: DetectGPT shows 95% AUROC for texts that have been modified by about 10% and
    this drops off to about 85% when text is changed up to 24%.
  updated: '2023-01-29T10:46:24.271948+00:00'
  uri: https://arxiv.org/pdf/2301.11305.pdf
  user: acct:ravenscroftj@hypothes.is
  user_info:
    display_name: James Ravenscroft
in-reply-to: https://arxiv.org/pdf/2301.11305.pdf
tags:
- chatgpt
- detecting gpt
- hypothesis
type: annotation
url: /annotations/2023/01/29/1674989184

---


 <blockquote>Figure 5. We simulate human edits to machine-generated text byreplacing varying fractions of model samples with T5-3B gener-ated text (masking out random five word spans until r% of text ismasked to simulate human edits to machine-generated text). Thefour top-performing methods all generally degrade in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment is conducted on the XSum dataset</blockquote>DetectGPT shows 95% AUROC for texts that have been modified by about 10% and this drops off to about 85% when text is changed up to 24%.
Add 'brainsteam/content/annotations/2023/01/29/1674989184.md' 2023-01-29 11:00:29 +00:00			`---`
			`date: '2023-01-29T10:46:24'`
			`hypothesis-meta:`
			`created: '2023-01-29T10:46:24.271948+00:00'`
			`document:`
			`title:`
			`- 2301.11305.pdf`
			`flagged: false`
			`group: __world__`
			`hidden: false`
			`id: LNKuap_CEe2NNLuZfhdxTA`
			`links:`
			`html: https://hypothes.is/a/LNKuap_CEe2NNLuZfhdxTA`
			`incontext: https://hyp.is/LNKuap_CEe2NNLuZfhdxTA/arxiv.org/pdf/2301.11305.pdf`
			`json: https://hypothes.is/api/annotations/LNKuap_CEe2NNLuZfhdxTA`
			`permissions:`
			`admin:`
			`- acct:ravenscroftj@hypothes.is`
			`delete:`
			`- acct:ravenscroftj@hypothes.is`
			`read:`
			`- group:__world__`
			`update:`
			`- acct:ravenscroftj@hypothes.is`
			`tags:`
			`- chatgpt`
			`- detecting gpt`
			`target:`
			`- selector:`
			`- end: 31791`
			`start: 31366`
			`type: TextPositionSelector`
			`- exact: Figure 5. We simulate human edits to machine-generated text byreplacing`
			`varying fractions of model samples with T5-3B gener-ated text (masking out`
			`random five word spans until r% of text ismasked to simulate human edits to`
			`machine-generated text). Thefour top-performing methods all generally degrade`
			`in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment`
			`is conducted on the XSum dataset`
			`prefix: etectGPTLogRankLikelihoodEntropy`
			`suffix: .XSum SQuAD WritingPromptsMethod`
			`type: TextQuoteSelector`
			`source: https://arxiv.org/pdf/2301.11305.pdf`
			`text: DetectGPT shows 95% AUROC for texts that have been modified by about 10% and`
			`this drops off to about 85% when text is changed up to 24%.`
			`updated: '2023-01-29T10:46:24.271948+00:00'`
			`uri: https://arxiv.org/pdf/2301.11305.pdf`
			`user: acct:ravenscroftj@hypothes.is`
			`user_info:`
			`display_name: James Ravenscroft`
			`in-reply-to: https://arxiv.org/pdf/2301.11305.pdf`
			`tags:`
			`- chatgpt`
			`- detecting gpt`
			`- hypothesis`
			`type: annotation`
			`url: /annotations/2023/01/29/1674989184`

			`---`



			<blockquote>Figure 5. We simulate human edits to machine-generated text byreplacing varying fractions of model samples with T5-3B gener-ated text (masking out random five word spans until r% of text ismasked to simulate human edits to machine-generated text). Thefour top-performing methods all generally degrade in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment is conducted on the XSum dataset</blockquote>DetectGPT shows 95% AUROC for texts that have been modified by about 10% and this drops off to about 85% when text is changed up to 24%.