brainsteam.co.uk/brainsteam/content/annotations/2023/01/29/1674989184.md

62 lines
2.4 KiB
Markdown

---
date: '2023-01-29T10:46:24'
hypothesis-meta:
created: '2023-01-29T10:46:24.271948+00:00'
document:
title:
- 2301.11305.pdf
flagged: false
group: __world__
hidden: false
id: LNKuap_CEe2NNLuZfhdxTA
links:
html: https://hypothes.is/a/LNKuap_CEe2NNLuZfhdxTA
incontext: https://hyp.is/LNKuap_CEe2NNLuZfhdxTA/arxiv.org/pdf/2301.11305.pdf
json: https://hypothes.is/api/annotations/LNKuap_CEe2NNLuZfhdxTA
permissions:
admin:
- acct:ravenscroftj@hypothes.is
delete:
- acct:ravenscroftj@hypothes.is
read:
- group:__world__
update:
- acct:ravenscroftj@hypothes.is
tags:
- chatgpt
- detecting gpt
target:
- selector:
- end: 31791
start: 31366
type: TextPositionSelector
- exact: Figure 5. We simulate human edits to machine-generated text byreplacing
varying fractions of model samples with T5-3B gener-ated text (masking out
random five word spans until r% of text ismasked to simulate human edits to
machine-generated text). Thefour top-performing methods all generally degrade
in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment
is conducted on the XSum dataset
prefix: etectGPTLogRankLikelihoodEntropy
suffix: .XSum SQuAD WritingPromptsMethod
type: TextQuoteSelector
source: https://arxiv.org/pdf/2301.11305.pdf
text: DetectGPT shows 95% AUROC for texts that have been modified by about 10% and
this drops off to about 85% when text is changed up to 24%.
updated: '2023-01-29T10:46:24.271948+00:00'
uri: https://arxiv.org/pdf/2301.11305.pdf
user: acct:ravenscroftj@hypothes.is
user_info:
display_name: James Ravenscroft
in-reply-to: https://arxiv.org/pdf/2301.11305.pdf
tags:
- chatgpt
- detecting gpt
- hypothesis
type: annotation
url: /annotations/2023/01/29/1674989184
---
<blockquote>Figure 5. We simulate human edits to machine-generated text byreplacing varying fractions of model samples with T5-3B gener-ated text (masking out random five word spans until r% of text ismasked to simulate human edits to machine-generated text). Thefour top-performing methods all generally degrade in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment is conducted on the XSum dataset</blockquote>DetectGPT shows 95% AUROC for texts that have been modified by about 10% and this drops off to about 85% when text is changed up to 24%.