62 lines
2.4 KiB
Markdown
62 lines
2.4 KiB
Markdown
---
|
|
date: '2023-01-29T10:46:24'
|
|
hypothesis-meta:
|
|
created: '2023-01-29T10:46:24.271948+00:00'
|
|
document:
|
|
title:
|
|
- 2301.11305.pdf
|
|
flagged: false
|
|
group: __world__
|
|
hidden: false
|
|
id: LNKuap_CEe2NNLuZfhdxTA
|
|
links:
|
|
html: https://hypothes.is/a/LNKuap_CEe2NNLuZfhdxTA
|
|
incontext: https://hyp.is/LNKuap_CEe2NNLuZfhdxTA/arxiv.org/pdf/2301.11305.pdf
|
|
json: https://hypothes.is/api/annotations/LNKuap_CEe2NNLuZfhdxTA
|
|
permissions:
|
|
admin:
|
|
- acct:ravenscroftj@hypothes.is
|
|
delete:
|
|
- acct:ravenscroftj@hypothes.is
|
|
read:
|
|
- group:__world__
|
|
update:
|
|
- acct:ravenscroftj@hypothes.is
|
|
tags:
|
|
- chatgpt
|
|
- detecting gpt
|
|
target:
|
|
- selector:
|
|
- end: 31791
|
|
start: 31366
|
|
type: TextPositionSelector
|
|
- exact: Figure 5. We simulate human edits to machine-generated text byreplacing
|
|
varying fractions of model samples with T5-3B gener-ated text (masking out
|
|
random five word spans until r% of text ismasked to simulate human edits to
|
|
machine-generated text). Thefour top-performing methods all generally degrade
|
|
in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment
|
|
is conducted on the XSum dataset
|
|
prefix: etectGPTLogRankLikelihoodEntropy
|
|
suffix: .XSum SQuAD WritingPromptsMethod
|
|
type: TextQuoteSelector
|
|
source: https://arxiv.org/pdf/2301.11305.pdf
|
|
text: DetectGPT shows 95% AUROC for texts that have been modified by about 10% and
|
|
this drops off to about 85% when text is changed up to 24%.
|
|
updated: '2023-01-29T10:46:24.271948+00:00'
|
|
uri: https://arxiv.org/pdf/2301.11305.pdf
|
|
user: acct:ravenscroftj@hypothes.is
|
|
user_info:
|
|
display_name: James Ravenscroft
|
|
in-reply-to: https://arxiv.org/pdf/2301.11305.pdf
|
|
tags:
|
|
- chatgpt
|
|
- detecting gpt
|
|
- hypothesis
|
|
type: annotation
|
|
url: /annotations/2023/01/29/1674989184
|
|
|
|
---
|
|
|
|
|
|
|
|
<blockquote>Figure 5. We simulate human edits to machine-generated text byreplacing varying fractions of model samples with T5-3B gener-ated text (masking out random five word spans until r% of text ismasked to simulate human edits to machine-generated text). Thefour top-performing methods all generally degrade in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment is conducted on the XSum dataset</blockquote>DetectGPT shows 95% AUROC for texts that have been modified by about 10% and this drops off to about 85% when text is changed up to 24%. |