diff --git a/brainsteam/content/annotations/2023/01/29/1674989184.md b/brainsteam/content/annotations/2023/01/29/1674989184.md new file mode 100644 index 0000000..505560c --- /dev/null +++ b/brainsteam/content/annotations/2023/01/29/1674989184.md @@ -0,0 +1,62 @@ +--- +date: '2023-01-29T10:46:24' +hypothesis-meta: + created: '2023-01-29T10:46:24.271948+00:00' + document: + title: + - 2301.11305.pdf + flagged: false + group: __world__ + hidden: false + id: LNKuap_CEe2NNLuZfhdxTA + links: + html: https://hypothes.is/a/LNKuap_CEe2NNLuZfhdxTA + incontext: https://hyp.is/LNKuap_CEe2NNLuZfhdxTA/arxiv.org/pdf/2301.11305.pdf + json: https://hypothes.is/api/annotations/LNKuap_CEe2NNLuZfhdxTA + permissions: + admin: + - acct:ravenscroftj@hypothes.is + delete: + - acct:ravenscroftj@hypothes.is + read: + - group:__world__ + update: + - acct:ravenscroftj@hypothes.is + tags: + - chatgpt + - detecting gpt + target: + - selector: + - end: 31791 + start: 31366 + type: TextPositionSelector + - exact: Figure 5. We simulate human edits to machine-generated text byreplacing + varying fractions of model samples with T5-3B gener-ated text (masking out + random five word spans until r% of text ismasked to simulate human edits to + machine-generated text). Thefour top-performing methods all generally degrade + in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment + is conducted on the XSum dataset + prefix: etectGPTLogRankLikelihoodEntropy + suffix: .XSum SQuAD WritingPromptsMethod + type: TextQuoteSelector + source: https://arxiv.org/pdf/2301.11305.pdf + text: DetectGPT shows 95% AUROC for texts that have been modified by about 10% and + this drops off to about 85% when text is changed up to 24%. + updated: '2023-01-29T10:46:24.271948+00:00' + uri: https://arxiv.org/pdf/2301.11305.pdf + user: acct:ravenscroftj@hypothes.is + user_info: + display_name: James Ravenscroft +in-reply-to: https://arxiv.org/pdf/2301.11305.pdf +tags: +- chatgpt +- detecting gpt +- hypothesis +type: annotation +url: /annotations/2023/01/29/1674989184 + +--- + + + +
Figure 5. We simulate human edits to machine-generated text byreplacing varying fractions of model samples with T5-3B gener-ated text (masking out random five word spans until r% of text ismasked to simulate human edits to machine-generated text). Thefour top-performing methods all generally degrade in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment is conducted on the XSum dataset
DetectGPT shows 95% AUROC for texts that have been modified by about 10% and this drops off to about 85% when text is changed up to 24%. \ No newline at end of file