--- date: '2023-01-29T10:46:24' hypothesis-meta: created: '2023-01-29T10:46:24.271948+00:00' document: title: - 2301.11305.pdf flagged: false group: __world__ hidden: false id: LNKuap_CEe2NNLuZfhdxTA links: html: https://hypothes.is/a/LNKuap_CEe2NNLuZfhdxTA incontext: https://hyp.is/LNKuap_CEe2NNLuZfhdxTA/arxiv.org/pdf/2301.11305.pdf json: https://hypothes.is/api/annotations/LNKuap_CEe2NNLuZfhdxTA permissions: admin: - acct:ravenscroftj@hypothes.is delete: - acct:ravenscroftj@hypothes.is read: - group:__world__ update: - acct:ravenscroftj@hypothes.is tags: - chatgpt - detecting gpt target: - selector: - end: 31791 start: 31366 type: TextPositionSelector - exact: Figure 5. We simulate human edits to machine-generated text byreplacing varying fractions of model samples with T5-3B gener-ated text (masking out random five word spans until r% of text ismasked to simulate human edits to machine-generated text). Thefour top-performing methods all generally degrade in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment is conducted on the XSum dataset prefix: etectGPTLogRankLikelihoodEntropy suffix: .XSum SQuAD WritingPromptsMethod type: TextQuoteSelector source: https://arxiv.org/pdf/2301.11305.pdf text: DetectGPT shows 95% AUROC for texts that have been modified by about 10% and this drops off to about 85% when text is changed up to 24%. updated: '2023-01-29T10:46:24.271948+00:00' uri: https://arxiv.org/pdf/2301.11305.pdf user: acct:ravenscroftj@hypothes.is user_info: display_name: James Ravenscroft in-reply-to: https://arxiv.org/pdf/2301.11305.pdf tags: - chatgpt - detecting gpt - hypothesis type: annotation url: /annotations/2023/01/29/1674989184 ---
Figure 5. We simulate human edits to machine-generated text byreplacing varying fractions of model samples with T5-3B gener-ated text (masking out random five word spans until r% of text ismasked to simulate human edits to machine-generated text). Thefour top-performing methods all generally degrade in performancewith heavier revision, but DetectGPT is consistently most accurate.Experiment is conducted on the XSum datasetDetectGPT shows 95% AUROC for texts that have been modified by about 10% and this drops off to about 85% when text is changed up to 24%.