77 lines
2.5 KiB
Markdown
77 lines
2.5 KiB
Markdown
---
|
||
date: '2022-12-19T14:57:08'
|
||
hypothesis-meta:
|
||
created: '2022-12-19T14:57:08.575784+00:00'
|
||
document:
|
||
title:
|
||
- My AI Safety Lecture for UT Effective Altruism
|
||
flagged: false
|
||
group: __world__
|
||
hidden: false
|
||
id: aQ51un-tEe29v2MBjEX6Xw
|
||
links:
|
||
html: https://hypothes.is/a/aQ51un-tEe29v2MBjEX6Xw
|
||
incontext: https://hyp.is/aQ51un-tEe29v2MBjEX6Xw/scottaaronson.blog/?p=6823
|
||
json: https://hypothes.is/api/annotations/aQ51un-tEe29v2MBjEX6Xw
|
||
permissions:
|
||
admin:
|
||
- acct:ravenscroftj@hypothes.is
|
||
delete:
|
||
- acct:ravenscroftj@hypothes.is
|
||
read:
|
||
- group:__world__
|
||
update:
|
||
- acct:ravenscroftj@hypothes.is
|
||
tags:
|
||
- explainability
|
||
- nlproc
|
||
target:
|
||
- selector:
|
||
- endContainer: /div[2]/div[2]/div[2]/div[1]/p[99]
|
||
endOffset: 386
|
||
startContainer: /div[2]/div[2]/div[2]/div[1]/p[99]
|
||
startOffset: 0
|
||
type: RangeSelector
|
||
- end: 40910
|
||
start: 40524
|
||
type: TextPositionSelector
|
||
- exact: "Anyway, we actually have a working prototype of the watermarking scheme,\
|
||
\ built by OpenAI engineer Hendrik Kirchner. It seems to work pretty well\u2014\
|
||
empirically, a few hundred tokens seem to be enough to get a reasonable signal\
|
||
\ that yes, this text came from GPT. In principle, you could even take a\
|
||
\ long text and isolate which parts probably came from GPT and which parts\
|
||
\ probably didn\u2019t."
|
||
prefix: 'irst hundred prime numbers).
|
||
|
||
|
||
|
||
|
||
'
|
||
suffix: '
|
||
|
||
|
||
|
||
|
||
Now, this can all be defeate'
|
||
type: TextQuoteSelector
|
||
source: https://scottaaronson.blog/?p=6823
|
||
text: Scott's team hsas already developed a prototype watermarking scheme at OpenAI
|
||
and it works pretty well
|
||
updated: '2022-12-19T14:57:08.575784+00:00'
|
||
uri: https://scottaaronson.blog/?p=6823
|
||
user: acct:ravenscroftj@hypothes.is
|
||
user_info:
|
||
display_name: James Ravenscroft
|
||
in-reply-to: https://scottaaronson.blog/?p=6823
|
||
tags:
|
||
- explainability
|
||
- nlproc
|
||
- hypothesis
|
||
type: annotation
|
||
url: /annotations/2022/12/19/1671461828
|
||
|
||
---
|
||
|
||
|
||
|
||
<blockquote>Anyway, we actually have a working prototype of the watermarking scheme, built by OpenAI engineer Hendrik Kirchner. It seems to work pretty well—empirically, a few hundred tokens seem to be enough to get a reasonable signal that yes, this text came from GPT. In principle, you could even take a long text and isolate which parts probably came from GPT and which parts probably didn’t.</blockquote>Scott's team hsas already developed a prototype watermarking scheme at OpenAI and it works pretty well |