2022-12-19T14:57:08.575784+00:00 |
title |
My AI Safety Lecture for UT Effective Altruism |
|
|
false |
__world__ |
false |
aQ51un-tEe29v2MBjEX6Xw |
|
admin |
delete |
read |
update |
acct:ravenscroftj@hypothes.is |
|
acct:ravenscroftj@hypothes.is |
|
|
acct:ravenscroftj@hypothes.is |
|
|
|
selector |
source |
endContainer |
endOffset |
startContainer |
startOffset |
type |
/div[2]/div[2]/div[2]/div[1]/p[99] |
386 |
/div[2]/div[2]/div[2]/div[1]/p[99] |
0 |
RangeSelector |
|
end |
start |
type |
40910 |
40524 |
TextPositionSelector |
|
exact |
prefix |
suffix |
type |
Anyway, we actually have a working prototype of the watermarking scheme, built by OpenAI engineer Hendrik Kirchner. It seems to work pretty well—empirically, a few hundred tokens seem to be enough to get a reasonable signal that yes, this text came from GPT. In principle, you could even take a long text and isolate which parts probably came from GPT and which parts probably didn’t. |
irst hundred prime numbers).
|
Now, this can all be defeate |
TextQuoteSelector |
|
|
https://scottaaronson.blog/?p=6823 |
|
|
Scott's team hsas already developed a prototype watermarking scheme at OpenAI and it works pretty well |
2022-12-19T14:57:08.575784+00:00 |
https://scottaaronson.blog/?p=6823 |
acct:ravenscroftj@hypothes.is |
display_name |
James Ravenscroft |
|