brainsteam.co.uk/brainsteam/content/annotations/2022/12/19/1671461752.md

2.1 KiB
Raw Blame History

date hypothesis-meta in-reply-to tags type url
2022-12-19T14:55:52
created document flagged group hidden id links permissions tags target text updated uri user user_info
2022-12-19T14:55:52.384335+00:00
title
My AI Safety Lecture for UT Effective Altruism
false __world__ false O7YUan-tEe29vjfmuBFMKQ
html incontext json
https://hypothes.is/a/O7YUan-tEe29vjfmuBFMKQ https://hyp.is/O7YUan-tEe29vjfmuBFMKQ/scottaaronson.blog/?p=6823 https://hypothes.is/api/annotations/O7YUan-tEe29vjfmuBFMKQ
admin delete read update
acct:ravenscroftj@hypothes.is
acct:ravenscroftj@hypothes.is
group:__world__
acct:ravenscroftj@hypothes.is
explainability
nlproc
selector source
endContainer endOffset startContainer startOffset type
/div[2]/div[2]/div[2]/div[1]/p[95] 193 /div[2]/div[2]/div[2]/div[1]/p[95] 0 RangeSelector
end start type
38138 37945 TextPositionSelector
exact prefix suffix type
So then to watermark, instead of selecting the next token randomly, the idea will be to select it pseudorandomly, using a cryptographic pseudorandom function, whose key is known only to OpenAI. of output tokens) each time. That wont make any detectable TextQuoteSelector
https://scottaaronson.blog/?p=6823
Watermarking by applying cryptographic pseudorandom functions to the model output instead of true random (true pseudo-random) 2022-12-19T14:55:52.384335+00:00 https://scottaaronson.blog/?p=6823 acct:ravenscroftj@hypothes.is
display_name
James Ravenscroft
https://scottaaronson.blog/?p=6823
explainability
nlproc
hypothesis
annotation /annotations/2022/12/19/1671461752
So then to watermark, instead of selecting the next token randomly, the idea will be to select it pseudorandomly, using a cryptographic pseudorandom function, whose key is known only to OpenAI.
Watermarking by applying cryptographic pseudorandom functions to the model output instead of true random (true pseudo-random)