---
date: '2022-11-23T20:50:17'
hypothesis-meta:
  created: '2022-11-23T20:50:17.668925+00:00'
  document:
    title:
    - 2022.naacl-main.167.pdf
  flagged: false
  group: __world__
  hidden: false
  id: b_EbpGtwEe2m8tfhSKM2EQ
  links:
    html: https://hypothes.is/a/b_EbpGtwEe2m8tfhSKM2EQ
    incontext: https://hyp.is/b_EbpGtwEe2m8tfhSKM2EQ/aclanthology.org/2022.naacl-main.167.pdf
    json: https://hypothes.is/api/annotations/b_EbpGtwEe2m8tfhSKM2EQ
  permissions:
    admin:
    - acct:ravenscroftj@hypothes.is
    delete:
    - acct:ravenscroftj@hypothes.is
    read:
    - group:__world__
    update:
    - acct:ravenscroftj@hypothes.is
  tags:
  - prompt-models
  - NLProc
  target:
  - selector:
    - end: 2221
      start: 1677
      type: TextPositionSelector
    - exact: "Suppose a human is given two sentences: \u201CNoweapons of mass destruction\
        \ found in Iraq yet.\u201Dand \u201CWeapons of mass destruction found in Iraq.\u201D\
        They are then asked to respond 0 or 1 and receive areward if they are correct.\
        \ In this setup, they wouldlikely need a large number of trials and errors\
        \ be-fore figuring out what they are really being re-warded to do. This setup\
        \ is akin to the pretrain-and-fine-tune setup which has dominated NLP in recentyears,\
        \ in which models are asked to classify a sen-tence representation (e.g.,\
        \ a CLS token) into some"
      prefix: task instructions.1 Introduction
      suffix: "\u2217Unabridged version available on"
      type: TextQuoteSelector
    source: https://aclanthology.org/2022.naacl-main.167.pdf
  text: This is a really excellent illustration of the difference in paradigm between
    "normal" text model fine tuning and prompt-based modelling
  updated: '2022-11-23T20:50:17.668925+00:00'
  uri: https://aclanthology.org/2022.naacl-main.167.pdf
  user: acct:ravenscroftj@hypothes.is
  user_info:
    display_name: James Ravenscroft
in-reply-to: https://aclanthology.org/2022.naacl-main.167.pdf
tags:
- prompt-models
- NLProc
- hypothesis
type: reply
url: /replies/2022/11/23/1669236617

---


 <blockquote>Suppose a human is given two sentences: “Noweapons of mass destruction found in Iraq yet.”and “Weapons of mass destruction found in Iraq.”They are then asked to respond 0 or 1 and receive areward if they are correct. In this setup, they wouldlikely need a large number of trials and errors be-fore figuring out what they are really being re-warded to do. This setup is akin to the pretrain-and-fine-tune setup which has dominated NLP in recentyears, in which models are asked to classify a sen-tence representation (e.g., a CLS token) into some</blockquote>This is a really excellent illustration of the difference in paradigm between "normal" text model fine tuning and prompt-based modelling