---
date: '2022-11-23T20:55:44'
hypothesis-meta:
  created: '2022-11-23T20:55:44.414977+00:00'
  document:
    title:
    - 2022.naacl-main.167.pdf
  flagged: false
  group: __world__
  hidden: false
  id: MrGLumtxEe21b1OADBLmyg
  links:
    html: https://hypothes.is/a/MrGLumtxEe21b1OADBLmyg
    incontext: https://hyp.is/MrGLumtxEe21b1OADBLmyg/aclanthology.org/2022.naacl-main.167.pdf
    json: https://hypothes.is/api/annotations/MrGLumtxEe21b1OADBLmyg
  permissions:
    admin:
    - acct:ravenscroftj@hypothes.is
    delete:
    - acct:ravenscroftj@hypothes.is
    read:
    - group:__world__
    update:
    - acct:ravenscroftj@hypothes.is
  tags:
  - prompt-models
  - NLProc
  target:
  - selector:
    - end: 20146
      start: 19539
      type: TextPositionSelector
    - exact: Misleading Templates There is no consistent re-lation between the performance
        of models trainedwith templates that are moderately misleading (e.g.{premise}
        Can that be paraphrasedas "{hypothesis}"?) vs. templates that areextremely
        misleading (e.g., {premise} Isthis a sports news? {hypothesis}).T0 (both 3B
        and 11B) perform better givenmisleading-moderate (Figure 3), ALBERT andT5
        3B perform better given misleading-extreme(Appendices E and G.4), whereas
        T5 11B andGPT-3 perform comparably on both sets (Figure 2;also see Table 2
        for a summary of statisticalsignificances.) Despite a lack of pattern between
      prefix: structiveand misleading-extreme.
      suffix: 4 8 16 32 64 128 2560.50.550.60.
      type: TextQuoteSelector
    source: https://aclanthology.org/2022.naacl-main.167.pdf
  text: "Their misleading templates really are misleading \n\n{premise} Can that be\
    \ paraphrased as \"{hypothesis}\" \n\n{premise} Is this a sports news? {hypothesis}"
  updated: '2022-11-23T20:55:44.414977+00:00'
  uri: https://aclanthology.org/2022.naacl-main.167.pdf
  user: acct:ravenscroftj@hypothes.is
  user_info:
    display_name: James Ravenscroft
in-reply-to: https://aclanthology.org/2022.naacl-main.167.pdf
tags:
- prompt-models
- NLProc
- hypothesis
type: reply
url: /replies/2022/11/23/1669236944

---


 <blockquote>Misleading Templates There is no consistent re-lation between the performance of models trainedwith templates that are moderately misleading (e.g.{premise} Can that be paraphrasedas "{hypothesis}"?) vs. templates that areextremely misleading (e.g., {premise} Isthis a sports news? {hypothesis}).T0 (both 3B and 11B) perform better givenmisleading-moderate (Figure 3), ALBERT andT5 3B perform better given misleading-extreme(Appendices E and G.4), whereas T5 11B andGPT-3 perform comparably on both sets (Figure 2;also see Table 2 for a summary of statisticalsignificances.) Despite a lack of pattern between</blockquote>Their misleading templates really are misleading 

{premise} Can that be paraphrased as "{hypothesis}" 

{premise} Is this a sports news? {hypothesis}