brainsteam.co.uk/brainsteam/content/annotations/2022/11/23/1669236944.md

2.9 KiB

date hypothesis-meta in-reply-to tags type url
2022-11-23T20:55:44
created document flagged group hidden id links permissions tags target text updated uri user user_info
2022-11-23T20:55:44.414977+00:00
title
2022.naacl-main.167.pdf
false __world__ false MrGLumtxEe21b1OADBLmyg
html incontext json
https://hypothes.is/a/MrGLumtxEe21b1OADBLmyg https://hyp.is/MrGLumtxEe21b1OADBLmyg/aclanthology.org/2022.naacl-main.167.pdf https://hypothes.is/api/annotations/MrGLumtxEe21b1OADBLmyg
admin delete read update
acct:ravenscroftj@hypothes.is
acct:ravenscroftj@hypothes.is
group:__world__
acct:ravenscroftj@hypothes.is
prompt-models
NLProc
selector source
end start type
20146 19539 TextPositionSelector
exact prefix suffix type
Misleading Templates There is no consistent re-lation between the performance of models trainedwith templates that are moderately misleading (e.g.{premise} Can that be paraphrasedas "{hypothesis}"?) vs. templates that areextremely misleading (e.g., {premise} Isthis a sports news? {hypothesis}).T0 (both 3B and 11B) perform better givenmisleading-moderate (Figure 3), ALBERT andT5 3B perform better given misleading-extreme(Appendices E and G.4), whereas T5 11B andGPT-3 perform comparably on both sets (Figure 2;also see Table 2 for a summary of statisticalsignificances.) Despite a lack of pattern between structiveand misleading-extreme. 4 8 16 32 64 128 2560.50.550.60. TextQuoteSelector
https://aclanthology.org/2022.naacl-main.167.pdf
Their misleading templates really are misleading {premise} Can that be paraphrased as "{hypothesis}" {premise} Is this a sports news? {hypothesis} 2022-11-23T20:55:44.414977+00:00 https://aclanthology.org/2022.naacl-main.167.pdf acct:ravenscroftj@hypothes.is
display_name
James Ravenscroft
https://aclanthology.org/2022.naacl-main.167.pdf
prompt-models
NLProc
hypothesis
annotation /annotation/2022/11/23/1669236944
Misleading Templates There is no consistent re-lation between the performance of models trainedwith templates that are moderately misleading (e.g.{premise} Can that be paraphrasedas "{hypothesis}"?) vs. templates that areextremely misleading (e.g., {premise} Isthis a sports news? {hypothesis}).T0 (both 3B and 11B) perform better givenmisleading-moderate (Figure 3), ALBERT andT5 3B perform better given misleading-extreme(Appendices E and G.4), whereas T5 11B andGPT-3 perform comparably on both sets (Figure 2;also see Table 2 for a summary of statisticalsignificances.) Despite a lack of pattern between
Their misleading templates really are misleading

{premise} Can that be paraphrased as "{hypothesis}"

{premise} Is this a sports news? {hypothesis}