brainsteam.co.uk/brainsteam/content/annotations/2022/11/23/1669236944.md at 8ba49d27127540df99ef72d0c281b022e16fe3c3

2.9 KiB

Raw Blame History

date

hypothesis-meta

in-reply-to

tags

target

text

updated

uri

user

user_info

2022-11-23T20:55:44.414977+00:00

title

2022.naacl-main.167.pdf

false

__world__

false

MrGLumtxEe21b1OADBLmyg

html	incontext	json
https://hypothes.is/a/MrGLumtxEe21b1OADBLmyg	https://hyp.is/MrGLumtxEe21b1OADBLmyg/aclanthology.org/2022.naacl-main.167.pdf	https://hypothes.is/api/annotations/MrGLumtxEe21b1OADBLmyg

admin

delete

read

update

acct:ravenscroftj@hypothes.is

group:__world__

acct:ravenscroftj@hypothes.is

prompt-models

NLProc

selector

source

end	start	type
20146	19539	TextPositionSelector

exact	prefix	suffix	type
Misleading Templates There is no consistent re-lation between the performance of models trainedwith templates that are moderately misleading (e.g.{premise} Can that be paraphrasedas "{hypothesis}"?) vs. templates that areextremely misleading (e.g., {premise} Isthis a sports news? {hypothesis}).T0 (both 3B and 11B) perform better givenmisleading-moderate (Figure 3), ALBERT andT5 3B perform better given misleading-extreme(Appendices E and G.4), whereas T5 11B andGPT-3 perform comparably on both sets (Figure 2;also see Table 2 for a summary of statisticalsignificances.) Despite a lack of pattern between	structiveand misleading-extreme.	4 8 16 32 64 128 2560.50.550.60.	TextQuoteSelector

https://aclanthology.org/2022.naacl-main.167.pdf

Their misleading templates really are misleading {premise} Can that be paraphrased as "{hypothesis}" {premise} Is this a sports news? {hypothesis}

2022-11-23T20:55:44.414977+00:00

https://aclanthology.org/2022.naacl-main.167.pdf

acct:ravenscroftj@hypothes.is

display_name
James Ravenscroft

https://aclanthology.org/2022.naacl-main.167.pdf

prompt-models

NLProc

hypothesis

annotation

/annotation/2022/11/23/1669236944

Misleading Templates There is no consistent re-lation between the performance of models trainedwith templates that are moderately misleading (e.g.{premise} Can that be paraphrasedas "{hypothesis}"?) vs. templates that areextremely misleading (e.g., {premise} Isthis a sports news? {hypothesis}).T0 (both 3B and 11B) perform better givenmisleading-moderate (Figure 3), ALBERT andT5 3B perform better given misleading-extreme(Appendices E and G.4), whereas T5 11B andGPT-3 perform comparably on both sets (Figure 2;also see Table 2 for a summary of statisticalsignificances.) Despite a lack of pattern between

Their misleading templates really are misleading

{premise} Can that be paraphrased as "{hypothesis}"

{premise} Is this a sports news? {hypothesis}

2.9 KiB Raw Blame History

2.9 KiB

Raw Blame History