Dissociating language and thought in large language models

doi:10.1016/j.tics.2024.01.011

Review

. 2024 Jun;28(6):517-540.

doi: 10.1016/j.tics.2024年01月01日1. Epub 2024 Mar 19.

Dissociating language and thought in large language models

Kyle Mahowald ¹, Anna A Ivanova ², Idan A Blank ³, Nancy Kanwisher ⁴, Joshua B Tenenbaum ⁵, Evelina Fedorenko ⁶

Affiliations

¹ The University of Texas at Austin, Austin, TX, USA. Electronic address: mahowald@utexas.edu.
² Georgia Institute of Technology, Atlanta, GA, USA. Electronic address: a.ivanova@gatech.edu.
³ University of California, Los Angeles, CA, USA. Electronic address: iblank@psych.ucla.edu.
⁴ Massachusetts Institute of Technology, Cambridge, MA, USA. Electronic address: ngk@mit.edu.
⁵ Massachusetts Institute of Technology, Cambridge, MA, USA. Electronic address: jbt@mit.edu.
⁶ Massachusetts Institute of Technology, Cambridge, MA, USA. Electronic address: evelina9@mit.edu.

PMID: 38508911
PMCID: PMC11416727
DOI: 10.1016/j.tics.2024年01月01日1

Review

Dissociating language and thought in large language models

Kyle Mahowald et al. Trends Cogn Sci. 2024 Jun.

. 2024 Jun;28(6):517-540.

doi: 10.1016/j.tics.2024年01月01日1. Epub 2024 Mar 19.

Authors

Kyle Mahowald ¹, Anna A Ivanova ², Idan A Blank ³, Nancy Kanwisher ⁴, Joshua B Tenenbaum ⁵, Evelina Fedorenko ⁶

Affiliations

¹ The University of Texas at Austin, Austin, TX, USA. Electronic address: mahowald@utexas.edu.
² Georgia Institute of Technology, Atlanta, GA, USA. Electronic address: a.ivanova@gatech.edu.
³ University of California, Los Angeles, CA, USA. Electronic address: iblank@psych.ucla.edu.
⁴ Massachusetts Institute of Technology, Cambridge, MA, USA. Electronic address: ngk@mit.edu.
⁵ Massachusetts Institute of Technology, Cambridge, MA, USA. Electronic address: jbt@mit.edu.
⁶ Massachusetts Institute of Technology, Cambridge, MA, USA. Electronic address: evelina9@mit.edu.

PMID: 38508911
PMCID: PMC11416727
DOI: 10.1016/j.tics.2024年01月01日1

Abstract

Large language models (LLMs) have come closest among all models to date to mastering human language, yet opinions about their linguistic and cognitive capabilities remain split. Here, we evaluate LLMs using a distinction between formal linguistic competence (knowledge of linguistic rules and patterns) and functional linguistic competence (understanding and using language in the world). We ground this distinction in human neuroscience, which has shown that formal and functional competence rely on different neural mechanisms. Although LLMs are surprisingly good at formal competence, their performance on functional competence tasks remains spotty and often requires specialized fine-tuning and/or coupling with external modules. We posit that models that use language in human-like ways would need to master both of these competence types, which, in turn, could require the emergence of separate mechanisms specialized for formal versus functional linguistic competence.

Keywords: cognitive neuroscience; computational modeling; language and thought; large language models; linguistic competence.

PubMed Disclaimer

Conflict of interest statement

Declaration of interests The authors declare no conflicts of interest.

Figures

Figure 1:

Figure 1:. Separating formal and functional competence.

Successful use of language relies on multiple cognitive skills, some of which (required for formal competence) are language-specific and some (required for functional competence) are not. Determining whether a particular failure stems from a gap in formal competence or functional competence is key to evaluating and improving language models.

See this image and copyright information in PMC

References

1. Turing AM. Computing Machinery and Intelligence. Mind, 59(October):433–60, 1950. Publisher: Oxford University Press.
1. Chang Y, Wang X, Wang J, Wu Y, Yang L, Zhu K, Chen H, Yi X, Wang C, Wang Y, et al. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 2023.
1. Bommasani R, Klyman K, Longpre S, Kapoor S, Maslej N, Xiong B, Zhang D, and Liang P. The foundation model transparency index. arXiv preprint arXiv:2310.12941, 2023.
1. Wang A, Pruksachatkun Y, Nangia N, Singh A, Michael J, Hill F, Levy O, and Bowman SR. SuperGLUE: A stickier benchmark for general-purpose language understanding systems. In 33rd Conference on Neural Information Processing Systems, 2019.
1. Srivastava A, Rastogi A, Rao A, Shoeb AAM, Abid A, Fisch A, Brown AR, Santoro A, Gupta A, Garriga-Alonso A, Kluska A, Lewkowycz A, Agarwal A, Power A, Ray A, Warstadt A, Kocurek AW, Safaya A, Tazarv A, Xiang A, Parrish A, Nie A, Hussain A, Askell A, Dsouza A, Slone A, Rahane A, Iyer AS, Andreassen AJ, Madotto A, Santilli A, Stuhlmüller A, Dai AM, La A, Lampinen A, Zou A, Jiang A, Chen A, Vuong A, Gupta A, Gottardi A, Norelli A, Venkatesh A, Gholamidavoodi A, Tabassum A, Menezes A, Kirubarajan A, Mullokandov A, Sabharwal A, Herrick A, Efrat A, Erdem A, Karakaş A, Roberts BR, Loe BS, Zoph B, Bojanowski B, Özyurt B, Hedayatnia B, Neyshabur B, Inden B, Stein B, Ekmekci B, Lin BY, Howald B, Orinion B, Diao C, Dour C, Stinson C, Argueta C, Ferri C, Singh C, Rathkopf C, Meng C, Baral C, Wu C, Callison-Burch C, Waites C, Voigt C, Manning CD, Potts C, Ramirez C, Rivera CE, Siro C, Raffel C, Ashcraft C, Garbacea C, Sileo D, Garrette D, Hendrycks D, Kilman D, Roth D, Freeman CD, Khashabi D, Levy D, González DM, Perszyk D, Hernandez D, Chen D, Ippolito D, Gilboa D, Dohan D, Drakard D, Jurgens D, Datta D, Ganguli D, Emelin D, Kleyko D, Yuret D, Chen D, Tam D, Hupkes D, Misra D, Buzan D, Mollo DC, Yang D, Lee D-H, Schrader D, Shutova E, Cubuk ED, Segal E, Hagerman E, Barnes E, Donoway E, Pavlick E, Rodolà E, Lam E, Chu E, Tang E, Erdem E, Chang E, Chi EA, Dyer E, Jerzak E, Kim E, Manyasi EE, Zheltonozhskii E, Xia F, Siar F, Martínez-Plumed F, Happé F, Chollet F, Rong F, Mishra G, Winata GI, de Melo G, Kruszewski G, Parascandolo G, Mariani G, Wang GX, Jaimovitch-Lopez G, Betz G, Gur-Ari G, Galijasevic H, Kim H, Rashkin H, Hajishirzi H, Mehta H, Bogar H, Shevlin HFA, Schuetze H, Yakura H, Zhang H, Wong HM, Ng I, Noble I, Jumelet J, Geissinger J, Kernion J, Hilton J, Lee J, Fisac JF, Simon JB, Koppel J, Zheng J, Zou J, Kocon J, Thompson J, Wingfield J, Kaplan J, Radom J, Sohl-Dickstein J, Phang J, Wei J, Yosinski J, Novikova J, Bosscher J, Marsh J, Kim J, Taal J, Engel J, Alabi J, Xu J, Song J, Tang J, Waweru J, Burden J, Miller J, Balis JU, Batchelder J, Berant J, Frohberg J, Rozen J, Hernandez-Orallo J, Boudeman J, Guerr J, Jones J, Tenenbaum JB, Rule JS, Chua J, Kanclerz K, Livescu K, Krauth K, Gopalakrishnan K, Ignatyeva K, Markert K, Dhole K, Gimpel K, Omondi K, Mathewson KW, Chiafullo K, Shkaruta K, Shridhar K, McDonell K, Richardson K, Reynolds L, Gao L, Zhang L, Dugan L, Qin L, Contreras-Ochando L, Morency L-P, Moschella L, Lam L, Noble L, Schmidt L, He L, Oliveros-Colón L, Metz L, Senel LK, Bosma M, Sap M, Hoeve MT, Farooqi M, Faruqui M, Mazeika M, Baturan M, Marelli M, Maru M, Ramirez-Quintana MJ, Tolkiehn M, Giulianelli M, Lewis M, Potthast M, Leavitt ML, Hagen M, Schubert M, Baitemirova MO, Arnaud M, McElrath M, Yee MA, Cohen M, Gu M, Ivanitskiy M, Starritt M, Strube M, Swędrowski M, Bevilacqua M, Yasunaga M, Kale M, Cain M, Xu M, Suzgun M, Walker M, Tiwari M, Bansal M, Aminnaseri M, Geva M, Gheini M, T MV, Peng N, Chi NA, Lee N, Krakover NG-A, Cameron N, Roberts N, Doiron N, Martinez N, Nangia N, Deckers N, Muennighoff N, Keskar NS, Iyer NS, Constant N, Fiedel N, Wen N, Zhang O, Agha O, Elbaghdadi O, Levy O, Evans O, Casares PAM, Doshi P, Fung P, Liang PP, Vicol P, Alipoormolabashi P, Liao P, Liang P, Chang PW, Eckersley P, Htut PM, Hwang P, Miłkowski P, Patil P, Pezeshkpour P, Oli P, Mei Q, Lyu Q, Chen Q, Banjade R, Rudolph RE, Gabriel R, Habacker R, Risco R, Millière R, Garg R, Barnes R, Saurous RA, Arakawa R, Raymaekers R, Frank R, Sikand R, Novak R, Sitelew R, Bras RL, Liu R, Jacobs R, Zhang R, Salakhutdinov R, Chi RA, Lee SR, Stovall R, Teehan R, Yang R, Singh S, Mohammad SM, Anand S, Dillavou S, Shleifer S, Wiseman S, Gruetter S, Bowman SR, Schoenholz SS, Han S, Kwatra S, Rous SA, Ghazarian S, Ghosh S, Casey S, Bischoff S, Gehrmann S, Schuster S, Sadeghi S, Hamdan S, Zhou S, Srivastava S, Shi S, Singh S, Asaadi S, Gu SS, Pachchigar S, Toshniwal S, Upadhyay S, Debnath SS, Shakeri S, Thormeyer S, Melzi S, Reddy S, Makini SP, Lee S-H, Torene S, Hatwar S, Dehaene S, Divic S, Ermon S, Biderman S, Lin S, Prasad S, Piantadosi S, Shieber S, Misherghi S, Kiritchenko S, Mishra S, Linzen T, Schuster T, Li T, Yu T, Ali T, Hashimoto T, Wu T-L, Desbordes T, Rothschild T, Phan T, Wang T, Nkinyili T, Schick T, Kornev T, Tunduny T, Gerstenberg T, Chang T, Neeraj T, Khot T, Shultz T, Shaham U, Misra V, Demberg V, Nyamai V, Raunak V, Ramasesh VV, prabhu vinay uday, Padmakumar V, Srikumar V, Fedus W, Saunders W, Zhang W, Vossen W, Ren X, Tong X, Zhao X, Wu X, Shen X, Yaghoobzadeh Y, Lakretz Y, Song Y, Bahri Y, Choi Y, Yang Y, Hao Y, Chen Y, Belinkov Y, Hou Y, Hou Y, Bai Y, Seid Z, Zhao Z, Wang Z, Wang ZJ, Wang Z, and Wu Z. Beyond the imitation game: Quantifying and extrapolating the capabilities of language models. Transactions on Machine Learning Research, 2023.

Publication types

Actions
Actions
Actions

MeSH terms

Actions
Actions
Actions
Actions

Grants and funding

LinkOut - more resources

Full Text Sources

[1] Turing AM. Computing Machinery and Intelligence. Mind, 59(October):433–60, 1950. Publisher: Oxford University Press.

[2] Turing AM. Computing Machinery and Intelligence. Mind, 59(October):433–60, 1950. Publisher: Oxford University Press.

[3] Chang Y, Wang X, Wang J, Wu Y, Yang L, Zhu K, Chen H, Yi X, Wang C, Wang Y, et al. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 2023.

[4] Chang Y, Wang X, Wang J, Wu Y, Yang L, Zhu K, Chen H, Yi X, Wang C, Wang Y, et al. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology, 2023.

[5] Bommasani R, Klyman K, Longpre S, Kapoor S, Maslej N, Xiong B, Zhang D, and Liang P. The foundation model transparency index. arXiv preprint arXiv:2310.12941, 2023.

[6] Bommasani R, Klyman K, Longpre S, Kapoor S, Maslej N, Xiong B, Zhang D, and Liang P. The foundation model transparency index. arXiv preprint arXiv:2310.12941, 2023.

[7] Wang A, Pruksachatkun Y, Nangia N, Singh A, Michael J, Hill F, Levy O, and Bowman SR. SuperGLUE: A stickier benchmark for general-purpose language understanding systems. In 33rd Conference on Neural Information Processing Systems, 2019.

[8] Wang A, Pruksachatkun Y, Nangia N, Singh A, Michael J, Hill F, Levy O, and Bowman SR. SuperGLUE: A stickier benchmark for general-purpose language understanding systems. In 33rd Conference on Neural Information Processing Systems, 2019.

Account

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Dissociating language and thought in large language models

Affiliations

Dissociating language and thought in large language models

Authors

Affiliations

Abstract

Conflict of interest statement

Figures

References

Publication types

MeSH terms

Grants and funding

LinkOut - more resources

Full Text Sources