Toward an Integration of Deep Learning and Neuroscience

doi:10.3389/fncom.2016.00094

. 2016 Sep 14:10:94.

doi: 10.3389/fncom.2016.00094. eCollection 2016.

Toward an Integration of Deep Learning and Neuroscience

Adam H Marblestone ¹, Greg Wayne ², Konrad P Kording ³

Affiliations

PMID: 27683554
PMCID: PMC5021692
DOI: 10.3389/fncom.2016.00094

Toward an Integration of Deep Learning and Neuroscience

Adam H Marblestone et al. Front Comput Neurosci. 2016.

. 2016 Sep 14:10:94.

doi: 10.3389/fncom.2016.00094. eCollection 2016.

Authors

Adam H Marblestone ¹, Greg Wayne ², Konrad P Kording ³

Affiliations

¹ Synthetic Neurobiology Group, Massachusetts Institute of Technology, Media Lab Cambridge, MA, USA.
² Google Deepmind London, UK.
³ Rehabilitation Institute of Chicago, Northwestern University Chicago, IL, USA.

PMID: 27683554
PMCID: PMC5021692
DOI: 10.3389/fncom.2016.00094

Abstract

Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives. First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage. Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas. We hypothesize that (1) the brain optimizes cost functions, (2) the cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. In support of these hypotheses, we argue that a range of implementations of credit assignment through multiple layers of neurons are compatible with our current knowledge of neural circuitry, and that the brain's specialized systems can be interpreted as enabling efficient optimization for specific problem classes. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses.

Keywords: cognitive architecture; cost functions; neural networks; neuroscience.

PubMed Disclaimer

Figures

Figure 1

Figure 1

Putative differences between conventional and brain-like neural network designs. (A) In conventional deep learning, supervised training is based on externally-supplied, labeled data. (B) In the brain, supervised training of networks can still occur via gradient descent on an error signal, but this error signal must arise from internally generated cost functions. These cost functions are themselves computed by neural modules specified by both genetics and learning. Internally generated cost functions create heuristics that are used to bootstrap more complex learning. For example, an area which recognizes faces might first be trained to detect faces using simple heuristics, like the presence of two dots above a line, and then further trained to discriminate salient facial expressions using representations arising from unsupervised learning and error signals from other brain areas related to social reward processing. (C) Internally generated cost functions and error-driven training of cortical deep networks form part of a larger architecture containing several specialized systems. Although the trainable cortical areas are schematized as feedforward neural networks here, LSTMs or other types of recurrent networks may be a more accurate analogy, and many neuronal and network properties such as spiking, dendritic computation, neuromodulation, adaptation and homeostatic plasticity, timing-dependent plasticity, direct electrical connections, transient synaptic dynamics, excitatory/inhibitory balance, spontaneous oscillatory activity, axonal conduction delays (Izhikevich, 2006) and others, will influence what and how such networks learn.

See this image and copyright information in PMC

References

1. Abbott L., DePasquale B., Memmesheimer R. (2016). Building Functional Networks of Spiking Model Neurons. Available online at: neurotheory.columbia.edu. - PMC - PubMed
1. Abbott L. F., Blum K. I. (1996). Functional significance of long-term potentiation for sequence learning and prediction. Cereb. Cortex 6, 406–416. - PubMed
1. Ackley D., Hinton G., Sejnowski T. (1958). A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169.
1. Acuna D. E., Wymbs N. F., Reynolds C. A., Picard N., Turner R. S., Strick P. L., et al. . (2014). Multifaceted aspects of chunking enable robust algorithms. J. Neurophysiol. 112, 1849–1856. 10.1152/jn.00028.2014 - DOI - PMC - PubMed
1. Alain G., Lamb A., Sankar C., Courville A., Bengio Y. (2015). Variance reduction in SGD by distributed importance sampling. arXiv:1511.06481.

Grants and funding

R01 MH103910/MH/NIMH NIH HHS/United States

LinkOut - more resources

Full Text Sources
Other Literature Sources
- The Lens - Patent Citations Database
- scite Smart Citations

[1] Abbott L., DePasquale B., Memmesheimer R. (2016). Building Functional Networks of Spiking Model Neurons. Available online at: neurotheory.columbia.edu. - PMC - PubMed

[2] Abbott L., DePasquale B., Memmesheimer R. (2016). Building Functional Networks of Spiking Model Neurons. Available online at: neurotheory.columbia.edu. - PMC - PubMed

[3] Abbott L. F., Blum K. I. (1996). Functional significance of long-term potentiation for sequence learning and prediction. Cereb. Cortex 6, 406–416. - PubMed

[4] Abbott L. F., Blum K. I. (1996). Functional significance of long-term potentiation for sequence learning and prediction. Cereb. Cortex 6, 406–416. - PubMed

[5] Ackley D., Hinton G., Sejnowski T. (1958). A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169.

[6] Ackley D., Hinton G., Sejnowski T. (1958). A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169.

[7] Acuna D. E., Wymbs N. F., Reynolds C. A., Picard N., Turner R. S., Strick P. L., et al. . (2014). Multifaceted aspects of chunking enable robust algorithms. J. Neurophysiol. 112, 1849–1856. 10.1152/jn.00028.2014 - DOI - PMC - PubMed

[8] Acuna D. E., Wymbs N. F., Reynolds C. A., Picard N., Turner R. S., Strick P. L., et al. . (2014). Multifaceted aspects of chunking enable robust algorithms. J. Neurophysiol. 112, 1849–1856. 10.1152/jn.00028.2014 - DOI - PMC - PubMed

[9] Alain G., Lamb A., Sankar C., Courville A., Bengio Y. (2015). Variance reduction in SGD by distributed importance sampling. arXiv:1511.06481.

[10] Alain G., Lamb A., Sankar C., Courville A., Bengio Y. (2015). Variance reduction in SGD by distributed importance sampling. arXiv:1511.06481.

Account

Save citation to file

Email citation

Add to Collections

Add to My Bibliography

Your saved search

Create a file for external citation management software

Your RSS Feed

Toward an Integration of Deep Learning and Neuroscience

Affiliations

Toward an Integration of Deep Learning and Neuroscience

Authors

Affiliations

Abstract

Figures

References

Grants and funding

LinkOut - more resources

Full Text Sources

Other Literature Sources