This site needs JavaScript to work properly. Please enable it to take advantage of the complete set of features!
Skip to main page content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.
Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.
The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

NIH NLM Logo
Log in
Access keys NCBI Homepage MyNCBI Homepage Main Content Main Navigation
. 2016 Sep 14:10:94.
doi: 10.3389/fncom.2016.00094. eCollection 2016.

Toward an Integration of Deep Learning and Neuroscience

Affiliations

Toward an Integration of Deep Learning and Neuroscience

Adam H Marblestone et al. Front Comput Neurosci. .

Abstract

Neuroscience has focused on the detailed implementation of computation, studying neural codes, dynamics and circuits. In machine learning, however, artificial neural networks tend to eschew precisely designed codes, dynamics or circuits in favor of brute force optimization of a cost function, often using simple and relatively uniform initial architectures. Two recent developments have emerged within machine learning that create an opportunity to connect these seemingly divergent perspectives. First, structured architectures are used, including dedicated systems for attention, recursion and various forms of short- and long-term memory storage. Second, cost functions and training procedures have become more complex and are varied across layers and over time. Here we think about the brain in terms of these ideas. We hypothesize that (1) the brain optimizes cost functions, (2) the cost functions are diverse and differ across brain locations and over development, and (3) optimization operates within a pre-structured architecture matched to the computational problems posed by behavior. In support of these hypotheses, we argue that a range of implementations of credit assignment through multiple layers of neurons are compatible with our current knowledge of neural circuitry, and that the brain's specialized systems can be interpreted as enabling efficient optimization for specific problem classes. Such a heterogeneously optimized system, enabled by a series of interacting cost functions, serves to make learning data-efficient and precisely targeted to the needs of the organism. We suggest directions by which neuroscience could seek to refine and test these hypotheses.

Keywords: cognitive architecture; cost functions; neural networks; neuroscience.

PubMed Disclaimer

Figures

Figure 1
Figure 1
Putative differences between conventional and brain-like neural network designs. (A) In conventional deep learning, supervised training is based on externally-supplied, labeled data. (B) In the brain, supervised training of networks can still occur via gradient descent on an error signal, but this error signal must arise from internally generated cost functions. These cost functions are themselves computed by neural modules specified by both genetics and learning. Internally generated cost functions create heuristics that are used to bootstrap more complex learning. For example, an area which recognizes faces might first be trained to detect faces using simple heuristics, like the presence of two dots above a line, and then further trained to discriminate salient facial expressions using representations arising from unsupervised learning and error signals from other brain areas related to social reward processing. (C) Internally generated cost functions and error-driven training of cortical deep networks form part of a larger architecture containing several specialized systems. Although the trainable cortical areas are schematized as feedforward neural networks here, LSTMs or other types of recurrent networks may be a more accurate analogy, and many neuronal and network properties such as spiking, dendritic computation, neuromodulation, adaptation and homeostatic plasticity, timing-dependent plasticity, direct electrical connections, transient synaptic dynamics, excitatory/inhibitory balance, spontaneous oscillatory activity, axonal conduction delays (Izhikevich, 2006) and others, will influence what and how such networks learn.

References

    1. Abbott L., DePasquale B., Memmesheimer R. (2016). Building Functional Networks of Spiking Model Neurons. Available online at: neurotheory.columbia.edu. - PMC - PubMed
    1. Abbott L. F., Blum K. I. (1996). Functional significance of long-term potentiation for sequence learning and prediction. Cereb. Cortex 6, 406–416. - PubMed
    1. Ackley D., Hinton G., Sejnowski T. (1958). A learning algorithm for Boltzmann machines. Cogn. Sci. 9, 147–169.
    1. Acuna D. E., Wymbs N. F., Reynolds C. A., Picard N., Turner R. S., Strick P. L., et al. . (2014). Multifaceted aspects of chunking enable robust algorithms. J. Neurophysiol. 112, 1849–1856. 10.1152/jn.00028.2014 - DOI - PMC - PubMed
    1. Alain G., Lamb A., Sankar C., Courville A., Bengio Y. (2015). Variance reduction in SGD by distributed importance sampling. arXiv:1511.06481.
Cite

AltStyle によって変換されたページ (->オリジナル) /