Vol 45312 June 200doi:10.1038/nature06976 nature REVIEWS What we can do and what we cannot do with fMR Nikos K.Logothetis' Functional magnetic resonance imaging(fMRI)is currently the mainstay of neuroimaging in cognitive neuroscience. fwareiRomertechnolcegymegeaguistionprotoc ental design,anc d analysis methods promise to push Here I give an overview of the current state of fMRI,and draw on neu imagingand physiological data to present thecu understanding of the haemodynamic signals and the constraints they impose on neuroimaging data interpretation. ing (MRD is th sition M on b linic in the l aticndpgnosticmcdicincandmorg ccntyinb ans,including the central nervous 信ana mind has no components to speak of.Even if true.the challenge meural activityi the had a real impa on basi in each of which cognitive capacity,however abstract,is cognitive database oblem but a contrast ag ng to a ut nitive task or timldorcogniti MRI can go reve ing the rm module er topics ing plasticty,drug action,experiment )as we In humans.fMRI is used routinely not just to study tions term fur ffongeontrolofactio d and spati ht furthe ed m yto po g toindicate tha h fmriwe can The principal adv ges of fMRI lie in its noninvasive natur tests has be renology that i oded to signal whose ificity and tempo oral response ar SiobohphycahdngaoatAmoepoia 'Max Planck Institute for Biological Cy :72076T cal Engir ng.University of Manchester.Manchester M13 9PL UK
REVIEWS What we can do and what we cannot do with fMRI Nikos K. Logothetis1 Functional magnetic resonance imaging (fMRI) is currently the mainstay of neuroimaging in cognitive neuroscience. Advances in scanner technology, image acquisition protocols, experimental design, and analysis methods promise to push forward fMRI from mere cartography to the true study of brain organization. However, fundamental questions concerning the interpretation of fMRI data abound, as the conclusions drawn often ignore the actual limitations of the methodology. Here I give an overview of the current state of fMRI, and draw on neuroimaging and physiological data to present the current understanding of the haemodynamic signals and the constraints they impose on neuroimaging data interpretation. Magnetic resonance imaging (MRI) is the most important imaging advance since the introduction of X-rays by Conrad Ro¨ntgen in 1895. Since its introduction in the clinic in the 1980s, it has assumed a role of unparalleled importance in diagnostic medicine and more recently in basic research. In medicine, MRI is primarily used to produce structural images of organs, including the central nervous system, but it can also provide information on the physico-chemical state of tissues, their vascularization, and perfusion. Although all of these capacities have long been widely appreciated, it was the emergence of functional MRI (fMRI)—a technique for measuring haemodynamic changes after enhanced neural activity—in the early 1990s that had a real impact on basic cognitive neuroscience research. A recent database (ISI/Web of Science) query using the keywords ‘fMRI’ or ‘functional MRI’ or ‘functional magnetic resonance imaging’ returned over 19,000 peerreviewed articles. Given that the first fMRI study without exogenous contrast agents was published in 1991, this corresponds to approximately 1,100 papers per year, or over 3 papers per day. This average obscures the actual rate of publications, as in 1992 there were four publications in total, increasing to about eight per day by 2007. About 43% of papers explore functional localization and/or cognitive anatomy associated with some cognitive task or stimulus—constructing statistical parametric maps from changes in haemodynamic responses from every point in the brain. Another 22% are region of interest studies examining the physiological properties of different brain structures, analogous to single-unit recordings; 8% are on neuropsychology; 5% on the properties of thefMRI signal; and the rest is on a variety of other topics including plasticity, drug action, experimental designs and analysis methods. In humans, fMRI is used routinely not just to study sensory processing or control of action, but also to draw provocative conclusions about the neural mechanisms of cognitive capacities, ranging from recognition and memory to pondering ethical dilemmas. Its popular fascination is reflected in countless articles in the press speculating on potential applications, and seeming to indicate that with fMRI we can read minds better than direct tests of behaviour itself. Unsurprisingly, criticism has been just as vigorous, both among scientists and the public. In fact, fMRI is not and will never be a mind reader, as some of the proponents of decoding-based methods suggest, nor is it a worthless and non-informative ‘neophrenology’ that is condemned to fail, as has been occasionally argued. Perhaps the extreme positions on both sides result from a poor understanding of the actual capacities and limitations of this technology, as well as, frequently, a confusion between fMRI shortcomings and potential flaws in modelling the organizational principles of the faculties under investigation. For example, a frequently made assumption is that the mind can be subdivided into modules or parts whose activity can then be studied with fMRI. If this assumption is false, then even if the brain’s architecture is modular, we would never be able to map mind modules onto brain structures, because a unified mind has no components to speak of. Even if true, the challenge remains in coming up with the correct recursive decompositions— in each of which any given cognitive capacity, however abstract, is divided into increasingly smaller functional units that are localized to specific brain parts, which in turn can be detected and studied with fMRI. This is not a neuroimaging problem but a cognitive one. Hierarchical decompositions are clearly possible within different sensory modalities and motor systems. Their mapping, which reflects the brain’s functional organization, is evidently possible and certainly meaningful beyond any reasonable doubt1 . Here, I offer an assessment of fMRI methodology itself, leaving aside such epistemological and ontological issues. I take the modular organization of many brain systems as a well established fact, and discuss only how far fMRI can go in revealing the neuronal mechanisms of behaviour by mapping different system modules and their dynamic inter-relationships. In this context the term module captures the classical local neuronal circuits repeated iteratively within a structure (for example, the columns or swirling, slab-like tangential arrangements of the neocortex), as well as the entities within which modules might be grouped by sets of dominating external connections. The often used term functional segregation refers to such specialized and spatially separated modules. Segregated entities that are interconnected might further result in nested distributed systems, the activity of which, often termed functional integration, can only be visualized by large-scale neuroimaging. The principal advantages of fMRI lie in its noninvasive nature, ever-increasing availability, relatively high spatiotemporal resolution, and its capacity to demonstrate the entire network of brain areas engaged when subjects undertake particular tasks. One disadvantage is that, like all haemodynamic-based modalities, it measures a surrogate signal whose spatial specificity and temporal response are subject to both physical and biological constraints. A more important 1 Max Planck Institute for Biological Cybernetics, 72076 Tuebingen, Germany, and Imaging Science and Biomedical Engineering, University of Manchester, Manchester M13 9PL, UK. Vol 453j12 June 2008jdoi:10.1038/nature06976 869 ©2008 Macmillan Publishers Limited. All rights reserved
REVIEWS NATUREIVol 453112 June 2008 gate signal reflects neuronal mass acti- Critical factors determining the utility of fMRI for drawing con The aim of thi rcficetactualncural neuro ience,and then dis sneobioigcalpihapileha hypothese from within and Spatial and und Humn from the of high-fid the od's founda t of he terms oftd with MRI via direct mearements of ncentr SE SE 870 08 Macmillan Publishers Limited.All rights reserved
shortcoming is that this surrogate signal reflects neuronal mass activity. Although this fact is acknowledged by the vast majority of investigators, its implications for drawing judicious conclusions from fMRI data are most frequently ignored. The aim of this review is first to describe briefly the fMRI technology used in cognitive neuroscience, and then discuss its neurobiological principles that very often limit data interpretation. I hope to point out that the ultimate limitations of fMRI are mainly due to the very fact that it reflects mass action, and much less to limitations imposed by the existing hardware or the acquisition methods. Functional MRI is an excellent tool for formulating intelligent, data-based hypotheses, but only in certain special cases can it be really useful for unambiguously selecting one of them, or for explaining the detailed neural mechanisms underlying the studied cognitive capacities. In the vast majority of cases, it is the combination of fMRI with other techniques and the parallel use of animal models that will be the most effective strategy for understanding brain function. A brief overview of fMRI The beautiful graphics MRI and fMRI produce, and the excitement about what they imply, often mask the immense complexity of the physical, biophysical and engineering procedures generating them. The actual details of MRI can only be correctly described via quantum mechanics, but a glimpse of the method’s foundation can be also afforded with the tools of classical physics using a few simple equations. (See refs 2 and 3 for a comprehensive account of the theoretical and practical aspects of MRI, and ref. 4 for its functional variants.) Here I offer a brief overview that permits an understandable definition of the terms and parameters commonly used in magnetic resonance imaging (see ‘MRI and fMRI principles’ in the Supplementary Information for a description of the principles and terms of anatomical and functional MRI). Functional activation of the brain can be detected with MRI via direct measurements of tissue perfusion, blood-volume changes, or changes in the concentration of oxygen. The blood-oxygen-level-dependent (BOLD) contrast mechanism5,6 is currently the mainstay of human neuroimaging. Critical factors determining the utility of fMRI for drawing conclusions in brain research are signal specificity and spatial and temporal resolution. Signal specificity ensures that the generated maps reflect actual neural changes, whereas spatial and temporal resolution determine our ability to discern the elementary units of the activated networks and the time course of various neural events, respectively. The interpretability of BOLD fMRI data also depends critically on the experimental design used. Spatiotemporal properties of BOLD fMRI. The spatiotemporal properties of fMRI are covered in some detail in the Supplementary Information. Briefly, spatial specificity increases with increasing magnetic field strength and for a given magnetic field can be optimized by using pulse sequences that are less sensitive to signals from within and around large vessels (see Fig. 1 and ‘Spatial and temporal specificity’ in the Supplementary Information). Spatiotemporal resolution is likely to increase with the optimization of pulse sequences, the improvement of resonators, the application of high magnetic fields, and the invention of intelligent strategies such as parallel imaging, for example, sensitivity encoding (SENSE) method (see ‘Spatial resolution’ section in the Supplementary Information). Human fMRI can profit a great deal from the use of high-field scanners and by the optimization of the pulse sequences used. Surprisingly, only a minority of the studies in the cognitive sciences seem to exploit the technical innovations reported from laboratories working on magnetic resonance methodologies. Most of the topcited cognitive neuroscience studies (approximately 70%) were carried out at 1.5 T scanners, 20% were carried out at 3 T scanners, and very few at 2 T or 4 T field strengths. About 87% of all studies used the conventional gradient-echo echoplanar imaging (GE-EPI), whereas the rest used different variants of the spin-echo echoplanar imaging (SE-EPI) sequence. This combination of low magnetic field and traditional GE-EPI is prone to many localization errors. However, as of the beginning of the twenty-first century the percentage of middlefield (3 T) studies has increased, to reach about 56% in 2007. High magnetic fields are likely to dominate magnetic resonance research a bc d ef GE GE SE SE SE SE Figure 1 | Specificity of GE-EPI and SE-EPI. Examples of high-resolution GE-EPI and SE-EPI (courtesy J. Goense, MPI for Biological Cybernetics). a, b, Two slices of GE-EPI demonstrating the high functional signal-to-noise ratio (SNR) of the images, but also the strong contribution of macrovessels. The yellow areas (indicated with the green arrows) are pia vessels, an example of which is shown in the inset scanning electron microscopy image (total width of inset, 2 mm). For the functional images red indicates low and yellow indicates high. In-plane resolution 333 3 333 mm2 ; slice thickness 2 mm. c, Anatomical scan, SE-EPI, 250 3 188 mm2 , 2 mm slice, with time to echo (TE) and repetition time (TR) 70 and 3,000 ms respectively. d, e, Two slices of SE-EPI showing the reduction of vascular contribution at the pial side of the cortex. In-plane resolution 250 3 175 mm2 , slice thickness 2 mm. f, The anatomical scan is the SE-EPI used for obtaining the functional scans (TE/TR 5 48/2,000 ms) but at different greyscale and contrast. The resolution of the anatomical scan permits the clear visualization of the Gennari line (red arrow), the characteristic striation of the primary visual REVIEWS NATUREjVol 453j12 June 2008 870 ©2008 Macmillan Publishers Limited. All rights reserved
NATUREVol 453|12 June 2008 REVIEWS time,high magnetic field scanners are likely to require even tighte tateof arousalof te subjectHigh-speed MRImethods,cap I by t or the response in such experr t deal of and ach s have distinct.non-over ppin ropi)wh 正 a hu n).Wi blocked subtraction paradig ly b night.at least to some tacke the probl with the xpectation that itwil ventual ugh fo cou sing [MRI nd the Afer adaptation the su to read a n r with a mi y s in th the to the h striosomes of basal on must know a great deal ab to th ing ofac ibuted l system.su ch as that unde one have been widely used in co and the w油10 d eiving are ch more rtant ver. and dit 0.7×9 far hav exploit prove optimal for the vastm tions b en erro terms owing to physiolo al n sal at d and h mos Anothe s to tak into account eminiscentof y or att all diffo be task or stimulus con es tha are no man savin In s underlyi ng th patterns. The presence, ion,with an emphasis subtraction des be pote ally detected by ly on pu that into a task ithout aff different types of neural population sumpti What do activation maps repre ent? design c ogaitinveleeL Does the a ctivation of an are a n tion ation in the studied b iour.But do we?It isu the determine their interactions ion,in tantiated in the patterns ith the hat the ularcogniticproockd& stis indeed being isolate ceptualized as information with an input, (that is signal difference between test and e2008 Macmillan Publishers L All rights reserved
facilities in the future, and this should definitely improve the quality of data obtained in human magnetic resonance studies. At the same time, high magnetic field scanners are likely to require even tighter interaction between magnetic resonance physicists and application scientists, as the much larger inhomogeneity of both B0 (main static field) and B1 (the field generated by the excitation pulses) at high field will demand a great deal of expertise and experimental skill to achieve the desired image quality. All in all, MRI may soon provide us with images of a fraction of a millimetre (for example, 300 3 300 mm2 with a couple of millimetres slice thickness or 50035003500 mm3 isotropic), which amount to voxel volumes of about two–three orders of magnitude smaller than those currently used in human imaging (see ‘Developments and perspectives’ in the Supplementary Information). With an increasing number of acquisition channels such resolution may ultimately be attained in whole-head imaging protocols, yielding unparalleled maps of distributed brain activity in great regional detail and with reasonable—a couple of seconds—temporal resolution. Would that be enough for using fMRI to understand brain function? The answer obviously depends on the scientific question and the spatial scale at which this question could be addressed—‘‘it makes no sense to read a newspaper with a microscope’’, as neuroanatomist Valentino Braitenberg once pointed out. To understand the functioning of the microcircuits in cortical columns or of the cell assemblies in the striosomes of basal ganglia, one must know a great deal about synapses, neurons and their interconnections. To understand the functioning of a distributed large-scale system, such as that underlying our memory or linguistic capacities, one must first know the architectural units that organize neural populations of similar properties, and the interconnections of such units. With 1010 neurons and 1014 connections in the cortex alone, attempting to study dynamic interactions between subsystems at the level of single neurons would probably make little sense, even if it were technically feasible. It is probably much more important to understand better the differential activity of functional subunits—whether subcortical nuclei, or cortical columns, blobs and laminae—and the instances of their joint or conditional activation. If so, whole-head imaging with a spatial resolution, say, of 0.7 3 0.7 mm2 in slices of 1-mm thickness, and a sampling time of a couple of seconds, might prove optimal for the vast majority of questions in basic and clinical research. More so, because of the great sensitivity of the fMRI signal to neuromodulation (see below and Supplementary Information). Neuromodulatory effects, such as those effected by arousal, attention, memory, and so on, are slow and have reduced spatiotemporal resolution and specificity7,8. Designs and analyses. Many studies initially used block designs, reminiscent of earlier positron emission tomography (PET) paradigms. These designs use time-integrated averaging procedures, and usually analyse the data by means of subtraction methods. The central idea is to compare a task state designed to place specific demands on the brain with an investigator-defined control state. Under these conditions, both enhancements and reductions of the fMRI signal are observed. In the early cognitive fMRI studies the prevailing block design was cognitive subtraction, with an emphasis on serial subtraction designs9 . Such designs rely strictly on pure insertion, which asserts that a single cognitive process can be inserted into a task without affecting the remainder, an assumption that all too often is not tenable (see ‘On pure insertion’ in the Supplementary Information). Even if an experimental design could satisfy this assumption at the cognitive level, the assumption would be condemned to fail at the level of its neuronal instantiation10 owing to the highly nonlinear nature of most brain processes. To overcome this kind of problem and ensure better interpretation of the neuroimaging data it is necessary to perform a detailed task analysis to determine subtraction components and their interactions. Yet most neuroimaging studies provide noformal task analysis thatwould ensure that the particular cognitive process of interest is indeed being isolated by the subtraction11. Traditional block designs have excellent functional contrast-to-noise ratio (that is, signal difference between test and control epochs, normalized to the mean signal of all epochs), but they are usually long (from 20 to 60 s), and may be confounded by the general state of arousal of the subject. High-speed fMRI methods, capable of whole-brain imaging with a temporal resolution of a few seconds, enabled the employment of so-called event-related designs12. The time course of the response in such experiments is closer to the underlying neural activity. The block designs discussed so far may reveal differential patterns of activation only in those cases in which different stimulus attributes or different cognitive processes have distinct, non-overlapping spatial organizations. Overlapping networks of neurons subserving different functions are likely to go unnoticed owing to the spatial averaging that characterizes the blocked subtraction paradigms. Functional MRI adaptation designs were conceived as tools that might, at least to some extent, tackle the problem of spatially overlapping neural networks13. In this experimental design, a stimulus is presented repeatedly with the expectation that it will eventually induce response adaptation in neurons selective for its various properties. In general, repetition of an identical stimulus does indeed produce a reduction in the fMRI signal. After adaptation, the subject is presented with a stimulus that is varied along one dimension (for example, the direction of a moving pattern or the view of a human face) and the possibility of a response rebound is examined. If the underlying neural representation is insensitive to the changes in the stimulus then the fMRI signal will be reduced, similar to the reduction produced by the repetition of identical stimuli. Alternatively, if the neurons are sensitive to the transformation, the signal will show a clear rebound to its original, pre-adaptation level. Functional MRI adaptation designs have been widely used in cognitive neuroscience, but they also have shortcomings, as any area receiving input from another region may reveal adaptation effects that actually occurred in that other region, even if the receiving area itself has no neuronal specificity for the adapted property13. Moreover, the conclusions of experiments relying on adaptation designs strongly rely on existing electrophysiological evidence, which itself may hold true for one area and not for another72. Finally, clever analysis is required to exploit clever design. Most studies so far have used voxel-based conventional analyses of MRI time series from one or more subjects14. The approach is predicated on an extension of the general linear model that allows for correlations between error terms owing to physiological noise or correlations that ensue after temporal smoothing. The method is reliable and, when well implemented, offers the best analysis strategy for most studies. Another approach is to take into account the full spatial pattern of brain activity, measured simultaneously at many locations15. Such multivariate analyses or pattern-classification-based techniques (decoding techniques) can often detect small differences between two task or stimulus conditions—differences that are not picked up by conventional univariate methods. However, this is not equivalent to saying that they unequivocally reveal the neural mechanisms underlying the activation patterns. The presence, for instance, of voxels selective to two different stimulus attributes could be potentially detected by modern classifiers, yet the existence of two types of patterns does not necessarily imply the existence of two different types of neural populations72. What do activation maps represent? Does the activation of an area mean that it is truly involved in the task at hand? This question implies that we understand what neural activity in a given area would unequivocally show its participation in the studied behaviour. But do we? It is usually alleged that cognitive capacities reflect the ‘local processing of inputs’ or the ‘output’ of a region, instantiated in the patterns of action potentials, with their characteristic frequency and timing. In principle, brain structures can be conceptualized as information processing entities, with an input, a local-processing capacity, and an output. Yet, although such a scheme may describe the function of subcortical nuclei, its implementation NATUREjVol 453j12 June 2008 REVIEWS 871 ©2008 Macmillan Publishers Limited. All rights reserved
REVIEWS NATUREIVol 453112 June 2008 rd.In fact,w corical moduleare instantiated ina simg basic EIN,referred to cading ove ofthe module,rather than initia ing a sequential activatio k is massive,the loc vol d and end taining ing rd and ding,or up an top-do periods ng-type or functiona apable e of large change in a y while maintai in tive on brain fun ion,it is the backwar tly in of hig (up) error signal be rea bral ing natural sleet th to s on basa and b are,respect and rked mitan Cortical utput ha s thalamicand he and reduction or cessation of firi ion of the upsta y and rge uctar uits t e fol t feat s:(1)th boriaonaandvcrticalco ctions within and be cortic ample,due Very few of the p ramid are thalamocortical (less that and less than 5%a when th on s Ils bei ④r large sustaine input c nd back to and induced tion k interposed amon ting of a variet activit visual cortex es on to their nata,and hav only local con rvat CARA synapt ells)targ ta and p of the integ other ole.chandelie sting the nal-t ratio (SNR either as d epe ha erio activity concomitant metabolic his in and imp ant dr odultor distinction wa nitially c ich ferents in the majo in all ortical the basis of th ara xon term 872 C2008 Macmillan Publishers Limited.All rights reserved
in different areas of cortex is anything but straightforward. In fact, we now know that the traditional cortical input–elaboration–output scheme, commonly presented as an instantiation of the tripartite perception–cognition–action model, is probably a misleading oversimplification16. Research shows that the subcortical input to cortex is weak; the feedback is massive, the local connectivity reveals strong excitatory and inhibitory recurrence, and the output reflects changes in the balance between excitation and inhibition, rather than simple feedforward integration of subcortical inputs17. In the context of this review, the properties of these excitation–inhibition networks (EIN) deserve special attention, and are briefly discussed below. Feedforward and feedback cortical processing. Brain connectivity is mostly bidirectional. To the extent that different brain regions can be thought of as hierarchically organized processing steps, connections are often described as feedforward and feedback, forward and backward, ascending and descending, or bottom-up and top-down18. Although all terms agree on processing direction, endowing backward connections with a role of engineering-type or functional ‘feedback’ might occasionally be misleading, as under a theoretical generative model perspective on brain function, it is the backward connections that generate predictions and the forward connections that convey the traditional feedback, in terms of mismatch or prediction error signals19. In the sensory systems, patterns of long-range cortical connectivity to some extent define feedforward and feedback pathways20. The main thalamic input mainly goes to middle layers, whereas secondorder thalamic afferents and the nonspecific diffuse afferents from basal forebrain and brain-stem are, respectively, distributed diffusely regionally or over many cortical areas, making synapses mainly in superficial and/or deep layers. Cortical output has thalamic and other subcortical projections originating in layers VI and V, respectively, and corticocortical projections mostly from supragranular layers. The primary thalamic input innervates both excitatory and inhibitory neurons, and communication between all cell types includes horizontal and vertical connections within and between cortical layers. Such connections are divergent and convergent, so that the final response of each neuron is determined by all feedforward, feedback and modulatory synapses17. Very few of the pyramid synapses are thalamocortical (less than 10–20% in the input layers of cortex, and less than 5% across its entire depth; in the primary visual cortex the numbers are even lower, with the thalamocortical synapses on stellate cells being about 5%21), with the rest originating from other cortical pyramidal cells. Pyramidal axon collateral branches ascend back to and synapse in superficial layers, whereas others distribute excitation in the horizontal plane, forming a strongly recurrent excitatory network17. The strong amplification of the input signal caused by this kind of positive feedback loop is set under tight control by an inhibitory network interposed among pyramidal cells and consisting of a variety of GABAergic interneurons22,23. These can receive both excitatory and inhibitory synapses on to their somata, and have only local connections. About 85% of them in turn innervate the local pyramidal cells. Different GABAergic cells target different subdomains of neurons22,24. Some (for example, basket cells) target somata and proximal dendrites, and are excellent candidates for the role of gain adjustment of the integrated synaptic response; others (for example, chandelier cells) target directly the axons of nearby pyramidal neurons, and appear to have a context-dependent role25—they can facilitate spiking during low activity periods, or act like gatekeepers that shunt most complex somatodendritic integrative processes during high activity periods (for example, see up- and down states below). Such nonlinearities might generate substantial dissociations between subthreshold population activity and its concomitant metabolic demand and the spiking of pyramidal cells. Modules and their microcircuits. A large number of structural, immunochemical and physiological studies, in all cortical areas examined so far, suggested that the functional characteristics of a cortical module are instantiated in a simple basic EIN, referred to as a canonical microcircuit17 (see also Fig. 2a). Activation of a microcircuit sets in motion a sequence of excitation and inhibition in every neuron of the module, rather than initiating a sequential activation of separate neurons at different hypothetical processing stages. Reexcitation is tightly controlled by local inhibition, and the time evolution of excitation–inhibition is far longer than the synaptic delays of the circuits involved. This means the magnitude and timing of any local mass activation arise as properties of the microcircuits. Computational modelling suggested that EIN microcircuits, containing such a precisely balanced excitation and inhibition, can account for a large variety of observations of cortical activity, including amplification of sensory input, noise reduction, gain control26, stochastic properties of discharge rates27, modulation of excitability with attention28, or even generation of persisting activity during the delay periods of working memory tasks29. The principle of excitation–inhibition balance implies that microcircuits are capable of large changes in activity while maintaining proportionality in their excitatory and inhibitory synaptic conductances. This hypothesis has been tested directly in experiments examining conductance changes during periods of high (up) and low (down) cortical activity. Alternating up states and down states can be readily observed in cerebral cortex during natural sleep or anaesthesia30, but they can be also induced in vitro by manipulating the ionic concentrations in a preparation so that they match those found in situ. Research showed that the up state is characterized by persisting synaptically mediated depolarization of the cell membranes owing to strong barrages of synaptic potentials, and a concomitant increase in spiking rate, whereas the down state is marked by membrane hyperpolarization and reduction or cessation of firing31,32. Most importantly, the excitation–inhibition conductances indeed changed proportionally throughout the duration of the up state despite large changes in membrane conductance31,32. Microcircuits therefore have the following distinct features: (1) the final response of each neuron is determined by all feedforward, feedback and modulatory synapses; (2) transient excitatory responses may result from leading excitation, for example, due to small synaptic delays or differences in signal propagation speed, whereupon inhibition is rapidly engaged, followed by balanced activity31,32; (3) net excitation or inhibition might occur when the afferents drive the overall excitation–inhibition balance in opposite directions; and (4) responses to large sustained input changes may occur while maintaining a well balanced excitation–inhibition. In the latter case, experimentally induced hyperpolarization of pyramidal cells may abolish their spiking without affecting the barrages of postsynaptic potentials (see ref. 31 and references therein). It is reasonable to assume that any similar hyperpolarization under normal conditions would decrease spiking of stimulus-selective neurons without affecting presynaptic activity. In visual cortex, recurrent connections among spiny stellate cells in the input layers can provide a significant source of recurrent excitation26. If driven by proportional excitation– inhibition synaptic currents, the impact of their sustained activity might, once again, minimally change the spiking of the pyramidal cells. This last property of microcircuits suggests that changes with balanced excitation–inhibition are good candidates for mechanisms adjusting the overall excitability and the signal-to-noise ratio (SNR) of the cortical output. Thus microcircuits—depending on their mode of operation—can, in principle, act either as drivers, faithfully transmitting stimulus-related information, or as modulators, adjusting the overall sensitivity and context-specificity of the responses28. Figure 2b summarizes the different types of excitation-inhibition changes and their potential effect on the haemodynamic responses. This interesting and important driver/modulator distinction was initially drawn in the thalamus33, in which the afferents in the major sensory thalamic relays were assigned to one of two major classes on the basis of the morphological characteristics of the axon terminals, the synaptic relationships and the type of activated receptors, the REVIEWS NATUREjVol 453j12 June 2008 872 ©2008 Macmillan Publishers Limited. All rights reserved
NATUREVol 45312 June 2008 REVIEWS of a hanges of coric qular and may cdnetcxcitationorht on might (up n in re torv n black.All g which or cortice rtical ax net excitator rtical output In thesam uperficial ortic utpu ve but would the decrea MRI sign inhibited hereas the the be paths originating am pul 。2 The initial infor ed an and s-reg rtical interactions The co the repres and high-tone reque sits res and SNR reflect the activityof se systems Electrophysiology s bination of anatomical and and nals them the ation-inhibition netw orks and fMR n th aning active is sub nent of a functi 9. inhibition teralccitatiol these conditio ps obtained ith C)2-de yglucose(2DG nen nd thly and onstrated hat the ation of blood flow CBF)(that that area is sufficie t to s the fMRIdau e th sumes an in as he spiking of many or stimulus the re e rates w suppression o All rights reserved
degree of input convergence, and the activity patterns of postsynaptic neurons. The same concept also broadly applies to the afferents of the cerebral cortex34, wherein the thalamic or corticocortical axons terminating in layer IV can be envisaged as drivers, and other feedback afferents terminating in the superficial layers as modulators. It can also be applied to the cortical output, whereby the projections of layer VI back to the primary relays of the thalamus are modulatory, whereas the cortico-thalamo-cortical paths originating in layer V of cortex, reaching higher-order thalamic nuclei (for example, pulvinar), and then re-entering cortex via layer IV, are drivers33. The initial information reaching a cortical region is elaborated and evaluated in a context-dependent manner, under the influence of strong intra- and cross-regional cortical interactions. The cortical output reflects ascending input but also cortico-thalamo-cortical pathways, whereas its responsiveness and SNR reflect the activity of feedback, and likely input from the ascending diffuse systems of the brain-stem. The neuromodulation (see ‘Neurotransmission and neuromodulation’ in Supplementary Information) afforded by these systems, which is thought to underlie the altered states of cognitive capacities, such as motivation, attention, learning and memory, is likely to affect large masses of cells, and potentially induce larger changes in the fMRI signal than the sensory signals themselves. Excitation–inhibition networks and fMRI. The organization discussed above evidently complicates both the precise definition of the conditions that would justify the assignment of a functional role to an ‘active’ area, and interpretation of the fMRI maps. Changes in excitation–inhibition balance—whether they lead to net excitation, inhibition, or simple sensitivity adjustment—inevitably and strongly affect the regional metabolic energy demands and the concomitant regulation of cerebral blood flow (CBF) (that is, they significantly alter the fMRI signal). A frequent explanation of the fMRI data simply assumes an increase in the spiking of many task- or stimulusspecific neurons. This might be correct in some cases, but increases of the BOLD signal may also occur as a result of balanced proportional increases in the excitatory and inhibitory conductances, potential concomitant increases in spontaneous spiking, but still without a net excitatory activity in stimulus-related cortical output. In the same vein, an increase in recurrent inhibition with concomitant decreases in excitation may result in reduction of an area’s net spiking output, but would the latter decrease the fMRI signal? The answer to this question seems to depend on the brain region that is inhibited, as well as on experimental conditions. Direct haemodynamic measurements with autoradiography suggested that metabolism increases with increased inhibition35. An exquisite example is the inhibition-induced increase in metabolism in the cat lateral superior olive (LSO). This nucleus, which contains the representations of low-, middle- and high-tone frequencies, receives afferents from both ears: over a two-neuron pathway from the ipsilateral ear and over a three-neuron pathway from the contralateral ear. Furthermore, it has no presynaptic axo-axonic endings that might mediate presynaptic inhibition via excitatory terminals. Electrophysiology showed that the LSO afferents from the ipsilateral ear are excitatory whereas the afferents from the contralateral ear are inhibitory. This unusual combination of anatomical and physiological features suggests that if one ear is surgically deafened and the animal is exposed to a high-frequency pure tone, a band of tissue in the LSO on the side opposite to the remaining active ear is subjected to strictly inhibitory synaptic activity without complications by presynaptic inhibition, concurrent lateral excitation, disinhibition/excitation, or other kinds of possibly excitatory action. Under these conditions, maps obtained with [14C]2-deoxyglucose (2DG) autoradiography36 demonstrated clear increases in metabolism in the contralateral LSO37, suggesting that the presynaptic activity in that area is sufficient to show strong energy consumption despite the ensuing spiking reduction. Similar increases in metabolism during the reduction of spike rates were observed during long-lasting microstimulation of the fornix, which induces sustained suppression of pyramidal cell firing in hippocampus38. a b E I I II III IVB IVC V VI IVA Thalamus Up Down Net excitation Net inhibition Increase Decrease Increase Decrease? (circuit dependent) Baseline fMRI response GABA cells Glu cells Figure 2 | Principles of excitation–inhibition circuits. a, Model of a canonical cerebral microcircuit (adapted from ref. 71). Three neuronal populations interact with each other: supragranular–granular and infragranular glutamatergic spiny neurons, and GABAergic cells. Excitatory synapses are shown in red and inhibitory synapses in black. All groups receive excitatory thalamic input. The line width indicates the strength of connection. The circuit is characterized by the presence of weak thalamic input and strong recurrence (see text for details). Glu, glutamatergic. b, Potential proportional and opposite-direction changes of cortical excitation (E) and inhibition (I). Responses to large sustained input changes may occur while maintaining a well balanced excitation–inhibition (up and down). The commonly assumed net excitation or inhibition might occur when the afferents drive the overall excitation–inhibition balance in opposite directions. The balanced proportional changes in excitation–inhibition activity, which occur as a result of neuromodulatory input, are likely to strongly drive the haemodynamic responses. NATUREjVol 453j12 June 2008 REVIEWS 873 ©2008 Macmillan Publishers Limited. All rights reserved