Functional connectome fingerprinting: identifying individuals using patterns of brain connectivity
Functional magnetic resonance imaging (fMRI) studies typically collapse data from many subjects, but brain functional organization varies between individuals.
Here we establish that this individual variability is both robust and reliable, using data from the Human Connectome Project to demonstrate that functional connectivity profiles act as a ‘fingerprint’ that can accurately identify subjects from a large group.
Identification was successful across scan sessions and even between task and rest conditions, indicating that an individual’s connectivity profile is intrinsic, and can be used to distinguish that individual regardless of how the brain is engaged during imaging.
Characteristic connectivity patterns were distributed throughout the brain, but the frontoparietal network emerged as most distinctive. Furthermore, we show that connectivity profiles predict levels of fluid intelligence: the same networks that were most discriminating of individuals were also most predictive of cognitive behavior.
Results indicate the potential to draw inferences about single subjects on the basis of functional connectivity fMRI.
All individuals are unique. Nevertheless, human neuroimaging studies have traditionally collapsed data from many subjects to draw inferences about general patterns of brain activity that are common across people.
Studies that contrast two populations—such as patients and healthy controls—typically ignore the considerable heterogeneity within each group. Despite the predominance of such population-level inferences, researchers have long recognized that even among the neurologically healthy, brain structure1–3 and function4–6 show high individual variability.
In terms of function, variability is found in activation patterns during cognitive tasks4–6 as well as intrinsic organization as measured by functional connectivity analyses of data acquired while subjects are resting7.
Recently, the Human Connectome Project (HCP) set out to map the connections in the human brain by acquiring high-quality structural and functional MRI scans from a large number of healthy subjects8. Many of the early analyses of HCP data have focused on elucidating the general blueprint for brain connectivity that is shared across people.
Yet despite the gross similarities, there is reason to believe that a substantial portion of the brain connectome is unique to each individual9. An open question is whether this uniqueness is sufficiently observable by fMRI to enable a transition from population-level studies to investigations of single subjects.
Here we show that functional connectivity profiles act as an identifying fingerprint, establishing that individual variability in connectivity is both substantial and reproducible.
Using HCP data from 126 subjects, each scanned during six separate sessions over 2 d, we demonstrate that a functional connectivity profile obtained from one session can be used to uniquely identify a given individual from the set of profiles obtained from the second session.
We show that identification is successful between rest sessions, task sessions and even across rest and task. These results indicate that although changes in brain state may modulate connectivity patterns to some degree, an individual’s underlying intrinsic functional architecture is reliable enough across sessions and distinct enough from that of other individuals to identify him or her from the group regardless of how the brain is engaged during imaging.
Furthermore, we establish the relevance of these connectivity profiles to behavior by demonstrating, in a fully cross-validated analysis, that functional connectivity profiles can be used to predict the fundamental cognitive trait of fluid intelligence in subjects.
These results provide a critical foundation for future work to novel test inferences about single subjects to reveal how individual functional brain organization relates to distinct behavioral phenotypes
RESULTS
Data for this study consisted of scans from 126 subjects provided in the Q2 data release of the HCP8. Each subject was scanned over a period of 2 d.
We used data from six separate imaging conditions: two rest sessions (one on each day, here referred to as Rest1 and Rest2) and four task sessions (working memory, emotion, motor and language; two on each day).
Functional connectivity was assessed using a functional brain atlas10 consisting of 268 nodes covering the whole brain; this atlas was defined on the basis of a separate population of healthy subjects (Online Methods, Yale data set).
The Pearson correlation coefficients between the time courses of each possible pair of nodes were calculated and used to construct 268 × 268 symmetrical connectivity matrices, where each element represents a connection strength, or edge,
Figure 1
Identification analysis procedure and network definitions.
(a) Database and target design. Data for each subject was collected in six fMRI sessions: a resting-state session (R1), a working-memory task (WM) and a motor task (Mt) on day 1 and a restingstate session (R2), a language task (Lg) and an emotion task (Em) on day 2. For identification, we used a set of connectivity matrices from one session for the database and connectivity matrices from a second session acquired on a different day as the target set. All possible combinations of database and target sessions are indicated by the arrows connecting session pairs.
(b) Identification procedure. Given a query connectivity matrix from the target set, we computed the correlations between this matrix and all the connectivity matrices in the database. The predicted identity ID* is the one with the highest correlation coefficient (argmax). Subj, subject.
(c) Node and network definitions. We used a 268-node functional atlas defined on an independent data set of healthy control subjects using a group-wise spectral clustering algorithm10. Nodes were further grouped into networks (1–10) using the same clustering algorithm, and networks were named according to their correspondence to other existing resting-state network definitions
between two nodes. This was done for each subject for each condition separately, such that each subject had a total of six matrices reflecting connectivity patterns during each of the different scan sessions.
Identification was performed across pairs of scans consisting of one ‘target’ and one ‘database’ session, with the requirement that the target and database sessions be taken from different days: for example, Rest1 matrices were used as the target and compared to a database of Rest2 matrices (Fig. 1a).
In an iterative process, one individual’s connectivity matrix was selected from the target set and compared against each of the connectivity matrices in the database to find the matrix that was maximally similar (Fig. 1b).
Similarity was defined as the Pearson correlation coefficient between vectors of edge values taken from the target matrix and each of the database matrices.
Once an identity had been predicted, the true identity of the target matrix was decoded and that iteration was assigned a score of 1, if the predicted identity matched the true identity, or 0, if it did not.
Within a target-database pair, each individual target connectivity matrix was tested against the database in an independent trial. We tested identification across all possible pairs of scan sessions (Fig. 1a; nine pairs, each with two possible configurations created by exchanging the roles of target and database session).
In each case, the success rate was measured as the percentage of subjects whose identity was correctly predicted out of the total number of subjects. Connectivity-based identification of individual subjects Whole-brain identification.
As a first pass, identification was performed using the whole-brain connectivity matrix (268 nodes; 35,778 edges), with no a priori network definitions. The success rate was 117/126 (92.9%) and 119/126 (94.4%) based on a targetdatabase of Rest1-Rest2 and the reverse Rest2-Rest1, respectively.
The success rate ranged from 68/126 (54.0%) to 110/126 (87.3%) with other database and target pairs, including rest-to-task and taskto-task comparisons.
Given that the 126 identification trials were not independent from one another, we performed nonparametric permutation testing to assess the statistical significance of these results (Online Methods).
Across 1,000 iterations, the highest success rate achieved was 6/126, or roughly 5%. Thus the P value associated with obtaining at least 68 correct identifications (the minimum rate we achieved) is 0.
Network-based identification. We next tested identification accuracy on the basis of each of eight specific functional networks to test the hypothesis that certain brain networks contribute more to individual subject discriminability than others.
These networks were derived from the same set of healthy subjects used to define the 268-node atlas (Fig. 1c). Two networks emerged as the most successful in individual subject identification: the medial frontal network (network 1) and the frontoparietal network (network 2), both comprised of higher-order association cortices in the frontal, parietal and temporal lobes.
We also tested a combination of these networks to determine whether this combination would afford even higher predictive power than each network on its own.
Thus, we determined identification rates based on each network separately, the combination of networks 1 and 2 (referred to for convenience as the frontoparietal networks) as well as the whole brain for each of the nine database and target pairs (Fig. 2a,b). Frontoparietal-based identification accuracy was extremely high between Rest1 and Rest2 (98–99%).
Accuracy dropped slightly when identification was performed between rest and task or two task conditions, but remained high (80–90% for most condition pairs).
The combination of these two frontoparietal networks significantly outperformed either network alone and the whole brain across all 18 database-to-target pairs (one-tailed paired t-test versus network 1, t17 = 10.4, P < 10−9; versus network 2, t17 = 1.97, P = 0.03; versus whole brain, t17 = 5.1, P < 10−5).
Figure 2
Identification accuracy across session pairs and networks.
(a) Identification accuracy based on all nine database and target pairs, where each row shares the same database session and each column shares the same target session. Bar shading (black or gray) indicates which session was used as the database (with the other session serving as the target), according to the legend above each graph. Graphs show accuracy based on each individual network as well as a combination of networks 1 and 2 (the frontoparietal networks) and the whole brain (all).
(b) Identification (ID) results from the combined frontoparietal networks (top) are highlighted in color-coded matrices (bottom) to compare accuracy across rest-rest, rest-task and task-task session pairs. R1, Rest1; WM, working-memory task; Mt, motor task; R2, Rest2; Lg, language task; Em, emotion task.
Factors affecting identification accuracy
We next explored several factors affecting identification accuracy, including quantifying contributions of individual edges to subject discriminability, varying the length of the time courses used to calculate connectivity matrices, expanding the database set to two matrices, comparing different node and network schemes and ruling out potential confounds (motion and anatomic differences).
Quantifying edgewise contributions to identification.
To quantify the extent to which different edges contribute to subject identification, we derived two measures: the differential power (DP) and the group consistency (Φ).
DP reflects each edge’s ability to distinguish an individual subject by quantifying how ‘characteristic’ that edge tends to be, such that edges with a high DP tend to have similar values within an individual across conditions, but different values across individuals regardless of condition.
Φ quantifies the consistency of a connection within a subject and across the group (Online Methods). Restricting analysis to Rest1 and Rest2, we calculated DP and Φ for all edges in the brain and determined which edges were in the 99.5 percentile on each of these two measures (Fig. 3a).
We found that majority of edges in the 99.5 percentile of DP are in the frontal, temporal and parietal lobes and involve nodes in the frontoparietal networks (networks 1 and 2) or default mode (network 3). This result was stable across percentile thresholds
(Supplementary Table 1).
This data-driven mathematical definition of characteristic edges recapitulates the results of our network-based analysis, showing that in general, connections involving higher-order association cortices are the most discriminating of subjects. Approximately 28% of the edges with high DP were within and between networks 1 and 2.
Another 48% were edges linking these two networks to other networks (Fig. 3a), suggesting that levels of interaction between the frontoparietal networks and the rest of the brain are also highly discriminating.
Conversely, the majority of high-Φ edges connected crosshemisphere homologs in the occipital and parietal lobes (Fig. 3a). Many of these edges were within the motor network (network 5) or the primary visual network (network 6; Fig. 3a).
These edges are highly consistent both within and across subjects, and thus do not contribute substantially to individual subject identification
Identification using shorter time courses.
Scan sessions differed in duration, meaning that the amount of data used to compute the connectivity profiles varied between sessions. Sessions Rest1 and Rest2 contained two runs of 1,200 brain volumes (time points) each, and were substantially longer than the working memory scan (two runs of 405 time points), the language scan (316), the motor scan (284) and the emotion scan (176).
To investigate the effect of the number of time points on identification power, we performed frontoparietal-based identification between the two rest sessions while varying the number of time points used to calculate connectivity matrices between 100 and 1,100.
We observed that longer time courses better preserved individual characteristics in connectivity profiles (Fig. 3b) and that temporal variability in the connectivity profiles degraded identification based on shorter time courses, especially those with fewer than ~500 time points.
Identification based on two-matrix database.
We found, as expected, that individual identification is more challenging when performed across connectivity matrices obtained from different task conditions or from task and rest conditions (even after taking into account the fact that task runs contain fewer time points) (Fig. 2).
It is well known that connectivity patterns can be modulated by different imaging conditions11–13, and such modulation contributes to intra-subject variation, making identification more difficult. In an effort to capture the intra-subject variation, we performed identification using an expanded database that included two connectivity matrices per
Figure 3
Factors affecting identification accuracy.
(a) Highly unique (DP, top, red) and highly consistent (Φ, bottom, blue) edges in individual connectivity profiles. For visualization, both sets of edges were thresholded at the 99.5 percentile.
In the circle plots (left), the 268 nodes (inner circle) are organized into a lobe scheme (outer circle) roughly reflecting brain anatomy from anterior (top of circle) to posterior (bottom of circle), and split into left and right hemispheres; lines indicate edges.
In the colored matrices (right), the same data are plotted as percentage of edges within and between each pair of networks; a darkly shaded cell indicates a relative overrepresentation of that network pair in the DP (top) or Φ (bottom) masks.
PFC, prefrontal cortex; mot, motor; ins, insula; par, parietal; tem, temporal; occ, occipital; lim, limbic (including cingulate cortex, amygdala and hippocampus); cer, cerebellum; sub, subcortical (including thalamus and striatum); bsm, brainstem; L, left hemisphere; R, right hemisphere.
(b) Longer time series improve identification accuracy. To control for the fact that task sessions contained fewer time points than rest sessions, we recalculated rest connectivity matrices using truncated time series containing between 100 and 1,100 time points.
Results shown are from 500 randomizations using Rest1 and Rest2 as the database and target sessions, respectively. Boxplots represent median (stripe), 25th and 75th percentiles (box) and range (whiskers). (c) Use of a two-matrix database improves identification rate relative to a single matrix (task or rest).
Dots and error bars represent mean and range of identification rate across all possible database and target pairs, where the target matrix was always from a task session and the database consisted of a rest-task pair (n = 8 combinations), task only (n = 8) or rest only (n = 4). *P < 0.01, Mann-Whitney U test.
subject (one rest session and one task session from the same day, with the target matrix from a task session on the other day; networks 1 and 2 were used for this analysis).
In all cases, accuracy was improved using the two-matrix database over a database of either the rest or task session alone (rank sum = 68, two-sided P = 0.004 versus rest alone; rank sum = 100, two-sided P = 0.0002 versus task alone, MannWhitney U test).
The average success rate increased to 97.2%, with a maximum rate of 100% (average rates were 82.1% with a task-only database and 86.9% with a rest-only database) (Fig. 3c). This suggests that the within-subject variability is well captured by a linear model that includes a baseline (connectivity at rest) and a deviation introduced by an independent task (connectivity during task).
Effects of parcellation scheme.
To investigate the sensitivity of identification to the specific choice of parcellation atlas and network definitions, we repeated the identification experiments using connectivity matrices calculated from the 68-node FreeSurfer atlas14 and grouped into seven networks using the scheme described in Yeo et al.
15 (below referred to as the FreeSurfer-Yeo scheme), where networks 3, 4 and 6 represent the frontoparietal networks. Between the two rest sessions, the identification rate based on this atlas was ~89% when the whole brain was used and ~75% when the frontoparietal networks were used (Fig. 4a).
The reduction in identification accuracy compared to our 268-node functional parcellation and corresponding network definitions, especially in the case of frontoparietal-based identification, suggests that a relatively high-resolution parcellation contributes to the detection of individual variability and boosts identification rate.
To further investigate the difference in identification accuracy, we calculated correlations between connectivity matrices of all 126 subjects from Rest1 and Rest2 on the basis of our atlas and network scheme and the FreeSurfer-Yeo scheme.
In a comparison of the raw cross-subject correlation coefficients for the frontoparietal networks (Fig. 4b), we found that with the FreeSurfer-Yeo scheme, the raw coefficients were larger for both matched (diagonal) subjects (t250 = −4.3, P < 10−5) and unmatched (off-diagonal) subjects (t31,750 = −18.0, P < 10−72) (Fig. 4b), which does not explain the difference in identification accuracy.
To control for equal global distribution of correlation coefficients, we normalized both matrices (Fig. 4c). After normalization, there was no difference between schemes in off-diagonal elements (t31,750 = 0.27, P = 0.79); however, using our parcellation and network scheme, the diagonal elements were significantly larger than with the FreeSurfer-Yeo scheme (t250 = 14.6, P < 10−35), underpinning the increase in identification rate (Fig. 4c).
Effects of head motion.
As head motion is a known confound of connectivity analyses16, we tested whether subjects could be identified on the basis of the distribution of their frame-to-frame motion during functional scans (Online Methods).
Identification based on each subject’s motion distribution had an average success rate of 2.4%, well below the identification accuracy achieved using connectivity profiles. Thus it is unlikely that identification power is based on idiosyncratic patterns related to motion in the scanner
Effects of anatomical differences.
The HCP provides functional data that have been normalized to a common space. However, small errors are inherent to the normalization process, and such errors result in part from individual differences in anatomy, which are constant in time.
Thus, the influence of individual brain anatomy is hard to eliminate completely from the functional data, as these errors could bias the individual identification to select the same subject on two different days.
Nonetheless, any such influence from anatomy should remain largely static and should not be modulated by task conditions. To confirm that identification power came from true differences in functional connectivity rather than anatomic idiosyncrasies, we recalculated connectivity matrices using different smoothing kernels (4 mm, 6 mm and 8 mm) for the blood oxygen level–dependent (BOLD) data.
With larger smoothing kernels, the registration advantages for the same brain compared to different brains should be vastly reduced or eliminated. Yet we saw only a slight drop in identification power: with Rest1 and Rest2 pairs, identification using the frontoparietal
Figure 4
Effect of node and network scheme on identification accuracy.
(a) Identification accuracy with the node atlas and network definitions described in Shen et al.10 (Shen) versus the FreeSurfer-Yeo scheme14,15 (FS + Yeo).
(b) Raw 126 × 126 cross-subject correlations of frontoparietal connectivity patterns from Rest1 and Rest2 (top). Row and column subject order is symmetric; thus, diagonal elements are correlation scores from matched subjects. Mean correlation coefficients for both diagonal (matched) and off-diagonal (unmatched) elements are also shown (bottom, error bars represent ±s.d.). The overall correlation coefficients are higher using the FreeSurfer-Yeo scheme, for both diagonal elements (n = 126) and off-diagonal elements (n = 15,876). **P < 10−5, two-tailed t-test.
(c) Cross-subject correlation matrices after z-score normalization (top). The global difference in correlation values is eliminated as there is no significant (n.s.) difference in the off-diagonal z scores. However, correlations between diagonal elements are significantly higher using the Shen scheme than the FreeSurfer-Yeo scheme (bottom; error bars represent ±s.
(d.), which helps account for the increase in identification accuracy using the Shen scheme. **P < 10−5, two-tailed t-test. networks remained above 96% for all three smoothing levels
(Supplementary Table 2).
We also investigated whether individuals could be identified by BOLD signal variance in each node (Online Methods). Although it is likely that BOLD variance reflects metabolic function to a substantial degree, it could also be influenced by anatomic factors, such as differences among subjects in number of gray-matter voxels per node.
This identification was successful, ranging 48–87% across different combinations of database and target sessions, but in all cases it was less successful than connectivity-based identification
(Fig. 2b and Supplementary Table 3).
Crucially, if BOLD variance were driven mostly by anatomic properties, it should be fairly constant regardless of how the brain is engaged during imaging, and thus the accuracy of variance-based identification should not differ according to brain state.
That there was considerable variability with brain state further suggests that functional rather than anatomic features are responsible for the high degree of identification accuracy observed.
Connectivity profiles predict cognitive behavior
To determine whether individual differences in functional connectivity are relevant to individual differences in behavior, we tested whether connectivity profiles could be used to predict subjects’ level of fluid intelligence.
Fluid intelligence (gF) is the capacity for on-the-spot reasoning to discern patterns and solve problems independently of acquired knowledge17. Levels of gF vary widely in the population18 and correlate with many other cognitive abilities and life outcomes19–22; investigating the biological basis of gF is of interest, as it is thought to reflect intrinsic cognitive ability.
In the HCP protocol, gF was assessed using a form of Raven’s progressive matrices with 24 items23 (scores are integers indicating number of correct items). We used leave-one-subject-out cross-validation to demonstrate that gF can be predicted based solely on the connectivity profile of a previously unseen individual.
In this iterative process, data from one subject is set aside as the test set, and data from the remaining n − 1 subjects is used as the training set. Each iteration consisted of three steps:
(i) feature selection, in which edges with a significant relationship to gF are identified in the training set and separated into two tails according to sign (positive or negative);
(ii) model building, in which training data are used to fit two simple linear regressions between gF and a summary statistic of connectivity strength in the positive- and negative-feature network, respectively, and
(iii) prediction, in which data from the excluded subject are input into each model to generate a predicted gF score (Online Methods). Following all iterations, we assessed the predictive power of each model by correlating predicted and observed gF scores across all subjects. Data from the Rest1 session were used here.
Similarly to the identification analysis, as a first pass, we tested whether whole-brain connectivity could predict gF in novel subjects. On the basis of a feature-selection threshold of P < 0.01, the correlation between predicted and observed gF scores was r = 0.50 (P < 10−9) for the positive-feature model (Fig. 5a) and r = 0.26 (P = 0.005) for the negative-feature model.
Although both models generated significant predictions, the positive-feature model was more accurate than the negative-feature model (z = 2.15, two-tailed P = 0.03). Prediction was significant across all three feature-selection thresholds tested: all r > 0.29, P ≤ 0.001 for the positive tail; all r > 0.22, P ≤ 0.01 for the negative tail. (The predicted range was narrower than the observed range in Fig.
5a; thus the model was most successful at generating predictions of gF level relative to other subjects.) Owing to the nature of our cross-validated approach, a slightly different set of edges was selected in each iteration.
To explore which networks contributed the most predictive power, for each of the eight networks we calculated the average number of within-network edges selected across all iterations and normalized by the total number of within-network edges (to control for differences in overall network sizes).
Networks 1 and 2 contributed the highest fraction of edges to the positive model, and network 3 contributed the highest fraction of edges to the negative model; this was consistent across statistical thresholds for feature selection (Fig. 5b).
Thus, edges that show a consistent positive correlation with gF are disproportionately located in the frontoparietal networks, and edges that show a consistent negative correlation with gF are mostly in the default-mode network.
As a second-pass analysis, we tested directly whether predictive power varied across the different networks; specifically, whether the networks that performed best for identification (the frontoparietal networks) also performed best for predicting cognitive behavior.
To do this, we repeated the leave-one-subject-out cross-validated procedure described above, this time restricting the feature selection step to features (within-network edges) from each of the eight networks in turn.
We also tested a combination of networks 1 and 2. Thus, nine sets of predicted gF scores were generated. Each of these was correlated with observed gF scores to assess the predictive power of each network or network combination (Fig. 5c,d).
Figure 5
Individual connectivity profiles predict cognitive behavior.
(a) Results from a leave-one-subject-out cross-validation (LOOCV) analysis comparing predicted and observed fluid intelligence (gF) scores (n = 118 subjects).
Scatter plot shows predictions based on the whole brain in the positive-feature network at a feature-selection threshold of P < 0.01. Each dot represents one subject; gray area represents 95% confidence interval for best-fit line, used to assess predictive power of the model.
(b) Mean fraction of within-network edges selected in the whole-brain positive-feature (left, red) and negative-feature (right, blue) models, shown at a range of statistical thresholds for feature selection. y axis indicates mean fraction of edges selected across all LOOCV iterations; x axis indicates network (labeled as in Fig. 1c).
(c) Results from a LOOCV analysis in which feature selection was restricted to within-network edges in the frontoparietal networks (1 and 2), at a feature-selection threshold of P < 0.01. Each dot represents one subject; gray area represents 95% confidence interval for best-fit line.
(d) Results from nine LOOCV analyses in which feature selection was restricted to within-network edges in each of the eight networks plus a combination of networks 1 and 2. y axis indicates correlation between predicted and observed gF scores; x axis indicates network label. *P < 0.05, uncorrected. Results based on a range of feature-selection thresholds (P values) are shown to demonstrate consistency across thresholds. For some networks, no features passed the statistical thresholding step and thus it was not possible to generate predictions; this is reflected by missing bars.
As hypothesized, predictive power based on the positive features was highest for the frontoparietal networks (at a feature-selection threshold of P < 0.01, r = 0.42, P < 10−6 and r = 0.39, P < 10−5 for networks 1 and 2, respectively; r = 0.50, P < 10−9 for the two networks combined; this pattern was consistent across different statistical thresholds used for feature selection).
The subcortical-cerebellar network (network 4) also had significant predictive power at a featureselection threshold of P < 0.01 (r = 0.22, P = 0.01) but this result was less consistent across feature-selection thresholds. Predictive power based on the negative features was significant only for the default mode (network 3; r = 0.35, P < 10−5 at P < 0.01); this result was consistent across feature-selection thresholds.
These results reinforce the functional relevance of our identification analyses, in that the networks most discriminating of individuals are also the most relevant to individual differences in behavior. Crucially, the relationship between connectivity and cognitive ability is sufficiently robust to generalize to previously unseen subjects.
DISCUSSION
Here we show that an individual’s functional brain connectivity profile is both unique and reliable, similarly to a fingerprint. We demonstrate that it is possible, with near-perfect accuracy in many cases, to identify an individual from a large group of subjects solely on the basis of his or her connectivity matrix.
Although inter-individual consistency in functional brain networks has been well characterized across task and rest conditions24,25 and even states of consciousness26, the intra-individual reliability observed here suggests that the general blueprint may be shared, but functional organization within individual subjects is idiosyncratic and relatively robust to changes in brain state, and provides meaningful information beyond the common template27.
We also demonstrate that this individual variability is relevant to individual differences in behavior, in that connectivity profiles can be used to predict the fundamental cognitive trait of fluid intelligence in novel subjects.
These results underscore the potential to discover fMRI-based connectivity ‘neuromarkers’ of present or future behavior that may eventually be used to personalize educational and clinical practices and improve outcomes28–30.
Anatomic loci of distinguishing connectivity features
Although identification based on the whole-brain connectivity matrix was highly successful, performance was best using a combination of the two frontoparietal networks. These networks are comprised of higher-order association cortices rather than primary sensory regions; the cortical regions are also the most evolutionarily recent31 and show the highest inter-subject variance7,32,33.
That the frontoparietal networks were most distinguishing of individuals—and the most predictive of behavior—is consistent with their function in cognition. Nodes in these networks tend to act as flexible hubs, switching connectivity patterns according to task demands34.
Additionally, broadly distributed across-network connectivity has been reported in these regions35, suggesting a role in large-scale coordination
a r t ic l e s
of brain activity. Although the frontoparietal network is particularly active in tasks requiring a high degree of cognitive control, here we show that it can identify individuals regardless of whether the data are collected during a task or at rest.
Training cross-subject classifiers based on frontoparietal connectivity to predict which task a subject is performing yields classification accuracy that is statistically significant but still low34; the present findings of high inter-individual variability may help explain these results.
In light of this, future work might use within-subject classification to explore how the frontoparietal networks reorganize according to task demands in individual subjects. Similarly, the frontoparietal networks emerged as most predictive of gF, a finding that is consistent with previous reports that structural and functional properties of these networks relate to intelligence36–38.
Aberrant functional connectivity in the frontoparietal networks has been linked to a variety of neuropsychiatric illnesses39,40; the work presented here, although focused on healthy subjects, suggests that sensitivity may be compromised in studies of disease if inferences are drawn only at the group level.
New insights into neuropsychiatric illnesses may be gained from an approach that links individual functional connectivity profiles to a spectrum of behavioral and symptom measures rather than a single diagnosis41,42.
Additional considerations
The discriminating power of connectivity profiles here is a result of integration over a relatively long period of time (that is, runs that last several minutes). There is a growing body of literature43 showing that in the resting state, functional connectivity is dynamic and varies considerably over short periods of time (that is, intervals shorter than 1 min).
Future work may seek to characterize individuals on the basis of properties of these dynamic fluctuations; however, our results indicate that single measures of time-averaged functional connectivity based on relatively long scan sessions provide meaningful information about individuals.
We note also that our cross-session identification was done between sessions separated by a single day; it remains unclear to what degree individual connectivity profiles are consistent across the lifespan.
Cross-sectional studies have shown changes in functional connectivity with age44–46, but future work should employ longitudinal designs to test the stability or evolution of the functional connectivity fingerprint over months or years rather than days.
From a methodological perspective, the scale of the node atlas seems to influence identification accuracy. The parcellation used in our primary analysis consisted of 268 nodes across the whole brain, with each node optimized to contain voxels with similar resting-state time courses10.
This number is consistent with the range postulated by other groups47,48 but represents a more fine-grained scheme than other atlases, such as the automatic anatomic labeling atlas (90 nodes)49 or the FreeSurfer atlas (68 nodes).
In a comparison of identification rates, network definitions based on our high-resolution parcellation outperformed networks based on the FreeSurfer atlas; it is likely that the coarser node size of the latter diminishes accuracy by averaging out individual variability
Conclusion
Together, these findings suggest that analysis of individual fMRI data is possible and indeed desirable. Given this foundation, human neuroimaging studies have an opportunity to move beyond population-level inferences, in which general networks are derived from the whole sample, to inferences about single subjects, examining how individuals’ networks are functionally organized in unique ways and relating this functional organization to behavioral phenotypes in both health and disease.
Methods
Methods and any associated references are available in the online version of the paper.
Acknowledgments
Data were provided in part by the Human Connectome Project, WU-Minn Consortium (principal investigators, D. Van Essen and K. Ugurbil; 1U54MH091657) funded by the 16 US National Institutes of Health (NIH) institutes and centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University. This work was also supported by NIH grant EB009666 (R.T.C.), T32 DA022975 (D.S.) and the US National Science Foundation Graduate Research Fellowship Program (E.S.F. and M.D.R.)
AUTHOR CONTRIBUTIONS
E.S.F., X.S., D.S., X.P. and R.T.C. conceptualized the study. X.S. designed and performed the identification analyses with support from E.S.F. and D.S. E.S.F. designed and performed the behavioral analyses with support from M.D.R. and J.H. X.S. and X.P. contributed unpublished data analysis tools and visualization software. X.P., M.M.C. and R.T.C. provided support and guidance with data interpretation. E.S.F. wrote the manuscript, with contributions from X.S. and comments from all other authors.
COMPETING FINANCIAL INTERESTS
The authors declare no competing financial interests.
ONLINE METHODS
Subject information. The primary data set used in this work is from the Human Connectome Project (HCP). A second data set (the Yale data set), was used for node and network definitions. These two data sets are described below. HCP data. We used the Q2 HCP data release, which was all the HCP data publicly available at the time that this project began.
The full Q2 release contains data on 142 healthy subjects; we restricted our analysis to subjects for whom all six fMRI sessions were available (n = 126; 40 males, age 22–35). This represents a relatively large sample size compared to most neuroimaging studies and has the advantage of being an open-source data set, facilitating replication and extension of this work by other researchers. Most subjects have at least one blood relative in the group, many of them twins.
A more heterogeneous population should make the identification problem easier, and therefore the high accuracy rate observed here, despite the homogeneity of the sample, underscores the power of functional connectivity–based identification. The resting-state runs (HCP filenames: rfMRI_REST1 and rfMRI_REST2) were acquired in separate sessions on two different days.
Task runs included the following: working memory (tfMRI_WM), motor (tfMRI_MOTOR), language (tfMRI_LANGUAGE, including both a story-listening and arithmetic task) and emotion (tfMRI_EMOTION). The working memory task and motor task were acquired on the first day, and the language and emotion tasks were acquired on the second day. In total, there were six conditions: day 1 rest, day 2 rest, working memory, motor, language and emotion. The HCP scanning protocol was approved by the local Institutional Review Board at Washington University in St.
Louis. For all sessions, data from both the left-right (LR) and right-left (RL) phase-encoding runs were used to calculate connectivity matrices. Full details on the HCP data set have been published previously8. Yale data. This data set consisted of 45 healthy subjects scanned on a Siemens 3T Tim Trio at Yale University (28 males; age = 31 ± 7.3 (s.d.), range 19–50).
Restingstate fMRI data were acquired using a multiband gradient echo EPI sequence with similar parameters to the HCP acquisition (FOV = 210 mm, matrix size 84 × 84, TR = 0.956 ms, TE = 30 ms, resolution = 2.5 mm3). Eight 5.6-min runs were acquired, totaling 45 min of data. Additionally, an MPRAGE image (TR = 2,530 ms, TE = 2.77 ms, TI = 1,100 ms, flip angle = 7 degrees, resolution = 1 mm3) and a two-dimensional anatomical T1 image (TR = 285 ms, TE = 2.61 ms, resolution = 2.5 mm3) were acquired for registration purposes.
All participants provided written informed consent in accordance with a protocol approved by the Human Research Protection Program of Yale University. Preprocessing. The HCP minimal preprocessing pipeline was used50 for the HCP data set. This pipeline includes artifact removal, motion correction and registration to standard space.
For the Yale data set, images were motion corrected using SPM8 and were warped to common space using a series of linear and nonlinear transformations as previously described10.
ures were applied to the fMRI data and included removal of linear components related to the six motion parameters (Yale data) or 12 motion parameters (HCP data; these include first derivatives, given as Movement_Regressors_dt.txt), regression of the mean time courses of the white matter and CSF as well as the global signal, removal of the linear trend, and low-pass filtering.
For the HCP data set, we investigated a range of spatial smoothing Gaussian kernel sizes— from no smoothing to a full-width half-maximum (FWHM) of 4 mm, 6 mm or 8 mm—and found that smoothing level had essentially no effect on identification accuracy (Supplementary Table 2); thus, results based on data with no spatial smoothing are presented in the main text.
(Our node-based analysis, in contrast to a voxel-wise analysis, contains a considerable degree of inherent smoothing because the time courses of many contiguous voxels are averaged into a single node.) Image preprocessing and calculation of connectivity matrices was done using BioImage Suite51.
Pearson correlation coefficients between pairs of node time courses were calculated and normalized to z scores using the Fisher transformation, resulting in a 268 × 268 symmetric connectivity matrix for each session for each subject. Connectivity matrices were not thresholded or binarized in any way.
Functional parcellation and network definition. Using the 45 subjects in the Yale data set, we constructed a 268-node functional atlas using a group-wise spectral clustering algorithm10. The two hemispheres were segmented separately into a target number of 150 regions. The final parcellation was examined to ensure each node contained a reasonable number of voxels.
Note that this single wholebrain parcellation atlas was defined in MNI space and was applied to all subjects in the HCP data set via traditional registration techniques. The parcellation image is publicly available on the BioImage Suite NITRC page (https://www.nitrc.org/ frs/?group_id=51).
In addition to parcellating the brain into 268 functionally coherent nodes, we further clustered these nodes into large-scale networks. To define the networks, the same group-wise spectral clustering algorithm was applied to connectivity matrices from the 45 Yale subjects to group the 268 nodes into eight networks10.
The eight networks were evaluated and compared visually to existing definitions of resting-state networks published by other groups15,25. Despite the fact that we included subcortical regions and cerebellum whereas other definitions excluded these regions, our network configuration matched well with these other network definitions.
Our eight clusters represent approximately the following networks: (i) medial frontal,
(ii) frontoparietal,
(iii) default mode,
(iv) subcortical-cerebellum,
(v) motor,
(vi) visual I,
(vii) visual II and
(viii) visual association (Fig. 1c).
ure. First, a database was created that consisted of all the individual subjects’ connectivity matrices from a single condition, D = = [ , , , ] X i i 1 126 … , where Xi is a 268 × 268 correlation matrix and the subscript i denotes subject.
In the identification step, the identity of the target matrix was predicted using a correlation matrix obtained from a different session. To predict the subject identity, the similarity between the current target matrix Yi and all other matrices in D was computed, and the predicted identity was that with the maximal similarity score.
Similarity was defined as the Pearson correlation between two vectors of edge values taken from the target matrix and each of the database matrices. Note that we performed prediction with replacement, such that the algorithm was not forced to predict a unique subject on each iteration within a condition.
To assess the statistical significance of identification accuracy, we performed nonparametric permutation testing. In each iteration, we first randomly selected one condition from day 1 to serve as the database set and a second condition from day 2 to serve as the target set.
Next, subject identity was permuted—such that each subject in the target set was assigned a ‘correct’ identity corresponding to a different subject in the database set—and identification performed. Then the roles of database and target sets were reversed. This procedure was repeated 1,000 times.
To investigate the contributions of individual networks (as described above) to identification accuracy, a sub-matrix was used corresponding to a single network or combination of networks. If we denote the set of nodes belonging to network j as V k K j jk = = j [ , , , ] u 1 … , where Kj is the total number of nodes in network j, the sub-matrix of network j is X(Vj , Vj ), thus only connections within the selected network(s) are included.
based analyses were performed to determine whether specific edges contribute more to individual identification than other edges. When performing subject identification, the Pearson correlation coefficients were computed between the target connectivity pattern and all connectivity patterns in the database, and the subject identity was chosen to be the one that had the largest correlation coefficient.
Computationally, the Pearson correlation of two vectors is the sum of element-wise products, given that the two vectors are z-score normalized (0 mean with unit s.d.). Therefore, this score can be broken down to quantify the individual amount contributed from each entry in the vector, where some edges contribute positively to the total coefficient and others contribute negatively.
Given two sets of connectivity matrices [ ],[ ] X X i R i 1 2R obtained from Rest1 and Rest2 runs after z-score normalization, we computed the corresponding edge-wise product vector (ϕi), ji i R i R ( ) ( ) e X e X e e M * = ( ), , , = 1 2 1 … where i indexes subject, e indexes edge and M is the total number of edges in the selected network (or the whole brain). The sum of ϕi over all edges Σe i j ( )e is the correlation between [ ] Xi R1 and [ ] Xi R2 .
The group consistency measure (Φ) was computed as the mean of ϕi across all subjects. Large positive entries in Φ are edges that are consistent both within a subject and across the group. npg © 2015 Nature America, Inc.
All rights reserved. doi:10.1038/nn.4135 nature NEUROSCIENCE In the same way, we can calculate ϕi between patterns from different subjects, for example jij i R j R ( ) ( ) e X e X e i j * = ( ), ≠ 1 2 It is possible that an edge e is equally correlated within the same subject and between different subjects. In other words, ϕij(e) (when subject subscript is not matched) and ϕii(e) (when subject subscript is matched) are of similar value.
Therefore, such an edge will not contribute to distinguishing an individual from the other subjects. For an edge to be truly helpful in individual identification, the following property must hold, j j j j ii ij ii ji ( ) ( ), ( ) ( ), e e e e i j > and > ≠ In this way, the particular edge contributes to maximize the correlation between connectivity patterns from matched subjects.
To quantify the differential power of an edge for the purpose of subject identification, we computed an empirical probability P e P e e e e e e e i ji ii ij ii ji ii ij ( ) ( ) ( ) ( ) ( ) ( ( ) ( ) ( ) = > > = > + > j j j j j j j or jii( ) )/ ( ),2 1 126 e N N − = We defined Pi(e) in this way so that it can be interpreted similarly as the P value in a standard statistical test. The smaller the Pi(e) value the better differential power the edge has to identify a single subject i.
The overall differential power of an edge across all subjects is then defined by the differential power measure (DP), DP e P e i i ( ) { ln( ( ))} = − ∑ The results of these edgewise analyses are visualized in Figure 3a. Identification using shorter time courses.
Scan sessions differed in duration: rest sessions were substantially longer than task sessions, and thus the discrepancy in amount of data could at least partially account for the differences in identification accuracy between rest-rest, rest-task and task-task pairs.
To explore this possibility, we tested frontoparietal-based identification between the two rest sessions while varying the number of time points used to calculate connectivity matrices between 100 and 1,100 in increments of 100.
For each number of time points n, identification accuracy was tested for each of 500 randomizations; these randomizations were generated by choosing among 50 starting points for each of the two rest runs and using n brain volumes beginning with that starting point to calculate matrices.
Results based on a database and target of Rest1 and Rest2, respectively, are shown in Figure 3b; results based on reversing the database and target were extremely similar. Identification based on two-matrix database. We also tested a database design option in which two matrices were included for each subject.
The current design of the database and target set makes identification challenging because the two sets of data were not only acquired on separate days, but also under different conditions (except for the Rest1 and Rest2 pair) with the brain engaged in a different cognitive task.
Inputting additional information about the difference between task and rest could potentially improve identification performance. Therefore, we created a database that included a connectivity matrix obtained from a resting-state session and another matrix obtained from a task session acquired on the same day: D = [( , ), , , ] X X i i = R i T 1 126 … . For identification, we always required that the target matrix be obtained on a different day.
To predict the subject identity, we projected the current target matrix Yi to the subspace spanned by the pair [ , ] X X i R i T , to obtain a projection Y i , and then computed the similarity between Y i and Yi to find the best match (Fig. 3c). Effects of parcellation scheme.
To investigate the effect of brain parcellation and network assignment on identification accuracy, we also computed correlation matrices based on the 68-node FreeSurfer atlas included as part of the HCP data. We used Yeo et al.’s previously published seven-network definition15 which included the following networks:
(i) visual,
(ii) motor,
(iii) pre-motor/parietal (dorsal attention),
(iv) ventral attention,
(v) ventromedial prefrontal,
(vi) frontal-parietal control and
(vii) default mode.
These networks do not include subcortical structures or cerebellum. Network labels were assigned to each node in the FreeSurfer atlas as follows. Given a node, we counted the number of voxels belonging to each network. The network with the largest number of voxels was designated the primary network label for the given node.
Because the 68-node parcellation is created independently from the seven networks, their boundaries do not align in a one-to-one manner. Therefore, we also allowed a secondary network association for a node when the number of voxels in this network ranked second and the proportion to the total number of voxels in the node exceeded 30%.
All 68 nodes have at least one primary network association. When defining a sub-matrix that represents a single network, we included nodes for which the target network was either the primary or secondary label. Comparisons of identification accuracy between the Shen and FreeSurfer and Yeo schemes are shown in Figure 4. Effects of head motion.
In order to rule out the possibility that successful identification was driven simply by characteristic movement patterns leading to predictable motion artifacts, we performed prediction using motion estimates only.
HCP data collection provides an estimate of frame-to-frame displacement for each run (Movement_RelativeRMS.txt). These data were used to generate a discrete motion distribution vector from each of the six relative RMS vectors (from the six conditions, using data from the left-right phase encoding run).
we computed the mean and s.d. of the relative RMS across all conditions and subjects. We then specified 60 bins that spanned the grand mean ± 3 s.d., and the motion distribution vectors were calculated accordingly.
The motion distribution vectors were then used in the same way as the correlation matrices for individual identity prediction purposes.
Effects of anatomic differences. Although connectivity calculations are based on functional BOLD data, subtle effects of anatomic variability could potentially confer a preference between the same subject on different days when it comes to applying the 268-node parcellation, defined in standard space, to each individual subject.
To help rule out confounds of anatomy introduced at the registration step, we recalculated connectivity matrices based on BOLD data smoothed with three different kernel sizes (4 mm, 6 mm and 8 mm, keeping all other preprocessing steps the same), and performed the identification analysis again; at higher levels of smoothing, any registration advantage for the same brain relative to a different brain should be eliminated or vastly reduced.
The resulting identification accuracies are presented in
Supplementary Table 2.
We also performed a second analysis to help rule out the possibility that identification was driven mainly by anatomic rather than functional differences between subjects. Rather than connectivity between pairs of nodes, we tested whether individuals could be identified on the basis of a measure of BOLD variance in each node (calculation described below).
In principle, although BOLD variance probably reflects baseline metabolic function to a substantial degree, it could also be influenced by anatomic factors such as partial volume effects introduced by the gray- and white-matter segmentation and/or differing numbers of gray-matter voxels per node owing to underlying variation in regional tissue volumes and gyral folding patterns
This analysis helps to address these potential confounds. BOLD variance was calculated and identification performed as follows: 1. Within each node, we computed the mean BOLD signal in each frame. This yields an N × 268 matrix of node-wise mean BOLD intensities for each subject for each condition, where N is the number of frames (1,200 for the resting-state runs and fewer for the task runs).
(This is identical to the first step in calculating connectivity matrices. Note that the mean across the time dimension is zero because of the drift removal and band-pass steps.) 2. For each node, using its N × 1 time course vector, we compute its variance as follows: variance sum mean time course = ( )/ 2 N This results in a single 1 × 268 vector of node-wise BOLD variances for each brain state for each subject.
In signal processing terms, this variance is also known as mean energy. npg © 2015 Nature America, Inc. All rights reserved. nature NEUROSCIENCE doi:10.1038/nn.4135 3. We then performed identification by computing the correlation between these mean BOLD variance vectors across the different conditions (states) and sessions (days).
The results of this analysis are presented in Supplementary Table 3. Behavioral prediction. In the HCP protocol, fluid intelligence (gF) was assessed using a form of Raven’s progressive matrices with 24 items23 (scores are integers indicating number of correct items; mean = 16.8, s.d. = 4.7, median = 18, mode = 20, range 5–24; HCP: PMAT24_A_CR). In light of evidence that head motion is a substantial confound in functional connectivity analyses, we first checked for any correlation between head motion and gF score.
In the full sample of n = 126, this correlation was trending toward significance (r = −0.15, P = 0.09). Upon further examination, we found that this relationship was driven by a small number of individuals with both high head motion and low gF score.
Thus, for purposes of the behavioral analysis, we excluded subjects with particularly high motion during the Rest1 run; specifically, eight subjects with >0.14 frame-to-frame head motion estimate (averaged across both day 1 rest runs; HCP: Movement_RelativeRMS_mean) were excluded.
There was no correlation between head motion and gF in the remaining set of n = 118 subjects (r = −0.05, P = 0.55). Leave-one-subject-out cross-validation was used for the prediction analysis. In this iterative analysis, features are selected and a predictive model is built based on n − 1 subjects (the training set) and the model is then tested on the remaining subject (the test set).
Each subject is left out once. Each iteration consisted of
(i) feature selection,
(ii) model building, and
(iii) prediction, described in turn below.
In the feature-selection step, Pearson correlation was performed between each edge in the connectivity matrices and gF score across subjects in the training set.
(Note that it is not necessary to correct for multiple comparisons in this step because the nature of a predictive analysis includes a built-in guard against false positives: if the proportion of false positives in the feature-selection step is high, the model should not generalize well to independent data.)
On the basis of the signs of the resulting correlation values, edges were separated into two tails: those positively correlated with gF and those inversely correlated with gF. Edges were then thresholded on the basis of the statistical significance of their correlation with gF, resulting in two sets of features (positive and negative).
Because the choice of statistical threshold at this step is somewhat arbitrary, a range of thresholds was tested (P < 0.01, 0.05, 0.10) to ensure that results were consistent. In the model-building step, we first defined a single-subject summary statistic, ‘network strength’, by summing values of all edges in each feature set in individual connectivity matrices.
In graph-theoretic terms, this statistic can be thought of as a type of weighted degree for each feature network52.
This summary statistic can be represented as follows: [ ] ( ) , Positive feature network strength s ij ij i j = c m + ∑ [ ] ( ) , Negative feature network strength s ij ij i j = c m − ∑ Where c is an individual s’s connectivity matrix, and m(+) and m(−) are binary matrices indexing the edges (i,j) that are significantly positively or negatively correlated with gF, respectively.
After obtaining network strength for each subject in the training set, simple linear regression was used to model the relationship between network strength (the explanatory variable) and gF (the dependent variable).
Two models were built: one based on strength in the positive-feature network and one based on strength in the negative-feature network.
Each model—positive and negative—consisted of a first-degree polynomial that fit the training data best in a least-squares sense: [Predicted gF score a Network strength b ] * ( ) pos pos = + [Predicted gF score a Network strength b ] * ( ) neg neg = + Finally, in the prediction step, positive and negative network strengths from the excluded subject were calculated and input into each of the two respective models to generate predicted gF scores for that subject.
These three steps were repeated iteratively such that each subject was excluded once. Finally, we assessed predictive power of both models by correlating observed versus predicted gF scores across all subjects (Fig. 5). Code availability.
The 268-node functional parcellation is available online on the BioImage Suite NITRC page (https://www.nitrc.org/frs/?group_id=51). Matlab scripts were written to perform the analyses described; this code is available from the authors upon request. A Supplementary Methods Checklist is available.