UVM Theses and Dissertations
Complex Systems and Data Science
An immense collective effort has been put towards the development of methods forquantifying brain activity and structure. In parallel, a similar effort has focused on collecting experimental data, resulting in ever-growing data banks of complex human in vivo neuroimaging data. Machine learning, a broad set of powerful and effective tools for identifying multivariate relationships in high-dimensional problem spaces, has proven to be a promising approach toward better understanding the relationships between the brain and different phenotypes of interest. However, applied machine learning within a predictive framework for the study of neuroimaging data introduces several domain-specific problems and considerations, leaving the overarching question of how to best structure and run experiments ambiguous. In this work, I cover two explicit pieces of this larger question, the relationship between data representation and predictive performance and a case study on issues related to data collected from disparate sites and cohorts. I then present the Brain Predictability toolbox, a soft- ware package to explicitly codify and make more broadly accessible to researchers the recommended steps in performing a predictive experiment, everything from framing a question to reporting results. This unique perspective ultimately offers recommendations, explicit analytical strategies, and example applications for using machine learning to study the brain.