General intelligence is the exhibition of intelligent behavior across multiple problems in a variety of settings, however intelligence is defined and measured. Endemic in approaches to realize such intelligence in machines is catastrophic forgetting, in which sequential learning corrupts knowledge obtained earlier in the sequence or in which tasks antagonistically compete for system resources. Methods for obviating catastrophic forgetting have either sought to identify and preserve features of the system necessary to solve one problem when learning to solve another, or enforce modularity such that minimally overlapping sub-functions contain task-specific knowledge. While successful in some domains, both approaches scale poorly because they require larger architectures as the number of training instances grows, causing different parts of the system to specialize for separate subsets of the data. Presented here is a method called developmental compression that addresses catastrophic forgetting in the neural networks of embodied agents. It exploits the mild impacts of developmental mutations to lessen adverse changes to previously evolved capabilities and `compresses' specialized neural networks into a single generalized one. In the absence of domain knowledge, developmental compression produces systems that avoid overt specialization, alleviating the need to engineer a bespoke system for every task permutation, and does so in a way that suggests better scalability than existing approaches. This method is validated on a robot control problem and may be extended to other machine learning domains in the future.