Synonymous codon choice can have dramatic effects on ribosome speed, RNA stability, and protein expression. Ribosome profiling experiments have underscored that ribosomes do not move uniformly along mRNAs, exposing a need for models of translation that capture the full range of empirically observed variation. Starting from empirical data from ribosome profiling, we modeled variation in translation elongation using a feedforward neural network to predict the distribution of ribosomes along an mRNA as a function of its sequence. We applied our model to design synonymous variants of a fluorescent protein in yeast and concluded that control of translation elongation alone is sufficient to produce large, quantitative differences in protein output. This provides us with an empirical method to learn codon preferences and apply that knowledge to synthetic sequence design. It also opens a path to measuring the impact of so-called silent mutations in human genes: synonymous substitutions that we now realize might affect protein expression without changing the amino acid sequence.
Giving Voice to Silent Mutations:
Machine Learning Models of Ribosome Decoding
Assistant Professor, Department of Bioengineering
University of California, Berkeley