A foundation model of transcription across human cell types
The Paper of the Day introduces the General Expression Transformer (GET) model, which is designed to accurately predict gene expression in both seen and unseen cell types by learning transcriptional regulatory syntax from chromatin accessibility data across 213 human fetal and adult cell types. GET outperforms previous state-of-the-art models in identifying cis-regulatory elements and identifying upstream regulators of fetal hemoglobin. The model also provides rich cell-type-specific regulatory insights, including identifying potential motif-motif interactions and constructing a structural interaction catalogue of human transcription factors and coactivators. The article demonstrates GET's ability to adapt to different sequencing platforms and assay types, as well as non-physiological cell types like tumor cells.