Protein structure detection in intermediate-resolution cryo-EM maps using deep learning
Cryo-EM has established its position in structural biology as an indispensable method of choice for determining macromolecular structures due to recent technological breakthroughs. The recent years have observed a steep increase of biomolecular structures solved by cryo-EM, including those which were determined at a high, near-atomic resolution. On the other hand, there are still many structures solved at an intermediate resolution, 5 to 10 Å, or even at a lower resolution every year. Here, we developed a computational method, Emap2sec, which identifies the secondary structures of proteins, α helices, β sheets, and the other in an EM map of 5 to 10 Å resolution. Emap2sec uses convolutional deep neural network at its core of the algorithm, which has been recently very successful in 2D and 3D image recognition tasks. Emap2sec assigns the secondary structure to each grid points in an EM map. Thus, the local prediction is made in the context of a large region of the map.
The architecture of Emap2sec is shown in Figure 1. Emap2sec scans an EM map with a cube of 113 Å3 and makes the secondary structure detection for the center position of each cube. Emap2sec has a characteristic two-phase architecture of the neural network, where an output of the first network is further refined by the second network reducing noise in the prediction. The performance of Emap2sec was tested on two datasets of EM maps, a dataset of 34 simulated EM maps at 6.0 Å and 10.0 Å and a dataset of 43 experimental EM maps whose resolution ranges from 5.0 to 9.5 Å. The overall accuracy at each amino acid level was 83.1 % and 79.8 % on average for the simulated maps at 6.0 Å and 10.0 Å, respectively. On the experimental map dataset, the accuracy was 64.4 % on average with the highest recording 91.6 %.
Emap2sec is available at kiharalab.org (https://kiharalab.org/emsuites/). An updated method, Emap2sec+, which also detects DNA/RNA structures in cryo-EM maps is now developed and made available at the same website.
The preprint of the Emap2sec+ paper is released at bioRxiv (https://www.biorxiv.org/content/10.1101/2020.08.22.262675v1).
Sai Raghavendra Maddhuri Venkata Subramaniya
Department of Computer Science, Purdue University, West Lafayette, IN, 47907, USA
Protein secondary structure detection in intermediate-resolution cryo-EM maps using deep learning
Maddhuri Venkata Subramaniya SR, Terashi G, Kihara D
Nat Methods. 2019 Sep