[Paper Review] Protein Secondary Structure Prediction Using Deep Multi-scale Convolutional Neural Networks and Next-Step Conditioning
This paper proposes a deep multi-scale convolutional neural network with residual connections and next-step conditioning for protein secondary structure prediction, achieving 70.0% Q8 accuracy on CB513 using a single model and 70.6% via ensembling with a conditional model. The method improves over prior state-of-the-art by leveraging modern deep learning techniques and a novel ensemble strategy to mitigate overfitting in conditional prediction.
Recently developed deep learning techniques have significantly improved the accuracy of various speech and image recognition systems. In this paper we adapt some of these techniques for protein secondary structure prediction. We first train a series of deep neural networks to predict eight-class secondary structure labels given a protein's amino acid sequence information and find that using recent methods for regularization, such as dropout and weight-norm constraining, leads to measurable gains in accuracy. We then adapt recent convolutional neural network architectures--Inception, ReSNet, and DenseNet with Batch Normalization--to the problem of protein structure prediction. These convolutional architectures make heavy use of multi-scale filter layers that simultaneously compute features on several scales, and use residual connections to prevent underfitting. Using a carefully modified version of these architectures, we achieve state-of-the-art performance of 70.0% per amino acid accuracy on the public CB513 benchmark dataset. Finally, we explore additions from sequence-to-sequence learning, altering the model to make its predictions conditioned on both the protein's amino acid sequence and its past secondary structure labels. We introduce a new method of ensembling such a conditional model with our convolutional model, an approach which reaches 70.6% Q8 accuracy on CB513. We argue that these results can be further refined for larger boosts in prediction accuracy through more sophisticated attempts to control overfitting of conditional models. We aim to release the code for these experiments as part of the TensorFlow repository.
Motivation & Objective
- To improve protein secondary structure prediction accuracy using modern deep learning architectures adapted from image recognition.
- To investigate the impact of multi-scale convolutional layers and residual connections on secondary structure prediction.
- To explore sequence-to-sequence conditioning on past secondary structure labels to enhance prediction accuracy.
- To address overfitting in conditional models through a novel ensemble strategy with unconditional convolutional models.
- To establish a new state-of-the-art result on the CB513 benchmark dataset using a single model and an ensembled approach.
Proposed method
- Adapts deep convolutional neural network architectures—Inception, ResNet, and DenseNet—with Batch Normalization and multi-scale filters for protein sequence data.
- Applies regularization techniques such as dropout and weight-norm constraining to improve generalization and reduce overfitting.
- Introduces residual connections to preserve local sequence context and prevent information loss across layers.
- Designs a conditional model that predicts secondary structure labels conditioned on both amino acid sequence and prior predicted labels, inspired by sequence-to-sequence learning.
- Employs a weighted beam search ensemble method that combines predictions from the unconditional CNN and the conditional model to reduce error propagation.
- Uses a 42-dimensional input representation combining one-hot amino acid encoding and normalized PSSM profiles from PSI-BLAST.
Experimental results
Research questions
- RQ1Can modern deep convolutional architectures from image recognition be effectively adapted for protein secondary structure prediction?
- RQ2How do multi-scale convolutional layers and residual connections improve performance on the eight-class secondary structure problem?
- RQ3Does conditioning future predictions on past secondary structure labels lead to measurable accuracy gains in protein structure prediction?
- RQ4To what extent does overfitting in conditional models limit performance, and can it be mitigated through ensemble learning?
- RQ5Can a single model achieve state-of-the-art performance without ensembling or multitask learning?
Key findings
- The proposed multi-scale residual convolutional network achieves 70.0% Q8 accuracy on the CB513 benchmark, surpassing previous state-of-the-art by 0.3%.
- The addition of residual connections leads to a larger accuracy gain than simply adding more convolutional blocks, indicating improved information retention.
- The conditional model alone achieves 81.7% next-step accuracy on validation with ground truth context but drops to 67.1% on test under beam search, indicating strong overfitting.
- Ensembling the conditional model with the unconditional CNN improves test accuracy to 70.6%, representing a 0.9% relative improvement over the prior best result.
- The ensemble approach outperforms simple ensembling of two unconditional models (70.4% vs. 70.6%), suggesting a measurable benefit from conditioning despite overfitting.
- The results indicate that overfitting in conditional models is primarily due to a tendency to copy previous labels, which can be mitigated through strategic ensemble weighting.
Better researchstarts right now
From paper design to paper writing, dramatically reduce your research time.
No credit card · Free plan available
This review was created by AI and reviewed by human editors.