QUICK REVIEW

[Paper Review] Protein Secondary Structure Prediction Using Cascaded Convolutional and Recurrent Neural Networks

Zhen Li, Yizhou Yu|arXiv (Cornell University)|Apr 25, 2016

Machine Learning in Bioinformatics33 references105 citations

TL;DR

The paper presents an end-to-end deep network (DCRNN) that combines multiscale CNNs for local context and stacked bidirectional GRUs for global context to predict 8-state protein secondary structures and solvent accessibility, achieving state-of-the-art results on CB6133, CB513, CASP10, and CASP11 datasets.

ABSTRACT

Protein secondary structure prediction is an important problem in bioinformatics. Inspired by the recent successes of deep neural networks, in this paper, we propose an end-to-end deep network that predicts protein secondary structures from integrated local and global contextual features. Our deep architecture leverages convolutional neural networks with different kernel sizes to extract multiscale local contextual features. In addition, considering long-range dependencies existing in amino acid sequences, we set up a bidirectional neural network consisting of gated recurrent unit to capture global contextual features. Furthermore, multi-task learning is utilized to predict secondary structure labels and amino-acid solvent accessibility simultaneously. Our proposed deep network demonstrates its effectiveness by achieving state-of-the-art performance, i.e., 69.7% Q8 accuracy on the public benchmark CB513, 76.9% Q8 accuracy on CASP10 and 73.1% Q8 accuracy on CASP11. Our model and results are publicly available.

Motivation & Objective

Motivate and solve protein secondary structure prediction by integrating local and global sequence contexts using deep learning.
Leverage multiscale CNNs to capture local contextual features at multiple window sizes.
Utilize stacked bidirectional GRUs to model long-range dependencies in protein sequences.
Perform multi-task learning to jointly predict secondary structure and solvent accessibility.
Demonstrate state-of-the-art performance on public benchmarks CB6133, CB513, CASP10, and CASP11.

Proposed method

Embed sparse amino acid sequence features into dense representations via an embedding layer.
Concatenate embedded sequence features with PSI-BLAST-derived profile features.
Apply multiscale CNNs with kernel sizes 3, 7, and 11 to extract local contexts (each with 64 channels).
Feed concatenated local contexts into three stacked bidirectional GRU layers (600 hidden units each) with dropout.
Concatenate outputs from CNNs and BGRUs and pass through two fully connected layers with ReLU activations.
Train with a multi-task loss to predict 8-state secondary structure and 4-state solvent accessibility, with L2 regularization and dropout; use Adam optimizer and bagging ensembles of 10 models for robustness.

Experimental results

Research questions

RQ1Can an end-to-end deep architecture effectively integrate local and global contextual features to improve 8-state secondary structure prediction?
RQ2Do multiscale CNNs combined with stacked bidirectional GRUs outperform prior methods on standard benchmarks?
RQ3Does multi-task learning for secondary structure and solvent accessibility yield further accuracy gains?
RQ4How well does the model generalize across CB6133, CB513, CASP10, and CASP11 datasets?

Key findings

On CB6133 training, the single model achieves 73.2% Q8 accuracy (state of the art) and 76.1% solvent accessibility on the test set.
Ensemble (bagging 10 models) increases Q8 to 69.7% on CB513 with a significant p-value, outperforming prior methods.
On CB513, the single model reaches 69.4% Q8 accuracy, surpassing DeepCNF and other baselines; ensemble improves robustness.
On CASP10 and CASP11, the model achieves 76.9% Q8 (CASP10) and 73.1% Q8 (CASP11); 87.8% and 85.3% Q3 respectively (single model).
Ablation shows that stacked BGRUs and the integration of local and global contexts are critical for performance; removing embedding, multiscale CNNs, backward passes, or using plain RNNs degrades results.
The model demonstrates strong generalization and outperforms multiple existing methods across diverse benchmarks.

Better researchstarts right now

From paper design to paper writing, dramatically reduce your research time.

No credit card · Free plan available

This review was created by AI and reviewed by human editors.