Institute of Computational Biology, College of Information Science and Technology, Northeast Normal University, Changchun, 130117
目的 采用计算手段探索在当前外膜蛋白小样本的条件下提升外膜蛋白拓扑结构预测精度的深度学习方法。方法 首先选取和构建适用于预测外膜蛋白拓扑结构的数据集；第二，经特征筛选和对比实验确定模型的最优输入；第三，构建和优化基于胶囊网络的拓扑结构预测模型TopOMP-capsnet；最后通过对比同类方法评估和验证模型性能。结果和结论 拓扑结构预测模型TopOMP-capsnet与同类方法对比性能有所提升，证明深度学习技术能够在有限样本条件下识别相应序列模式，有助于外膜蛋白结构和功能的大规模分类及筛选。创新之处 拓扑结构预测模型TopOMP-capsnet的三态预测准确率（Q3）达到87.7%，优于传统机器学习方法。
Objective Employing computational means to explore an efficient deep learning method for improving the prediction accuracy of outer membrane protein topology under the current conditions of small sample size of outer membrane proteins. Method First, selecting and constructing data sets suitable for the prediction of outer membrane protein topology; Second, determining the optimal input of the model through feature screening and comparative experiments; Third, building and optimizing a topology prediction model the TopOMP-capsnet based on capsule network; Finally, evaluation and validation of model performance by comparative congenic methods. Results and conclusions Topology prediction model the TopOMP-capsnet has better performance compared with similar methods, which proves that deep learning technology can identify corresponding sequence patterns under limited sample conditions, and is helpful for large scale classification and screening of outer membrane protein structure and function. Innovation Topology prediction model the TopOMP-capsnet has a three-state prediction accuracy (Q3) of 87.7%, which is superior to traditional machine learning methods.