ProtPIC (Protein and Peptide Isoelectric Point Calculator)是一个基于机器学习的生物信息学工具,专门用于预测蛋白质和肽段的等电点(pI)以及残基级别的 pKa 值。工具会 自动识别 输入序列类型: 短序列(≤60aa) → 使用肽段模型, 仅需氨基酸序列; 长序列(>60aa) → 使用蛋白质模型,计算结构特征(RSA、pLDDT、二级结构),然后融入计算。
页面会返回等电点,如果是蛋白质,还会返回带电残基的 pKa 值。
1. 氨基酸序列(支持10条 FASTA):
已解析序列数: 0,总残基数: 0
技术亮点
- 结构信息融合 :整合 RSA(相对溶剂可及性)、pLDDT(结构置信度)、二级结构等特征
- 物理先验整合 :融入 Henderson-Hasselbalch 理论计算值作为特征
- 高精度预测 :pKa 预测 MAE = 0.338 pH;蛋白质 pI 预测 MAE = 0.581 pH;肽段 pI 预测 MAE = 0.118 pH
模型性能指标
性能对比 (vs IPC 2.0)
------------------------------------------------------------
[pKa 预测] Rosetta 数据集, 260 残基
------------------------------------------------------------
Metric IPC 2.0 ProtPIC (ours)
------------------------------------------------------------
MAE 0.3364 0.338
RMSE 0.5762 0.624
Outliers (>0.5) 54(20.8%) 48(18.7%)
------------------------------------------------------------
------------------------------------------------------------
[Protein pI 预测] 581 蛋白质, IPC2.protein.svr.19
------------------------------------------------------------
Metric IPC 2.0 ProtPIC (ours)
------------------------------------------------------------
MAE 0.5906 0.581
RMSE 0.8479 0.839
R² 0.5934 0.623
Outliers (>0.5) 247(42.5%) 247(42.5%)
------------------------------------------------------------
------------------------------------------------------------
[Peptide pI 预测] 29,774 肽段, IPC2.peptide.Conv2D
------------------------------------------------------------
Metric IPC 2.0 ProtPIC (ours)
------------------------------------------------------------
MAE 0.1216 0.118
RMSE 0.2216 0.228
R² 0.9761 0.975
Outliers (>0.25) 2691(9.0%) 2878(9.7%)
------------------------------------------------------------
按 pKa / pI 范围性能
------------------------------------------------------------
[pKa] ProtPIC (ours), 257 残基
------------------------------------------------------------
pKa Range Count MAE RMSE Outliers (>0.5)
------------------------------------------------------------
pKa < 4 90 0.339 0.761 15 (16.7%)
pKa 4-6 63 0.190 0.386 3 (4.8%)
pKa 6-8 60 0.370 0.555 13 (21.7%)
pKa 8-10 6 0.538 0.726 3 (50.0%)
pKa > 10 38 0.502 0.668 14 (36.8%)
------------------------------------------------------------
------------------------------------------------------------
[Protein pI] ProtPIC (ours), 581 蛋白质
------------------------------------------------------------
pI Range Count MAE RMSE Outliers (>0.5)
------------------------------------------------------------
Acidic (<5) 162 0.602 0.814 81 (50.0%)
Neutral (5-7) 313 0.393 0.542 90 (28.8%)
Basic (7-9) 86 0.949 1.211 57 (66.3%)
Very Basic (>9) 20 1.776 2.041 19 (95.0%)
------------------------------------------------------------
------------------------------------------------------------
[Peptide pI] ProtPIC (ours), 29,774 肽段
------------------------------------------------------------
pI Range Count MAE RMSE Outliers (>0.25)
------------------------------------------------------------
Very Acidic (<4) 4,578 0.064 0.098 130 (2.8%)
Acidic (4-5) 9,879 0.087 0.156 330 (3.3%)
Neutral-Acidic (5-6) 2,160 0.323 0.423 1167 (54.0%)
Neutral (6-7) 7,297 0.145 0.273 818 (11.2%)
Basic (7-9) 5,795 0.099 0.230 373 (6.4%)
Very Basic (>9) 65 0.534 0.630 60 (92.3%)
------------------------------------------------------------
最后更新日期:2026-05-03
参考文献
Kozlowski LP. IPC 2.0: prediction of isoelectric point and pKa dissociation constants. Nucleic Acids Res. 2021 Jul 2;49(W1):W285-W292. doi: 10.1093/nar/gkab295. PMID: 33905510; PMCID: PMC8262712.