uniprot database知识整理


以uniprot中human 物种的所有蛋白序列库为例,swissPort 和TrEMBL 两个库的数量之和,就等于uniProt中human物种的所有蛋白序列总数。所以unreviewed的TrEMBL库,和Reviewed的swissPort 库是互斥的,而不是包含关系。

Entry status




This subsection of the ‘Entry information’ section indicates whether the entry has been manually annotated and reviewed by UniProtKB curators or not, in other words, if the entry belongs to the Swiss-Prot section of UniProtKB (reviewed) or to the computer-annotated TrEMBL section (unreviewed).

UniProtKB/Swiss-Prot entries are tagged with a yellow reviewed icon

UniProtKB/TrEMBL entries are tagged with a blue unreviewed icon

Automatic annotation


  • Automatic classification and domain annotation
  • Automatic annotation

UniProt’s Automatic Annotation pipeline enhances the unreviewed records in UniProtKB by enriching them with automatic classification and annotation.

Entry header中包含的信息解读

以VHL蛋白为例,>sp P40337 VHL_HUMAN von Hippel-Lindau disease tumor suppressor OS=Homo sapiens OX=9606 GN=VHL PE=1 SV=2
  • sp:Swiss-Prot数据库的简称

  • P40337:UniProt ID号

  • VHL_HUMAN:是UniProt 的登录名

  • von Hippel-Lindau disease tumor suppressor:蛋白质名称

  • OS=Homo sapiens:OS是Organism简称,Homo sapiens为人的拉丁文分类命名

  • OX=9606:Organism Taxonomy,即物种分类数据库Taxonomy ID

  • GN=VHL:Gene name,基因名为VHL

  • PE=1:Protein Existence,蛋白质可靠性,对应5个数字,数字越小越可靠:

    • 1:Experimental evidence at protein level
    • 2:Experimental evidence at tranlevel
    • 3:Protein inferred from homology
    • 4:Protein predicted
    • 5:Protein uncertain
  • SV=2:Sequence Version,序列版本号


