Why do we need homology modeling Release 2015 o 8 of 22-Jul-15 of Uni ProtKB/Swiss-Prot contains 549008 Sequence entries, comprising 195692017 amino acids abstracted from 238140 references 差距巨大! PDB Current Holdings Breakdown Exp Method Proteins Nucleic Acids Protein/NA Complexe Other X-RAY 72610 1435 3730 NMR 8599 1022 192 ELECTRON AICROSCOP 34 41 123 HYBRID other Total 2505 4052
Why do we need homology modeling ? 差距巨大! Release 2015_08 of 22-Jul-15 of UniProtKB/Swiss-Prot contains 549008 sequence entries, comprising 195692017 amino acids abstracted from 238140 references.
How often can we do t There are currently 90000 protein structures in the pdb database Reduces to <20000 structures <30 identical (sequence) with a resolution <3.0 A 25% of all sequences can be modeled 50% can be assigned to a fold class
How often can we do It ? • There are currently ~90000 protein structures in the PDB database. • Reduces to <20000 structures <30 % identical (sequence) with a resolution <3.0 Å. • 25% of all sequences can be modeled. • 50% can be assigned to a fold class
Structural genomics project Aim to solve the structure of all proteins: this is too much work experimentally Solve enough structures so that the remaining structures can be inferred from those experimental structures The number of experimental structures needed depend on our abilities to generate a model
Structural genomics project • Aim to solve the structure of all proteins: this is too much work experimentally! • Solve enough structures so that the remaining structures can be inferred from those experimental structures • The number of experimental structures needed depend on our abilities to generate a model
Structural genomics 已知结构 o● 的蛋白质 a。 ;·Q一关知结构的蛋白质
未知结构的蛋白质 已知结构 的蛋白质 Structural genomics
Homology Modeling: why it works Native sequences 100 80 High sequence identity 60 High structure similarity 。∝ 0 序列愈相似,则结构愈相似 2 cRMS
Homology Modeling: why it works High sequence identity High structure similarity 序列愈相似,则结构愈相似