Hypothetical protein
In biochemistry, a hypothetical protein is a protein whose existence has been predicted, but for which there is a lack of experimental evidence that it is expressed in vivo. Sequencing of several genomes has resulted in numerous predicted open reading frames to which functions cannot be readily assigned. These proteins, either orphan or conserved hypothetical proteins, make up ~ 20% to 40% of proteins encoded in each newly sequenced genome.[1] Even when there is enough evidence that the product of the gene is expressed, by techniques such as microarray and mass-spectrometry, it is difficult to assign a function to it given its lack of identity to protein sequences with annotated biochemical function. Nowadays, most protein sequences are inferred from computational analysis of genomic DNA sequence. Hypothetical proteins are created by gene prediction software during genome analysis. When the bioinformatic tool used for the gene identification finds a large open reading frame without a characterised homologue in the protein database, it returns "hypothetical protein" as an annotation remark.
The function of a hypothetical protein can be predicted by domain homology searches with various confidence levels.[2] Conserved domains are available in the hypothetical proteins which need to be compared with the known family domains by which hypothetical protein could be classified into particular protein families even though they have not been in vivo investigated. The function of hypothetical protein could also be predicted by homology modelling, in which hypothetical protein has to align with known protein sequence whose three dimensional structure is known and by modelling method if structure predicted then the capability of hypothetical protein to function could be ascertained computationally.[2][3] Further, approaches to annotate function to hypothetical proteins include determination of 3-dimensional structure of these proteins by structural genomics initiatives, understanding the nature and mode of prosthetic group/metal ion binding, fold similarity with other proteins of known functions and annotating possible catalytic site and regulatory site.[4] Structure prediction with biochemical function assessment by screening for various substrate is another promising approach to annotate function[2]
See also
References
- Galperin MY (2001). "Conserved 'hypothetical' proteins: new hints and new puzzles". Comp Funct Genomics. 2 (1): 14–18. doi:10.1002/cfg.66. PMC 2447192. PMID 18628897.
- Srinivasan B; et al. (2015). "Prediction of substrate specificity and preliminary kinetic characterization of the hypothetical protein PVX_123945 from Plasmodium vivax". Exp. Parasitol. 151-152: 56–63. doi:10.1016/j.exppara.2015.01.013. PMID 25655405.
- P S Kewate; R C Urade; D G Gore; M A Soni; A P Kopulwar (2015). "In silico enzyme function prediction in hypothetical proteins of Mycobacterium bovis AF2122/97". Journal of Pharmacy Research. 9 (3): 182–189.
- Eisenstein E; et al. (2000). "Biological function made crystal clear - annotation of hypothetical proteins via structural genomics". Curr Opin Biotechnol. 11 (1): 25–30. doi:10.1016/j.exppara.2015.01.013. PMID 10679350.
- Sunil Pande Dilip Gore (2015). "Does hypothetical proteins of Yersinia pestis CO92 Capable of Coding Enzymes?". Journal of Pharmacy Research. 9: 278–287.
- Dilip Gore Ashish Chakule (2012). "Homology modeling and function prediction in uncharacterized proteins of Pseudoxanthomonas spadix". Biocompx. 1: 23–32.
- Zarembinski TI, Hung LW, Mueller-Dieckmann HJ, Kim KK, Yokota H, Kim R, Kim SH (December 1998). "Structure-based assignment of the biochemical function of a hypothetical protein: a test case of structural genomics". Proceedings of the National Academy of Sciences of the United States of America. 95 (26): 15189–93. doi:10.1073/pnas.95.26.15189. PMC 28018. PMID 9860944.
- Nan J, Brostromer E, Liu XY, Kristensen O, Su XD (2009). "Bioinformatics and structural characterization of a hypothetical protein from Streptococcus mutans: implication of antibiotic resistance". PLoS ONE. 4 (10): e7245. doi:10.1371/journal.pone.0007245. PMC 2749211. PMID 19798411.
- Hernández S, Gómez A, Cedano J, Querol E (October 2009). "Bioinformatics annotation of the hypothetical proteins found by omics techniques can help to disclose additional virulence factors". Current Microbiology. 59 (4): 451–6. doi:10.1007/s00284-009-9459-y. PMID 19636617.
- Dilip Gore (2009). "In silico Prediction of Structure andEnzymatic Activity for Hypothetical Proteins of Shigellaflexneri. Biofrontiers". Biofrontiers. 1 (2): 1–10.
- Dilip gore; Alankar raut (2009). "Computational Functionand Structural Annotations for Hypothetical proteins of Bacillus anthracis". Biofrontiers. 1 (1): 27–36.
- Dogra Pranay; Dilip Gore (2010). "Prediction of Enzymatic Function and Structure of H. influenzae Hypothetical Proteins - An In silico Approach". International Journal of Soft Computing and Bioinformatics. 1 (in press).
- D G Gore; A P Denge; N M Amrute (2010). "Homology Modeling and Enzyme Function Prediction in the Hypothetical Proteins of Helicobacter pylori - an Insilico Approach". Biomirror. 1: 1–5.