TOOLS FOR PREDICTION AND ANALYSIS OF PROTEIN-CODING GENE STRUCTURE
CpG islands prediction
Prediction of protein-coding genes in newly sequenced DNA becomes very important in large genome sequencing projects. This problems is complicted due to exon-intron of the eukaryotic genes. CpG islands are important signature of 5' region of many mammalian genes.
The WWWCPG program locates CpG islands as defined by Gardiner-Garden M. and Frommer M. (CpG islands in vertebrate genomes, J.Mol.Biol, 1987, 196, 261-276): these are regions greater than 200 bp in length which have more than 50% G+C (p(G)+p(C) > 0.5) and have a CpG content of at least 0.6 of that expected on the basis of the G+C content of the region (p(CpG) > 0.6*p(C)*p(G)).
Milanesi L. and Rogozin I.B. Prediction of human gene structure. In: Guide to Human Genome Computing (2nd ed.) (Ed. M.J.Bishop) Academic Press, Cambridge, 1998, 215-259.
Milanesi L., D'Angelo D., Rogozin I.B. GeneBuilder: interactive in silico prediction of genes structure. Bioinformatics, 1999, (in press - BIO98N149).