| 1. |
Doolittle, R. F. (1981) Similar amino acid sequences: chance or common ancestry? Science 214, 149–159.
|
| |
| 2. |
Pearson, W. R., Sierk, M. L. (2005) The limits of protein sequence comparison? Curr Opin Struct Biol 15, 254–260.
|
| |
| 3. |
Ponting, C. P. (2001) Issues in predicting protein function from sequence. Brief Bio-inform 2, 19–29.
|
| |
| 4. |
Ponting, C. P., Dickens, N. J. (2001) Genome cartography through domain annotation. Genome Biol 2, Comment 2006.
|
| |
| 5. |
Fitch, W. M. (2000) Homology a personal view on some of the problems. Trends Genet 16, 227–231.
|
| |
| 6. |
Henikoff, S., Greene, E. A., Pietrokovski, S., et al. (1997) Gene families: the taxonomy of protein paralogs and chimeras.
Science 278, 609–614.
|
| |
| 7. |
Sonnhammer, E. L., Koonin, E. V. (2002) Orthology, paralogy and proposed classification for paralog subtypes. Trends Genet 18, 619–620.
|
| |
| 8. |
Webber, C., Ponting, C. P. (2004). Genes and homology. Curr Biol 14, R332–333.
|
| |
| 9. |
Tatusov, R. L., Galperin, M. Y., Natale, D. A., et al. (2000) The COG database: a tool for genome-scale analysis of protein
functions and evolution. Nucleic Acids Res 28, 33–36.
|
| |
| 10. |
Hurles, M. (2004) Gene duplication: the genomic trade in spare parts. PLoS Biol 2, E206.
|
| |
| 11. |
Tatusov, R. L., Fedorova, N. D., Jackson, J. D., et al. (2003) The COG database: an updated version includes eukaryotes. BMC Bioinformatics 4, 41.
|
| |
| 12. |
Hubbard, T., Andrews, D., Caccamo, M., et al. (2005) Ensembl 2005. Nucleic Acids Res 33, D447–453.
|
| |
| 13. |
Hubbard, T., Barker, D., Birney, E., et al. (2002) The Ensembl genome database project. Nucleic Acids Res 30, 38–41.
|
| |
| 14. |
Hinrichs, A. S., Karolchik, D., Baertsch, R., et al. (2006) The UCSC Genome Browser Database: update 2006. Nucleic Acids Res 34, D590–598.
|
| |
| 15. |
Karolchik, D., Baertsch, R., Diekhans, M., et al. (2003) The UCSC genome browser database. Nucleic Acids Res 31, 51–54.
|
| |
| 16. |
Marchler-Bauer, A., Anderson, J. B., Cherukuri, P. F., et al. (2005) CDD: a Conserved Domain Database for protein classification.
Nucleic Acids Res 33, D192–196.
|
| |
| 17. |
Marchler-Bauer, A., Anderson, J. B., DeW-eese-Scott, C., et al. (2003) CDD: a curated Entrez database of conserved domain
alignments. Nucleic Acids Res 31, 383–387.
|
| |
| 18. |
Marchler-Bauer, A., Panchenko, A. R., Shoemaker, B. A., et al. (2002) CDD: a database of conserved domain alignments with
links to domain three-dimensional structure. Nucleic Acids Res 30, 281–283.
|
| |
| 19. |
Apweiler, R., Attwood, T. K., Bairoch, A., et al. (2001) The InterPro database, an integrated documentation resource for protein
families, domains and functional sites. Nucleic Acids Res 29, 37–40.
|
| |
| 20. |
Zdobnov, E. M., Apweiler, R. (2001) Inter-ProScan—an integration platform for the signature-recognition methods in InterPro.
Bioinformatics 17, 847–848.
|
| |
| 21. |
Bateman, A., Birney, E., Durbin, R., et al. (2000) The Pfam protein families database. Nucleic Acids Res 28, 263–266.
|
| |
| 22. |
Bateman, A., Coin, L., Durbin, R., et al. (2004) The Pfam protein families database. Nucleic Acids Res 32, D138–141.
|
| |
| 23. |
Finn, R. D., Mistry, J., Schuster-Bockler, B., et al. (2006) Pfam: clans, web tools and services. Nucleic Acids Res 34, D247–251.
|
| |
| 24. |
Letunic, I., Copley, R. R., Pils, B., et al. (2006) SMART 5: domains in the context of genomes and networks. Nucleic Acids Res 34, D257–260.
|
| |
| 25. |
Letunic, I., Goodstadt, L., Dickens, N. J., et al. (2002) Recent improvements to the SMART domain-based sequence annotation
resource. Nucleic Acids Res 30, 242–244.
|
| |
| 26. |
Schultz, J., Copley, R. R., Doerks, T., et al. (2000) SMART: a web-based tool for the study of genetically mobile domains.
Nucleic Acids Res 28, 231–234.
|
| |
| 27. |
Altschul, S. F., Gish, W., Miller, W., et al. (1990) Basic local alignment search tool. J Mol Biol 215, 403–410.
|
| |
| 28. |
Altschul, S. F., Madden, T. L., Schaffer, A. A., et al. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database
search programs. Nucleic Acids Res 25, 3389–3402.
|
| |
| 29. |
Lopez, R., Silventoinen, V., Robinson, S., et al. (2003) WU-Blast2 server at the European Bioinformatics Institute. Nucleic Acids Res 31, 3795–3798.
|
| |
| 30. |
Pearson, W. R., Lipman, D. J. (1988) Improved tools for biological sequence comparison. Proc Natl Acad Sci U S A 85, 2444–2448.
|
| |
| 31. |
Ponting, C. P., Russell, R. R. (2002) The natural history of protein domains. Annu Rev Biophys Biomol Struct 31, 45–71.
|
| |
| 32. |
Durbin, R., Eddy, S. R., Krogh, A., et al. (1998) Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids. Cambridge University Press, Cambridge, UK.
|
| |
| 33. |
Eddy, S. R. (1998) Profile hidden Markov models. Bioinformatics 14, 755–763.
|
| |
| 34. |
Eddy, S. R. (2004) What is a hidden Markov model? Nat Biotechnol 22, 1315–1316.
|
| |
| 35. |
Gibbs, R. A., Weinstock, G. M., Metzker, M. L., et al. (2004) Genome sequence of the Brown Norway rat yields insights into
mammalian evolution. Nature 428, 493–521.
|
| |
| 36. |
Hillier, L. W., Miller, W., Birney, E., M., K., et al. (2004) Sequence and comparative analysis of the chicken genome provide
unique perspectives on vertebrate evolution. Nature 432, 695–716.
|
| |
| 37. |
Lander, E. S., Linton, L. M., Birren, B., D., E., et al. (2001) Initial sequencing and analysis of the human genome. Nature 409, 860–921.
|
| |
| 38. |
Waterston, R. H., Lindblad-Toh, K., Bir-ney, E., et al. (2002) Initial sequencing and comparative analysis of the mouse genome.
Nature 420, 520–562.
|
| |
| 39. |
Bateman, A. (1997) The structure of a domain common to archaebacteria and the homocystinuria disease protein. Trends Biochem Sci 22, 12–13.
|
| |
| 40. |
Emes, R. D., Ponting, C. P. (2001) A new sequence motif linking lissencephaly, Treacher Collins and oral-facial-digital type
1 syndromes, microtubule dynamics and cell migration. Hum Mol Genet 10, 2813–2820.
|
| |
| 41. |
Goodstadt, L., Ponting, C. P. (2004) Vitamin K epoxide reductase: homology, active site and catalytic mechanism. Trends Biochem Sci 29, 289–292.
|
| |
| 42. |
Morett, E., Bork, P. (1999) A novel trans-activation domain in parkin. Trends Biochem Sci 24, 229–231.
|
| |
| 43. |
Dayhoff, M. O., Schwartz, R. M., Orcutt, B. C. (1978) A model for evolutionary change, in (Dayhoff, M. O., ed.), Atlas of Protein Sequence and Structure, vol. 5. National Biomedical Research Foundation, Washington, DC.
|
| |
| 44. |
Henikoff, S., Henikoff, J. G. (1992) Amino acid substitution matrices from protein blocks. Proc Natl Acad Sci U S A 89, 10915–10919.
|
| |
| 45. |
Smith, T. F., Waterman, M. S. (1981) Identification of common molecular subsequences. J Mol Biol 147, 195–197.
|
| |
| 46. |
Altschul, S. F., Gish, W. (1996) Local alignment statistics. Methods Enzymol 266, 460–480.
|
| |
| 47. |
Altschul, S. F., Koonin, E. V. (1998) Iterated profile searches with PSI-BLAST—a tool for discovery in protein databases.
Trends Biochem Sci 23, 444–447.
|
| |
| 48. |
Altschul, S. F., Wootton, J. C., Gertz, E. M., et al. (2005) Protein database searches using compositionally adjusted substitution
matrices. Febs J 272, 5101–5109.
|
| |
| 49. |
Jones, D. T., Swindells, M. B. (2002) Getting the most from PSI-BLAST. Trends Biochem Sci 27, 161–164.
|
| |
| 50. |
Korf, I., Yandell, M., Bedell, J. (2003) BLAST. O'Reilly, Sebastopol CA.
|
| |
| 51. |
Wootton, J. C., Federhen, S. (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol 266, 554–571.
|
| |
| 52. |
Gribskov, M., Luthy, R., Eisenberg, D. (1990) Profile analysis. Methods Enzymol 183, 146–159.
|
| |
| 53. |
Gribskov, M., McLachlan, A. D., Eisen-berg, D. (1987) Profile analysis: detection of distantly related proteins. Proc Natl Acad Sci U S A 84, 4355–4358.
|
| |
| 54. |
Henikoff, S. (1996) Scores for sequence searches and alignments. Curr Opin Struct Biol 6, 353–360.
|
| |
| 55. |
Schaffer, A. A., Aravind, L., Madden, T. L., et al. (2001) Improving the accuracy of PSI-BLAST protein database searches with
composition-based statistics and other refinements. Nucleic Acids Res 29, 2994–3005.
|
| |
| 56. |
Sierk, M. L., Pearson, W. R. (2004) Sensitivity and selectivity in protein structure comparison. Protein Sci 13, 773–785.
|
| |
| 57. |
Henikoff, J. G., Pietrokovski, S., McCal-lum, C. M., et al. (2000) Blocks-based methods for detecting protein homology. Electrophoresis 21, 1700–1706.
|
| |
| 58. |
Henikoff, S., Pietrokovski, S., Henikoff, J. G. (1998) Superior performance in protein homology detection with the Blocks
Database servers. Nucleic Acids Res 26, 309–312.
|
| |
| 59. |
Schaffer, A. A., Wolf, Y. I., Ponting, C. P., et al. (1999) IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed
position-specific score matrices. Bioinformatics 15, 1000–1011.
|
| |
| 60. |
Pietrokovski, S. (1996) Searching databases of conserved sequence regions by aligning protein multiple-alignments. Nucleic Acids Res 24, 3836–3845.
|
| |
| 61. |
Sadreyev, R., Grishin, N. (2003) COMPASS: a tool for comparison of multiple protein alignments with assessment of statistical
significance. J Mol Biol 326, 317–336.
|
| |
| 62. |
Sadreyev, R. I., Baker, D., Grishin, N. V. (2003) Profile-profile comparisons by COMPASS predict intricate homologies between
protein families. Protein Sci 12, 2262–2272.
|
| |
| 63. |
Sadreyev, R. I., Grishin, N. V. (2004) Quality of alignment comparison by COMPASS improves with inclusion of diverse confident
homologs. Bioinformatics 20, 818–828.
|
| |
| 64. |
Soding, J. (2005) Protein homology detection by HMM-HMM comparison. Bioinformatics 21, 951–960.
|
| |
| 65. |
Soding, J., Biegert, A., Lupas, A. N. (2005) The HHpred interactive server for protein homology detection and structure prediction.
Nucleic Acids Res 33, W244–248.
|
| |
| 66. |
Emes, R. D., Goodstadt, L., Winter, E. E., et al. (2003) Comparison of the genomes of human and mouse lays the foundation
of genome zoology. Hum Mol Genet 12, 701–709.
|
| |
| 67. |
Kent, W. J. (2002) BLAT—the BLAST-like alignment tool. Genome Res 12, 656–664.
|
| |
| 68. |
Wheeler, D. L., Barrett, T., Benson, D. A., et al. (2006) Database resources of the National Center for Biotechnology Information.
Nucleic Acids Res 34, D173–180.
|
| |
| 69. |
Holm, L., Sander, C. (1998) Removing near-neighbour redundancy from large protein sequence collections. Bioinformatics 14, 423–429.
|
| |