Elaboration of the Homologous Plastid-Encoded Protein Families that Separate Paralogs in Magnoliophytes
Lyubetsky V.A., Seliverstov A.V., Zverkov O.A.
Institute for Information Transmission Problems, the Russian Academy of Sciences (Kharkevich Institute)
Abstract. The division of proteins into families such that paralogous proteins belong to different families allows clarifying protein annotations and searching a family by its phylogenetic profile. The profile is defined by separating the set of species into three parts according to the presence/absence of a protein binding site (or other species feature) as well as indeterminacy in that respect. Another use of the families can be the search of proteins specific for a narrow taxonomic group (“signatures”). We have developed the algorithm to form these families and applied it to various sets of proteins. Among them is a set of plastid-encoded proteins in 186 Magnoliophyta plants. For this case, the corresponding database and algorithm for family search by a phylogenetic profile are freely accessible on the Web at http://lab6.iitp.ru/ppc/magnoliophyta/. The algorithm was also used for the division (clusterization) of mitochondrion-encoded proteins in 66 species of the green plants taxon (Viridiplantae); the corresponding database is provided at http://lab6.iitp.ru/mpc/viridiplantae/. The algorithm was applied to both rhodophyta-like and chlorophyta-like (algae and bryophytes) plastid branches; the corresponding databases are presented at http://lab6.iitp.ru/ppc/redline/ and http://lab6.iitp.ru/ppc/chlorophyta/. On this basis, some biological results were obtained. For example, in grape (Vitis vinifera) we found unique proteins which at the same time are typical for plastids. This allows predicting the horizontal gene transfer from plastids to mitochondria. The formal statement of the proteins clusterization problem is still waiting for completion.
Key words: clustering, protein families, plastids, mitochondria.