3D7 chrA as reference. Ordered contigs. A. Ordered contigs. B. 3D7 chrB as reference. 3. 2. 1. 7. 5. 4. P. knowlesi. P. falciparum chr7. Assembly of scaffolds of P. knowlesi chromosomes for genomic comparison with P. falciparum 3D7.
Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author.While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server.
P. falciparum chr7
Assembly of scaffolds of P. knowlesi chromosomes for genomic comparison with P. falciparum 3D7
A. E. Berry1, E. Adlam, S. Banda, M. A. Rajandream1, M. Berriman1.
1 Wellcome Trust Sanger Institute, Welcome trust Genome Campus, Hinxton, Cambridgeshire, CB10 1SA, UK.
Read pair information to confirm linkage of ordered contigs for accurate scaffolds of P. knowlesichromosomes.
Introduction - Status of sequencing projects
SNAP and projector gene prediction analysis has resulted in a set of 5186 predivted proteins. These will be presented in GeneDB (www.genedb.org/pknowlesi/) (Hertz-
fowler et. al.) Manual review and annotation of snap/projector gene predictions is in progress, 286 have been manually reviewed thus far.
Preliminary annotation of Shizont-Infected Cell Agglutination antigens SICAvar in P. knowlesi.
SICAvar antigens have been shown to play an important role in virulence. The SICA agglutination assay demonstrated that recrudescing parasitemic waves were associated with variant phenotypes (Brown and Brown). The proteins responsible for agglutination of infected erythrocytes were later characterised as the SICAvar antigens (Howard et al, 1983). SICAvar antigens are analogous to Pfalciparum erthrocyte membrane protein -1(Pfemp1) (Leech., 1984; Howard et al., 1988).
A first analysis of P. knowlesi contigs has revealed four full length SICAvar antigens.
Figure 4 ACT view of a BLASTn comparison of four contigs encoding SICAvar antigens
Comparison of the Ghanian clinical isolate with 3D7 is in preliminary stages. This analysis provides an exciting opportunity to analyse the genome of a pathogen in relation to the laboratory adapted 3D7.
P. knowlesi is now at 8X and entering the finishing phase. This has a enabled preliminary comparison with 3D7 and an analysis of 5 SICAvar genes (Schizont-infected cell agglutination variant antigens).
Analysing Plasmodium spp. genomes using 3D7
as a reference
Figure 2 Hypothetical contigs of P. knowlesi (light blue and red horizontal boxes) are show ordered against 3D7(dark blue horizontal boxes) using tblastx. Blast hits are shown by red blocks. Matched read pairs are denoted by inward black and orange inward facing arrows joined by a dotted lines. Orange matched read pairs span the boundary of two ordered contigs providing evidence for their linkage. Unmatched read pairs are denoted by red, green, orange and violet arrows and accumulate at boundaries that are not linked. Read pair evidence can be used to map contigs, in this case suggesting that contigA and B should be interchanged, thus resulting in read pairs becoming matched.
Figure 5 The ACT comparison shows four SICAvar genes (red boxes).The first and second have 10 and 7 exons repectively, the third is truncated by the end of the contig, and the forth has 12 exons. Blast hits (High scoring pairs) between the genes are denoted by red or blue lines. The hits shown have a minimum nucleotide identity of 80 %. A blue line indicates that the hit is inverted. The green region denotes 2 kb immediately upstream of the start position.
3D7 is an important reference in the analysis of other Plasmodium spp. genomes. Contigs can be arranged into pseudochromosomes by comparison to 3D7 with TBlastX and ordered relative to it. This approach assumes that since the organisms are closely related, regions of conserved gene order between them will be evident. Such regions of conserved synteny are present throughout comparisons (Figure 1),
Figure1 An example of regions of conserved synteny between P. falciparum and P. knowlesi.
Figure 6 Ordering places Pknowlesi contig 4778 at the right hand telomere
An ordering process has been applied to an 8X PHRAP assembly of P. knowlesi 2766 contigs ( median 1.7 kb ). These contigs were size filtered resulting in a set of 890 (median 5.8 kb) which were ordered into resulting in 14 metachromosomes (Figure 3). The ordering process first removed any contigs below 5 kb. Z % of the remaining contigs were ordered.
Figure 3 ACT view of pseudochromosome 7 to 3D7 chr7
This analysis, although not conclusive supports the hypothesis that SICAvar genes are located close to the telomeres. The right hand end of contig4778 has heptameric repeats resembling the telomeric heptad of 3D7 (1 arrowed). Regions in the 3’UTR are similar to regions in REP20 (2 arrowed).Regions with the introns of SICAvar shown have similarity to regions of VAR introns and/or regions flanking exon/intron boundaries.
Future comparison of Pknowlesi and Pfalciparum telomeric/subtelomeric regions should shed light on the analogy between SICAvar and VAR genes and mechanisms which generate their antigenic diversity and control their expression throughout the life cycle.
Figure 1. P. Knowlesi top, six frame translation showing snap generated gene models (blue), contigs depicted alternate brown and orange. Pfalciparum (bottom) as for P. knowlesi. Near vertical red bars joining the sequences represent tblastx hits above a score threshold of 135 bits. Conservation of gene order, and to a lesser extent exon organisation, is apparent.
Yellow near vertical bars show a break in conservation of synteny. A putative orthologue of a proposed lysophospholipase is duplicated in P. knowlesi but Is in single copy in P. falciparum.
Figure 3 Ordering of contigs to generate Pkn pseudochromosome 7. Blast hits are shown by red lines joining the two sequences. 3D7 genes are shown on the six frame translation of 3D7 chr7. Note that it is not possible to order contigs onto the subtelomeric regions and to the internal VAR gene array.