SynBrowse consists of two components: a web-based front end and a set of relational database back ends. Each database stores pre-computed alignments from a focus sequence (query sequence) to reference sequences in addition to the genome annotations of the focus sequence. The software provides end users with a web interface to choose sequence comparisons. The end user is first prompted to select a focus species (query species). Once a focus species is selected, the interface displays all the other genomes that have been aligned to the focus species, along with the list of available feature types or "tracks" of the focus species in checkbox group. After a reference species is selected, the interface shows the list of the "tracks" for the reference species and the search field as well as the selection menus of available comparative alignment types ("tracks") and the display levels. The user will then select a set of the "tracks" for each species for comparison and choose a comparative alignment type (or "track") and a display mode (or level). After these preliminaries, the user can search the database for a BAC, a chromosome, or another landmark in order to view patterns of global synteny (Figure 1), detailed syntenic regions or homologous gene pairs (Figure
2 and 3).
Figure 1. Global synteny between Lotus chromosome 4 and Medicago chromosome 4
Figure 2. Microsynteny with gene to gene comparisons between Lotus chromosome 4 and Medicago chromosome 4.
Figure 3. Microsynteny with exon to exon comparisons between Lotus chromosome 4 and Medicago chromosome 4.
The global synteny, detailed syntenic regions or homologous gene pairs found are graphically displayed on both the focus panel (white background) and reference panels (antique white or / and light yellow background). The comparisons of two sequences are based on an alignment feature created using any nucleotide- or protein-based alignment methods. Both ends of a conserved region (a synteny block, which includes a set of genes or DNA alignments that share the same relative ordering on the chromosomes of two species or between duplicated chromosomes of the same species, a gene, an exon, or a DNA alignment) between focus and reference species are connected by colored lines to show the matching regions. These vertical lines are extended through various other features such as predicted genes, EST alignments, repeats and motifs that have been attached to the focus or reference sequences. The conserved regions between the focus sequence and different reference sequences are shown in different colors based on the level of conservation.
Because comparisons between related species can result in very "busy" display, users can manipulate the display by realigning, zooming in, or flipping the region. Clicking on a syntenic block, a gene or another alignment on the focus panel will zoom in to a page showing a detailed view of the clicked object.
The focus panel contains one or more tracks that display the protein or nucleotide alignments used for the genome alignments. For example, line 4 on the white panel in Figure 1 holds a set of protein alignments (synteny blocks) from Lotus chromosome 4 to Medicago chromosome 4. Clicking one of these blocks will zoom in to a page showing a more detailed microsynteny structure in gene to gene comparisons in that block.
Each reference panel also contains an alignment track which shows the reciprocal alignment from that reference sequence to the focus sequence (e.g., the line Lj4 on each reference panel). In contrast to the alignment tracks of the focus panel, clicking a nucleotide alignment rectangle on the reference panel will generate text-based nucleotide alignments on the fly based on the FASTA sequences for both focus and reference species from the databases while clicking on the protein alignment rectangles will pop up a window showing the detailed text alignments of the homologous gene represented by the rectangles.
Features to Be Compared (taking legume syntenic network visualization system for example)
All the features are required to be formatted into GFF datasets before they are entered into the databases.
The protein alignments were generated using protein sequences of reference
species against genomic sequences of query species via spliced alignment
program GeneSeqer developed by Volker Brendel et al. We used similarity score
0.3 of proten vs DNA matches in GeneSeqer scoring system (Brendel
et al.) as cut off to filter out the low quality alignments. So, the browser
only shows quality protein alignments. These alignments are graphically
represented by blue colored-rectangles. The dark blue rectangles represent the
conserved regions with 1.0 similarity score while the white rectangles are the
conserved regions having less than 0.3 similarity scores. Clicking on the
graphical representation of this alignment on the query panel will bring up a
detailed viewer of the comparisons of the alignment in a popup window while
clicking on that of this alignment on the reference panel will show up the
text version of the alignment.
The DNA alignments were produced using genomic sequences of query species
against genomic sequences of reference species via Blastz program
developed by Schwartz et al. These alignments are represented by green
colored-rectangles. The dark green rectangles represent the conserved
alignments with 100% identity. The white rectangles are the conserved
alignments having less than 30% identity. Clicking on the graphical
representation of this alignment on both query and reference panels will
bring up the text version of the alignment in a popup window.
We collected the gene models of a genome if they are publicly
avaialable. If there are no publicly avaialable gene annotations,
we used Fgenesh to predict genes.
The EST alignments were generated using EST sequences against
genomic sequences of a species via spliced alignment program
-----------to be written later---------