Skip to content

Alignment

ikb6 edited this page May 15, 2025 · 9 revisions

image

How Alignment Works

Alignment refers to the process of matching each base (or gap) of a sequence (i.e. one collected from laboratory samples) to a base (or gap) of another known sequence (called a reference).For example, 95% of HIV cases in the US are HIV-1 subtype B, for which there is a canonical reference sequence. This sequence is called HXB2.

MicrobeTrace is not intended to be an aligner

In spite of the fact that MicrobeTrace is capable of performing rudimentary alignments on sequences, this has never been its principal application. Accordingly, the best results are likeliest to follow from using it with sequences which have been aligned by a capable bioinformatician in another program. However, we have developed a visually appealing and highly customizable alignment view for the user to view their sequences and easily check and navigate the alignment using the logo and mini map features as well as a number of view options via the settings menu.

The first thing to be aware of is the the included aligner uses HXB2 as its reference by default. If you are using MicrobeTrace to analyze sequences from other pathogens or organisms, YOU MUST PROVIDE YOUR OWN REFERENCE SEQUENCE. We cannot emphasize this strongly enough. Aligning non-HIV sequences to HXB2 will produce nonsense, and MicrobeTrace cannot tell the difference between HIV and non-HIV sequences to warn you. See image below to see reference selection options and preview

image

Interpreting sequence diagrams

A sequence diagram is a simple visualization designed to show you, at a glance, if your alignment is satisfactory. In general, it is a table of colored cells in which each row represents a sequence, and each column represents a nucleotide location in the sequence. So, for example:

image

Here we can see that columns of reasonably consistent colors represent nucleotides that are well-aligned relative to each other. This view is customizable using the settings button. The mini map allows you to click on a section of the alignment, and it automatically scrolls to that portion. The logo above the alignment allows you to see the composition of nucleotides at any given position across all sequences. Hover over a position to see percentage split up. You can toggle between** Logo view **and Bar Plot using settings button. The Ruler interval can also be changed.

image

Example of poorly aligned sequences

A horrible alignment

This will display a spurious network.

Customize the view using the settings button

This allows you to change sizes of displayed nucleotides/amino acids, colors, width, and whether or not you would like to display all bases or represent conserved positions as dots. Explore each section shown below.

image

Clone this wiki locally