Bowtie2AlignOptions

Options to map reads to reference sequence

Description

A Bowtie2AlignOptions object contains options to run the bowtie2 function, which aligns reads to a reference sequence.

Creation

Syntax

alignOptions = Bowtie2AlignOptions

alignOptions = Bowtie2AlignOptions(Name,Value)

alignOptions = Bowtie2AlignOptions(S)

Description

alignOptions = Bowtie2AlignOptions creates a Bowtie2AlignOptions object with default property values.

Bowtie2AlignOptions requires the Bowtie 2 Support Package for Bioinformatics Toolbox™. If this support package is not installed, then the function provides a download link. For details, see Bioinformatics Toolbox Software Support Packages.

example

alignOptions = Bowtie2AlignOptions(Name,Value) sets properties using one or more name-value pair arguments. Enclose each property name in quotes. For example, alignOptions = Bowtie2AlignOptions('Trim5',10) specifies to trim 10 residues from the 5' end.

example

alignOptions = Bowtie2AlignOptions(S) specifies optional parameters in a character vector S.

example

Input Arguments

expand all

`S` — Alignment parameters
character vector

Alignment parameters, specified as a character vector. S must be in the Bowtie 2 option syntax (prefixed by one or two dashes) [1].

Properties

expand all

`AlignForwardStrand` — Flag to allow unpaired reads to be aligned to forward strand
`true` or 1 (default) | `false` or 0

Since R2023b

Flag to allow unpaired reads to be aligned to the forward (Watson) reference strand, specified as a numeric or logical 1 (true) or 0 (false). Set this option to false to prevent bowtie2 from aligning reads to the forward reference strand.

Data Types: double | logical

`AlignReverseComplementStrand` — Flag to allow unpaired reads to be aligned to reverse strand
`true` or 1 (default) | `false` or 0

Since R2023b

Flag to allow unpaired reads to be aligned to the reverse (Crick) reference strand, specified as a numeric or logical 1 (true) or 0 (false). Set this option to false to prevent bowtie2 from aligning reads to the reverse reference strand.

Data Types: double | logical

`AlignedPairedReadSupplementFile` — Base name of files where aligned paired reads are saved
empty string (default) | character vector | string scalar

Since R2023b

Base name of files where aligned paired reads are saved, specified as a character vector or string scalar. Paired reads that align at least one time are saved to the files. bowtie2 creates two files, one for each read pair. The files have the same format as the input data.

The function appends ".1" or ".2" to the base file name to specify each read pair file. If the base file name includes the % symbol, bowtie2 inserts 1 or 2 at this % position instead of appending ".1" or ".2". Use ReadSupplementFileCompression to compress these supplement files.

By default, bowtie2 does not create these supplement files.

Data Types: char | string

`AlignedUnpairedReadSupplementFile` — Name of file where aligned unpaired reads are saved
empty string (default) | character vector | string scalar

Since R2023b

Name of a file where aligned unpaired reads are saved, specified as a character vector or string scalar. Unpaired reads that align at least one time are saved to the file. The file has the same format as the input data. Use ReadSupplementFileCompression to compress these supplement files.

By default, bowtie2 does not create the file.

Data Types: char | string

`AllowDovetail` — Flag to allow dovetail configurations of input reads
`false` or 0 (default) | `true` or 1

Flag to allow dovetail configurations of input reads, specified as a numeric or logical 1 (true) or 0 ( false). This property specifies whether the alignment of one mate can extend past the beginning of the alignment of the other mate and be considered concordant.

This property applies to paired-end reads only.

Data Types: double | logical

`AmbiguousPenalty` — Penalty for positions with ambiguous characters
`1` (default) | nonnegative integer

Penalty for positions with ambiguous characters on the read sequence, reference sequence, or both, specified as a nonnegative integer.

Data Types: double

`AppendCommentToSAM` — Flag to append FASTQ or FASTA comments
`false` or 0 (default) | `true` or 1

Since R2023b

Flag to append FASTQ or FASTA comments to the output SAM file, specified as a numeric or logical 1 (true) or 0 (false). A comment is any text after the first space in the read name.

Data Types: double | logical

`BAMAlignPairs` — Flag to align paired-end BAM reads
`false` or 0 (default) | `true` or 1

Since R2023b

Flag to align the paired-end BAM reads, specified as a numeric or logical 1 (true) or 0 ( false). This flag is functional only if you also set ReadFormat="BAM".

By default, bowtie2 attempts to align unpaired BAM reads only. Set the value to true to align paired-end reads instead.

Data Types: double | logical

`BAMPreserveTags` — Flag to preserve tags from input BAM file
`false` or 0 (default) | `true` or 1

Since R2023b

Flag to preserve tags from the input BAM file by appending them to the SAM output, specified as a numeric or logical 1 (true) or 0 ( false). Set the value to true to add the tags to the end of the corresponding SAM output file.

Data Types: double | logical

`Encoding` — Encoding format of base quality
`'Phred33'` (default) | `'Phred64'` | `'Solexa'`

Encoding format of the base quality in the input files, specified as one of the following: 'Phred33', 'Phred64', or 'Solexa'.

Data Types: char | string

`ExcludeContain` — Flag to allow one mate alignment to contain other mate
`false` or 0 (default) | `true` or 1

Flag to allow one mate alignment to contain the alignment of the other mate and to be considered concordant, specified as a numeric or logical 1 (true) or 0 (false).

This property applies to paired-end reads only.

Data Types: double | logical

`ExcludeDiscordant` — Flag to include discordant alignments
`false` or 0 (default) | `true` or 1

Flag to include discordant alignments, specified as a numeric or logical 1 (true) or 0 (false). A discordant alignment is an alignment where both mates align uniquely, but not in a way that satisfies the paired-end constraints.

Data Types: double | logical

`ExcludeMixed` — Flag to exclude mixed alignments
`false` or 0 (default) | `true` or 1

Flag to exclude mixed alignments, specified as a numeric or logical 1 (true) or 0 ( false). A mixed alignment consists of mate reads that are not concordant or discordant, but align individually.

This property applies to paired-end reads only.

Data Types: double | logical

`ExcludeOverlap` — Flag to allow mate alignment overlap
`false` or 0 (default) | `true` or 1

Flag to allow the alignment of one mate to overlap with the alignment of the other mate and to be considered concordant, specified as a numeric or logical 1 (true) or 0 (false).

Data Types: double | logical

`ExcludeSAMHeaders` — Flag to exclude SAM headers
`false` or 0 (default) | `true` or 1

Since R2023b

Flag to exclude SAM headers, specified as a numeric or logical 1 (true) or 0 ( false). A SAM header starts with the @ symbol.

Data Types: double | logical

`ExcludeSQSAMHeaders` — Flag to exclude SAM SQ headers
`false` or 0 (default) | `true` or 1

Since R2023b

Flag to exclude SAM reference sequence header lines in the output SAM file, specified as a numeric or logical 1 (true) or 0 ( false). A reference sequence header line starts with @SQ.

Data Types: double | logical

`ExcludeUnaligned` — Flag to exclude reads that failed to align
`false` or 0 (default) | `true` or 1

Flag to exclude reads that failed to align, specified as a numeric or logical 1 (true) or 0 (false).

Data Types: double | logical

`ExtraBowtie2Command` — Additional options not included in object properties
`''` (default) | character vector

Additional options not included in the object properties, specified as a character vector. The character vector must be in the Bowtie 2 option syntax (prefixed by one or two dashes). The default value is an empty character vector ''.

Example: 'ExtraBowtie2Command','--version'

Data Types: char | string

`FASTAKMerParameters` — K-mer length and step size
`[]` (default) | two-element vector

Since R2023b

K-mer length and step size to use when you set ReadFormat="FASTAKMer", specified as a two-element vector of positive integers.

Data Types: double

`FilterQSEQ` — Flag to filter reads with nonzero QSEQ field
`false` or 0 (default) | `true` or 1

Since R2023b

Flag to filter reads with nonzero QSEQ filter field, specified as a numeric or logical 1 (true) or 0 ( false). This flag is functional only if you also set ReadFormat="QSeq".

Data Types: double | logical

`IgnoreQuality` — Flag to ignore read position quality
`false` or 0 (default) | `true` or 1

Flag to ignore the actual read position quality when a mismatch occurs, specified as a numeric or logical 1 (true) or 0 (false). Setting this property to true allows the quality value at that mismatched position to be the highest possible, regardless of the actual value.

Data Types: double | logical

`IgnoreSoftClippedBasesForTLEN` — Flag to consider soft-clipped bases as unmapped when calculating TLEN
`false` or 0 (default) | `true` or 1

Since R2023b

Flag to consider soft-clipped bases as unmapped when calculating TLEN in the output SAM file, specified as a numeric or logical 1 (true) or 0 ( false). This flag is functional only if you also set Mode="Local". TLEN stands for signed observed template length.

Data Types: double | logical

`IntegerQualityEncoding` — Flag to specify quality values as integers
`false` or 0 (default) | `true` or 1

Since R2023b

Flag to specify quality values in the input reads as space-separated integers rather than ASCII characters, specified as a numeric or logical 1 (true) or 0 ( false).

Data Types: double | logical

`MatchBonus` — Reward added to alignment score
`2` (default) | nonnegative integer

Reward added to the alignment score when a position in the read matches a position in the reference, specified as a nonnegative integer.

Data Types: double

`MateOrientation` — Orientation of mate pairs
`"ForwardReverse"` (default) | `"ReverseForward"` | `"ForwardForward"`

Since R2023b

Orientation of mate pairs for paired-end alignment, specified as one of the following:

"ForwardReverse" — Aligned pairs are derived from a forward-oriented mate upstream of a reverse-oriented complement mate.
"ReverseForward" — Aligned pairs are derived from a reverse-oriented complement mate upstream of a forward-oriented mate.
"ForwardForward" — Aligned pairs are derived from a forward-oriented mate upstream of a forward-oriented mate.

Data Types: char | string

`MaxAmbiguousFunction` — Function governing maximum number of ambiguous characters
`'L,0,0.15'` (default) | character vector | string scalar

Function governing the maximum number of ambiguous characters allowed in a read, specified as a character vector or string scalar.

The function has the format 'f,B,A', where f is a function type, B is a constant term, and A is a coefficient. Available function types are:

'C'– Constant
'L'– Linear
'S'– Square root
'G'– Natural log

The resulting function is H(x) = B + A * f(x), where x is the read length.

The default function is 'L,0,0.15', that is, H(x) = 0 + 0.15 * x.

Example: 'MaxAmbiguousFunction','L,-0.4,-0.6'

Data Types: char | string

`MaxFragmentLength` — Maximum fragment length for paired-end alignment
500 (default) | positive integer

Since R2023b

Maximum fragment length for the paired-end alignment, specified as a positive integer.

The larger the difference between MaxFragmentLength and MinFragmentLength is, the slower bowtie2 runs.

This option does not consider trimming into account. That is, if you specify trimming options, such as Trim3 or Trim5, MaxFragmentLength is applied to the untrimmed mates.

Data Types: double

`MemoryMappedIndex` — Flag to use memory mapping when loading index
`false` or 0 (default) | `true` or 1

Flag to use memory mapping (instead of file I/O) when loading the index, specified as a numeric 1 (true) or 0 (false). Memory mapping allows many concurrent processes to share the memory image of the index, resulting in a more efficient parallelization of the task.

Data Types: double | logical

`MetricsFile` — Name of metrics file
empty string (default) | character vector | string scalar

Since R2023b

Name of the metrics file, specified as a character vector or string scalar. This file contains performance metrics for the alignment generated by bowtie2. By default, bowtie2 does not generate a metrics file.

Data Types: char | string

`MetricsFileWriteFrequency` — Time interval for writing to metrics file
1 (default) | positive integer

Since R2023b

Time interval in seconds for writing to the metrics file, specified as a positive integer. This option is functional only if you also specify MetricsFile. If so, by default, bowtie2 writes a new metrics record every second.

Data Types: double

`MinFragmentLength` — Minimum fragment length for paired-end alignment
0 (default) | nonnegative integer

Since R2023b

Minimum fragment length for the paired-end alignment, specified as a nonnegative integer.

The larger the difference between MaxFragmentLength and MinFragmentLength is, the slower bowtie2 runs.

This option does not consider trimming into account. That is, if you specify trimming options, such as Trim3 or Trim5, MinFragmentLength is applied to the untrimmed mates.

Data Types: double

`MinScoreFunction` — Function governing minimum score threshold of alignment
character vector | string scalar

Function governing the minimum score threshold of an alignment, specified as a character vector or string scalar.

The function has the format 'f,B,A', where f is a function type, B is a constant term, and A is a coefficient. Available function types are:

'C'– Constant
'L'– Linear
'S'– Square root
'G'– Natural log

The resulting function is H(x) = B + A * f(x), where x is the read length.

For the 'EndToEnd' alignment mode, the default function is 'L,-0.6,-0.6'. For the 'Local' mode, the default function is 'G,20,8'.

Example: 'MinScoreFunction','L,-0.4,-0.6'

Data Types: char | string

`MismatchPenalty` — Maximum and minimum values to compute mismatch penalty
`[6 2]` (default) | two-element vector

Maximum and minimum values to compute the mismatch penalty during alignment, specified as a two-element vector. The first element is the maximum value and the second element is the minimum value.

A number less than or equal to the maximum value, and greater than or equal to the minimum value is subtracted from the alignment score for each position where a read character aligns to a reference character, the characters do not match, and neither is an N character.

Example: 'MismatchPenalty',[5 3]

Data Types: double

`Mode` — Alignment mode
`'EndToEnd'` (default) | `'Local'`

Alignment mode, specified as 'EndToEnd' or 'Local'.

In the 'Local' mode, only part of the read must align to the reference, and some residues can be omitted (soft-clipped) to achieve the best alignment score. In the 'EndToEnd' mode, the entire read must align without any soft-clipping.

Data Types: char | string

`Nondeterministic` — Flag to reinitialize pseudo-random generator
`false` or 0 (default) | `true` or 1

Flag to reinitialize the pseudo-random generator for each read using the current time, specified as a numeric or logical 1 (true) or 0 (false). If true, the alignments reported for two identical reads can be different. The default value is false, that is, the pseudo-random generator is reinitialized using a seed derived from read information and the seed number.

Data Types: double | logical

`NoGapPositions` — Number of positions where gaps are not allowed
`4` (default) | nonnegative integer

Number of positions at the beginning or end of each read where gaps are not allowed, specified as a nonnegative integer.

Data Types: double

`NumAlignments` — Maximum number of valid alignments to report
`'Best'` (default) | `'All'` | positive integer

Maximum number of valid alignments to report before terminating the search, specified as a positive integer, 'Best', or 'All'. If you specify a positive integer N, the function searches for up to N distinct, valid alignments for each read. 'Best' reports the best alignment for each read. 'All' reports all the valid alignments for each read sorted by alignment scores.

The alignment score for a paired-end alignment equals the sum of the alignment scores of individual mates.

Data Types: double | char | string

`NumReseedings` — Maximum number of reseeding attempts
`2` (default) | nonnegative integer

Maximum number of reseeding attempts with repetitive seeds, specified as a nonnegative integer. During reseeding, the function chooses a new set of reads at different offsets to find more alignments.

Data Types: double

`NumSeedExtensions` — Maximum number of consecutive seed extension attempts
`15` (default) | nonnegative integer

Maximum number of consecutive seed extension attempts before getting a new seed, specified as a nonnegative integer. A seed extension fails if it does not yield an alignment with the best (or second-best) score.

Data Types: double

`NumSeedMismatches` — Number of allowed mismatches in seed alignment
`0` (default) | `1`

Number of allowed mismatches in a seed alignment during the multiseed alignment, specified as 0 or 1.

Data Types: double

`NumThreads` — Number of parallel threads to perform alignment
`1` (default) | positive integer

Number of parallel threads to perform the alignment, specified as a positive integer. Threads run on separate processors or cores. Increasing the number of threads provides a significant increase in speed (close to linear) but also increases the memory footprint.

Data Types: double

`Offrate` — Offrate to use when reading index
`NaN` (default) | positive integer

Offrate to use when reading the index to reduce the memory footprint, specified as a positive integer. The offrate must be greater than the offrate used to build the index.

Data Types: double

`OmitSecondarySequence` — Flag to omit SEQ and QUAL fields
`false` or 0 (default) | `true` or 1

Since R2023b

Flag to omit SEQ and QUAL fields, specified as a numeric or logical 1 (true) or 0 (false). When this option is true, bowtie2 prints an asterisk "*" for these fields in the output SAM file.

Data Types: double | logical

`PadPositions` — Position in reference sequence where alignment begins
`15` (default) | nonnegative integer

Position in the reference sequence where the alignment for each sequence begins, specified as a nonnegative integer.

Data Types: double

`ReadFormat` — File format of input reads
`""` (default) | `"Interleaved"` | `"BAM"` | `"FASTQ"` | `"FASTAKMer"` | ...

Since R2023b

File format for the input reads, specified as one of the following strings.

"" — Uses the extensions of the input files to determine the file format. All the input files must have the same file extension.
"FASTQ" — FASTQ file format.
"FASTA"— FASTA file format.
"FASTAKMer" — FASTA file format and you aim to align k-mers from the input files. You must also specify FASTAKMerParameters that defines the k-mer length and step size.
"Interleaved" — Interleaved FASTQ files, where the first two records represent a mate pair.
"BAM" — Sorted and unaligned BAM files.
"RawSequences" — Input files contain a single sequence per line.
"QSeq" — QSEQ file format.
"Tab5" — TAB5 file format, where each read or pair is on a single line. An unpaired read line is [name]\t[seq]\t[qual]\n. A paired-end read line is [name]\t[seq1]\t[qual1]\t[seq2]\t[qual2]\n. An input file can contain a mix of unpaired and paired-end reads, and the function can distinguish and handle both read types.
"Tab6" — TAB6 file format, where an unpaired read line is [name]\t[seq]\t[qual]\n and a paired read line is [name1]\t[seq1]\t[qual1]\t[name2]\t[seq2]\t[qual2]\n.

Data Types: char | string

`ReadGapCosts` — Gap costs for opening and extending gap
`[5 3]` (default) | two-element vector of nonnegative integers

Gap costs for opening and extending a gap on the read, specified as a two-element vector of nonnegative integers. The first element is the cost of opening a gap, and the second element is the cost of extending a gap. Given the cost vector [GO GE], a read gap of length N is assigned a penalty of GO + N * GE.

Example: 'ReadGapCosts',[4 2]

Data Types: double

`ReadGroup` — Read group information to add as field on `@RG` header line
`''` (default) | character vector | string scalar

Read group information to add as a field on the @RG header line in the output SAM report, specified as a character vector or string. This property applies only if you specify 'ReadGroupID'.

Data Types: char | string

`ReadGroupID` — Read group ID to add on `@RG` header line
`''` (default) | character vector | string

Read group ID to add on the @RG header line in the output SAM report, specified as a character vector or string. If you specify any read group ID, the function prints the @RG header line with the tag ID: followed by the specified group ID.

Data Types: char | string

`ReadSupplementFileCompression` — Compression type to use for supplement files
`"None"` (default) | `"gz"` | ...

Since R2023b

Compression type to use for the supplement files, specified as "None", "gz", "bz2", or "lz4".Use the following options to specify supplement files: AlignedPairedReadSupplementFile, AlignedUnpairedReadSupplementFile, UnalignedPairedReadSupplementFile, UnalignedUnpairedReadSupplementFile.

Data Types: char | string

`RefGapCosts` — Gap costs for opening and extending gap
`[5 3]` (default) | two-element vector of nonnegative integers

Gap costs for opening and extending a gap on the reference, specified as a two-element vector of nonnegative integers. The first element is the cost of opening a gap, and the second element is the cost of extending a gap. Given the cost vector [GO GE], a reference gap of length N is assigned a penalty of GO + N * GE.

Example: 'RefGapCosts',[4 2]

Data Types: double

`Reorder` — Flag to reorder SAM records
`false` or 0 (default) | `true` or 1

Flag to reorder SAM records to maintain the same order as in the input files, specified as a numeric or logical 1 (true) or 0 (false). This property applies only when the number of parallel threads is greater than one. When you use one thread, the order of the records in the output is the same as the order of the input.

Data Types: double | logical

`Seed` — Number to set seed in pseudo-random number generator
`0` (default) | nonnegative integer

Number to set the seed in the pseudo-random number generator, specified as a nonnegative integer.

Example: 'Seed',3

Data Types: double

`SeedIntervalFunction` — Function governing distance between seed substrings
character vector | string scalar

Function governing the distance between seed substrings during the multiseed alignment, specified as a character vector or string scalar.

The function has the format 'f,B,A', where f is a function type, B is a constant term, and A is a coefficient. Available function types are:

'C'– Constant
'L'– Linear
'S'– Square root
'G'– Natural log

The resulting function is H(x) = B + A * f(x), where x is the read length.

For the 'EndToEnd' alignment mode, the default function is 'S,1,1.15'. For the 'Local' mode, the default function is 'S,1,0.75'.

Example: 'SeedIntervalFunction','S,2,2.15'

Data Types: char | string

`SeedLength` — Seed substring length to align during multiseed alignment
`22` (default) | positive integer

Seed substring length to align during the multiseed alignment, specified as a positive integer.

Data Types: double

`Skip` — Number of reads to ignore
`0` (default) | nonnegative integer

Number of reads to ignore from the beginning of the input files, specified as a nonnegative integer.

Data Types: double

`Trim3` — Number of residues to trim from 3' end
`0` (default) | nonnegative integer

Number of residues to trim from the 3' end of each read before aligning, specified as a nonnegative integer.

Data Types: double

`Trim5` — Number of residues to trim from 5' end
`0` (default) | nonnegative integer

Number of residues to trim from the 5' end of each read before aligning, specified as a nonnegative integer.

Data Types: double

`TrimTo` — Threshold to trim reads exceeding given number of bases
`Inf` (default) | nonnegative integer | two-element array

Since R2023b

Threshold to trim reads exceeding a given number of bases, specified as a nonnegative integer or two-element array. By default, no reads are trimmed.

If the value is a nonnegative integer N, reads that contains more bases than the specified number N are trimmed from the 3' end.

If the value is a two-element array [M,N], the first number M must be either 3 or 5, which indicates either the 3' or 5' end to trim from. The second number specifies the maximum read length and any reads containing more bases than N are trimmed.

Data Types: double

`TruncateReadName` — Flag to truncate read names
`true` or 1 (default) | `false` or 0

Since R2023b

Flag to truncate read names, specified as a numeric or logical 1 (true) or 0 (false). By default, bowtie2 truncates the read name after the first white space.

Data Types: double | logical

`UnalignedPairedReadSupplementFile` — Base name of files where unaligned paired reads are saved
empty string (default) | character vector | string scalar

Since R2023b

Base name of files where paired reads that are not aligned are saved, specified as a character vector or string scalar. bowtie2 creates two files, one for each read pair. The files have the same format as the input data.

By default, bowtie2 does not create these supplement files.

Data Types: char | string

`UnalignedUnpairedReadSupplementFile` — Name of file where unaligned unpaired reads are saved
empty string (default) | character vector | string scalar

Since R2023b

Name of a file where unpaired reads that are not aligned are saved, specified as a character vector or string scalar. The file has the same format as the input data. Use ReadSupplementFileCompression to compress these supplement files.

By default, bowtie2 does not create the file.

Data Types: char | string

`UpTo` — Number of reads to consider from beginning of input files
`Inf` (default) | positive integer

Number of reads to consider from the beginning of input files, specified as a positive integer. The default value is Inf, that is, all reads are considered.

Data Types: double

`UseOneMismatchPriority` — Flag to indicate the prioritization of 1-mismatch alignments over the multiseed alignment
`true` or 1 (default) | `false` or 0

Since R2023b

Flag to indicate the prioritization of 1-mismatch alignments over the multiseed alignment, specified as a numeric or logical 1 (true) or 0 (false). By default, bowtie2 attempts to find the exact matches or matches with a single mismatch before trying a multiseed alignment.

Data Types: double | logical

Object Functions

`getBowtie2Command`	Translate object properties to Bowtie 2 options
`getBowtie2Table`	Retrieve table with object properties and equivalent Bowtie 2 options
`preset`	Set combination of alignment options
`run`	Map sequence reads to reference sequence using Bowtie 2

Examples

collapse all

Align Reads to Reference Sequence Using Bowtie 2

Build a set of index files for the Drosophila genome. An error message appears if you do not have the Bowtie 2 Support Package for Bioinformatics Toolbox installed when you run the function. Click the provided link to download the package from the Add-on menu.

For this example, the reference sequence Dmel_chr4.fa is already provided with the toolbox.

status = bowtie2build('Dmel_chr4.fa', 'Dmel_chr4_index');

If the index build is successful, the function returns 0 and creates the index files (*.bt2) in the current folder. The files have the prefix 'Dmel_chr4_index'.

Sometimes the index files exist, and you want to know the reference sequence used to build the index. In this case, use the bowtie2inspect function to get more information about the reference.

bowtie2inspect('Dmel_chr4', 'Dmel_chr4_retrieved.fa');

By default, the output file Dmel_chr4_retrieved.fa contains the sequence of the reference. You can also get a summary information about the reference name and lengths instead of the actual sequence. For details on the available options, see Bowtie2InspectOptions.

Once the index is ready, map the read sequences to the reference using the bowtie2 function. The paired-end read files (SRR6008575_10k_1.fq and SRR6008575_10k_2.fq) are already provided with the toolbox.

bowtie2('Dmel_chr4','SRR6008575_10k_1.fq','SRR6008575_10k_2.fq','SRR6008575_10k_chr4.sam');

The output is a SAM-formatted file that contains the mapping results.

You can specify different alignment options by passing in a Bowtie 2 syntax string or using a Bowtie2AlignOptions object.

Suppose you want to trim some residues from the 3' end before aligning. First, create a Bowtie2AlignOptions object.

 alignOpt = Bowtie2AlignOptions;

Trim four residues from the 3' end before aligning.

 alignOpt.Trim3 = 4;

Map reads to the reference using the specified alignment option.

flag = bowtie2('Dmel_chr4','SRR6008575_10k_1.fq','SRR6008575_10k_2.fq','SRR6008575_10k_chr4_trimmed.sam',alignOpt);

References

[1] Langmead, B., and S. Salzberg. "Fast gapped-read alignment with Bowtie 2." Nature Methods. 9, 2012, 357–359.

Version History

Introduced in R2018a

Bowtie2AlignOptions

Description

Creation

Syntax

Description

Input Arguments

S — Alignment parameters character vector

Properties

AlignForwardStrand — Flag to allow unpaired reads to be aligned to forward strand true or 1 (default) | false or 0

AlignReverseComplementStrand — Flag to allow unpaired reads to be aligned to reverse strand true or 1 (default) | false or 0

AlignedPairedReadSupplementFile — Base name of files where aligned paired reads are saved empty string (default) | character vector | string scalar

AlignedUnpairedReadSupplementFile — Name of file where aligned unpaired reads are saved empty string (default) | character vector | string scalar

AllowDovetail — Flag to allow dovetail configurations of input reads false or 0 (default) | true or 1

AmbiguousPenalty — Penalty for positions with ambiguous characters 1 (default) | nonnegative integer

AppendCommentToSAM — Flag to append FASTQ or FASTA comments false or 0 (default) | true or 1

BAMAlignPairs — Flag to align paired-end BAM reads false or 0 (default) | true or 1

BAMPreserveTags — Flag to preserve tags from input BAM file false or 0 (default) | true or 1

Encoding — Encoding format of base quality 'Phred33' (default) | 'Phred64' | 'Solexa'

ExcludeContain — Flag to allow one mate alignment to contain other mate false or 0 (default) | true or 1

ExcludeDiscordant — Flag to include discordant alignments false or 0 (default) | true or 1

ExcludeMixed — Flag to exclude mixed alignments false or 0 (default) | true or 1

ExcludeOverlap — Flag to allow mate alignment overlap false or 0 (default) | true or 1

ExcludeSAMHeaders — Flag to exclude SAM headers false or 0 (default) | true or 1

ExcludeSQSAMHeaders — Flag to exclude SAM SQ headers false or 0 (default) | true or 1

ExcludeUnaligned — Flag to exclude reads that failed to align false or 0 (default) | true or 1

ExtraBowtie2Command — Additional options not included in object properties '' (default) | character vector

FASTAKMerParameters — K-mer length and step size [] (default) | two-element vector

FilterQSEQ — Flag to filter reads with nonzero QSEQ field false or 0 (default) | true or 1

IgnoreQuality — Flag to ignore read position quality false or 0 (default) | true or 1

IgnoreSoftClippedBasesForTLEN — Flag to consider soft-clipped bases as unmapped when calculating TLEN false or 0 (default) | true or 1

IntegerQualityEncoding — Flag to specify quality values as integers false or 0 (default) | true or 1

MatchBonus — Reward added to alignment score 2 (default) | nonnegative integer

MateOrientation — Orientation of mate pairs "ForwardReverse" (default) | "ReverseForward" | "ForwardForward"

MaxAmbiguousFunction — Function governing maximum number of ambiguous characters 'L,0,0.15' (default) | character vector | string scalar

MaxFragmentLength — Maximum fragment length for paired-end alignment 500 (default) | positive integer

MemoryMappedIndex — Flag to use memory mapping when loading index false or 0 (default) | true or 1

MetricsFile — Name of metrics file empty string (default) | character vector | string scalar

MetricsFileWriteFrequency — Time interval for writing to metrics file 1 (default) | positive integer

MinFragmentLength — Minimum fragment length for paired-end alignment 0 (default) | nonnegative integer

MinScoreFunction — Function governing minimum score threshold of alignment character vector | string scalar

MismatchPenalty — Maximum and minimum values to compute mismatch penalty [6 2] (default) | two-element vector

Mode — Alignment mode 'EndToEnd' (default) | 'Local'

Nondeterministic — Flag to reinitialize pseudo-random generator false or 0 (default) | true or 1

NoGapPositions — Number of positions where gaps are not allowed 4 (default) | nonnegative integer

NumAlignments — Maximum number of valid alignments to report 'Best' (default) | 'All' | positive integer

NumReseedings — Maximum number of reseeding attempts 2 (default) | nonnegative integer

NumSeedExtensions — Maximum number of consecutive seed extension attempts 15 (default) | nonnegative integer

NumSeedMismatches — Number of allowed mismatches in seed alignment 0 (default) | 1

NumThreads — Number of parallel threads to perform alignment 1 (default) | positive integer

Offrate — Offrate to use when reading index NaN (default) | positive integer

OmitSecondarySequence — Flag to omit SEQ and QUAL fields false or 0 (default) | true or 1

PadPositions — Position in reference sequence where alignment begins 15 (default) | nonnegative integer

ReadFormat — File format of input reads "" (default) | "Interleaved" | "BAM" | "FASTQ" | "FASTAKMer" | ...

ReadGapCosts — Gap costs for opening and extending gap [5 3] (default) | two-element vector of nonnegative integers

ReadGroup — Read group information to add as field on @RG header line '' (default) | character vector | string scalar

ReadGroupID — Read group ID to add on @RG header line '' (default) | character vector | string

ReadSupplementFileCompression — Compression type to use for supplement files "None" (default) | "gz" | ...

RefGapCosts — Gap costs for opening and extending gap [5 3] (default) | two-element vector of nonnegative integers

Reorder — Flag to reorder SAM records false or 0 (default) | true or 1

Seed — Number to set seed in pseudo-random number generator 0 (default) | nonnegative integer

SeedIntervalFunction — Function governing distance between seed substrings character vector | string scalar

SeedLength — Seed substring length to align during multiseed alignment 22 (default) | positive integer

Skip — Number of reads to ignore 0 (default) | nonnegative integer

Trim3 — Number of residues to trim from 3' end 0 (default) | nonnegative integer

Trim5 — Number of residues to trim from 5' end 0 (default) | nonnegative integer

TrimTo — Threshold to trim reads exceeding given number of bases Inf (default) | nonnegative integer | two-element array

TruncateReadName — Flag to truncate read names true or 1 (default) | false or 0

UnalignedPairedReadSupplementFile — Base name of files where unaligned paired reads are saved empty string (default) | character vector | string scalar

UnalignedUnpairedReadSupplementFile — Name of file where unaligned unpaired reads are saved empty string (default) | character vector | string scalar

UpTo — Number of reads to consider from beginning of input files Inf (default) | positive integer

UseOneMismatchPriority — Flag to indicate the prioritization of 1-mismatch alignments over the multiseed alignment true or 1 (default) | false or 0

Object Functions

Examples

Align Reads to Reference Sequence Using Bowtie 2

References

Version History

See Also

External Websites

`S` — Alignment parameters
character vector

`AlignForwardStrand` — Flag to allow unpaired reads to be aligned to forward strand
`true` or 1 (default) | `false` or 0

`AlignReverseComplementStrand` — Flag to allow unpaired reads to be aligned to reverse strand
`true` or 1 (default) | `false` or 0

`AlignedPairedReadSupplementFile` — Base name of files where aligned paired reads are saved
empty string (default) | character vector | string scalar

`AlignedUnpairedReadSupplementFile` — Name of file where aligned unpaired reads are saved
empty string (default) | character vector | string scalar

`AllowDovetail` — Flag to allow dovetail configurations of input reads
`false` or 0 (default) | `true` or 1

`AmbiguousPenalty` — Penalty for positions with ambiguous characters
`1` (default) | nonnegative integer

`AppendCommentToSAM` — Flag to append FASTQ or FASTA comments
`false` or 0 (default) | `true` or 1

`BAMAlignPairs` — Flag to align paired-end BAM reads
`false` or 0 (default) | `true` or 1

`BAMPreserveTags` — Flag to preserve tags from input BAM file
`false` or 0 (default) | `true` or 1

`Encoding` — Encoding format of base quality
`'Phred33'` (default) | `'Phred64'` | `'Solexa'`

`ExcludeContain` — Flag to allow one mate alignment to contain other mate
`false` or 0 (default) | `true` or 1

`ExcludeDiscordant` — Flag to include discordant alignments
`false` or 0 (default) | `true` or 1

`ExcludeMixed` — Flag to exclude mixed alignments
`false` or 0 (default) | `true` or 1

`ExcludeOverlap` — Flag to allow mate alignment overlap
`false` or 0 (default) | `true` or 1

`ExcludeSAMHeaders` — Flag to exclude SAM headers
`false` or 0 (default) | `true` or 1

`ExcludeSQSAMHeaders` — Flag to exclude SAM SQ headers
`false` or 0 (default) | `true` or 1

`ExcludeUnaligned` — Flag to exclude reads that failed to align
`false` or 0 (default) | `true` or 1

`ExtraBowtie2Command` — Additional options not included in object properties
`''` (default) | character vector

`FASTAKMerParameters` — K-mer length and step size
`[]` (default) | two-element vector

`FilterQSEQ` — Flag to filter reads with nonzero QSEQ field
`false` or 0 (default) | `true` or 1

`IgnoreQuality` — Flag to ignore read position quality
`false` or 0 (default) | `true` or 1

`IgnoreSoftClippedBasesForTLEN` — Flag to consider soft-clipped bases as unmapped when calculating TLEN
`false` or 0 (default) | `true` or 1

`IntegerQualityEncoding` — Flag to specify quality values as integers
`false` or 0 (default) | `true` or 1

`MatchBonus` — Reward added to alignment score
`2` (default) | nonnegative integer

`MateOrientation` — Orientation of mate pairs
`"ForwardReverse"` (default) | `"ReverseForward"` | `"ForwardForward"`

`MaxAmbiguousFunction` — Function governing maximum number of ambiguous characters
`'L,0,0.15'` (default) | character vector | string scalar

`MaxFragmentLength` — Maximum fragment length for paired-end alignment
500 (default) | positive integer

`MemoryMappedIndex` — Flag to use memory mapping when loading index
`false` or 0 (default) | `true` or 1

`MetricsFile` — Name of metrics file
empty string (default) | character vector | string scalar

`MetricsFileWriteFrequency` — Time interval for writing to metrics file
1 (default) | positive integer

`MinFragmentLength` — Minimum fragment length for paired-end alignment
0 (default) | nonnegative integer

`MinScoreFunction` — Function governing minimum score threshold of alignment
character vector | string scalar

`MismatchPenalty` — Maximum and minimum values to compute mismatch penalty
`[6 2]` (default) | two-element vector

`Mode` — Alignment mode
`'EndToEnd'` (default) | `'Local'`

`Nondeterministic` — Flag to reinitialize pseudo-random generator
`false` or 0 (default) | `true` or 1

`NoGapPositions` — Number of positions where gaps are not allowed
`4` (default) | nonnegative integer

`NumAlignments` — Maximum number of valid alignments to report
`'Best'` (default) | `'All'` | positive integer

`NumReseedings` — Maximum number of reseeding attempts
`2` (default) | nonnegative integer

`NumSeedExtensions` — Maximum number of consecutive seed extension attempts
`15` (default) | nonnegative integer

`NumSeedMismatches` — Number of allowed mismatches in seed alignment
`0` (default) | `1`

`NumThreads` — Number of parallel threads to perform alignment
`1` (default) | positive integer

`Offrate` — Offrate to use when reading index
`NaN` (default) | positive integer

`OmitSecondarySequence` — Flag to omit SEQ and QUAL fields
`false` or 0 (default) | `true` or 1

`PadPositions` — Position in reference sequence where alignment begins
`15` (default) | nonnegative integer

`ReadFormat` — File format of input reads
`""` (default) | `"Interleaved"` | `"BAM"` | `"FASTQ"` | `"FASTAKMer"` | ...

`ReadGapCosts` — Gap costs for opening and extending gap
`[5 3]` (default) | two-element vector of nonnegative integers

`ReadGroup` — Read group information to add as field on `@RG` header line
`''` (default) | character vector | string scalar

`ReadGroupID` — Read group ID to add on `@RG` header line
`''` (default) | character vector | string

`ReadSupplementFileCompression` — Compression type to use for supplement files
`"None"` (default) | `"gz"` | ...

`RefGapCosts` — Gap costs for opening and extending gap
`[5 3]` (default) | two-element vector of nonnegative integers

`Reorder` — Flag to reorder SAM records
`false` or 0 (default) | `true` or 1

`Seed` — Number to set seed in pseudo-random number generator
`0` (default) | nonnegative integer

`SeedIntervalFunction` — Function governing distance between seed substrings
character vector | string scalar

`SeedLength` — Seed substring length to align during multiseed alignment
`22` (default) | positive integer

`Skip` — Number of reads to ignore
`0` (default) | nonnegative integer

`Trim3` — Number of residues to trim from 3' end
`0` (default) | nonnegative integer

`Trim5` — Number of residues to trim from 5' end
`0` (default) | nonnegative integer

`TrimTo` — Threshold to trim reads exceeding given number of bases
`Inf` (default) | nonnegative integer | two-element array

`TruncateReadName` — Flag to truncate read names
`true` or 1 (default) | `false` or 0

`UnalignedPairedReadSupplementFile` — Base name of files where unaligned paired reads are saved
empty string (default) | character vector | string scalar

`UnalignedUnpairedReadSupplementFile` — Name of file where unaligned unpaired reads are saved
empty string (default) | character vector | string scalar

`UpTo` — Number of reads to consider from beginning of input files
`Inf` (default) | positive integer

`UseOneMismatchPriority` — Flag to indicate the prioritization of 1-mismatch alignments over the multiseed alignment
`true` or 1 (default) | `false` or 0