Bowtie2AlignOptions
Options to map reads to reference sequence
Description
A Bowtie2AlignOptions
object contains options to run the
bowtie2
function, which aligns reads to a reference
sequence.
Creation
Syntax
Description
creates a alignOptions
= Bowtie2AlignOptionsBowtie2AlignOptions
object with default property
values.
Bowtie2AlignOptions
requires the Bowtie 2 Support Package for Bioinformatics Toolbox™. If this support package is not installed, then the function provides a download
link. For details, see Bioinformatics Toolbox Software Support Packages.
sets properties using one or more name-value pair arguments. Enclose each
property name in quotes. For example, alignOptions
= Bowtie2AlignOptions(Name,Value)alignOptions =
Bowtie2AlignOptions('Trim5',10)
specifies to trim 10 residues from
the 5' end.
Input Arguments
S
— Alignment parameters
character vector
Alignment parameters, specified as a character vector.
S
must be in the Bowtie 2 option syntax
(prefixed by one or two dashes) [1].
Properties
AlignForwardStrand
— Flag to allow unpaired reads to be aligned to forward strand
true
or 1 (default) | false
or 0
Since R2023b
Flag to allow unpaired reads to be aligned to the forward (Watson)
reference strand, specified as a numeric or logical 1
(true
) or 0 (false
). Set this
option to false
to prevent bowtie2
from aligning reads to the forward reference strand.
Data Types: double
| logical
AlignReverseComplementStrand
— Flag to allow unpaired reads to be aligned to reverse strand
true
or 1 (default) | false
or 0
Since R2023b
Flag to allow unpaired reads to be aligned to the reverse (Crick)
reference strand, specified as a numeric or logical 1
(true
) or 0 (false
). Set this
option to false
to prevent bowtie2
from aligning reads to the reverse reference strand.
Data Types: double
| logical
AlignedPairedReadSupplementFile
— Base name of files where aligned paired reads are saved
empty string (default) | character vector | string scalar
Since R2023b
Base name of files where aligned paired reads are saved, specified as a
character vector or string scalar. Paired reads that align at least one time
are saved to the files. bowtie2
creates two files, one
for each read pair. The files have the same format as the input data.
The function appends ".1"
or ".2"
to
the base file name to specify each read pair file. If the base file name
includes the %
symbol, bowtie2
inserts 1
or 2
at this
%
position instead of appending
".1"
or ".2"
. Use
ReadSupplementFileCompression
to compress these
supplement files.
By default, bowtie2
does not create these supplement
files.
Data Types: char
| string
AlignedUnpairedReadSupplementFile
— Name of file where aligned unpaired reads are saved
empty string (default) | character vector | string scalar
Since R2023b
Name of a file where aligned unpaired reads are saved, specified as a
character vector or string scalar. Unpaired reads that align at least one
time are saved to the file. The file has the same format as the input data.
Use ReadSupplementFileCompression
to compress these
supplement files.
By default, bowtie2
does not create the file.
Data Types: char
| string
AllowDovetail
— Flag to allow dovetail configurations of input reads
false
or 0 (default) | true
or 1
Flag to allow dovetail configurations of input reads, specified as a
numeric or logical 1 (true
) or 0 (
false
). This property specifies whether the alignment
of one mate can extend past the beginning of the alignment of the other mate
and be considered concordant.
This property applies to paired-end reads only.
Data Types: double
| logical
AmbiguousPenalty
— Penalty for positions with ambiguous characters
1
(default) | nonnegative integer
Penalty for positions with ambiguous characters on the read sequence, reference sequence, or both, specified as a nonnegative integer.
Data Types: double
AppendCommentToSAM
— Flag to append FASTQ or FASTA comments
false
or 0 (default) | true
or 1
Since R2023b
Flag to append FASTQ or FASTA comments to the output SAM file, specified
as a numeric or logical 1 (true
) or 0
(false
). A comment is any text after the first space
in the read name.
Data Types: double
| logical
BAMAlignPairs
— Flag to align paired-end BAM reads
false
or 0 (default) | true
or 1
Since R2023b
Flag to align the paired-end BAM reads, specified as a numeric or logical
1 (true
) or 0 ( false
). This flag is
functional only if you also set ReadFormat="BAM"
.
By default, bowtie2
attempts to align unpaired BAM
reads only. Set the value to true
to align paired-end
reads instead.
Data Types: double
| logical
BAMPreserveTags
— Flag to preserve tags from input BAM file
false
or 0 (default) | true
or 1
Since R2023b
Flag to preserve tags from the input BAM file by appending them to the SAM
output, specified as a numeric or logical 1 (true
) or 0 (
false
). Set the value to true
to
add the tags to the end of the corresponding SAM output file.
Data Types: double
| logical
Encoding
— Encoding format of base quality
'Phred33'
(default) | 'Phred64'
| 'Solexa'
Encoding format of the base quality in the input files, specified as one
of the following: 'Phred33'
,
'Phred64'
, or 'Solexa'
.
Data Types: char
| string
ExcludeContain
— Flag to allow one mate alignment to contain other mate
false
or 0 (default) | true
or 1
Flag to allow one mate alignment to contain the alignment of the other
mate and to be considered concordant, specified as a numeric or logical 1
(true
) or 0 (false
).
This property applies to paired-end reads only.
Data Types: double
| logical
ExcludeDiscordant
— Flag to include discordant alignments
false
or 0 (default) | true
or 1
Flag to include discordant alignments, specified as a numeric or logical 1
(true
) or 0 (false
). A discordant
alignment is an alignment where both mates align uniquely, but not in a way
that satisfies the paired-end constraints.
Data Types: double
| logical
ExcludeMixed
— Flag to exclude mixed alignments
false
or 0 (default) | true
or 1
Flag to exclude mixed alignments, specified as a numeric or logical 1
(true
) or 0 ( false
). A mixed
alignment consists of mate reads that are not concordant or discordant, but
align individually.
This property applies to paired-end reads only.
Data Types: double
| logical
ExcludeOverlap
— Flag to allow mate alignment overlap
false
or 0 (default) | true
or 1
Flag to allow the alignment of one mate to overlap with the alignment of
the other mate and to be considered concordant, specified as a numeric or
logical 1 (true
) or 0 (false
).
Data Types: double
| logical
ExcludeSAMHeaders
— Flag to exclude SAM headers
false
or 0 (default) | true
or 1
Since R2023b
Flag to exclude SAM headers, specified as a numeric or logical 1
(true
) or 0 ( false
). A SAM header
starts with the @
symbol.
Data Types: double
| logical
ExcludeSQSAMHeaders
— Flag to exclude SAM SQ headers
false
or 0 (default) | true
or 1
Since R2023b
Flag to exclude SAM reference sequence header lines in the output SAM
file, specified as a numeric or logical 1 (true
) or 0 (
false
). A reference sequence header line starts with
@SQ
.
Data Types: double
| logical
ExcludeUnaligned
— Flag to exclude reads that failed to align
false
or 0 (default) | true
or 1
Flag to exclude reads that failed to align, specified as a numeric or
logical 1 (true
) or 0 (false
).
Data Types: double
| logical
ExtraBowtie2Command
— Additional options not included in object properties
''
(default) | character vector
Additional options not included in the object properties, specified as
a character vector. The character vector must be in the Bowtie 2
option syntax (prefixed by one or two dashes). The default value
is an empty character vector ''
.
Example: 'ExtraBowtie2Command','--version'
Data Types: char
| string
FASTAKMerParameters
— K-mer length and step size
[]
(default) | two-element vector
Since R2023b
K-mer length and step size to use when you set
ReadFormat="FASTAKMer"
, specified as a two-element
vector of positive integers.
Data Types: double
FilterQSEQ
— Flag to filter reads with nonzero QSEQ field
false
or 0 (default) | true
or 1
Since R2023b
Flag to filter reads with nonzero QSEQ filter field, specified as a
numeric or logical 1 (true
) or 0 (
false
). This flag is functional only if you also set
ReadFormat="QSeq"
.
Data Types: double
| logical
IgnoreQuality
— Flag to ignore read position quality
false
or 0 (default) | true
or 1
Flag to ignore the actual read position quality when a mismatch occurs,
specified as a numeric or logical 1 (true
) or 0
(false
). Setting this property to
true
allows the quality value at that mismatched
position to be the highest possible, regardless of the actual value.
Data Types: double
| logical
IgnoreSoftClippedBasesForTLEN
— Flag to consider soft-clipped bases as unmapped when calculating TLEN
false
or 0 (default) | true
or 1
Since R2023b
Flag to consider soft-clipped bases as unmapped when calculating TLEN in
the output SAM file, specified as a numeric or logical 1
(true
) or 0 ( false
). This flag is
functional only if you also set Mode="Local"
. TLEN
stands for signed observed template length.
Data Types: double
| logical
IntegerQualityEncoding
— Flag to specify quality values as integers
false
or 0 (default) | true
or 1
Since R2023b
Flag to specify quality values in the input reads as space-separated
integers rather than ASCII characters, specified as a numeric or logical 1
(true
) or 0 ( false
).
Data Types: double
| logical
MatchBonus
— Reward added to alignment score
2
(default) | nonnegative integer
Reward added to the alignment score when a position in the read matches a position in the reference, specified as a nonnegative integer.
Data Types: double
MateOrientation
— Orientation of mate pairs
"ForwardReverse"
(default) | "ReverseForward"
| "ForwardForward"
Since R2023b
Orientation of mate pairs for paired-end alignment, specified as one of the following:
"ForwardReverse"
— Aligned pairs are derived from a forward-oriented mate upstream of a reverse-oriented complement mate."ReverseForward"
— Aligned pairs are derived from a reverse-oriented complement mate upstream of a forward-oriented mate."ForwardForward"
— Aligned pairs are derived from a forward-oriented mate upstream of a forward-oriented mate.
Data Types: char
| string
MaxAmbiguousFunction
— Function governing maximum number of ambiguous characters
'L,0,0.15'
(default) | character vector | string scalar
Function governing the maximum number of ambiguous characters allowed in a read, specified as a character vector or string scalar.
The function has the format 'f,B,A'
, where
f is a function type, B is a constant term, and
A is a coefficient. Available function types are:
'C'
– Constant'L'
– Linear'S'
– Square root'G'
– Natural log
The resulting function is H(x) = B + A * f(x)
, where
x is the read length.
The default function is 'L,0,0.15'
, that is,
H(x) = 0 + 0.15 * x
.
Example: 'MaxAmbiguousFunction','L,-0.4,-0.6'
Data Types: char
| string
MaxFragmentLength
— Maximum fragment length for paired-end alignment
500 (default) | positive integer
Since R2023b
Maximum fragment length for the paired-end alignment, specified as a positive integer.
The larger the difference between MaxFragmentLength
and MinFragmentLength
is, the slower
bowtie2
runs.
This option does not consider trimming into account. That is, if you
specify trimming options, such as Trim3
or
Trim5
, MaxFragmentLength
is
applied to the untrimmed mates.
Data Types: double
MemoryMappedIndex
— Flag to use memory mapping when loading index
false
or 0 (default) | true
or 1
Flag to use memory mapping (instead of file I/O) when loading the index,
specified as a numeric 1 (true
) or 0
(false
). Memory mapping allows many concurrent
processes to share the memory image of the index, resulting in a more
efficient parallelization of the task.
Data Types: double
| logical
MetricsFile
— Name of metrics file
empty string (default) | character vector | string scalar
Since R2023b
Name of the metrics file, specified as a character vector or string
scalar. This file contains performance metrics for the alignment generated
by bowtie2
. By default, bowtie2
does not generate a metrics file.
Data Types: char
| string
MetricsFileWriteFrequency
— Time interval for writing to metrics file
1 (default) | positive integer
Since R2023b
Time interval in seconds for writing to the metrics file, specified as a
positive integer. This option is functional only if you also specify
MetricsFile
. If so, by default,
bowtie2
writes a new metrics record every
second.
Data Types: double
MinFragmentLength
— Minimum fragment length for paired-end alignment
0 (default) | nonnegative integer
Since R2023b
Minimum fragment length for the paired-end alignment, specified as a nonnegative integer.
The larger the difference between MaxFragmentLength
and MinFragmentLength
is, the slower
bowtie2
runs.
This option does not consider trimming into account. That is, if you
specify trimming options, such as Trim3
or
Trim5
, MinFragmentLength
is
applied to the untrimmed mates.
Data Types: double
MinScoreFunction
— Function governing minimum score threshold of alignment
character vector | string scalar
Function governing the minimum score threshold of an alignment, specified as a character vector or string scalar.
The function has the format 'f,B,A'
, where
f is a function type, B is a constant term, and
A is a coefficient. Available function types are:
'C'
– Constant'L'
– Linear'S'
– Square root'G'
– Natural log
The resulting function is H(x) = B + A * f(x)
, where
x is the read length.
For the 'EndToEnd'
alignment mode, the default function
is 'L,-0.6,-0.6'
. For the 'Local'
mode, the default function is 'G,20,8'
.
Example: 'MinScoreFunction','L,-0.4,-0.6'
Data Types: char
| string
MismatchPenalty
— Maximum and minimum values to compute mismatch penalty
[6 2]
(default) | two-element vector
Maximum and minimum values to compute the mismatch penalty during alignment, specified as a two-element vector. The first element is the maximum value and the second element is the minimum value.
A number less than or equal to the maximum value, and greater than or
equal to the minimum value is subtracted from the alignment score for each
position where a read character aligns to a reference character, the
characters do not match, and neither is an N
character.
Example: 'MismatchPenalty',[5 3]
Data Types: double
Mode
— Alignment mode
'EndToEnd'
(default) | 'Local'
Alignment mode, specified as 'EndToEnd'
or
'Local'
.
In the 'Local'
mode, only part of the read must align
to the reference, and some residues can be omitted (soft-clipped) to achieve
the best alignment score. In the 'EndToEnd'
mode, the
entire read must align without any soft-clipping.
Data Types: char
| string
Nondeterministic
— Flag to reinitialize pseudo-random generator
false
or 0 (default) | true
or 1
Flag to reinitialize the pseudo-random generator for each read using the
current time, specified as a numeric or logical 1 (true
)
or 0 (false
). If true
, the alignments
reported for two identical reads can be different. The default value is
false
, that is, the pseudo-random generator is
reinitialized using a seed derived from read information and the seed
number.
Data Types: double
| logical
NoGapPositions
— Number of positions where gaps are not allowed
4
(default) | nonnegative integer
Number of positions at the beginning or end of each read where gaps are not allowed, specified as a nonnegative integer.
Data Types: double
NumAlignments
— Maximum number of valid alignments to report
'Best'
(default) | 'All'
| positive integer
Maximum number of valid alignments to report before terminating the
search, specified as a positive integer, 'Best'
, or
'All'
. If you specify a positive integer
N, the function searches for up to
N distinct, valid alignments for each read.
'Best'
reports the best alignment for each read.
'All'
reports all the valid alignments for each read
sorted by alignment scores.
The alignment score for a paired-end alignment equals the sum of the alignment scores of individual mates.
Data Types: double
| char
| string
NumReseedings
— Maximum number of reseeding attempts
2
(default) | nonnegative integer
Maximum number of reseeding attempts with repetitive seeds, specified as a nonnegative integer. During reseeding, the function chooses a new set of reads at different offsets to find more alignments.
Data Types: double
NumSeedExtensions
— Maximum number of consecutive seed extension attempts
15
(default) | nonnegative integer
Maximum number of consecutive seed extension attempts before getting a new seed, specified as a nonnegative integer. A seed extension fails if it does not yield an alignment with the best (or second-best) score.
Data Types: double
NumSeedMismatches
— Number of allowed mismatches in seed alignment
0
(default) | 1
Number of allowed mismatches in a seed alignment during the multiseed
alignment, specified as 0
or 1
.
Data Types: double
NumThreads
— Number of parallel threads to perform alignment
1
(default) | positive integer
Number of parallel threads to perform the alignment, specified as a positive integer. Threads run on separate processors or cores. Increasing the number of threads provides a significant increase in speed (close to linear) but also increases the memory footprint.
Data Types: double
Offrate
— Offrate to use when reading index
NaN
(default) | positive integer
Offrate to use when reading the index to reduce the memory footprint, specified as a positive integer. The offrate must be greater than the offrate used to build the index.
Data Types: double
OmitSecondarySequence
— Flag to omit SEQ and QUAL fields
false
or 0 (default) | true
or 1
Since R2023b
Flag to omit SEQ and QUAL fields, specified as a numeric or logical 1
(true
) or 0 (false
). When this
option is true, bowtie2
prints an asterisk
"*"
for these fields in the output SAM file.
Data Types: double
| logical
PadPositions
— Position in reference sequence where alignment begins
15
(default) | nonnegative integer
Position in the reference sequence where the alignment for each sequence begins, specified as a nonnegative integer.
Data Types: double
ReadFormat
— File format of input reads
""
(default) | "Interleaved"
| "BAM"
| "FASTQ"
| "FASTAKMer"
| ...
Since R2023b
File format for the input reads, specified as one of the following strings.
""
— Uses the extensions of the input files to determine the file format. All the input files must have the same file extension."FASTQ"
— FASTQ file format."FASTA"
— FASTA file format."FASTAKMer"
— FASTA file format and you aim to align k-mers from the input files. You must also specifyFASTAKMerParameters
that defines the k-mer length and step size."Interleaved"
— Interleaved FASTQ files, where the first two records represent a mate pair."BAM"
— Sorted and unaligned BAM files."RawSequences"
— Input files contain a single sequence per line."QSeq"
— QSEQ file format."Tab5"
— TAB5 file format, where each read or pair is on a single line. An unpaired read line is[name]\t[seq]\t[qual]\n
. A paired-end read line is[name]\t[seq1]\t[qual1]\t[seq2]\t[qual2]\n
. An input file can contain a mix of unpaired and paired-end reads, and the function can distinguish and handle both read types."Tab6"
— TAB6 file format, where an unpaired read line is[name]\t[seq]\t[qual]\n
and a paired read line is[name1]\t[seq1]\t[qual1]\t[name2]\t[seq2]\t[qual2]\n
.
Data Types: char
| string
ReadGapCosts
— Gap costs for opening and extending gap
[5 3]
(default) | two-element vector of nonnegative integers
Gap costs for opening and extending a gap on the read, specified as a
two-element vector of nonnegative integers. The first element is the cost of
opening a gap, and the second element is the cost of extending a gap. Given
the cost vector [GO
GE]
, a read gap of length
N is assigned a penalty of
GO + N *
GE
.
Example: 'ReadGapCosts',[4 2]
Data Types: double
ReadGroup
— Read group information to add as field on @RG
header line
''
(default) | character vector | string scalar
Read group information to add as a field on the @RG
header line in the output SAM report, specified as a character vector or
string. This property applies only if you specify
'ReadGroupID'
.
Data Types: char
| string
ReadGroupID
— Read group ID to add on @RG
header line
''
(default) | character vector | string
Read group ID to add on the @RG
header line in the
output SAM report, specified as a character vector or string. If you specify
any read group ID, the function prints the @RG
header
line with the tag ID:
followed by the specified group
ID.
Data Types: char
| string
ReadSupplementFileCompression
— Compression type to use for supplement files
"None"
(default) | "gz"
| ...
Since R2023b
Compression type to use for the supplement files, specified as
"None"
, "gz"
,
"bz2"
, or "lz4"
.Use the following
options to specify supplement files:
AlignedPairedReadSupplementFile
,
AlignedUnpairedReadSupplementFile
,
UnalignedPairedReadSupplementFile
,
UnalignedUnpairedReadSupplementFile
.
Data Types: char
| string
RefGapCosts
— Gap costs for opening and extending gap
[5 3]
(default) | two-element vector of nonnegative integers
Gap costs for opening and extending a gap on the reference, specified as a
two-element vector of nonnegative integers. The first element is the cost of
opening a gap, and the second element is the cost of extending a gap. Given
the cost vector [GO
GE]
, a reference gap of length
N is assigned a penalty of
GO + N *
GE
.
Example: 'RefGapCosts',[4 2]
Data Types: double
Reorder
— Flag to reorder SAM records
false
or 0 (default) | true
or 1
Flag to reorder SAM records to maintain the same order as in the input
files, specified as a numeric or logical 1 (true
) or 0
(false
). This property applies only when the number
of parallel threads is greater than one. When you use one thread, the order
of the records in the output is the same as the order of the input.
Data Types: double
| logical
Seed
— Number to set seed in pseudo-random number generator
0
(default) | nonnegative integer
Number to set the seed in the pseudo-random number generator, specified as a nonnegative integer.
Example: 'Seed',3
Data Types: double
SeedIntervalFunction
— Function governing distance between seed substrings
character vector | string scalar
Function governing the distance between seed substrings during the multiseed alignment, specified as a character vector or string scalar.
The function has the format 'f,B,A'
, where
f is a function type, B is a constant term, and
A is a coefficient. Available function types are:
'C'
– Constant'L'
– Linear'S'
– Square root'G'
– Natural log
The resulting function is H(x) = B + A * f(x)
, where
x is the read length.
For the 'EndToEnd'
alignment mode, the default function
is 'S,1,1.15'
. For the 'Local'
mode,
the default function is 'S,1,0.75'
.
Example: 'SeedIntervalFunction','S,2,2.15'
Data Types: char
| string
SeedLength
— Seed substring length to align during multiseed alignment
22
(default) | positive integer
Seed substring length to align during the multiseed alignment, specified as a positive integer.
Data Types: double
Skip
— Number of reads to ignore
0
(default) | nonnegative integer
Number of reads to ignore from the beginning of the input files, specified as a nonnegative integer.
Data Types: double
Trim3
— Number of residues to trim from 3' end
0
(default) | nonnegative integer
Number of residues to trim from the 3' end of each read before aligning, specified as a nonnegative integer.
Data Types: double
Trim5
— Number of residues to trim from 5' end
0
(default) | nonnegative integer
Number of residues to trim from the 5' end of each read before aligning, specified as a nonnegative integer.
Data Types: double
TrimTo
— Threshold to trim reads exceeding given number of bases
Inf
(default) | nonnegative integer | two-element array
Since R2023b
Threshold to trim reads exceeding a given number of bases, specified as a nonnegative integer or two-element array. By default, no reads are trimmed.
If the value is a nonnegative integer N, reads that contains more bases than the specified number N are trimmed from the 3' end.
If the value is a two-element array
[M,N]
, the
first number M must be either 3 or 5, which indicates
either the 3' or 5' end to trim from. The second number specifies the
maximum read length and any reads containing more bases than
N are trimmed.
Data Types: double
TruncateReadName
— Flag to truncate read names
true
or 1 (default) | false
or 0
Since R2023b
Flag to truncate read names, specified as a numeric or logical 1
(true
) or 0 (false
). By default,
bowtie2
truncates the read name after the first
white space.
Data Types: double
| logical
UnalignedPairedReadSupplementFile
— Base name of files where unaligned paired reads are saved
empty string (default) | character vector | string scalar
Since R2023b
Base name of files where paired reads that are not aligned are saved,
specified as a character vector or string scalar.
bowtie2
creates two files, one for each read pair.
The files have the same format as the input data.
The function appends ".1"
or ".2"
to
the base file name to specify each read pair file. If the base file name
includes the %
symbol, bowtie2
inserts 1
or 2
at this
%
position instead of appending
".1"
or ".2"
. Use
ReadSupplementFileCompression
to compress these
supplement files.
By default, bowtie2
does not create these supplement
files.
Data Types: char
| string
UnalignedUnpairedReadSupplementFile
— Name of file where unaligned unpaired reads are saved
empty string (default) | character vector | string scalar
Since R2023b
Name of a file where unpaired reads that are not aligned are saved,
specified as a character vector or string scalar. The file has the same
format as the input data. Use
ReadSupplementFileCompression
to compress these
supplement files.
By default, bowtie2
does not create the file.
Data Types: char
| string
UpTo
— Number of reads to consider from beginning of input files
Inf
(default) | positive integer
Number of reads to consider from the beginning of input files, specified
as a positive integer. The default value is Inf
, that is,
all reads are considered.
Data Types: double
UseOneMismatchPriority
— Flag to indicate the prioritization of 1-mismatch alignments over the multiseed alignment
true
or 1 (default) | false
or 0
Since R2023b
Flag to indicate the prioritization of 1-mismatch alignments over the
multiseed alignment, specified as a numeric or logical 1
(true
) or 0 (false
). By default,
bowtie2
attempts to find the exact matches or
matches with a single mismatch before trying a multiseed alignment.
Data Types: double
| logical
Object Functions
getBowtie2Command | Translate object properties to Bowtie 2 options |
getBowtie2Table | Retrieve table with object properties and equivalent Bowtie 2 options |
preset | Set combination of alignment options |
run | Map sequence reads to reference sequence using Bowtie 2 |
Examples
Align Reads to Reference Sequence Using Bowtie 2
Build a set of index files for the Drosophila genome. An error message appears if you do not have the Bowtie 2 Support Package for Bioinformatics Toolbox installed when you run the function. Click the provided link to download the package from the Add-on menu.
For this example, the reference sequence Dmel_chr4.fa
is already
provided with the toolbox.
status = bowtie2build('Dmel_chr4.fa', 'Dmel_chr4_index');
If the index build is successful, the function returns 0
and
creates the index files (*.bt2
) in the current folder. The files have
the prefix 'Dmel_chr4_index'
.
Sometimes the index files exist, and you want to know the reference sequence used to
build the index. In this case, use the bowtie2inspect
function to get more information about the
reference.
bowtie2inspect('Dmel_chr4', 'Dmel_chr4_retrieved.fa');
By default, the output file Dmel_chr4_retrieved.fa
contains the sequence of the reference. You can also get a summary information about the reference name and lengths instead of the actual sequence. For details on the available options, see Bowtie2InspectOptions
.
Once the index is ready, map the read sequences to the reference using the
bowtie2
function. The paired-end read files
(SRR6008575_10k_1.fq
and SRR6008575_10k_2.fq
)
are already provided with the toolbox.
bowtie2('Dmel_chr4','SRR6008575_10k_1.fq','SRR6008575_10k_2.fq','SRR6008575_10k_chr4.sam');
The output is a SAM-formatted file that contains the mapping results.
You can specify different alignment options by passing in a Bowtie 2 syntax string or
using a Bowtie2AlignOptions
object.
Suppose you want to trim some residues from the 3'
end before
aligning. First, create a Bowtie2AlignOptions
object.
alignOpt = Bowtie2AlignOptions;
Trim four residues from the 3'
end before aligning.
alignOpt.Trim3 = 4;
Map reads to the reference using the specified alignment option.
flag = bowtie2('Dmel_chr4','SRR6008575_10k_1.fq','SRR6008575_10k_2.fq','SRR6008575_10k_chr4_trimmed.sam',alignOpt);
References
[1] Langmead, B., and S. Salzberg. "Fast gapped-read alignment with Bowtie 2." Nature Methods. 9, 2012, 357–359.
Version History
Introduced in R2018a
See Also
bowtie2
| bowtie2inspect
| bowtie2build
| Bowtie2BuildOptions
| Bowtie2InspectOptions
External Websites
MATLAB Command
You clicked a link that corresponds to this MATLAB command:
Run the command by entering it in the MATLAB Command Window. Web browsers do not support MATLAB commands.
Select a Web Site
Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select: .
You can also select a web site from the following list
How to Get Best Site Performance
Select the China site (in Chinese or English) for best site performance. Other MathWorks country sites are not optimized for visits from your location.
Americas
- América Latina (Español)
- Canada (English)
- United States (English)
Europe
- Belgium (English)
- Denmark (English)
- Deutschland (Deutsch)
- España (Español)
- Finland (English)
- France (Français)
- Ireland (English)
- Italia (Italiano)
- Luxembourg (English)
- Netherlands (English)
- Norway (English)
- Österreich (Deutsch)
- Portugal (English)
- Sweden (English)
- Switzerland
- United Kingdom (English)
Asia Pacific
- Australia (English)
- India (English)
- New Zealand (English)
- 中国
- 日本Japanese (日本語)
- 한국Korean (한국어)