주요 콘텐츠

bioinfo.pipeline.block.CuffNorm

Bioinformatics pipeline block to normalize transcript expression levels

Since R2023a

  • cuffnorm block icon

Description

A CuffNorm block enables you to generate expression tables which contain normalized expression level for each isoform, gene, transcript start site, and coding sequence based on library size.

The block requires the Cufflinks Support Package for the Bioinformatics Toolbox™. If the support package is not installed, then a download link is provided. For details, see Bioinformatics Toolbox Software Support Packages.

Creation

Description

b = bioinfo.pipeline.block.CuffNorm creates a CuffNorm block.

example

b = bioinfo.pipeline.block.CuffNorm(options) also specifies additional options.

b = bioinfo.pipeline.block.CuffNorm(Name=Value) specifies additional options as the property names and values of a CuffNormOptions object. This object is set as the value of the Options property of the block.

Input Arguments

expand all

CuffNorm options, specified as a CuffNormOptions object, string, or character vector.

If you are specifying a string or character vector, it must be in the CuffNorm native syntax (prefixed by one or two dashes) [1].

Name-Value Arguments

expand all

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Note

The following list of arguments is a partial list. For the complete list, refer to the properties of CuffNormOptions object.

The commands must be in the native syntax (prefixed by one or two dashes). Use this option to apply undocumented flags and flags without corresponding MATLAB® properties.

When the software converts the original flags to MATLAB properties, it stores any unrecognized flags in this property.

Example: '--library-type fr-secondstrand'

Data Types: char | string

Flag to include all the object properties with the corresponding default values when converting to the original options syntax, specified as true or false. You can convert the properties to the original syntax prefixed by one or two dashes (such as '-d 100 -e 80') by using getCommand. The default value false means that when you call getCommand(optionsObject), it converts only the specified properties. If the value is true, getCommand converts all available properties, with default values for unspecified properties, to the original syntax.

Note

If you set IncludeAll to true, the software converts all available properties, using default values for unspecified properties. The only exception is when the default value of a property is NaN, Inf, [], '', or "". In this case, the software does not translate the corresponding property.

Example: true

Data Types: logical

Properties

expand all

Function to handle errors from the run method of the block, specified as a function handle. The handle specifies the function to call if the run method encounters an error within a pipeline. For the pipeline to continue after a block fails, ErrorHandler must return a structure that is compatible with the output ports of the block. The error handling function is called with the following two inputs:

  • Structure with these fields:

    FieldDescription
    identifierIdentifier of the error that occurred
    messageText of the error message
    indexLinear index indicating which block process failed in the parallel run. By default, the index is 1 because there is only one run per block. For details on how block inputs can be split across different dimensions for multiple run calls, see Bioinformatics Pipeline SplitDimension.

  • Input structure passed to the run method when it fails

Data Types: function_handle

This property is read-only.

Input ports of the block, specified as a structure. The field names of the structure are the names of the block input ports, and the field values are bioinfo.pipeline.Input objects. These objects describe the input port behaviors. The input port names are the expected field names of the input structure that you pass to the block run method.

The CuffNorm block Inputs structure has the following fields:

  • GenomicAnnotationFile — Name of the transcript annotation file. The file can be a GTF or GFF file produced by Cufflinks, CuffCompare, or another source of GTF annotations. This input is a required input that must be satisfied.

  • GenomicAlignmentFiles — Names of SAM, BAM, or CXB files containing alignment records for each sample. This input is a required input that must be satisfied.

The default value for each input field is a bioinfo.pipeline.datatypes.Unset object, which means that the input value is not set yet.

Data Types: struct

This property is read-only.

Output ports of the block, specified as a structure. The field names of the structure are the names of the block output ports, and the field values are bioinfo.pipeline.Output objects. These objects describe the output port behaviors. The field names of the output structure returned by the block run method are the same as the output port names.

The CuffNorm block Outputs structure has the following fields:

  • IsoformFPKMFile — Name of a file containing the normalized expression level for each isoform.

  • GeneFPKMFile — Name of a file containing the normalized expression level for each gene.

  • TSSFPKMFile — Name of a file containing the normalized expression level for each transcript start site (TSS).

  • CDSFPKMFile — Name of a file containing the normalized expression level for each coding sequence.

Tip

To see the actual location of the output file, first get the results of the block. Then use the unwrap method as shown in this example.

Data Types: struct

CuffNorm options, specified as a CuffNormOptions object. The default value is a default CuffNormOptions object.

Object Functions

compilePerform block-specific additional checks and validations
copyCopy array of handle objects
emptyInputsCreate input structure for use with run method
evalEvaluate block object
runRun block object

Examples

collapse all

Generate normalized expression tables using CuffNorm.

import bioinfo.pipeline.block.*
import bioinfo.pipeline.Pipeline

FC1 = FileChooser(which("gyrAB.gtf"));
samFiles = {which("Myco_1_1.sam"),which("Myco_1_2.sam")};
FC2 = FileChooser(samFiles);
CN = CuffNorm;

P = Pipeline;
addBlock(P, [FC1,FC2,CN]);
connect(P, FC1,CN,["Files","GenomicAnnotationFile"]);
connect(P,FC2,CN,["Files","GenomicAlignmentFiles"]);

run(P);
R = results(P,CN)
R = 

  struct with fields:

    IsoformFPKMFile: [1×1 bioinfo.pipeline.datatypes.File]
       GeneFPKMFile: [1×1 bioinfo.pipeline.datatypes.File]
        TSSFPKMFile: [1×1 bioinfo.pipeline.datatypes.File]
        CDSFPKMFile: [1×1 bioinfo.pipeline.datatypes.File]

Call unwrap on each field of the result structure R to see the location of each output file. For example, to see the location of IsoformFPKMFile, enter the following.

unwrap(R.IsoformFPKMFile)
ans = 

    "C:\PipelineResults\CuffNorm_1\1\isoforms.fpkm_table"

References

[1] Trapnell, Cole, Brian A Williams, Geo Pertea, Ali Mortazavi, Gordon Kwan, Marijke J van Baren, Steven L Salzberg, Barbara J Wold, and Lior Pachter. “Transcript Assembly and Quantification by RNA-Seq Reveals Unannotated Transcripts and Isoform Switching during Cell Differentiation.” Nature Biotechnology 28, no. 5 (May 2010): 511–15.

Version History

Introduced in R2023a