KNN Search

Examples

Find Nearest Neighbors Using KNN Search Block

Train a nearest neighbor searcher model, and then use the KNN Search block for label prediction.

Since R2023b
Open Live Script

Ports

Input

expand all

x — Query point
row vector

Query point, specified as a row vector. x must have the same number of columns as the number of predictor variables in the searcher object specified by Select nearest neighbor searcher. The columns of x must be in the same order as those in the searcher object.

Output

expand all

Idx — Indices of nearest neighbors
numeric row vector | 1-by-1 cell array

Indices of the nearest neighbors in the data, returned as a numeric row vector or 1-by-1 cell array.

If you do not select Include ties on the Main tab of the Block Parameters dialog box, then the block returns a 1-by-k numeric row vector, where k is the number of nearest neighbors searched. Each column of the row vector contains the index of a nearest neighbor point in the data, ordered by increasing distance to the query point x.
If you select Include ties on the Main tab of the Block Parameters dialog box, then the block returns a 1-by-1 cell array as a variable-size signal containing a numeric row vector of at least k indices of the closest observations in the data to the query point x. The columns of the vector are ordered by increasing distance to the query point.

D — Distances of nearest neighbors
numeric row vector | 1-by-1 cell array

Distances of the nearest neighbors to the query points, returned as a numeric row vector or 1-by-1 cell array.

If you do not select Include ties on the Main tab of the Block Parameters dialog box, then the block returns a 1-by-k numeric row vector, where k is the number of nearest neighbors searched. Each column of the row vector contains the distance of a nearest neighbor point in the data to the query point x, according to the distance metric. The columns of the row vector are ordered by increasing distance to the query point.
If you select Include ties on the Main tab of the Block Parameters dialog box, then the block returns a 1-by-1 cell array as a variable-size signal containing a numeric row vector of at least k distances of the closest observations in the data to the query point x. The columns of the vector are ordered by increasing distance to the query point.

Dependencies

To enable this port, select Add output port for nearest neighbor distances in the KNN Search block.

Parameters

expand all

Main

Select nearest neighbor searcher — Nearest neighbor search method
`searcher` (default) | `ExhaustiveSearcher` object | `KDTreeSearcher` object

Specify the name of a workspace variable that contains an ExhaustiveSearcher or KDTreeSearcher object.

Note

The software uses the default settings for all parameters that you can specify in the Block Parameters dialog box. The parameters in the dialog box override those of the searcher object.

Programmatic Use

Block Parameter: NeighborhoodSearcher

Type: workspace variable

Values: ExhaustiveSearcher object | KDTreeSearcher object

Default: "searcher"

Add output port for nearest neighbor distances — Add second output port for nearest neighbor distances
`off` (default) | `on`

Select the check box to include the second outport port D in the KNN Search block.

Programmatic Use

Block Parameter: ShowOutputDistances

Type: character vector

Values: "off" | "on"

Default: "off"

Number of nearest neighbors — Number of nearest neighbors
`1` (default) | positive integer

Specify the number of nearest neighbors to find in the data for the query point.

Programmatic Use

Block Parameter: NumNeighbors

Type: positive integer

Values: single | double

Default: 1

Include ties — Flag to include all nearest neighbors
`off` (default) | `on`

If you do not select Include ties on the Main tab of the Block Parameters dialog box, then the block selects the observation with the smallest index among the observations that have the same distance from the query point.

If you select Include ties:

The block output includes all nearest neighbors whose distances are equal to the kth smallest distance in the output arguments. If more than five nearest neighbors have equal distance to the kth smallest distance, the block output includes only the first five nearest neighbors with the smallest index values.
The Idx and D block outputs are 1-by-1 cell arrays where each cell contains a vector of at least k indices and distances, respectively. The columns in the vectors are ordered by increasing distance to the query point.

Programmatic Use

Block Parameter: IncludeTies

Type: character vector

Values: "off" | "on"

Default: "off"

Distance metric — Distance metric
`euclidean` (default) | `chebychev` | `cityblock` | `minkowski` | `correlation` | `cosine` | `hamming` | `jaccard` | `mahalanobis` | `seuclidean` | `spearman`

Specify the distance metric used to find nearest neighbors in the data to the query point.

For both ExhaustiveSearcher and KDTreeSearcher objects, the block supports these distance metrics.

Value	Description
`"chebychev"`	Chebychev distance (maximum coordinate difference)
`"cityblock"`	City block distance
`"euclidean"`	Euclidean distance
`"minkowski"`	Minkowski distance. The default exponent is 2. You can specify a different exponent in the Block Parameters dialog box.

For an ExhaustiveSearcher object, the block also supports these distance metrics.

Value	Description
`"correlation"`	One minus the sample linear correlation between observations (treated as sequences of values)
`"cosine"`	One minus the cosine of the included angle between observations (treated as row vectors)
`"hamming"`	Hamming distance, which is the percentage of coordinates that differ
`"jaccard"`	One minus the Jaccard coefficient, which is the percentage of nonzero coordinates that differ
`"mahalanobis"`	Mahalanobis distance, computed using a positive definite covariance matrix. The block computes the covariance matrix from the data in the `searcher` object, by default. You can specify a customized covariance matrix in the Block Parameters dialog box.
`"seuclidean"`	Standardized Euclidean distance. Each coordinate difference between the query point x and the data is scaled by dividing by the corresponding element of the standard deviation computed from the data. You can specify a different scaling method in the Block Parameters dialog box.
`"spearman"`	One minus the sample Spearman's rank correlation between observations (treated as sequences of values)

Note

The distance metric setting overrides the Distance property of the specified searcher object.
The KNN Search block does not support the "fasteuclidean" or "fastseuclidean" distance metric (see Distance Metrics).

Programmatic Use

Block Parameter: DistanceMetric

Type: character vector

Values:

"euclidean" | "chebychev" | "cityblock" | "minkowski" | "correlation" |
                    "cosine" | "hamming" | "jaccard" | "mahalanobis" | "seuclidean" |
                    "spearman"

Default: "euclidean"

Covariance matrix — Covariance matrix for Mahalanobis distance metric
`Computed using data in searcher` (default) | `Customized`

The block computes the covariance matrix from the data in the searcher object, by default. You can specify a customized covariance matrix by selecting Customized and entering a positive definite matrix in the Customized matrix box.

Note

This setting overrides the DistParameter property of the specified searcher object.

Programmatic Use

Block Parameter: CovarianceMatrix

Type: positive definite matrix

Values: "Computed using data in searcher" | "Customized"

Default: "Computed using data in searcher"

Dependencies

To enable this parameter, set Distance Metric to "mahalanobis".

Scale — Scale parameter value for standardized Euclidean distance metric
`Standard deviation of data in searcher` (default) | `Customized`

The block computes the scale parameter value from the data in the searcher object, by default. You can specify a customized scale parameter value by selecting Customized and entering a nonnegative numeric row vector in the Customized scale text box. The row vector must have the same number of columns as the number of predictor variables in the searcher object. When the block computes the standardized Euclidean distance, each coordinate of the data is scaled by the corresponding element of Scale, as is the query point.

Note

This setting overrides the DistParameter property of the specified searcher object.

Programmatic Use

Block Parameter: Scale

Type: nonnegative numeric row vector

Values: "Standard deviation of data in searcher" | "Customized"

Default: "Standard deviation of data in searcher"

Dependencies

To enable this parameter, set Distance Metric to "seuclidean".

P — Exponent for Minkowski distance metric
`2` (default) | positive integer

Specify the exponent for the Minkowski distance metric. For the default case of P = 2, the Minkowski distance gives the Euclidean distance. For the special case of P = 1, the Minkowski distance gives the city block distance. For the special case of P = ∞, the Minkowski distance gives the Chebychev distance.

Note

This setting overrides the DistParameter property of the specified searcher object.

Programmatic Use

Block Parameter: MinkExp

Type: positive integer

Values: positive integer

Default: 2

Dependencies

To enable this parameter, set Distance Metric to "minkowski".

Data Types

Fixed-Point Operational Parameters

Integer rounding mode — Rounding mode for fixed-point operations
`Floor` (default) | `Ceiling` | `Convergent` | `Nearest` | `Round` | `Simplest` | `Zero`

Specify the rounding mode for fixed-point operations. For more information, see Rounding Modes (Fixed-Point Designer).

Block parameters always round to the nearest representable value. To control the rounding of a block parameter, enter an expression into the mask field using a MATLAB^® rounding function.

Programmatic Use

Block Parameter: RndMeth

Type: character vector

Values:

"Ceiling" | "Convergent" | "Floor" | "Nearest" | "Round" | "Simplest" |
                        "Zero"

Default: "Floor"

Saturate on integer overflow — Method of overflow action
`off` (default) | `on`

Specify whether overflows saturate or wrap.

Action Rationale Impact on Overflows Example

Action	Rationale	Impact on Overflows	Example
Select this check box (`on`).	Your model has possible overflow, and you want explicit saturation protection in the generated code.	Overflows saturate to either the minimum or maximum value that the data type can represent.	The maximum value that the `int8` (signed 8-bit integer) data type can represent is 127. Any block operation result greater than this maximum value causes overflow of the 8-bit integer. With the check box selected, the block output saturates at 127. Similarly, the block output saturates at a minimum output value of –128.
Clear this check box (`off`).	You want to optimize the efficiency of your generated code. You want to avoid overspecifying how a block handles out-of-range signals. For more information, see Troubleshoot Signal Range Errors (Simulink).	Overflows wrap to the appropriate value that the data type can represent.	The maximum value that the `int8` (signed 8-bit integer) data type can represent is 127. Any block operation result greater than this maximum value causes overflow of the 8-bit integer. With the check box cleared, the software interprets the value causing the overflow as `int8`, which can produce an unintended result. For example, a block result of 130 (binary 1000 0010) expressed as `int8` is –126.

Select this check box (on).

Your model has possible overflow, and you want explicit saturation protection in the generated code.

Overflows saturate to either the minimum or maximum value that the data type can represent.

The maximum value that the int8 (signed 8-bit integer) data type can represent is 127. Any block operation result greater than this maximum value causes overflow of the 8-bit integer. With the check box selected, the block output saturates at 127. Similarly, the block output saturates at a minimum output value of –128.

Clear this check box (off).

You want to optimize the efficiency of your generated code.

You want to avoid overspecifying how a block handles out-of-range signals. For more information, see Troubleshoot Signal Range Errors (Simulink).

Overflows wrap to the appropriate value that the data type can represent.

The maximum value that the int8 (signed 8-bit integer) data type can represent is 127. Any block operation result greater than this maximum value causes overflow of the 8-bit integer. With the check box cleared, the software interprets the value causing the overflow as int8, which can produce an unintended result. For example, a block result of 130 (binary 1000 0010) expressed as int8 is –126.

Programmatic Use

Block Parameter: SaturateOnIntegerOverflow

Type: character vector

Values: "off" | "on"

Default: "off"

Lock output data type setting against changes by the fixed-point tools — Prevention of fixed-point tools from overriding data type
`off` (default) | `on`

Select this parameter to prevent the fixed-point tools from overriding the data type you specify for the block. For more information, see Use Lock Output Data Type Setting (Fixed-Point Designer).

Programmatic Use

Block Parameter: LockScale

Type: character vector

Values: "off" | "on"

Default: "off"

Data Type

Index data type — Data type of index output
`Inherit: auto` (default) | `double` | `single` | `half` | `int8` | `uint8` | `int16` | `uint16` | `int32` | `uint32` | `int64` | `uint64` | `fixdt(1,16,0)` | `fixdt(1,16,2^0,0)` | `<data type expression>`

Specify the data type for the Idx output. The type can be inherited, specified directly, or expressed as a data type object such as Simulink.NumericType.

Click the Show data type assistant button to display the Data Type Assistant, which helps you set the data type attributes. For more information, see Specify Data Types Using Data Type Assistant (Simulink).

Programmatic Use

Block Parameter: IndicesDataTypeStr

Type: character vector

"<data type
                    expression>"

Default:

"Inherit:
                  auto"

Index data type Minimum — Minimum of index output
`[]` (default) | scalar

Specify the minimum value of the Idx output range that Simulink^® checks.

Simulink uses the minimum value to perform:

Parameter range checking for some blocks (see Specify Minimum and Maximum Values for Block Parameters (Simulink)).
Simulation range checking (see Specify Signal Ranges (Simulink) and Enable Simulation Range Checking (Simulink)).
Optimization of the code that you generate from the model. This optimization can remove algorithmic code and affect the results of some simulation modes, such as software-in-the-loop (SIL) mode or external mode. For more information, see Optimize using the specified minimum and maximum values (Embedded Coder).

Note

The Index data type Minimum parameter does not saturate or clip the actual Idx output signal. To do so, use the Saturation (Simulink) block instead.

Programmatic Use

Block Parameter: IndicesOutMin

Type: scalar

Values: "[]" | scalar

Default: "[]"

Index data type Maximum — Maximum of index output
`[]` (default) | scalar

Specify the maximum value of the Idx output range that Simulink checks.

Simulink uses the maximum value to perform:

Parameter range checking for some blocks (see Specify Minimum and Maximum Values for Block Parameters (Simulink)).
Simulation range checking (see Specify Signal Ranges (Simulink) and Enable Simulation Range Checking (Simulink)).
Optimization of the code that you generate from the model. This optimization can remove algorithmic code and affect the results of some simulation modes, such as software-in-the-loop (SIL) mode or external mode. For more information, see Optimize using the specified minimum and maximum values (Embedded Coder).

Note

The Index data type Maximum parameter does not saturate or clip the actual Idx output signal. To do so, use the Saturation (Simulink) block instead.

Programmatic Use

Block Parameter: IndicesOutMax

Type: scalar

Values: "[]" | scalar

Default: "[]"

Distance data type — Data type of distance output
`Inherit: auto` (default) | `double` | `single` | `half` | `int8` | `uint8` | `int16` | `uint16` | `int32` | `uint32` | `int64` | `uint64` | `fixdt(1,16,0)` | `fixdt(1,16,2^0,0)` | `<data type expression>`

Specify the data type for the distance (D) output. The type can be inherited, specified directly, or expressed as a data type object such as Simulink.NumericType.

Click the Show data type assistant button to display the Data Type Assistant, which helps you set the data type attributes. For more information, see Specify Data Types Using Data Type Assistant (Simulink).

Programmatic Use

Block Parameter: DistanceDataTypeStr

Type: character vector

"<data type
                    expression>"

Default:

"Inherit:
                  auto"

Note

Fixed-point data types are not supported for the Spearman distance metric.

Dependencies

To enable this parameter, select Add output port for nearest neighbor distances on the Main tab of the Block Parameters dialog box.

Distance data type Minimum — Minimum of distance output
`[]` (default) | scalar

Specify the minimum value of the distance (D) output range that Simulink checks.

Simulink uses the minimum value to perform:

Parameter range checking for some blocks (see Specify Minimum and Maximum Values for Block Parameters (Simulink)).
Simulation range checking (see Specify Signal Ranges (Simulink) and Enable Simulation Range Checking (Simulink)).
Optimization of the code that you generate from the model. This optimization can remove algorithmic code and affect the results of some simulation modes, such as software-in-the-loop (SIL) mode or external mode. For more information, see Optimize using the specified minimum and maximum values (Embedded Coder).

Note

The Distance data type Maximum parameter does not saturate or clip the actual D output signal. To do so, use the Saturation (Simulink) block instead.

Programmatic Use

Block Parameter: DistanceOutMin

Type: scalar

Values: "[]" | scalar

Default: "[]"

Dependencies

To enable this parameter, select Add output port for nearest neighbor distances on the Main tab of the Block Parameters dialog box.

Distance data type Maximum — Maximum of distance output
`[]` (default) | scalar

Specify the maximum value of the distance (D) output range that Simulink checks.

Simulink uses the maximum value to perform:

Parameter range checking for some blocks (see Specify Minimum and Maximum Values for Block Parameters (Simulink)).
Simulation range checking (see Specify Signal Ranges (Simulink) and Enable Simulation Range Checking (Simulink)).
Optimization of the code that you generate from the model. This optimization can remove algorithmic code and affect the results of some simulation modes, such as software-in-the-loop (SIL) mode or external mode. For more information, see Optimize using the specified minimum and maximum values (Embedded Coder).

Note

The Distance data type Maximum parameter does not saturate or clip the actual D output signal. To do so, use the Saturation (Simulink) block instead.

Programmatic Use

Block Parameter: DistanceOutMax

Type: scalar

Values: "[]" | scalar

Default: "[]"

Dependencies

To enable this parameter, select Add output port for nearest neighbor distances on the Main tab of the Block Parameters dialog box.

Block Characteristics

Data Types	`Boolean` \| `double` \| `enumerated` \| `fixed point` \| `half` \| `integer` \| `single`
Direct Feedthrough	`yes`
Multidimensional Signals	`no`
Variable-Size Signals	`yes`
Zero-Crossing Detection	`no`

Extended Capabilities

C/C++ Code Generation
Generate C and C++ code using Simulink® Coder™.

Fixed-Point Conversion
Design and simulate fixed-point systems using Fixed-Point Designer™.

Version History

Introduced in R2023b

KNN Search

Description

Examples

Find Nearest Neighbors Using KNN Search Block

Ports

Input

x — Query point row vector

Output

Idx — Indices of nearest neighbors numeric row vector | 1-by-1 cell array

D — Distances of nearest neighbors numeric row vector | 1-by-1 cell array

Dependencies

Parameters

Main

Select nearest neighbor searcher — Nearest neighbor search method searcher (default) | ExhaustiveSearcher object | KDTreeSearcher object

Programmatic Use

Add output port for nearest neighbor distances — Add second output port for nearest neighbor distances off (default) | on

Programmatic Use

Number of nearest neighbors — Number of nearest neighbors 1 (default) | positive integer

Programmatic Use

Include ties — Flag to include all nearest neighbors off (default) | on

Programmatic Use

Distance metric — Distance metric euclidean (default) | chebychev | cityblock | minkowski | correlation | cosine | hamming | jaccard | mahalanobis | seuclidean | spearman

Programmatic Use

Covariance matrix — Covariance matrix for Mahalanobis distance metric Computed using data in searcher (default) | Customized

Programmatic Use

Dependencies

Scale — Scale parameter value for standardized Euclidean distance metric Standard deviation of data in searcher (default) | Customized

Programmatic Use

Dependencies

P — Exponent for Minkowski distance metric 2 (default) | positive integer

Programmatic Use

Dependencies

Data Types

Integer rounding mode — Rounding mode for fixed-point operations Floor (default) | Ceiling | Convergent | Nearest | Round | Simplest | Zero

Programmatic Use

Saturate on integer overflow — Method of overflow action off (default) | on

Programmatic Use

Lock output data type setting against changes by the fixed-point tools — Prevention of fixed-point tools from overriding data type off (default) | on

Programmatic Use

Index data type — Data type of index output Inherit: auto (default) | double | single | half | int8 | uint8 | int16 | uint16 | int32 | uint32 | int64 | uint64 | fixdt(1,16,0) | fixdt(1,16,2^0,0) | <data type expression>

Programmatic Use

Index data type Minimum — Minimum of index output [] (default) | scalar

Programmatic Use

Index data type Maximum — Maximum of index output [] (default) | scalar

Programmatic Use

Distance data type — Data type of distance output Inherit: auto (default) | double | single | half | int8 | uint8 | int16 | uint16 | int32 | uint32 | int64 | uint64 | fixdt(1,16,0) | fixdt(1,16,2^0,0) | <data type expression>

Programmatic Use

Dependencies

Distance data type Minimum — Minimum of distance output [] (default) | scalar

Programmatic Use

Dependencies

Distance data type Maximum — Maximum of distance output [] (default) | scalar

Programmatic Use

Dependencies

Block Characteristics

Alternative Functionality

Extended Capabilities

C/C++ Code Generation Generate C and C++ code using Simulink® Coder™.

Fixed-Point Conversion Design and simulate fixed-point systems using Fixed-Point Designer™.

Version History

See Also

Topics

x — Query point
row vector

Idx — Indices of nearest neighbors
numeric row vector | 1-by-1 cell array

D — Distances of nearest neighbors
numeric row vector | 1-by-1 cell array

Select nearest neighbor searcher — Nearest neighbor search method
`searcher` (default) | `ExhaustiveSearcher` object | `KDTreeSearcher` object

Add output port for nearest neighbor distances — Add second output port for nearest neighbor distances
`off` (default) | `on`

Number of nearest neighbors — Number of nearest neighbors
`1` (default) | positive integer

Include ties — Flag to include all nearest neighbors
`off` (default) | `on`

Distance metric — Distance metric
`euclidean` (default) | `chebychev` | `cityblock` | `minkowski` | `correlation` | `cosine` | `hamming` | `jaccard` | `mahalanobis` | `seuclidean` | `spearman`

Covariance matrix — Covariance matrix for Mahalanobis distance metric
`Computed using data in searcher` (default) | `Customized`

Scale — Scale parameter value for standardized Euclidean distance metric
`Standard deviation of data in searcher` (default) | `Customized`

P — Exponent for Minkowski distance metric
`2` (default) | positive integer

Integer rounding mode — Rounding mode for fixed-point operations
`Floor` (default) | `Ceiling` | `Convergent` | `Nearest` | `Round` | `Simplest` | `Zero`

Saturate on integer overflow — Method of overflow action
`off` (default) | `on`

Lock output data type setting against changes by the fixed-point tools — Prevention of fixed-point tools from overriding data type
`off` (default) | `on`

Index data type — Data type of index output
`Inherit: auto` (default) | `double` | `single` | `half` | `int8` | `uint8` | `int16` | `uint16` | `int32` | `uint32` | `int64` | `uint64` | `fixdt(1,16,0)` | `fixdt(1,16,2^0,0)` | `<data type expression>`

Index data type Minimum — Minimum of index output
`[]` (default) | scalar

Index data type Maximum — Maximum of index output
`[]` (default) | scalar

Distance data type — Data type of distance output
`Inherit: auto` (default) | `double` | `single` | `half` | `int8` | `uint8` | `int16` | `uint16` | `int32` | `uint32` | `int64` | `uint64` | `fixdt(1,16,0)` | `fixdt(1,16,2^0,0)` | `<data type expression>`

Distance data type Minimum — Minimum of distance output
`[]` (default) | scalar

Distance data type Maximum — Maximum of distance output
`[]` (default) | scalar

C/C++ Code Generation
Generate C and C++ code using Simulink® Coder™.

Fixed-Point Conversion
Design and simulate fixed-point systems using Fixed-Point Designer™.