Main Content

partitionRead

Import data from partitions of Apache Cassandra database table

Since R2021a

Description

example

results = partitionRead(conn,keyspace,tablename) returns imported data by reading all Cassandra® database columns from all partitions of a Cassandra database table. The partitionRead function imports data from a Cassandra database into MATLAB® without using a Cassandra Query Language (CQL) query.

example

results = partitionRead(conn,keyspace,tablename,keyValue1...keyValueN) returns imported data by reading all Cassandra columns from one or more partitions specified by the partition key values.

example

results = partitionRead(___,Name,Value) specifies options using one or more name-value arguments in addition to any of the previous input argument combinations. For example, 'ConsistencyLevel',"TWO" sets the consistency level to specify that two nodes must respond for the CQL query to execute.

Examples

collapse all

Using the Apache™ Cassandra® database C++ interface, create a Cassandra database connection and import data from a Cassandra database table into MATLAB®. The Cassandra database contains a database table with employee data.

Create a Cassandra database connection using the configured data source CassandraDataSource and a blank user name and password. The apacheCassandra function returns conn as a connection object.

datasource = "CassandraDataSource";
username = "";
password = "";
conn = apacheCassandra(datasource,username,password);

Import employee data into MATLAB from the employeedata keyspace and the employees_by_job database table by using the Cassandra database connection.

keyspace = "employeedata";
tablename = "employees_by_job";
results = partitionRead(conn,keyspace,tablename);

Display the first few rows of the returned employee data.

head(results)
ans=8×13 table
      job_id       hire_date     employee_id    commission_pct    department_id      email       first_name      last_name      manager_id         office         performance_ratings     phone_number     salary
                                                                                                                                              building    room                                                   
    __________    ___________    ___________    ______________    _____________    __________    __________    _____________    __________    ________________    ___________________    ______________    ______

    "ST_CLERK"    08-Mar-2008        128             NaN               50          "SMARKLE"     "Steven"      "Markle"            120        "North"     171         {3×1 int32}        "650.124.1434"     2200 
    "ST_CLERK"    06-Feb-2008        136             NaN               50          "HPHILTAN"    "Hazel"       "Philtanker"        122        "North"     303         {[      2]}        "650.127.1634"     2200 
    "ST_CLERK"    12-Dec-2007        135             NaN               50          "KGEE"        "Ki"          "Gee"               122        "West"      287         {2×1 int32}        "650.127.1734"     2400 
    "ST_CLERK"    10-Apr-2007        132             NaN               50          "TJOLSON"     "TJ"          "Olson"             121        "North"     256         {[      7]}        "650.124.8234"     2100 
    "ST_CLERK"    14-Jan-2007        127             NaN               50          "JLANDRY"     "James"       "Landry"            120        "West"      273         {2×1 int32}        "650.124.1334"     2400 
    "ST_CLERK"    28-Sep-2006        126             NaN               50          "IMIKKILI"    "Irene"       "Mikkilineni"       120        "East"      246         {4×1 int32}        "650.124.1224"     2700 
    "ST_CLERK"    26-Aug-2006        134             NaN               50          "MROGERS"     "Michael"     "Rogers"            122        "East"      246         {3×1 int32}        "650.127.1834"     2900 
    "ST_CLERK"    09-Jul-2006        144             NaN               50          "PVARGAS"     "Peter"       "Vargas"            124        "North"     129         {3×1 int32}        "650.121.2004"     2500 

results is a table that contains these variables:

  • job_id — Job identifier

  • hire_date — Hire date

  • employee_id — Employee identifier

  • commission_pct — Commission percentage

  • department_id — Department identifier

  • email — Email address

  • first_name — First name

  • last_name — Last name

  • manager_id — Manager identifier

  • office — Office location (table that contains two variables for the building and room)

  • performance_ratings — Performance ratings

  • phone_number — Phone number

  • salary — Salary

Close the Cassandra database connection.

close(conn)

Using the Apache™ Cassandra® database C++ interface, create a Cassandra® database connection and import data from a Cassandra database table into MATLAB®. Use the values of two partition keys in the database table to import data. The Cassandra database contains a database table with employee data.

Create a Cassandra database connection using the configured data source CassandraDataSource and a blank user name and password. The apacheCassandra function returns conn as a connection object.

datasource = "CassandraDataSource";
username = "";
password = "";
conn = apacheCassandra(datasource,username,password);

Import employee data into MATLAB from the employeedata keyspace and the employees_by_name database table by using the Cassandra database connection. This database table has the first_name and last_name partition keys. Specify the first and last names of two employees as values of the partition keys to import data for those two employees.

keyspace = "employeedata";
tablename = "employees_by_name";
keyValue1 = ["Christopher","Alexander"];
keyValue2 = ["Olsen","Hunold"];
results = partitionRead(conn,keyspace,tablename,keyValue1,keyValue2);

Display the returned employee data for the two employees.

results
results=2×13 table
     first_name      last_name     hire_date     employee_id    commission_pct    department_id      email       job_id      manager_id         office         performance_ratings        phone_number        salary
                                                                                                                                           building    room                                                         
    _____________    _________    ___________    ___________    ______________    _____________    _________    _________    __________    ________________    ___________________    ____________________    ______

    "Alexander"      "Hunold"     03-Jan-2006        103             NaN               60          "AHUNOLD"    "IT_PROG"       102        "West"      155         {2×1 int32}        "590.423.4567"           9000 
    "Christopher"    "Olsen"      30-Mar-2006        153             0.2               80          "COLSEN"     "SA_REP"        145        "South"     333         {4×1 int32}        "011.44.1344.498718"     8000 

results is a table that contains these variables:

  • first_name — First name

  • last_name — Last name

  • hire_date — Hire date

  • employee_id — Employee identifier

  • commission_pct — Commission percentage

  • department_id — Department identifier

  • email — Email address

  • job_id — Job identifier

  • manager_id — Manager identifier

  • office — Office location (table that contains two variables for the building and room)

  • performance_ratings — Performance ratings

  • phone_number — Phone number

  • salary — Salary

Close the Cassandra database connection.

close(conn)

Using the Apache™ Cassandra® database C++ interface, create a Cassandra database connection and import data from a Cassandra database table into MATLAB®. Use the value of the partition key in the database table to import data. Specify a consistency level for returning results. The Cassandra database contains a database table with employee data.

Create a Cassandra database connection using the configured data source CassandraDataSource and a blank user name and password. The apacheCassandra function returns conn as a connection object.

datasource = "CassandraDataSource";
username = "";
password = "";
conn = apacheCassandra(datasource,username,password);

Import employee data into MATLAB from the employeedata keyspace and the employees_by_job database table by using the Cassandra database connection. This database table has the job_id partition key. Specify the IT_PROG value of the partition key to import all data for only those employees who are programmers. Also, specify the consistency level as one node.

keyspace = "employeedata";
tablename = "employees_by_job";
keyValue = "IT_PROG";
level = "ONE";
results = partitionRead(conn,keyspace,tablename,keyValue, ...
    'ConsistencyLevel',level);

One replica node responds with the returned data.

Display the returned employee data.

results
results=5×13 table
     job_id       hire_date     employee_id    commission_pct    department_id      email       first_name      last_name     manager_id         office         performance_ratings     phone_number     salary
                                                                                                                                            building    room                                                   
    _________    ___________    ___________    ______________    _____________    __________    ___________    ___________    __________    ________________    ___________________    ______________    ______

    "IT_PROG"    21-May-2007        104             NaN               60          "BERNST"      "Bruce"        "Ernst"           103        "North"     371         {[      8]}        "590.423.4568"     6000 
    "IT_PROG"    07-Feb-2007        107             NaN               60          "DLORENTZ"    "Diana"        "Lorentz"         103        "West"      133         {3×1 int32}        "590.423.5567"     4200 
    "IT_PROG"    05-Feb-2006        106             NaN               60          "VPATABAL"    "Valli"        "Pataballa"       103        "East"      231         {5×1 int32}        "590.423.4560"     4800 
    "IT_PROG"    03-Jan-2006        103             NaN               60          "AHUNOLD"     "Alexander"    "Hunold"          102        "West"      155         {2×1 int32}        "590.423.4567"     9000 
    "IT_PROG"    25-Jun-2005        105             NaN               60          "DAUSTIN"     "David"        "Austin"          103        "South"     393         {2×1 int32}        "590.423.4569"     4800 

results is a table that contains these variables:

  • job_id — Job identifier

  • hire_date — Hire date

  • employee_id — Employee identifier

  • commission_pct — Commission percentage

  • department_id — Department identifier

  • email — Email address

  • first_name — First name

  • last_name — Last name

  • manager_id — Manager identifier

  • office — Office location (table that contains two variables for the building and room)

  • performance_ratings — Performance ratings

  • phone_number — Phone number

  • salary — Salary

Close the Cassandra database connection.

close(conn)

Input Arguments

collapse all

Apache Cassandra database connection, specified as a connection object.

Keyspace, specified as a character vector or string scalar. If you do not know the keyspace, then access the Keyspaces property of the connection object using dot notation to view the keyspaces in the Cassandra database.

Example: "employeedata"

Data Types: char | string

Cassandra database table name, specified as a character vector or string scalar. If you do not know the name of the table, then use the tablenames function to find it.

Example: "employees_by_job"

Data Types: char | string

Partition key values, specified as one of these data types:

  • numeric scalar

  • numeric array

  • character vector

  • cell array of character vectors

  • string scalar

  • string array

  • logical

  • logical array

  • datetime array

  • duration array

If you do not specify the keyValue1...keyValueN input argument, then the partitionRead function imports data from all partitions of the Cassandra database table (same as the CQL query SELECT * FROM tablename).

Specify one key value for each partition key of the Cassandra database table. The maximum number of partition key values that you can specify is the number of primary keys, which includes the partition keys and clustering columns in the Cassandra database.

If you specify a scalar value, then the CQL query equivalent is an = clause in the CQL WHERE clause. If you specify an array of values, then the CQL query equivalent is an IN clause in the CQL WHERE clause.

If all partition key values are scalar values, then the partitionRead function imports data from one partition. If some partition key values are arrays, then the partitionRead function imports data by searching multiple partitions that correspond to all possible key combinations.

The following table describes supported Cassandra partition keys.

Supported Cassandra Partition KeyMATLAB Valid Data Types for One PartitionMATLAB Valid Data Types for Multiple Partitions

ascii

character vector or string scalar

cell array of character vectors or string array

bigint

numeric scalar or logical scalar

numeric array or logical array

blob

numeric array

cell array of numeric arrays

boolean

numeric scalar or logical scalar

numeric array or logical array

date

datetime array, string scalar, or character vector

datetime array, string array, or cell array of character vectors

decimal

numeric scalar, logical scalar, or string scalar

numeric array, logical array, or string array

double

numeric scalar or logical scalar

numeric array or logical array

float

numeric scalar or logical scalar

numeric array or logical array

inet

character vector or string scalar

cell array of character vectors or string array

int

numeric scalar or logical scalar

numeric array or logical array

smallint

numeric scalar or logical scalar

numeric array or logical array

text

character vector or string scalar

cell array of character vectors or string array

time

duration array, string scalar, or character vector

duration array, string array, or cell array of character vectors

timestamp

datetime array, string scalar, or character vector

datetime array, string array, or cell array of character vectors

timeuuid

character vector or string scalar

cell array of character vectors or string array

tinyint

numeric scalar or logical scalar

numeric array or logical array

uuid

character vector or string scalar

cell array of character vectors or string array

varchar

character vector or string scalar

cell array of character vectors or string array

varint

numeric scalar, logical scalar, or string

numeric array, logical array, or string array

These Cassandra partition keys are not supported:

  • counter

  • list

  • map

  • set

  • tuple

  • user-defined types (UDTs)

Example: ["MA","CT"]

Example: 1,2,'DataProvider1','AmbientTemp'

Data Types: double | logical | char | string | struct | cell | datetime | duration

Name-Value Arguments

Specify optional pairs of arguments as Name1=Value1,...,NameN=ValueN, where Name is the argument name and Value is the corresponding value. Name-value arguments must appear after other arguments, but the order of the pairs does not matter.

Before R2021a, use commas to separate each name and value, and enclose Name in quotes.

Example: results = partitionRead(conn,keyspace,tablename,'ConsistencyLevel',"ONE",'RequestTimeout',15) returns imported data by receiving a read response from one node, and the database must wait 15 seconds to perform the read operation before throwing an error.

Consistency level, specified as one of these values.

Consistency Level ValueConsistency Level Description

"ALL"

Return query results when all replica nodes respond.

"QUORUM"

Return query results when most replica nodes respond.

"LOCAL_QUORUM"

Return query results when most replica nodes in the local data center respond.

"ONE" (default)

Return query results when one replica node responds.

"TWO"

Return query results when two replica nodes respond.

"THREE"

Return query results when three replica nodes respond.

"LOCAL_ONE"

Return query results when one replica node in the local data center responds.

"SERIAL"

Return query results for current (and possibly uncommitted) data for replica nodes in any data center.

"LOCAL_SERIAL"

Return query results for current (and possibly uncommitted) data for replica nodes in the local data center.

You can specify the value of the consistency level as a character vector or string scalar.

For details about consistency levels, see Configuring Data Consistency.

Data Types: char | string

This property is read-only.

Request timeout, specified as a positive numeric scalar. The request timeout indicates the number of seconds the database waits to return a CQL query before throwing an error.

Data Types: double

Output Arguments

collapse all

Imported data results, returned as a table. The table contains imported data from the partitions that correspond to the keyValue1...keyValueN input argument. Each Cassandra database column from the partitions becomes a variable in the table. The variable names match the names of the Cassandra database columns in the specified partitions.

The data types of the variables in the table depend on the Cassandra data types. For details about how CQL data types convert to MATLAB data types, see Convert CQL Data Types to MATLAB Data Types Using Apache Cassandra Database C++ Interface.

Version History

Introduced in R2021a