Main Content

CERT C++: STR34-C

Cast characters to unsigned char before converting to larger integer sizes

Description

Rule Definition

Cast characters to unsigned char before converting to larger integer sizes.1

Polyspace Implementation

The rule checker checks for Misuse of sign-extended character value.

Examples

expand all

Issue

Misuse of sign-extended character value occurs when you convert a signed or plain char data type to a wider integer data type with sign extension. You then use the resulting sign-extended value as array index, for comparison with EOF or as argument to a character-handling function.

Risk

Comparison with EOF: Suppose, your compiler implements the plain char type as signed. In this implementation, the character with the decimal form of 255 (–1 in two’s complement form) is stored as a signed value. When you convert a char variable to the wider data type int for instance, the sign bit is preserved (sign extension). This sign extension results in the character with the decimal form 255 being converted to the integer –1, which cannot be distinguished from EOF.

Use as array index: By similar reasoning, you cannot use sign-extended plain char variables as array index. If the sign bit is preserved, the conversion from char to int can result in negative integers. You must use positive integer values for array index.

Argument to character-handling function: By similar reasoning, you cannot use sign-extended plain char variables as arguments to character-handling functions declared in ctype.h, for instance, isalpha() or isdigit(). According to the C11 standard (Section 7.4), if you supply an integer argument that cannot be represented as unsigned char or EOF, the resulting behavior is undefined.

Fix

Before conversion to a wider integer data type, cast the signed or plain char value explicitly to unsigned char.

Example - Sign-Extended Character Value Compared with EOF
#include <stdio.h>
#include <stdlib.h>
#define fatal_error() abort()

extern char parsed_token_buffer[20];

static int parser(char *buf)
{
    int c = EOF;
    if (buf && *buf) {
        c = *buf++;    
    }
    return c;
}

void func()
{
    if (parser(parsed_token_buffer) == EOF) {  //Noncompliant
        /* Handle error */
        fatal_error();
    }
}

In this example, the function parser can traverse a string input buf. If a character in the string has the decimal form 255, when converted to the int variable c, its value becomes –1, which is indistinguishable from EOF. The later comparison with EOF can lead to a false positive.

Correction — Cast to unsigned char Before Conversion

One possible correction is to cast the plain char value to unsigned char before conversion to the wider int type.

#include <stdio.h>
#include <stdlib.h>
#define fatal_error() abort()

extern char parsed_token_buffer[20];

static int parser(char *buf)
{
    int c = EOF;
    if (buf && *buf) {
        c = (unsigned char)*buf++;    
    }
    return c;
}

void func()
{
    if (parser(parsed_token_buffer) == EOF) { 
        /* Handle error */
        fatal_error();
    }
}

Check Information

Group: 05. Characters and Strings (STR)

Version History

Introduced in R2019a


1 This software has been created by MathWorks incorporating portions of: the “SEI CERT-C Website,” © 2017 Carnegie Mellon University, the SEI CERT-C++ Web site © 2017 Carnegie Mellon University, ”SEI CERT C Coding Standard – Rules for Developing safe, Reliable and Secure systems – 2016 Edition,” © 2016 Carnegie Mellon University, and “SEI CERT C++ Coding Standard – Rules for Developing safe, Reliable and Secure systems in C++ – 2016 Edition” © 2016 Carnegie Mellon University, with special permission from its Software Engineering Institute.

ANY MATERIAL OF CARNEGIE MELLON UNIVERSITY AND/OR ITS SOFTWARE ENGINEERING INSTITUTE CONTAINED HEREIN IS FURNISHED ON AN "AS-IS" BASIS. CARNEGIE MELLON UNIVERSITY MAKES NO WARRANTIES OF ANY KIND, EITHER EXPRESSED OR IMPLIED, AS TO ANY MATTER INCLUDING, BUT NOT LIMITED TO, WARRANTY OF FITNESS FOR PURPOSE OR MERCHANTABILITY, EXCLUSIVITY, OR RESULTS OBTAINED FROM USE OF THE MATERIAL. CARNEGIE MELLON UNIVERSITY DOES NOT MAKE ANY WARRANTY OF ANY KIND WITH RESPECT TO FREEDOM FROM PATENT, TRADEMARK, OR COPYRIGHT INFRINGEMENT.

This software and associated documentation has not been reviewed nor is it endorsed by Carnegie Mellon University or its Software Engineering Institute.