第2章变量和基本类型

2016-03-21 约 6708 字预计阅读 14 分钟次阅读

2.1 基础内置类型

C++定义了一组基础类型，包括算术类型和一个特殊的void类型。算术类型表示字符，整型，布尔值和浮点数类型。void类型不关联任何值，而且只能用在一些场合，最常见是用作函数的返回类型。

2.1.1 算术类型

算术类型分为2类：整型（包含了字符和布尔值）和浮点数类型。算术类型的大小不同机器可能不一样。

Type	Meaning	Minimum Size
bool	boolean	NA
char	character	8 bits
wchar_t	wide character	16 bits
char16_t	Unicode character	16 bits
char32_t	Unicode character	32 bits
short	short integer	16 bits
int	integer	16 bits
long	long integer	32 bits
long long	long integer	64 bits
float	single-precision floating-point	6 significant digits
double	double-precision floating-point	10 significant digits
long double	extended-precision floating-point	10 significant digits

bool类型代表了真值true和false。

基本的字符类型是char。一个char保证足够大来保存机器的基本字符集。也就是一个char和一个机器字节一样大。剩下的字符类型wchar_t，char16_t和char32_t用于扩展字符集。wchar_t保证可以存放机器最大扩展字符集中任意字符。char16_t和char32_t被用于Unicode字符。

剩下的整型表示不同大小的整数。C++保证int至少和short一样大，long至少和int一样大，long long至少和long一样大。long long类型由新标准引入。

浮点数类型表示单精度，双精度和扩展精度的值。标准指定了一个最小数量的有效数字。大多数编译器提供更高的精确度。典型地，float占32位，double占64位，long double占96或128位。float和double通常产生7位和16位有效数字。

有符号和无符号类型

除了bool和扩展的字符类型，整型可以是有符号或无符号的。有符号类型表示负值或正值（包括0），无符号的类型的值大于或等于0。

在整型前面加上unsigned关键字就得到相应的无符号类型。unsigned int可以简写为unsigned。

和其他整型不一样，有3种不一样的基本字符类型：char, signed char和unsigned char。char类型使用signed char或unsigned char其中一种表示。

标准没有定义signed类型如何表示，但是明确了signed类型的正值和负值范围应该一样大。

建议：决定使用哪种类型

当你知道值不可能是负的时，使用unsigned类型
使用int做整型运算。short通常太小，long通常和int一样大小。如果超过int范围，则使用long long。
在算术表达式中不要使用char或bool。使用char计算尤其容易出问题，因为在一些机器上char是有符号的，而在另一些机器上char是无符号的。
使用double做浮点计算。float通常精度不够，而且计算double的开销与float相比可以忽略。

2.1.2 类型转换

一个对象的类型定义了这个对象可能包含的数据以及它能执行的操作。在诸多类型支持的操作中的一个是转换为其他相关类型的能力。

类型转换自动发生在当我们使用一种类型的对象在期待另一种类型的地方。

当我们将一种算术类型赋值给另一种算术类型时，会发生什么取决于等号左边类型允许的值的范围：

当将非bool算术类型赋值给bool对象时，如果值为0，则结果是false，否则为true。
当将bool赋值给其他算术类型时，如果bool为true，则结果为1，否则为0。
当将浮点数类型赋值给整型时，小数点后面的部分被截断。
当将整型值赋值给浮点数类型时，小数部分为0。如果整型有比浮点数有效数字更多数字时，可能会损失精度。
如果将超出范围的值赋值给无符号类型时，结果为改值和目标类型最大值的余数。
如果将超出范围的值赋值给有符号类型时，结果是未定义的。

建议：避免未定义的依赖具体实现的行为

涉及无符号类型的表达式

尽管我们不太可能有意地将一个负值赋值给一个无符号类型，但是我们非常容易写出代码隐式地这样做。

1
2
3
4


unsigned u = 10;
int i = -42;
std::cout << i + i << std::endl;  // prints -84
std::cout << u + i << std::endl;  // if 32-bit ints, prints 4294967264

不管一个或者两个操作数是无符号的，当用无符号数减去一个数时，我们必须确保结果不能是负数：

1
2
3


unsigned u1 = 42, u2 = 10;
std::cout << u1 - u2 << std::endl; // ok: result is 32
std::cout << u2 - u1 << std::endl; // ok: but the result will wrap around

无符号数不可能小于0也影响我们如何写循环。

1
2
3


// WRONG: u can never be less than 0; the condition will always succeed
for (unsigned u = 10; u >= 0; --u)
    std::cout << u << std::endl;

注意：不要混合使用signed和unsigned类型需要记住signed值会自动转换为unsigned值。

2.1.3 字面值

一个值，比如42，被称为字面值，因为它的值是不证自明的。每一个字面值都有一个类型。一个字面值的形式和值确定了它的类型。

整型和浮点型字面值

我们可以使用十进制，八进制或十六进制表示法来写整型字面值。八进制字面值以0开始，十六进制字面值以0x或0X开始。

默认地，十进制字面值是有符号的，而八进制和十六进制字面值可以使有符号或无符号的。十进制字面值类型为适合其值的最小整型，八进制字面值和十六进制字面值类型为适合其值的最小有符号或无符号整型。没有short类型的字面值。

Table 2.2 指定字面值的类型

Prefix	Meaning	Type
u	Unicode 16 character	char16_t
U	Unicode 32 character	char32_t
L	wide character	wchar_t
u8	utf-8(string literals only)	char

Suffix	Minimum Type
u or U	unsigned
l or L	long
ll or LL	long long

Suffix	Type
f or F	float
l or L	long double

浮点数字面值包含小数点或指数(E或e)指定使用科学计数法。

1

3.14159    3.14159E0    0.    0e0    .001

浮点数字面值默认为double类型。我们可以添加后缀覆盖默认行为。

字符和字符串字面值

一个单引号包含的字符是一个类型为char的字面值。0个或多个双引号包含的字符是字符串字面值：

1
2


'a'  // character literal
"Hello World!"  // string literal

一个字符串字面值的类型是一个const char数组。编译器给每一个字符串字面值追加一个空字符(’\0’)。

两个相邻的仅仅由空白符分割的字符串字面值连接为一个字面值。

1
2
3


// multiline string literal
std::cout << "a really, really long string literal "
             "that spans two lines" << std::endl;

转义序列

Escape Sequence	Meaning
\n	newline
\t	horizontal tab
\a	alert(bell)
\v	vertical tab
\b	backspace
"	double quote
\	backslash
?	question mark
'	single quote
\r	carriage return
\f	formfeed

1
2
3
4


newline            \n     horizontal tab      \t     alert (bell)       \a
vertical tab       \v     backspace          \b     double quote  \"
backslash         \\     question mark     \?     single quote    \'
carriage return   \r     formfeed       \f"

我们也可以写通用的转义序列，即\x后跟一个或多个16进制数字或者\后跟1至3个8进制数字。

1
2


\7 (bell)    \12 (newline)     \40 (blank)
\0 (null)    \115 ('M')    \x4d ('M')

注意如果\后面后跟3个以上8进制数字，只有前3个有效。相反\x使用所有后续16进制数字。

指定字面值的类型

我们可以提供前缀或后缀来覆盖整数，浮点数或字符字面值的默认类型。

1
2
3
4
5


L'a'     // wide character literal, type is wchar_t
u8"hi!"  // utf-8 string literal (utf-8 encodes a Unicode character in 8 bits)
42ULL    // unsigned integer literal, type is unsigned long long
1E-3F    // single-precision floating-point literal, type is float
3.14159L // extended-precision floating-point literal, type is long double

最佳实践 当写一个long字面值时，使用大写的L，小写l和1容易混淆。

布尔类型和指针类型字面值

true和false是布尔类型的字面值，nullptr是指针的字面值。

2.2 变量

一个变量给我们提供了程序能够操作的命名存储空间。C++中每一个变量都有一个类型。类型决定了变量的大小和在内存种的布局，以及可以应用到变量的一组操作。

2.2.1 变量定义

一个简单的变量定义由类型说明符，一个或多个由逗号分隔的变量名组成，由分号结束。

1
2
3
4
5


int sum = 0, value, // sum, value, and units_sold have type int
    units_sold = 0; // sum and units_sold have initial value 0
Sales_item item;    // item has type Sales_item (see § 1.5.1 (p. 20))
// string is a library type, representing a variable-length sequence of characters
std::string book("0-201-78345-X"); // book initialized from string literal

初始化式

一个被初始化的对象在它创建时获得指定的值。用来初始化一个变量的值可以是任意复杂的表达式。当一个定义语句定义多个变量时，前面定义的变量对后面的变量可见。

1
2
3
4


// ok: price is defined and initialized before it is used to initialize discount
double price = 109.99, discount = price * 0.16;
// ok: call applyDiscount and use the return value to initialize salePrice
double salePrice = applyDiscount(price, discount);

警告初始化不是赋值。初始化发生在一个变量创建时给定一个值。赋值抹掉对象当前的值并用新值替代。

列表初始化

初始化的一个复杂主题是语言定义了几种不同形式的初始化方式。

1
2
3
4


int units_sold = 0;
int units_sold = {0};
int units_sold{0};
int units_sold(0);

新标准引入花括号作为普遍使用的初始化方式。这种初始化方式称为列表初始化。列表初始化现在可以用在任何时候初始化一个对象以及某些情况下赋值给一个对象。

当使用内置类型的变量时，列表初始化有一个重要的性质：编译器不允许会导致精度损失的内置类型列表初始化。

1
2
3


long double ld = 3.1415926536;
int a{ld}, b = {ld}; // error: narrowing conversion required
int c(ld), d = ld;   // ok: but value will be truncated

默认初始化

当我们定义变量没有提供初始化式时，变量被默认初始化。这些变量被给予默认值。默认值取决于变量的类型以及在哪里定义变量。

内置类型对象的默认值取决于变量在哪里定义。定义在任何函数外的变量初始化为0。定义在函数内的变量未初始化。内置类型对象未初始化的值是未定义的。

每一个类控制如何初始化类对象。大多数类允许我们定义对象时不用显式指定初始化式。这些类为我们提供合适的默认值。

注解定义在函数内的未初始化的内置类型对象拥有未定义的值。没有显式初始化的类对象拥有类定义好的值。

警告未初始化的变量会引起运行时的问题

提示我们建议初始化每一个内置类型对象的值。这并不总是必要的，但是它更安全。

2.2.2 变量声明与定义

为了允许程序按不同逻辑部分来写，C++支持所谓的分离编译（separate compilation）。分离编译允许我们将程序分解为多个文件，每一个文件可以独立编译。

为了支持分离编译，C++区分声明和定义。一个声明使得一个名字被程序知道。一个定义创建关联的实体。

为了获得一个声明而不是定义，可以添加extern关键字且不需要提供初始化式：

1
2


extern int i;   // declares but does not define i
int j;          // declares and defines j

任何包含显式初始化式的声明都是一个定义。一个有初始化式的extern是一个定义：

1

extern double pi = 3.1416; // definition

在函数内给extern提供初始化式是错误的。

注解变量必须仅定义一次但是可以声明多次。

2.2.3 标识符

C++中标识符由字母，数字和下划线组成。语言没有限制标识符的长度。标识符必须以字母或下划线开始。标识符是区分大小写的。

C++预留了一组名字给语言本身用，这些名字不能用作标识符。

C++关键字

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


alignas          continue         friend           register         true
alignof          decltype         goto             reinterpret_cast try
asm              default          if               return           typedef
auto             delete           inline           short            typeid
bool             do               int              signed           typename
break            double           long             sizeof           union
case             dynamic_cast     mutable          static           unsigned
catch            else             namespace        static_assert    using
char             enum             new              static_cast      virtual
char16_t         explicit         noexcept         struct           void
char32_t         export           nullptr          switch           volatile
class            extern           operator         template         wchar_t
const            false            private          this             while
constexpr        float            protected        thread_local
const_cast       for              public           throw

C++可选操作符名字

1
2


and    bitand compl not_eq or_eq xor_eq
and_eq bitor  not   or     xor

标准也同样保留了一组名字用于标准库。我们自己的程序最好不要使用两个连续的下划线或一个下划线紧跟一个大写字母开头的标识符。此外定义在函数外的标识符不要以下划线开头。

变量名的约定

有一些大家都接收的变量名约定，遵循这些约定可以提高程序的可读性。

标识符应该给它的含义一些提示。
变量名通常是小写的。
类名通常以大写字母开头，比如Sales_item。
多个单词组成的标识符应该能在视觉上区分每一个单词，比如student_loan或studentLoan。

最佳实践 当一致遵循的时候，命名约定最有用。

2.2.4 名字的作用域

一个作用域是程序的一部分，在其中一个名字有特定的意义。C++种大多数作用域由花括号分隔。相同的名字可以引用不同作用域中的不同实体。名字从声明的地方开始可见，直到离开这个作用域。

建议：在第一次使用的地方定义变量

嵌套作用域

作用域可以包含其他作用域。被包含的作用域称为内部作用域，包含的作用域称为外部作用域。外部作用域的名字对内部作用域可见。当名字相同时，内部作用域中的名字隐藏外部作用域的名字。

1
2
3
4
5
6
7
8


int reused = 42;  // reused has global scope
int main()
{
    std::cout << reused << std::endl; // global reused
    int reused = 0; // local reused hides global one
    std::cout << reused << std::endl;
    std::cout << ::reused << std::endl; // explicitly requests the global reused
}

警告定义一个与全局变量同名的局部变量几乎总是坏想法。

2.3 复合类型

复合类型是依据其它类型定义的类型。C++有多种复合类型，指针和引用就是其中2种。

2.3.1 引用

引用定义了对象的一个别名。

1
2
3


int ival = 1024;
int &refVal = ival;  // refVal refers to (is another name for) ival
int &refVal2;        // error: a reference must be initialized

当定义一个引用时，我们将引用绑定到初始化对象上。引用一旦初始化后，就不能绑定到其他对象了。因此，引用必须初始化。

引用是一个别名

引用不是一个对象，它仅仅是一个已经存在的对象的另一个名字。当引用被定义后，所有对引用的操作实际上是操作引用绑定的对象。因为引用不是一个对象，我们不能定义一个引用的引用。

引用定义

我们可以在一个定义里面定义多个引用。每一个标识符前面必须是&。

1
2
3
4


int i = 1024, i2 = 2048;  // i and i2 are both ints
int &r = i, r2 = i2;      // r is a reference bound to i; r2 is an int
int i3 = 1024, &ri = i3;  // i3 is an int; ri is a reference bound to i3
int &r3 = i3, &r4 = i2;   // both r3 and r4 are references

除了2个例外，引用的类型和它绑定对象的类型必须完全一致。而且引用只能绑定到一个对象，不能绑定到字面值或表达式的结果上。

1
2
3


int &refVal4 = 10;   // error: initializer must be an object
double dval = 3.14;
int &refVal5 = dval; // error: initializer must be an int object

2.3.2 指针

指针是一个指向其他类型的复合类型。和引用一样，指针也是用来间接访问其他对象。和引用不一样的是，指针本身是一个对象。

1
2


int *ip1, *ip2;  // both ip1 and ip2 are pointers to int
double dp, *dp2; // dp2 is a pointer to double; dp is a double

获取对象的地址

指针保存了另一个对象的地址。我们使用取地址符获得一个对象的地址：

1
2


int ival = 42;
int *p = &ival; // p holds the address of ival; p is a pointer to ival

因为引用不是对象，它们没有地址。因此不能定义指向引用的指针。

除了2个例外，指针的类型和它指向的对象的类型必须匹配：

1
2
3
4
5


double dval;
double *pd = &dval;  // ok: initializer is the address of a double
double *pd2 = pd;    // ok: initializer is a pointer to double
int *pi = pd;  // error: types of pi and pd differ
pi = &dval;    // error: assigning the address of a double to a pointer to int

指针值

存储在指针里面的值可以是以下4种状态种的一个：

它可以指向一个对象。
它可以指向紧跟一个对象后面的空间。
它可以是空指针，指示它没有绑定到任何对象。
它可以是非法的。除了上面3种值，其他值都是非法的。

使用指针访问一个对象

我们可以使用一个解引用操作符访问指针指向的对象：

1
2
3


int ival = 42;
int *p = &ival; // p holds the address of ival; p is a pointer to ival
cout << *p;     // * yields the object to which p points; prints 42

注解我们只能解引用一个合法的指针。

空指针

空指针不指向任何对象。在使用指针之前，代码可以检查指针是不是空指针。

1
2
3
4


int *p1 = nullptr; // equivalent to int *p1 = 0;
int *p2 = 0;       // directly initializes p2 from the literal constant 0
// must #include cstdlib
int *p3 = NULL;    // equivalent to int *p3 = 0;

将一个整数变量赋值给指针是非法的，即使这个变量碰巧值为0。

建议：初始化所有指针

赋值和指针

当赋值给指针时，我们给指针本身一个新值。赋值使得指针指向一个不同的对象。

其他指针操作

常量表达式是一个值不能改变且能在编译期间计算出值的表达式。

constexpr变量由constexpr声明的变量隐式为const并且必须由常量表达式初始化。

指针和constexpr

当我们在constexpr声明中定义一个指针时，constexpr指示符应用到指针，而不是指针指向的对象。

1
2


const int *p = nullptr;     // p is a pointer to a const int
constexpr int *q = nullptr; // q is a const pointer to int

constexpr产生一个top-level const。

2.5 处理类型

2.5.1 类型别名

传统上我们使用typedef定义类型别名

1
2


const int *p = nullptr;     // p is a pointer to a const int
constexpr int *q = nullptr; // q is a const pointer to int

C++11引入另一种定义类型别名的方法。别名声明。

1
2
3
4
5


using SI = Sales_item;  // SI is a synonym for Sales_item

typedef char *pstring;
const pstring cstr = 0; // cstr is a constant pointer to char
const pstring *ps;      // ps is a pointer to a constant pointer to char

2.5.2 auto类型限定符

auto告诉编译器从初始值推导出类型。这暗示了使用auto的变量必须有一个初始值。

1
2
3


auto item = val1 + val2;
auto i = 0, *p = &i;      // ok: i is int and p is a pointer to int
auto sz = 0, pi = 3.14;   // error: inconsistent types for sz and pi

复合类型，const和auto

1
2


int i = 0, &r = i;
auto a = r;  // a is an int (r is an alias for i, which has type int)

auto一般忽略top-level const。low-level const保留。

1
2
3
4
5


const int ci = i, &cr = ci;
auto b = ci;  // b is an int (top-level const in ci is dropped)
auto c = cr;  // c is an int (cr is an alias for ci whose const is top-level)
auto d = &i;  // d is an int*(& of an int object is int*)
auto e = &ci; // e is const int*(& of a const object is low-level const)

如果想要推导的类型有top-level const，必须显式指定

1

const auto f = ci; // deduced type of ci is int; f has type const int

我们也可以指定自动推导的类型为引用。

1
2
3
4
5
6
7
8


auto &g = ci;       // g is a const int& that is bound to ci
auto &h = 42;       // error: we can't bind a plain reference to a literal
const auto &j = 42; // ok: we can bind a const reference to a literal

auto k = ci, &l = i;    // k is int; l is int&
auto &m = ci, *p = &ci; // m is a const int&;p is a pointer to const int
// error: type deduced from i is int; type deduced from &ci is const int
auto &n = i, *p2 = &ci;

2.5.3 decltype类型说明符

1

decltype(f()) sum = x; // sum has whatever type f returns

decltype处理top-level const和引用的方式和auto有着微妙的区别。当给变量应用 decltype时，decltype返回这个变量的类型，包括top-level const和引用。

1
2
3
4


const int ci = 0, &cj = ci;
decltype(ci) x = 0; // x has type const int
decltype(cj) y = x; // y has type const int& and is bound to x
decltype(cj) z;     // error: z is a reference and must be initialized

decltype和引用

当我们应用decltype到非变量的表达式时，我们得到表达式产生的类型。decltype返回一个引用当表达式产生可以作为左值的对象时。

1
2
3
4
5
6
7


int i = 42, *p = &i, &r = i;
decltype(r + 0) b;  // ok: addition yields an int; b is an (uninitialized) int
decltype(*p) c;     // error: c is int& and must be initialized

// decltype of a parenthesized variable is always a reference
decltype((i)) d;    // error: d is int& and must be initialized
decltype(i) e;      // ok: e is an (uninitialized) int

类内初始值被限制于只能用在花括号里面或等号后面，不能使用圆括号。头文件通常包含在任意文件里面只能定义一次的实体，比如类定义，const和constexpr变量。

目录

第2章 变量和基本类型