Arrays
An array is a series of elements of the same type placed in contiguous memory locations that can be individually referenced by adding an index to a unique identifier.
That means that, for example, five values of type int can be declared as an array without having to declare 5 different variables (each with its own identifier). Instead, using an array, the five int values are stored in contiguous memory locations, and all five can be accessed using the same identifier, with the proper index. The values stored in an array can be any valid data type supported by C++ (int, double, pointers, references, objects, …).
Elements of an array are initialised to their default values if the array is declared in a global or class scope, while they are uninitialised when declared in a local (function or code block).
int foo[5] = { 16, 2, 77, 40, 12071 };
This statement declares an array that can be represented like this:
The number of values between braces
The number of values between braces
{}
shall not be greater than the number of elements in the array. For example, in the example above, foo was declared having 5 elements (as specified by the number enclosed in square brackets, []
), and the braces {}
contained exactly 5 values, one for each element. If declared with less, the remaining elements are set to their default values (which for fundamental types, means they are filled with zeroes). For example:int bar[5] = { 10, 20, 30 };
Will create an array like this:
It is important to be able to clearly distinguish between the two uses that brackets
It is important to be able to clearly distinguish between the two uses that brackets
[]
have related to arrays. They perform two different tasks: one is to specify the size of arrays when they are declared; and the second one is to specify indices for concrete array elements when they are accessedint foo[5]; // declaration of a new array
foo[2] = 75; // access to an element of the array.
The main difference is that the declaration is preceded by the type of the elements, while the access is not.
Some other valid operations with arrays:
Some other valid operations with arrays:
foo[0] = a;
foo[a] = 75;
b = foo [a+2];
foo[foo[a]] = foo[2] + 5;
Arrays can have an unlimited number of dimensions (declared with multiple []). One and two dimensional (matrix) arrays are most commonly used. Three dimensional arrays are used in 3-D modelling and other similar domains where data in 3 dimensions or planes are represented. Larger dimensions though supported are hard to understand and rarely used. The amount of memory needed for an array increases exponentially with each dimension. For example
char century[100][365][24][60][60];
declares an array with an element of type char for each second in a century. This amounts to more than 3 billion char! So this declaration would consume more than 3 gigabytes of memory!
At the end, multidimensional arrays are just an abstraction for programmers, since the same results can be achieved with a simple array, by multiplying its indices
At the end, multidimensional arrays are just an abstraction for programmers, since the same results can be achieved with a simple array, by multiplying its indices
int jimmy[3][5]; // is equivalent to
int jimmy[15]; // (3 * 5 = 15)
With the only difference that with multidimensional arrays, the compiler automatically remembers the depth of each imaginary dimension. The following two pieces of code produce the exact same result, but one uses a two dimensional array while the other uses a simple array:
In a function declaration, it is also possible to include multidimensional arrays. The format for a tridimensional array parameter is:
In a function declaration, it is also possible to include multidimensional arrays. The format for a tridimensional array parameter is:
base_type[][depth][depth]
For example, a function with a multidimensional array as argument could be:
void procedure (int myarray[][3][4])
Notice that the first brackets [] are left empty, while the following ones specify sizes for their respective dimensions. This is necessary in order for the compiler to be able to determine the depth of each additional dimension.
Pointer Equivalence
Arrays and pointers behave in much the same way. In C, it is common to represent a string (char array) as either
char[]
or char*
, with the latter being more common. The handle to an array points to the start of the block of memory the array occupies, which is the same as a pointer to the start of the block of memory. This makes it possible to use pointer arithmetic and manipulation to efficiently access individual items represented by that block of memory.Associative Arrays
Associative arrays are a special type of data structure where data is stored keyed by a user specified value (instead of monotonically increasing number). Often the internal implementation of such structures is an array where the user specified key is converted into a numeric value, and then stored at the appropriate location within the array. C does not provide an associative array out of the box, however the STL provides both a
map
and unordered_map
for this purpose.Recommendation
Arrays are low level constructs (from the C heritage of C++) that should in general be avoided. C++ provides much better (safer, more friendly, …) higher level constructs that provide the benefits of contiguous memory storage without the attendant pitfalls (buffer overflow, uniform iteration, …) of using low level arrays.
std::vector
The vector class provided by the STL is probably the most commonly used data structure in C++. They provide all the features of arrays, and also support growing of the number of items stored. Optionally arrays may shrink, although implementations are free to ignore any instruction to shrink. Growing a vector will usually involve a reallocation of a new block of memory, followed by a copy of the current memory contents into the new block.
reserve
- Used to indicate to the vector how much memory it should reserve for storing the items. Generally the preferred way of indicating how big a memory block the vector should use before adding items. Used to avoid frequent reallocation if a rough estimate of the number of items to be added is available (only used when adding large number of items).resize
- Used to grow/shrink the vector. A smaller size than current will lead to items being removed. A larger size will lead to new items being inserted into the vector (either default initialised or using a value specified in function invocation).push/push_back
- Used to add items to vector. Generally will involve a copy of the value being specified being stored in the vector.emplace/emplace_back
- Used to add items to vector. Directly stores the specified value into the vector avoiding a (potentially expensive) copy.
std::array
Introduced in C++11. These have the same memory profile as C-style arrays, and just add useful functions to operate on them. As with most other STL containers that support the subscript operator (
[]
), accessing data using the subscript operator is not bounds checked, while the at
function performs bounds checking. May create multi-dimensional arrays as with raw arrays.Why Arrays?
- Arrays are a fundamental part of computer science, and most (if not all) higher level abstractions around the concept will internally use arrays.
- They may be the only option is certain environments (embedded devices), where the memory profile (code and data) has to be kept to absolute minimum.
- From a C++ perspective, a lot of libraries used are written in pure C, and will involve passing arrays of data back and forth. A C++ programmer needs to be able to work with C libraries, which in turn require good familiarity with arrays.
- A lot of interviews (at any state in your career) will ask you to implement a map or a vector or similar data structure, which will involve you using an array internally.