2

I have written the following code in C

int main(){
    int a = {1, 2, 3};
}

It seems like the variable assigned, in this case a, always takes the value of the first array element. Now I'm wondering if the other array elements are discarded, or written to the memory after a, thus causing a buffer overflow.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
Daniel D.
  • 178
  • 2
  • 15
  • 1
    [Build with warnings](https://godbolt.org/z/473E8ohzj), the compile will not like that. – Some programmer dude Oct 29 '21 at 08:03
  • you should check this tutorial : [C-Array](https://www.tutorialspoint.com/cprogramming/c_arrays.htm) And in an array if you want to access the other cells you have to specify the index for exemple `a[0] = 1 or a[1] = 2` an array always starts at 0 – Tsirsuna Oct 29 '21 at 08:05
  • 1
    You must compile the code with a valid C compiler. [What compiler options are recommended for beginners learning C?](https://software.codidact.com/posts/282565) – Lundin Oct 29 '21 at 08:07
  • What is the objective of such code? – Iron Fist Oct 29 '21 at 09:14
  • Re “the value of the first array element”: `{1, 2, 3}` is not an array. It is a list of values inside braces. That list can be used to initialize an array, but it is not an array. No book and no teacher told you it is an array. – Eric Postpischil Oct 29 '21 at 10:51

2 Answers2

6

This declaration

 int a = {1, 2, 3};

is semantically invalid (it breaks the semantic rule referred below). A scalar object may not be initialized by a braced initializer list with more than one initializer.

From the C Standard (6.7.9 Initialization)

11 The initializer for a scalar shall be a single expression, optionally enclosed in braces. The initial value of the object is that of the expression (after conversion); the same type constraints and conversions as for simple assignment apply, taking the type of the scalar to be the unqualified version of its declared type.

That is the comma in the braced initializer list is considered as a separator of initializers and for a scalar object only a single expression is allowed.

When more than one initializer is present then the compiler assumes that the initialized object is an aggregate.

To declare an array you need to write

 int a[] = {1, 2, 3};

or

 int a[N] = {1, 2, 3};

where N is an integer value equal to or greater than 3.

Vlad from Moscow
  • 301,070
  • 26
  • 186
  • 335
  • 1
    Is it _syntactically_ invalid though? Serious question. I do not know for sure. – Ted Lyngmo Oct 29 '21 at 08:14
  • The syntax is OK. The only problem are excess elements when initializing scalar objects. – 0___________ Oct 29 '21 at 08:22
  • @0___________ That's how I was thinking until I saw the above and I was pretty sure that Vlad would dig up a standard clause for it :-) Thanks Vlad! – Ted Lyngmo Oct 29 '21 at 08:32
  • 1
    @TedLyngmo No, it is syntactically fine. The syntax is (6.7.9) _initializer:_ `{` _initializer-list_ `}`. Vlad's quote is kind of irrelevant. This is a _constraint violation_ of C17 6.7.9/2 "No initializer shall attempt to provide a value for an object not contained within the entity being initialized." – Lundin Oct 29 '21 at 08:47
  • @Vlad Feel free to update the answer with my reference in the comment above, since pedantically this is _not_ a syntax error and you are quoting the wrong paragraph. – Lundin Oct 29 '21 at 08:50
  • Yeah well since you didn't include the relevant quote I posted an answer containing it, in case someone is interested in knowing the reason. But as far as beginners are concerned, they just need to know that it is wrong. – Lundin Oct 29 '21 at 09:23
  • @Lundin Why is it a non-relevant quote? It is a relevant quote.:) It says how scalar objects may be initialized.:) – Vlad from Moscow Oct 29 '21 at 09:28
  • @Lundin The quote provided by you refers to aggregates.:) It is aggregates that can contain objects within their entities.:) – Vlad from Moscow Oct 29 '21 at 09:33
  • Err... no it doesn't. It refers to scalars and aggregates both. Also see §12 "The rest of this subclause deals with initializers for objects that have aggregate or union type." – Lundin Oct 29 '21 at 09:39
  • As for §11 it does indeed say how a scalar should be initialized, but it doesn't say how it _shouldn't be initialized, because that's already covered by the earlier §2. – Lundin Oct 29 '21 at 09:41
  • @Lundin Scalar entities do not contain objects. They are objects themselves.:) The quote 11 says how scalar shall be initialized. All other initializations of scalar objects are prohibited.:) There is no sense to list how they may not be initialized.:) – Vlad from Moscow Oct 29 '21 at 09:44
  • The standard sometimes uses the term "entity" where the item being discussed is something unknown or vague. Had it meant to say aggregate, it would have done so. §2 refers to the normative syntax above, where _initializer_ could be either an assignment expression or a brace-enclosed initializer-list. This is further clarified by §3 "The type of the entity to be initialized shall be an array of unknown size or a **complete object type**". A scalar is always a complete object type, as is an array with specified size. – Lundin Oct 29 '21 at 09:51
  • It is a fun fact that `int a = (1, 2, 3);` and `int a = {(1, 2, 3)};` and `int a; a = 1, 2, 3;` are all valid as explained here: https://stackoverflow.com/questions/52550/what-does-the-comma-operator-do. The cases where this use of the comma is needed or useful are probably rare. – nielsen Oct 29 '21 at 11:51
  • @Lundin For example, you might say that milk contains lactose and other proteins. But it makes no sense to say that milk contains milk. The verb contain implies that some entity is composed from something. Thus you may say that structure (array, union) as an entity contains objects. But to say that object contains object does not make a sense. – Vlad from Moscow Oct 29 '21 at 12:58
  • This is getting silly. The text doesn't say that. It says should not provide a _value_ for an object. Given an expression `int a = 0;`, then `a` is an object and `0` is a value. The formal definition of an object being _"region of data storage in the execution environment, the contents of which can represent values"_. – Lundin Oct 29 '21 at 13:04
  • @Lundin You are mistaken. There is clear written "...a value for an object not contained within the entity being initialized." That is "an object CONNTAINED within an entity". But you are trying in fact to say an object contained in an object.:) – Vlad from Moscow Oct 29 '21 at 13:10
  • That concept exists elsewhere in the standard too, such as in the definition of the additive operators/pointer arithmetic. "For the purposes of these operators, a pointer to an object that is not an element of an array behaves the same as a pointer to the first element of an array of length one with the type of the object as its element type." Either way, there is nothing in §11 that would make `int a = {1, 2, 3};` invalid. This is an initialiser to a scalar in the form of single expression optionally enclosed in braces, it fulfils all the requirements of §11. – Lundin Oct 29 '21 at 13:20
  • Also see §3 that I already quoted. What's your definition of _complete object type_ then? – Lundin Oct 29 '21 at 13:22
  • @Lundin This has no relation with the quote your provided in your answer. Do not try to rewrite the quote as you like. – Vlad from Moscow Oct 29 '21 at 13:26
  • Either way, there is nothing in §11 that would make int a = {1, 2, 3}; invalid. This is an initialiser to a scalar in the form of single expression optionally enclosed in braces, it fulfils all the requirements of §11. Also see §3 that I already quoted. What's your definition of complete object type then? – Lundin Oct 29 '21 at 13:30
  • @Lundin Again you are wrong. There is no single expression. According to the syntax the comma is used to separate initializers. You would be right if you wrote int a = { ( 1, 2, 3 ) }; It is the same as to write a function call like f( 1, 2, 3 ); and then to say that there is a single expression. – Vlad from Moscow Oct 29 '21 at 13:36
  • No, it isn't... this whole discussion started with me quoting the syntax. Which isn't described in §11 either. Anyway, it's like talking to a wall so I'm done here. – Lundin Oct 29 '21 at 13:39
2

int a = {1, 2, 3}; is not valid C code.

It is a so-called constraint violation in the C standard, after which a compiler is required to issue a diagnostic message:

C17 6.7.9/2:

Constraints
No initializer shall attempt to provide a value for an object not contained within the entity being initialized.

It's not a syntax error though, we may actually write weird crap such as int a = {1}; with a single, brace-enclosed initializer. The result no matter the error reason is the same though - compilers must issue diagnostic messages for all constraint- and syntax violations.

To avoid wasting your time at trouble-shooting invalid C code such as this, study What compiler options are recommended for beginners learning C?


As for what compilers like gcc and clang do when faced with such non-standard code - they appear to simply discard the superfluous initializers. If I compile this code with gcc/clang for x86 and ignore the diagnostic message:

int foo (void)
{
    int a = {1, 2, 3};
    return a;
}

The resulting x86 assembly is

mov     eax, 1
ret

Which when translated back to C is 100% equivalent to

int foo (void)
{
    return 1;
}

It's important to understand that this is a non-standard compiler extension though, and no guaranteed or portable behavior.

Lundin
  • 195,001
  • 40
  • 254
  • 396
  • There's a passage in the provided link where you say "_There's an option `-pedantic` that gives warnings for invalid C. `-pedantic-errors` is the same but gives errors and prevents the code from compiling_" - I tried to understand if that's the case by reading [`-pedantic` and `-pedantic-errors`](https://gcc.gnu.org/onlinedocs/gcc/Warning-Options.html#Warning-Options) a long time ago - but gave up and have used both ever since. :-) If you are sure about this, I'll drop `-pedantic` from now on. – Ted Lyngmo Oct 29 '21 at 09:26
  • 1
    @TedLyngmo `-pedantic` gives _warnings_ and `-pedantic-errors` gives errors. If you are using `-pedantic-errors` there's no need to use `-pedantic`. They look for exactly the same things. The advantage of `-pedantic-errors` is that you can use it and only get errors for C language violations. If you would do `-pedantic -Werror` you will turn _all_ warnings into errors, which is good for C beginners but not for intermediate and above, whom might only be interested in getting errors for outright wrong stuff. – Lundin Oct 29 '21 at 09:33
  • Thanks! I just did what didn't occur to me before. I checked the gcc source. `pedantic-errors` `Common Var(flag_pedantic_errors)` `Like -pedantic but issue them as errors.` :-) – Ted Lyngmo Oct 29 '21 at 09:36
  • 1
    (Similarly, people writing stuff like glibc or Linux kernel code under -std=gnu17 might want to use -pedantic just to get a head's up, as they might be doing non-standard things on purpose.) – Lundin Oct 29 '21 at 09:37