According to the standard (C17 draft, 7.21.7.2), fgets (¶1)
char *fgets(char * restrict s, int n, FILE * restrict stream);
reads from stream at most n-1 characters (until the first '\n' (which is in this case also written to the target) or EOF) into s[], appending a '\0' (¶2). It returns (¶3):
NULL:- if
EOFis encountered immediately (s[]remains unchanged) - if there was a read error (
s[]has indeterminate contents)
- if
s: otherwise ("success")
I would therefore expect that for n <= 1, fgets reads "at most" n-1 <= 0 (that is: 0) characters, appending a '\0', and returning s. In any case, there is nothing being read, so the program can't read EOF or have any read errors.
However, with the following code
#include <stdio.h>
#include <string.h>
int main(void) {
char s[20];
char *cp;
int n;
for (n = 2; n >= -1; --n) {
strcpy(s, "HHHHH");
cp = fgets(s, n, stdin);
printf("n == %d:\n", n);
printf(" \"%s\"\n", s);
if (cp == NULL)
printf(" fgets returned NULL\n");
}
printf("The end of main has been reached.\n");
return 0;
}
and input abcde, I get the following output with GCC
n == 2:
"a"
n == 1:
""
n == 0:
"HHHHH"
fgets returned NULL
n == -1:
"HHHHH"
fgets returned NULL
The end of main has been reached.
and the following output with MSVC
n == 2:
"a"
n == 1:
""
n == 0:
"HHHHH"
fgets returned NULL
I am guessing that the abrupt program termination with MSVC has to do with the invocation of an "invalid parameter handler" (see Microsoft's documentation for fgets).
For the n == 1 case, the output is as expected. But: Shouldn't fgets(s, n, stream) assign and return an empty string "" instead of NULL for all n <= 1 instead of just for n == 1? Irrespective of what to make of the n == -1 case, both GCC and MSVC return NULL for n == 0.
For what it's worth, the precise wording of ¶3 is:
"[...] If end-of-file is encountered and no characters have been read into the array, the contents of the array remain unchanged and a null pointer is returned. [...]"
But if nothing is being read in the first place, how can end-of-file be "encountered" in that case?
My conclusions, after having read the comments and the content in the duplicate post:
I am convinced now that reading "at most"
n-1characters isn't possible forn <= 0. One can only read a non-negative number of characters (which I interpret to mean: a number of characters in the range [0,SIZE_MAX]).Therefore, in the case of
n <= 0, the standard's text invokes what linguists in their subfield of semantics call a presupposition failure.- A presupposition is an assumption whose negation renders the containing statement uninterpretable. Natural language examples: "the" in a sentence presupposes contextual uniqueness; "we" in a sentence presupposes 2 or more people on whose behalf the subject is speaking; "stopped doing X" in a sentence presupposes that one indeed "was doing X for a while".
- That is, in this case "all bets are off". However, even though the presupposition failure makes this case literally "undefined behavior", I would be more comfortable if we just called the standard out on this omission, because after all it doesn't outlaw an argument of
nwhich is<= 0. (As user "chux" pointed out in a comment (paraphrased): UB comes in 2 flavors, that which is explicitly specified as UB and that about which the standard is silent; both types are common in C.)
The case of
n == 1looks well-formed to me (one can read "at most0" characters). I find the wordingA null character is written immediately after the last character read into the array.
unproblematic, because not having a "last character read" is expected for the boundary condition of
0read operations. (That is, in this case the presupposition failure can be tolerated, because it is just "one away" from there being a last character read.)- That said, the wording lacks clarity, and the standard should improve it.
- To make things dependent on the stream's
EOFflag (herefeof(stdin)) is intriguing, but I think this goes too far in trying to assign meaning to something which is poorly worded in the standard.
That the C2x draft (I'm looking at N3096; there might be newer versions at the point of this writing) still contains the same underspecified language is a disappointment.
I believe that there are 4 potential ways we can consider handling the n <= 0 case, given that it's not outlawed:
- setting
s[]to""andreturnings - setting
s[]to""andreturningNULL - leaving
s[]unchanged andreturnings - leaving
s[]unchanged andreturningNULL
Given the existing confusion around the case, I will stay away from discussing their relative merits and consistency with other parts of the standard.