35

I stumble upon some C++ code like this:

int $T$S;

First I thought that it was some sort of PHP code or something wrongly pasted in there, but it compiles and runs nicely (on MSVC 2008).

What kind of characters are valid for variables in C++ and are there other weird characters you can use?

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
Valmond
  • 2,897
  • 8
  • 29
  • 49
  • 10
    "Can" != "Should." Using `$` in a variable name is an extension to the language, and will probably not work for other compilers (except probably GCC, which has a flag for every language extension ever). – Chris Lutz Oct 28 '11 at 07:40
  • 1
    I'd even say it clogs up the variable names using weird characters so no, I don't want to use it, just to know about it :-) – Valmond Oct 28 '11 at 08:09
  • 6
    The use of `$` in identifiers is fairly common on VMS, where a lot of system library routines have names like `SYS$SOMETHING`. g++ supports it as an extension, but warns about it if you specify `-pedantic`. – Keith Thompson Oct 28 '11 at 08:31
  • @KeithThompson: As for now 2015-11-17 `gcc 4.9.3` does not even warn with the `-pedantic` switch. Can it be something changed in the meantime? – Peter VARGA Nov 17 '15 at 09:29
  • 1
    @AlBundy: The C standard permits "other implementation-defined characters" in identifiers. No warning is required, even with `-pedantic` (though personally I wish there were an easy way to warn about such things without specifying a separate option for each feature). – Keith Thompson Nov 17 '15 at 16:58
  • I was also caught out relying on "-Wpedantic". If you actually want to prevent this, suggest turning off the extension with "-fno-dollars-in-identifiers" or similar. – tipaye Nov 21 '16 at 16:52
  • If I recall correctly it's implementation-defined. "Are there weird characters you can use?" The answer is yes but please be nice to people who'll read your code in the future. See: [cppreference: c++ identifiers](https://en.cppreference.com/w/cpp/language/identifiers) – viraltaco_ Jul 15 '21 at 05:08

4 Answers4

36

The only legal characters according to the standard are alphanumerics and the underscore. The standard does require that just about anything Unicode considers alphabetic is acceptable (but only as single code-point characters). In practice, implementations offer extensions (i.e. some do accept a $) and restrictions (most don't accept all of the required Unicode characters). If you want your code to be portable, restrict symbols to the 26 unaccented letters, upper or lower case, the ten digits, and the '_'.

James Kanze
  • 150,581
  • 18
  • 184
  • 329
17

It's an extension of some compilers and not in the C standard

MSVC:

Microsoft Specific

Only the first 2048 characters of Microsoft C++ identifiers are significant. Names for user-defined types are "decorated" by the compiler to preserve type information. The resultant name, including the type information, cannot be longer than 2048 characters. (See Decorated Names for more information.) Factors that can influence the length of a decorated identifier are:

  • Whether the identifier denotes an object of user-defined type or a type derived from a user-defined type.
  • Whether the identifier denotes a function or a type derived from a function.
  • The number of arguments to a function.

The dollar sign is also a valid identifier in Visual C++.

// dollar_sign_identifier.cpp
struct $Y1$ {
    void $Test$() {}
};

int main() {
    $Y1$ $x$;
    $x$.$Test$();
}

https://web.archive.org/web/20100216114436/http://msdn.microsoft.com/en-us/library/565w213d.aspx

Newest version: https://learn.microsoft.com/en-us/cpp/cpp/identifiers-cpp?redirectedfrom=MSDN&view=vs-2019

GCC:

6.42 Dollar Signs in Identifier Names

In GNU C, you may normally use dollar signs in identifier names. This is because many traditional C implementations allow such identifiers. However, dollar signs in identifiers are not supported on a few target machines, typically because the target assembler does not allow them.

http://gcc.gnu.org/onlinedocs/gcc/Dollar-Signs.html#Dollar-Signs

phuclv
  • 37,963
  • 15
  • 156
  • 475
3

To my knowledge, only letters (capital and small), numbers (0 to 9) and _ are valid for variable names according to the standard (note: the variable name should not start with a number though).

All other characters should be compiler extensions.

Peter Mortensen
  • 30,738
  • 21
  • 105
  • 131
iammilind
  • 68,093
  • 33
  • 169
  • 336
-1

This is not good practice. Generally, you should only use alphanumeric characters and underscores in identifiers ([a-z][A-Z][0-9]_).

Surface Level

Unlike in other languages (bash, perl), C does not use $ to denote the usage of a variable. As such, it is technically valid. In C it most likely falls under C11, 6.4.2. This means that it does seem to be supported by modern compilers.

As for your C++ question, lets test it!

int main(void) {
    int $ = 0;
    return $;
}

On GCC/G++/Clang/Clang++, this indeed compiles, and runs just fine.

Deeper Level

Compilers take source code, lex it into a token stream, put that into an abstract syntax tree (AST), and then use that to generate code (e.g. assembly/LLVM IR). Your question really only revolves around the first part (e.g. lexing).

The grammar (thus the lexer implementation) of C/C++ does not treat $ as special, unlike commas, periods, skinny arrows, etc... As such, you may get an output from the lexer like this from the below c code:

int i_love_$ = 0;

After the lexer, this becomes a token steam like such:

["int", "i_love_$", "=", "0"]

If you where to take this code:

int i_love_$,_and_.s = 0;

The lexer would output a token steam like:

["int", "i_love_$", ",", "_and_", ".", "s", "=", "0"]

As you can see, because C/C++ doesn't treat characters like $ as special, it is processed differently than other characters like periods.

Isacc Barker
  • 507
  • 4
  • 15
  • 1
    "As of C++ 17, this is standards conformant, see Draft n4659" This is false as of current drafts (https://timsong-cpp.github.io/cppwp/uaxid.def), valid identifiers are made from things part of XID_Start / XID_Continue unicode classes and $ is part of neither: http://www.zuga.net/articles/unicode/character/record/0024.html – Jean-Michaël Celerier Aug 01 '21 at 11:11