23

I am trying to save the output of a command to a file. The command is:

clang -Xclang -ast-dump -fsyntax-only main.cpp > output.txt

However the resulting output.txt file when opened (by gedit and jedit on ubuntu) gives me this:

[0;1;32mTranslationUnitDecl[0m[0;33m 0x4192020[0m <[0;33m<invalid sloc>[0m> [0;33m<invalid sloc>[0m
[0;34m|-[0m[0;1;32mTypedefDecl[0m[0;33m 0x4192558[0m <[0;33m<invalid sloc>[0m> [0;33m<invalid sloc>[0m implicit[0;1;36m __int128_t[0m [0;32m'__int128'[0m
[0;34m| `-[0m[0;32mBuiltinType[0m[0;33m 0x4192270[0m [0;32m'__int128'[0m
[0;34m|-[0m[0;1;32mTypedefDecl[0m[0;33m 0x41925b8[0m <[0;33m<invalid sloc>[0m> [0;33m<invalid sloc>[0m implicit[0;1;36m __uint128_t[0m [0;32m'unsigned __int128'[0m
[0;34m| `-[0m[0;32mBuiltinType[0m[0;33m 0x4192290[0m [0;32m'unsigned __int128'[0m
...

When it should really look like this:

TranslationUnitDecl 0x4e46020 <<invalid sloc>> <invalid sloc>
|-TypedefDecl 0x4e46558 <<invalid sloc>> <invalid sloc> implicit __int128_t '__int128'
| `-BuiltinType 0x4e46270 '__int128'
|-TypedefDecl 0x4e465b8 <<invalid sloc>> <invalid sloc> implicit __uint128_t 'unsigned __int128'
| `-BuiltinType 0x4e46290 'unsigned __int128'
...

I thought it might be a problem of encoding, I checked the encoding of the file, file -bi output.txt which outputs text/plain; charset=us-ascii.

I thought maybe if I change the encoding to utf-8 the problem would be fixed so I tried this:

clang -Xclang -ast-dump -fsyntax-only main.cpp | iconv -f us-ascii -t UTF-8 > output.txt

but it didn't make a difference.

What can I do to solve this problem?

The problem isn't that I'm trying to view the syntax-highlighted version (I didn't have a problem viewing it in the first place). I need to save the AST generated by clang to a file and then parse it, which would be difficult with the colour information left in.

Jarmund
  • 6,277
  • 5
  • 38
  • 60
abzullah
  • 347

3 Answers3

55

It has nothing to do with codepages/encoding. Your output isn't plain text. It contains the sequences like [0;1;32m. These strings (there is a, not shown, [escape] character as well before each of these) are instructions to the terminal to show text bold, italics, in various colors, etc. This results in easier to read output, if your terminal supports it.

There should be an option to tell clang not to try to beautify the output, but use plain text instead. Check the manual. (I don't have one handy, so I can't tell you what the proper command would be.)

techraf
  • 4,952
Tonny
  • 33,276
13

Alternatively, instead of removing the colours from the output, you can view the coloured output in your terminal by using the raw option of less

less -r output.txt
2

Those characters, such as [0;33m look like terminal output control to me. They're part of a set of escape sequences that is frequently used for applying colors to text in the terminal. In its raw state like this it is also often used for applying color to the bash prompt itself - Here's what I've been using in .bashrc for years on all of my machines:

export PS1='\[\033[1;33m\]\u\[\033[1;35m\]@\[\033[1;32m\]\h\[\033[0;36m\]\w\[\033[1;37m\]\$ \[\033[0;37m\]'

(Most think it's ugly, but I like it).

See if you are able to find a switch to remove any color coding or the like from the output of your commands and see if that helps.

Jarmund
  • 6,277
  • 5
  • 38
  • 60