Try file, then file -k, then dos2unix -ih
file will usually be enough. But for tough cases try file -k or dos2unix -ih.
Details below.
Try file -k
Short version: file -k somefile.txt will tell you line terminators:
- It will output
with CRLF line terminators for DOS/Windows line terminators.
- It will output
with CR line terminators for MAC line terminators.
- It will just output
text for Linux/Unix "LF" line terminators. (So if it does not explicitly mention any kind of line terminators then this means: "LF line terminators".)
And for extra weird cases: When you have mixed line terminators:
$ echo -ne '1\n2\r\n3\r' | file -k -
/dev/stdin: ASCII text, with CRLF, CR, LF line terminators
Long version see below.
Real world example: Certificate Encoding
I sometimes have to check this for PEM certificate files.
The trouble with regular file is this: Sometimes it's trying to be too smart/too specific.
Let's try a little quiz: I've got some files. And one of these files has different line terminators. Which one?
(By the way: this is what one of my typical "certificate work" directories looks like.)
Let's try regular file:
$ file -- *
0.example.end.cer: PEM certificate
0.example.end.key: PEM RSA private key
1.example.int.cer: PEM certificate
2.example.root.cer: PEM certificate
example.opensslconfig.ini: ASCII text
example.req: PEM certificate request
Huh. It's not telling me the line terminators. And I already knew that those were cert files. I didn't need "file" to tell me that.
Some network appliances are really, really picky about how their certificate files are encoded. That's why I need to know.
What else can you try?
You might try dos2unix with the --info switch like this:
$ dos2unix --info -- *
37 0 0 no_bom text 0.example.end.cer
0 27 0 no_bom text 0.example.end.key
0 28 0 no_bom text 1.example.int.cer
0 25 0 no_bom text 2.example.root.cer
0 35 0 no_bom text example.opensslconfig.ini
0 19 0 no_bom text example.req
So that tells you that: yup, "0.example.end.cer" must be the odd man out. But what kind of line terminators are there? Do you know the dos2unix output format by heart? (I don't.)
But fortunately there's the --keep-going (or -k for short) option in file:
$ file --keep-going -- *
0.example.end.cer: PEM certificate\012- , ASCII text, with CRLF line terminators\012- data
0.example.end.key: PEM RSA private key\012- , ASCII text\012- data
1.example.int.cer: PEM certificate\012- , ASCII text\012- data
2.example.root.cer: PEM certificate\012- , ASCII text\012- data
example.opensslconfig.ini: ASCII text\012- data
example.req: PEM certificate request\012- , ASCII text\012- data
Excellent! Now we know that our odd file has DOS (CRLF) line terminators. (And the other files have Unix (LF) line terminators. This is not explicit in this output. It's implicit. It's just the way file expects a "regular" text file to be.)
(If you wanna share my mnemonic: "L" is for "Linux" and for "LF".)
Now let's convert the culprit and try again:
$ dos2unix -- 0.example.end.cer
$ file --keep-going -- *
0.example.end.cer: PEM certificate\012- , ASCII text\012- data
0.example.end.key: PEM RSA private key\012- , ASCII text\012- data
1.example.int.cer: PEM certificate\012- , ASCII text\012- data
2.example.root.cer: PEM certificate\012- , ASCII text\012- data
example.opensslconfig.ini: ASCII text\012- data
example.req: PEM certificate request\012- , ASCII text\012- data
Good. Now all certs have Unix line terminators.
Try dos2unix -ih
I didn't know this when I was writing the example above but:
Actually it turns out that dos2unix will give you a header line if you use -ih (short for --info=h) like so:
$ dos2unix -ih -- *
DOS UNIX MAC BOM TXTBIN FILE
0 37 0 no_bom text 0.example.end.cer
0 27 0 no_bom text 0.example.end.key
0 28 0 no_bom text 1.example.int.cer
0 25 0 no_bom text 2.example.root.cer
0 35 0 no_bom text example.opensslconfig.ini
0 19 0 no_bom text example.req
And another "actually" moment: The header format is really easy to remember: Here's two mnemonics:
- It's DUMB (left to right: d for Dos, u for Unix, m for Mac, b for BOM).
- And also: "DUM" is just the alphabetical ordering of D, U and M.
Further reading