492

I need to compare two binary files and get the output in the form

<fileoffset-hex> <file1-byte-hex> <file2-byte-hex>

for every different byte. So if file1.bin is

  00 90 00 11

in binary form and file2.bin is

  00 91 00 10

I want to get something like

  00000001 90 91
  00000003 11 10

Is there a way to do this in Linux? I know about cmp -l but it uses a decimal system for offsets and octal for bytes which I would like to avoid.

18 Answers18

282

As ~quack pointed out:

 % xxd b1 > b1.hex
 % xxd b2 > b2.hex

And then

 % diff b1.hex b2.hex

or

 % vimdiff b1.hex b2.hex
akira
  • 63,447
267

This will print the offset and bytes in hex:

cmp -l file1.bin file2.bin | gawk '{printf "%08X %02X %02X\n", $1, strtonum(0$2), strtonum(0$3)}'

Or do $1-1 to have the first printed offset start at 0.

cmp -l file1.bin file2.bin | gawk '{printf "%08X %02X %02X\n", $1-1, strtonum(0$2), strtonum(0$3)}'

Unfortunately, strtonum() is specific to GAWK, so for other versions of awk—e.g., mawk—you will need to use an octal-to-decimal conversion function. For example,

cmp -l file1.bin file2.bin | mawk 'function oct2dec(oct,     dec) {for (i = 1; i <= length(oct); i++) {dec *= 8; dec += substr(oct, i, 1)}; return dec} {printf "%08X %02X %02X\n", $1, oct2dec($2), oct2dec($3)}'

Broken out for readability:

cmp -l file1.bin file2.bin |
    mawk 'function oct2dec(oct,    dec) {
              for (i = 1; i <= length(oct); i++) {
                  dec *= 8;
                  dec += substr(oct, i, 1)
              };
              return dec
          }
          {
              printf "%08X %02X %02X\n", $1, oct2dec($2), oct2dec($3)
          }'
216

diff + xxd

Try diff in the following combination of zsh/bash process substitution:

diff -y <(xxd foo1.bin) <(xxd foo2.bin)

Where:

  • -y shows you differences side-by-side (optional).
  • xxd is CLI tool to create a hexdump output of the binary file.
  • Add -W200 to diff for wider output (of 200 characters per line).
  • For colors, use colordiff as shown below.

colordiff + xxd

If you've colordiff, it can colorize diff output, e.g.:

colordiff -y <(xxd foo1.bin) <(xxd foo2.bin)

Otherwise install via: sudo apt-get install colordiff.

Sample output:

binary file output in terminal - diff -y <(xxd foo1.bin) <(xxd foo2.bin) | colordiff

vimdiff + xxd

You can also use vimdiff, e.g.

vimdiff <(xxd foo1.bin) <(xxd foo2.bin)

Hints:

  • if files are too big, add limit (e.g. -l1000) for each xxd
kenorb
  • 26,615
80

There's a tool called DHEX which may do the job, and there's another tool called VBinDiff.

For a strictly command-line approach, try jojodiff.

njd
  • 11,426
48

diff + od method that works for byte addition / deletion

diff <(od -An -tx1 -w1 -v file1) \
     <(od -An -tx1 -w1 -v file2)

which you may want to alias in Bash as:

bdiff() {
  diff <(od -An -tx1 -w1 -v "$1") \
       <(od -An -tx1 -w1 -v "$2")
}

Generate a test case with a single removal of byte 64:

for i in `seq 128`; do printf "%02x" "$i"; done | xxd -r -p > file1
for i in `seq 128`; do if [ "$i" -ne 64 ]; then printf "%02x" $i; fi; done | xxd -r -p > file2

Output:

64d63
<  40

If you also want to see the ASCII version of the character:

bdiff() (
  f() (
    od -An -tx1c -w1 -v "$1" | paste -d '' - -
  )
  diff <(f "$1") <(f "$2")
)

bdiff file1 file2

Output:

64d63
<   40   @

Tested on Ubuntu 16.04.

I prefer od over xxd because:

  • it is POSIX, xxd is not (comes with Vim)
  • has the -An to remove the address column without awk.

Command explanation:

  • -An removes the address column. This is important otherwise all lines would differ after a byte addition / removal.
  • -w1 puts one byte per line, so that diff can consume it. It is crucial to have one byte per line, or else every line after a deletion would become out of phase and differ. Unfortunately, this is not POSIX, but present in GNU.
  • -tx1 is the representation you want, change to any possible value, as long as you keep 1 byte per line.
  • -v prevents asterisk repetition abbreviation * which might interfere with the diff
  • paste -d '' - - joins every two lines. We need it because the hex and ASCII go into separate adjacent lines. Taken from: https://stackoverflow.com/questions/8987257/concatenating-every-other-line-with-the-next
  • we use parenthesis () to define bdiff instead of {} to limit the scope of the inner function f, see also: https://stackoverflow.com/questions/8426077/how-to-define-a-function-inside-another-function-in-bash

See also:

16

Short answer

vimdiff <(xxd -c1 -p first.bin) <(xxd -c1 -p second.bin)

When using hexdumps and text diff to compare binary files, especially xxd, the additions and removals of bytes become shifts in addressing which might make it difficult to see. This method tells xxd to not output addresses, and to output only one byte per line, which in turn shows exactly which bytes were changed, added, or removed. You can find the addresses later by searching for the interesting sequences of bytes in a more "normal" hexdump (output of xxd first.bin).

13

The firmware analysis tool binwalk also has this as a feature through its -W/--hexdump command line option which offers options such as to only show the differing bytes:

    -W, --hexdump                Perform a hexdump / diff of a file or files
    -G, --green                  Only show lines containing bytes that are the same among all files
    -i, --red                    Only show lines containing bytes that are different among all files
    -U, --blue                   Only show lines containing bytes that are different among some files
    -w, --terse                  Diff all files, but only display a hex dump of the first file

In OP's example when doing binwalk -W file1.bin file2.bin:

binwalk -W file1.bin file2.bin

Add | less -r for paging.

phk
  • 405
13

I'd recommend hexdump for dumping binary files to textual format and kdiff3 for diff viewing.

hexdump myfile1.bin > myfile1.hex
hexdump myfile2.bin > myfile2.hex
kdiff3 myfile1.hex myfile2.hex
BugoK
  • 131
10

The hexdiff is a program designed to do exactly what you're looking for.

Usage:

hexdiff file1 file2

It displays the hex (and 7-bit ASCII) of the two files one above the other, with any differences highlighted. Look at man hexdiff for the commands to move around in the file, and a simple q will quit.

kenorb
  • 26,615
Mick
  • 241
4

It may not strictly answer the question, but I use this for diffing binaries:

gvim -d <(xxd -c 1 ~/file1.bin | awk '{print $2, $3}') <(xxd -c 1 ~/file2.bin | awk '{print $2, $3}')

It prints both files out as hex and ASCII values, one byte per line, and then uses Vim's diff facility to render them visually.

2

Below is a Perl script, colorbindiff, which performs a binary diff, taking into account bytes changes but also byte additions/deletions (many of the solutions proposed here only handle byte changes), like in a text diff. It's also available on GitHub.

It displays results side by side with colors, and this greatly facilitate analysis.

colorbindiff output snapshot

To use it:

perl colorbindiff.pl FILE1 FILE2

The script:

#!/usr/bin/perl
#########################################################################
#
# VBINDIFF.PL : A side-by-side visual diff for binary files.
#               Consult usage subroutine below for help.
#
# Copyright (C) 2020 Jerome Lelasseux jl@jjazzlab.com
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program.  If not, see <http://www.gnu.org/licenses/>.
#
#
#########################################################################

use warnings;
use strict;
use Term::ANSIColor qw(colorstrip colored);
use Getopt::Long qw(GetOptions);
use File::Temp qw(tempfile);
use constant BLANK => "..";
use constant BUFSIZE =>  64 * 1024;     # 64kB

sub usage
{
    print "USAGE: $0 [OPTIONS] FILE1 FILE2\n";
    print "Show a side-by-side binary comparison of FILE1 and FILE2. Show byte modifications but also additions and deletions, whatever the number of changed bytes. Rely on the 'diff' external command such as found on Linux or Cygwin. The algorithm is not suited for large and very different files.\n";
    print "Author: Jerome Lelasseux \@2021\n";
    print "OPTIONS: \n";
    print " --cols=N       : display N columns of bytes.diff Default is 16.\n";
    print " --no-color     : don't colorize output. Needed if you view the output in an editor.\n";
    print " --no-marker    : don't use the change markers (+ for addition, - for deletion, * for modified).\n";
    print " --no-ascii     : don't show the ascii columns.\n";
    print " --only-changes : only display lines with changes.\n";
    exit;
}

# Command line arguments
my $maxCols = 16;
my $noColor = 0;
my $noMarker = 0;
my $noAscii = 0;
my $noCommon = 0;
GetOptions(
    'cols=i'       => \$maxCols,
    'no-ascii'     => \$noAscii,
    'no-color'     => \$noColor,
    'no-marker'    => \$noMarker,
    'only-changes' => \$noCommon
) or usage();
usage() unless ($#ARGV == 1);
my ($file1, $file2) = (@ARGV);


# Convert input files into hex lists
my $fileHex1 = createHexListFile($file1);
my $fileHex2 = createHexListFile($file2);


# Process diff -y output to get an easy-to-read side-by-side view
my $colIndex = 0;
my $oldPtr = 0;
my $newPtr = 0;
my $oldLineBuffer = sprintf("0x%04X ", 0);
my $newLineBuffer = sprintf("0x%04X ", 0);
my $oldCharBuffer;
my $newCharBuffer;
my $isDeleting = 0;
my $isAdding = 0;
my $isUnchangedLine = 1;

open(my $fh, '-|', qq(diff -y $fileHex1 $fileHex2)) or die $!;
while (<$fh>)
{
    # Parse line by line the output of the 'diff -y' on the 2 hex list files.
    # We expect:
    # "xx      | yy" for a modified byte
    # "        > yy" for an added byte
    # "xx      <"    for a deleted byte
    # "xx        xx" for identicial bytes

   my ($oldByte, $newByte);
   my ($oldChar, $newChar);
   if (/\|/)
   {
        # Changed
        if ($isDeleting || $isAdding)
        {
            printLine($colIndex);
        }
        $isAdding = 0;
        $isDeleting = 0;
        $isUnchangedLine = 0;

        /([a-fA-F0-9]+)([^a-fA-F0-9]+)([a-fA-F0-9]+)/;
        $oldByte = formatByte($1, 3);
        $oldChar = toPrintableChar($1, 3);
        $newByte = formatByte($3, 3);
        $newChar = toPrintableChar($3, 3);
        $oldPtr++;
        $newPtr++;
   }
   elsif (/</)
   {
        # Deleted in new
        if ($isAdding)
        {
            printLine($colIndex);
        }
        $isAdding = 0;
        $isDeleting = 1;
        $isUnchangedLine = 0;

        /([a-fA-F0-9]+)/;
        $oldByte=formatByte($1, 2);
        $oldChar=toPrintableChar($1, 2);
        $newByte=formatByte(BLANK, 2);
        $newChar=colorize(".", 2);
        $oldPtr++;
   }
   elsif (/>/)
   {
        # Added in new
        if ($isDeleting)
        {
            printLine($colIndex);
        }
        $isAdding = 1;
        $isDeleting = 0;
        $isUnchangedLine = 0;

        /([a-fA-F0-9]+)/;
        $oldByte=formatByte(BLANK, 1);
        $oldChar=colorize(".", 1);
        $newByte=formatByte($1, 1);
        $newChar=toPrintableChar($1, 1);
        $newPtr++;
   }
   else
   {
        # Unchanged
        if ($isDeleting || $isAdding)
        {
            printLine($colIndex);
        }
        $isDeleting = 0;
        $isAdding = 0;

        /([a-fA-F0-9]+)([^a-fA-F0-9]+)([a-fA-F0-9]+)/;
        $oldByte=formatByte($1, 0);
        $oldChar=toPrintableChar($1, 0);
        $newByte=formatByte($3, 0);
        $newChar=toPrintableChar($3, 0);
        $oldPtr++;
        $newPtr++;
   }

   # Append the bytes to the old and new buffers
    $oldLineBuffer .= $oldByte;
    $oldCharBuffer .= $oldChar;
    $newLineBuffer .= $newByte;
    $newCharBuffer .= $newChar;
    $colIndex++;
    if ($colIndex == $maxCols)
    {
        printLine();
    }
}

printLine($colIndex);    # Possible remaining line


#================================================================
# subroutines
#================================================================

# $1 a string representing a data byte
# $2 0=unchanged, 1=added, 2=deleted, 3=changed
# return the formatted string (color/maker)
sub formatByte
{
    my ($byte, $type) = @_;
    my $res;
    if (!$noMarker)
    {
        if    ($type == 0  || $byte eq BLANK)     { $res = "  " . $byte; }    # Unchanged or blank
        elsif ($type == 1)     { $res = " +" . $byte; }    # Added
        elsif ($type == 2)     { $res = " -" . $byte; }    # Deleted
        elsif ($type == 3)     { $res = " *" . $byte; }    # Changed
        else  { die "Error"; }
    } else
    {
        $res = " " . $byte;
    }
    $res = colorize($res, $type);
    return $res;
}

# $1 a string
# $2 0=unchanged, 1=added, 2=deleted, 3=changed
# return the colorized string according to $2
sub colorize
{
    my ($res, $type) = @_;
    if (!$noColor)
    {
        if ($type == 0)     {  }        # Unchanged
        elsif ($type == 1)     { $res = colored($res, 'bright_green'); }  # Added
        elsif ($type == 2)     { $res = colored($res, 'bright_red'); }    # Deleted
        elsif ($type == 3)     { $res = colored($res, 'bright_cyan'); }   # Changed
        else   { die "Error"; }
    }
    return $res;
}

# Print the buffered line
sub printLine
{
    if (length($oldLineBuffer) <=10)
    {
        return;        # No data to display
    }

    if (!$isUnchangedLine)
    {
        # Colorize and add a marker to the address of each line if some bytes are changed/added/deleted
        my $prefix = substr($oldLineBuffer, 0, 6) . ($noMarker ? " " : "*");
        $prefix = colored($prefix, 'magenta') unless $noColor;
        $oldLineBuffer =~ s/^......./$prefix/;
        $prefix = substr($newLineBuffer, 0, 6) . ($noMarker ? " " : "*");
        $prefix = colored($prefix, 'magenta') unless $noColor;
        $newLineBuffer =~ s/^......./$prefix/;
    }

    my $oldCBuf = $noAscii ? "" : $oldCharBuffer;
    my $newCBuf = $noAscii ? "" : $newCharBuffer;
    my $spacerChars = $noAscii ? "" : (" " x ($maxCols - $colIndex));
    my $spacerData = ($noMarker ? "   " : "    ") x ($maxCols - $colIndex);
    if (!($noCommon && $isUnchangedLine))
    {
        print "${oldLineBuffer}${spacerData} ${oldCBuf}${spacerChars}  ${newLineBuffer}${spacerData} ${newCBuf}\n";
    }

    # Reset buffers and counters
    $oldLineBuffer = sprintf("0x%04X ", $oldPtr);
    $newLineBuffer = sprintf("0x%04X ", $newPtr);
    $oldCharBuffer = "";
    $newCharBuffer = "";
    $colIndex = 0;
    $isUnchangedLine = 1;
}

# Convert a hex byte string into a printable char, or '.'.
# $1 = hex str such as A0
# $2 0=unchanged, 1=added, 2=deleted, 3=changed
# Return the corresponding char, possibly colorized
sub toPrintableChar
{
    my ($hexByte, $type) = @_;
    my $char = chr(hex($hexByte));
    $char = ($char =~ /[[:print:]]/) ? $char : ".";
    return colorize($char, $type);
}

# Convert file $1 into a text file with 1 hex byte per line.
# $1=input file name
# Return the output file name
sub createHexListFile
{
    my ($inFileName) = @_;
    my $buffer;
    my $in_fh;
    open($in_fh,  "<:raw", $inFileName) || die "$0: cannot open $inFileName for reading: $!";
    my ($out_fh, $filename) = tempfile();

    while (my $nbReadBytes = read($in_fh, $buffer, BUFSIZE))
    {
        my @hexBytes = unpack("H2" x $nbReadBytes, $buffer);
        foreach my $hexByte (@hexBytes)
        {
            print $out_fh "$hexByte\n" || die "couldn't write to $out_fh: $!";
        }
    }
    close($in_fh);
    return $filename;
}
jjazzboss
  • 121
2

How to compare binary files, hex files, and Intel hex firmware files with meld

Mirror mirror on the wall, which is the most amazing solution of them all?

It's this one: from @kenorb!--for sure! I upvoted it. Now, let me show you how amazing it looks and how easy it is to use with meld:

enter image description here

Quick summary

Get the latest version of my hex2xxdhex function in my eRCaGuy_dotfiles repo here: .bash_useful_functions. Copy and paste that function to the bottom of your ~/.bashrc file. Then, re-source your ~/.bashrc file with . ~/.bashrc. Finally, use my hex2xxdhex function like this:

# pass multiple `.hex` files to convert to `.bin`, `.xxd.hex`, and
# `.xxd_short.hex` files
hex2xxdhex path/to/myfile1.hex path/to/myfile2.hex
# then compare the two output ".xxd.hex" or ".xxd_short.hex" files with `meld`
meld path/to/myfile1.xxd_short.hex path/to/myfile2.xxd_short.hex
meld path/to/myfile1.xxd.hex path/to/myfile2.xxd.hex

In the above comparison of the two *.xxd*.hex files, you'll see all of the hex chars, followed by the binary/ASCII chars in a column on the right-hand-side, allowing you to more easily identify hex file differences between the two files.

The .xxd_short.hex files are simply the same as the .xxd.hex files, except with all lines containing only zeros removed. This way, if your hex file places portions of your firmware at drastically different address locations, all of the padded zeros between the two address locations are removed.

If your initial .hex file is 3.5 MB, your .bin file might be 45 MB, your .xxd.hex file might be 200 MB, and your .xxd_short.hex file (with all rows of pure zeros) might be 5 MB. meld can compare two 5 MB files just fine, but it struggles with 200 MB files. That's why I generate the .xxd_short.hex version too.

Other options:

# Compare **two** binary files in meld
meld <(xxd file1.bin) <(xxd file2.bin)

Compare three binary files in meld

meld <(xxd file1.bin) <(xxd file2.bin) <(xxd file3.bin)

(note that for regular text files, just do this)

meld file1.txt file2.txt

Compare ASCII hex files which were previously created with

xxd file1.bin file1.hex

meld file1.hex file2.hex

one-liner to compare Intel hex my_firmware1.hex and my_firmware2.hex

objcopy --input-target=ihex --output-target=binary my_firmware1.hex 1.bin
&& objcopy --input-target=ihex --output-target=binary my_firmware2.hex 2.bin
&& meld <(xxd 1.bin) <(xxd 2.bin)

one-liner to compare Intel hex my_firmware1.hex and my_firmware2.hex

with the Microchip XC32 compiler toolchain!

xc32-objcopy --input-target=ihex --output-target=binary my_firmware1.hex 1.bin
&& xc32-objcopy --input-target=ihex --output-target=binary my_firmware2.hex 2.bin
&& meld <(xxd 1.bin) <(xxd 2.bin)

Compare two binary files using CLI tools only (no meld GUI) since you might

be ssh'ed onto a remote machine

diff -u --color=always <(xxd file1.bin) <(xxd file2.bin) | less -RFX

Another nice tool to use, which is CLI-based, but GUI-like (probably via the

ncurses library, I'm guessing)

vbindiff file1.bin file2.bin

Details:

Compare binary files with meld

First, install it in Linux Ubuntu with sudo apt install meld. Then, use it to compare binary files like this:

# Compare binary files in meld
meld <(xxd file1.bin) <(xxd file2.bin)

(note that for regular text files, just do this)

meld file1.txt file2.txt

The first command above gives you this view, highlighting the exact differences, on a line-by-line and character-by-character level, between the left and right files. Notice the highlighted slivers in the right scroll bar too, which indicate where lines differ in the entire file:

enter image description here

Navigation in Meld:

  1. You can find the next change with Alt + Down and the previous change with Alt + Up.
    1. Or, you can hover your cursor over the center space exactly between the left and right sides, and scroll up and down with the mouse wheel to jump just between the changes.
  2. You can type into and edit the left or right side, and save afterwards.
  3. You can use Ctrl + F in the left or right side to find.
    1. Limitation: it will not search around line wraps. For that, try vbindiff instead.

Great tool! I am going to use this extensively now as I compare microcontroller .hex firmware files to identify minor differences between some builds, such as changed IP addresses, embedded filenames, or timestamps.

The ingeniousness of the command above is how it uses xxd first to convert a binary file to a hex + binary ASCII-text-side-bar view so that you can see human-readable text as well as the hex code.

Compare standard .hex files with meld

Perhaps you have previously converted binary files to .hex files, like this:

# convert binary files to ASCII hex files with a human-readable binary
# ASCII-text-side-bar on the right
xxd file1.bin file1.hex
xxd file2.bin file2.hex

In that case, just use meld directly:

meld file1.hex file2.hex

Compare Intel .hex microcontroller firmware files with meld

Intex .hex files don't have the nice human-readable binary ASCII-text-side-bar on the right. So, first we must convert them to binary .bin files, using objcopy, as this answer shows, like this:

# Convert an Intel hex firmware file to binary
objcopy --input-target=ihex --output-target=binary my_firmware1.hex my_firmware1.bin
objcopy --input-target=ihex --output-target=binary my_firmware2.hex my_firmware2.bin

Do not forget the my_firmware1.bin part at the end or else you'll get an unexpected behavior: my_firmware1.hex will be converted to binary in-place! Oh no! There goes your hex file!

Now, compare the binary files in meld, using xxd to convert them back to ASCII hex with the pretty human-readable side-bar:

meld <(xxd my_firmware1.bin) <(xxd my_firmware2.bin)

Even better, do both steps above in one, like this "one-liner":

# one-liner to compare my_firmware1.hex and my_firmware2.hex 
objcopy --input-target=ihex --output-target=binary my_firmware1.hex 1.bin \
 && objcopy --input-target=ihex --output-target=binary my_firmware2.hex 2.bin \
 && meld <(xxd 1.bin) <(xxd 2.bin)

Keep in mind though you need to use your compiler's version of the objcopy executable to do the above operations. So, for the Microchip MPLAB X XC32 compiler toolchain, for instance, use xc32-objcopy instead of objcopy:

# one-liner to compare my_firmware1.hex and my_firmware2.hex 
# **with the Microchip XC32 compiler toolchain!**
xc32-objcopy --input-target=ihex --output-target=binary my_firmware1.hex 1.bin \
 && xc32-objcopy --input-target=ihex --output-target=binary my_firmware2.hex 2.bin \
 && meld <(xxd 1.bin) <(xxd 2.bin)

Using only non-GUI CLI tools (meld is a GUI)...

If you really need to use non-GUI tools, such as through an ssh session, here are some more options. Alternatively, you could just scp the file back to your local machine over ssh, and then use meld as described above.

Pure CLI tools for binary comparison:

  1. Use diff: refer back to @kenorb's answer above]. Here are some of my own spins on those commands which I think are more useful:

    # For short output
    diff -u --color=always <(xxd file1.bin) <(xxd file2.bin)
    

    If your output is really long, pipe to less -RFX, like git does

    diff -u --color=always <(xxd file1.bin) <(xxd file2.bin) | less -RFX

    Example run and output:

    eRCaGuy_hello_world/c$ diff -u --color=always <(xxd file1.bin) <(xxd file2.bin) 
    --- /dev/fd/63  2023-06-21 23:16:51.649582608 -0700
    +++ /dev/fd/62  2023-06-21 23:16:51.649582608 -0700
    @@ -53,8 +53,8 @@
     00000340: 0500 0000 474e 5500 0200 00c0 0400 0000  ....GNU.........
     00000350: 0300 0000 0000 0000 0280 00c0 0400 0000  ................
     00000360: 0100 0000 0000 0000 0400 0000 1400 0000  ................
    -00000370: 0300 0000 474e 5500 5a84 e8c3 58b3 e81e  ....GNU.Z...X...
    -00000380: d731 5fd2 0c0a 1aaf be99 ec8b 0400 0000  .1_.............
    +00000370: 0300 0000 474e 5500 23ea bebd 1106 9feb  ....GNU.#.......
    +00000380: 14a4 4f55 e90b d6b0 bf57 e851 0400 0000  ..OU.....W.Q....
     00000390: 1000 0000 0100 0000 474e 5500 0000 0000  ........GNU.....
     000003a0: 0300 0000 0200 0000 0000 0000 0000 0000  ................
     000003b0: 0200 0000 0600 0000 0100 0000 0600 0000  ................
    @@ -510,7 +510,7 @@
     00001fd0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     00001fe0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
     00001ff0: 0000 0000 0000 0000 0000 0000 0000 0000  ................
    -00002000: 0100 0200 4865 6c6c 6f20 576f 726c 642e  ....Hello World.
    +00002000: 0100 0200 4865 4c6c 6f20 576f 526c 642e  ....HeLlo WoRld.
     00002010: 0a00 0000 011b 033b 3000 0000 0500 0000  .......;0.......
     00002020: 0cf0 ffff 6400 0000 2cf0 ffff 8c00 0000  ....d...,.......
     00002030: 3cf0 ffff a400 0000 4cf0 ffff bc00 0000  <.......L.......
    

    Here's a screenshot so you can more easily see the colored changes:

    enter image description here

  2. Use vbindiff: I first learned about this tool here: How-To Geek: How to Compare Binary Files on Linux.

    Here is how to install and use it:

    # 1. Install it
    sudo apt update
    sudo apt install vbindiff
    

    2. use it; note that we do NOT need xxd here

    vbindiff file1.bin file2.bin

    You should study the manual/help pages too. It is very useful:

    vbindiff -h man vbindiff

    Here's what it looks like on the 2nd change shown above. As you can see, it has highlighted the exact character differences in red. It's a pretty nice tool:

    enter image description here

    Navigation:

    1. See man vbindiff for details. It's a short and easy manual.
    2. Press Space or Enter to go to the next difference. I hit Space twice to go to the 2nd difference in the screenshot above.
    3. Press Q or Esc to quit.
    4. You can scroll with the mouse scroll wheel, arrow keys, PageUp/Down keys, etc.
    5. Press T to toggle on scrolling only the top window.
    6. Press B to toggle on scrolling only the bottom window.
    7. Press F to find.

    The navigation is really pretty limited. I don't see a way to go back up and find a previous change. Just quit and start again.

How did I produce the binary files used in the examples above?

Easy:

file1.c from my eRCaGuy_hello_world repo here: hello_world_extra_basic.c:

#include <stdbool.h> // For `true` (`1`) and `false` (`0`) macros in C
#include <stdint.h>  // For `uint8_t`, `int8_t`, etc.
#include <stdio.h>   // For `printf()`

// int main(int argc, char *argv[]) // alternative prototype int main() { printf("Hello World.\n\n");

return 0;

}

file2.c:

#include <stdbool.h> // For `true` (`1`) and `false` (`0`) macros in C
#include <stdint.h>  // For `uint8_t`, `int8_t`, etc.
#include <stdio.h>   // For `printf()`

// int main(int argc, char *argv[]) // alternative prototype int main() { printf("HeLlo WoRld.\n\n");

return 0;

}

Now produce the executables, file1.bin and file2.bin, from those C files above:

gcc -Wall -Wextra -Werror -O3 -std=gnu17 file1.c -o file1.bin && ./file1.bin
gcc -Wall -Wextra -Werror -O3 -std=gnu17 file2.c -o file2.bin && ./file2.bin

Then, of course, compare them!:

meld <(xxd file1.bin) <(xxd file2.bin)

See also

  1. My really-useful hex2bin and hex2xxdhex Bash functions to aid your binary comparisons above, in my answer here: Stack Overflow: Bash function to mass-convert Intel *.hex firmware files to *.bin firmware files, and to *.xxd.hex files for comparison in meld
  2. My answer on how to make meld your git difftool in Windows, Mac, and Linux
2

dead-ranger

dead-ranger is an open source Rust TUI program that will do just this.

  • CLI Diff Viewer for Hex and ASCII.
  • Color highlighting for different data types to enhance readability.
  • Keyboard navigation enables interactive exploration of differences.
  • Displays bit position for focused data, aiding in precise location identification.

tui screen

Evan Carroll
  • 9,518
1

You can use the gvimdiff tool that is included in the vim-gui-common package

sudo apt-get update

sudo apt-get install vim-gui-common

Then you can compare two hexadecimal files using the following commands:

ubuntu> gvimdiff <hex-file1> <hex-file2>
craken
  • 119
0

I wrote a simple script to diff a binary file. It will print the first different chunk (40 bytes) and offset:

https://gist.github.com/guyskk/98621a9785bd88cf2b4e804978950122

$ bindiff file1 file2
8880> 442408E868330300488D05825337004889042448C744240802000000E84F330300E88A2A0300488B
                      ^^^^^^^^^ ^^
      442408E868330300E59388E59388004889042448C744240802000000E84F330300E88A2A0300488B
guyskk
  • 101
0

Here is a script to use kdiff3 on hex output:

#!/bin/bash

mkdir -p ~/tmp/kdiff3/a mkdir -p ~/tmp/kdiff3/b

a="$HOME/tmp/kdiff3/a/basename $1.hex" b="$HOME/tmp/kdiff3/b/basename $2.hex" xxd "$1" > "$a" xxd "$2" > "$b" kdiff3 "$a" "$b"

Which you could save as e.g. kdiff3bin and use like:

kdiff3bin file1.bin file2.bin
-2

https://security.googleblog.com/2016/03/bindiff-now-available-for-free.html

BinDiff is a great UI tool for comparing binary files that has been open sourced recently.

-2

The go to open source product on Linux (and everything else) is Radare which provides radiff2 explicitly for this purpose.

for every different byte

That's insane though. Because as asked, if you insert one byte at the first byte in the file, you'd find every subsequent byte was different and so the diff would repeat the whole file, for an actual difference of one byte.

Slightly more practical is radiff -O. The -O is for ""Do code diffing with all bytes instead of just the fixed opcode bytes""

0x000000a4 0c01 => 3802 0x000000a4
0x000000a8 1401 => 3802 0x000000a8
0x000000ac 06 => 05 0x000000ac
0x000000b4 02 => 01 0x000000b4
0x000000b8 4c05 => 0020 0x000000b8
0x000000bc 4c95 => 00a0 0x000000bc
0x000000c0 4c95 => 00a0 0x000000c0

Like IDA Pro, Radare is a tool primary for binary analysis, and you can also show delta diffing with -d, or display the disassembled bytes instead of hex with -D.

See also:

Evan Carroll
  • 9,518