I'm building a compression program. I want to use LWZ for utf-8 files (any urf-8 files) and BZip for others (usually random binary files). I can't find method to define is file utf8 or not.
I tried this and many other methods all over stackoverflow but they can't do it for me. I can share examples of files that should be recognized as utf 8 and files that should be recognized as "others"
 else if (args[0] != null && args[1] != null)
        {
            if (random binary detected)
            {
                Console.WriteLine("Started Bzip");
                byte[] res = new Bzip2Compressor(65).Compress(File.ReadAllBytes(args[0]));
                File.WriteAllBytes(args[1], res);
                Console.WriteLine("Done!");
                return;
            }
            else //for utf 8 cases (both with bom and without)
            {
                Console.WriteLine("Started LZW");
                byte[] res = LZWCompressor.Compress(File.ReadAllBytes(args[0]));
                File.WriteAllBytes(args[1], res);
                Console.WriteLine("Done");
                return;
            }
        }
Note: i only need to separate utf-8 and all others
EDIT: so i would like to check first n symbols to be invalid utf 8;
var bytes = new byte[1024 * 1024];
new Random().NextBytes(bytes);
File.WriteAllBytes(@"PATH", bytes);
General goal is to detected files cerated like in code above as NOT utf-8 files