I'm implementing word-count-style program on google books ngram. My input is binary file: https://aws.amazon.com/datasets/google-books-ngrams/ And I was told to use SequenceFileInputFormat in order to use it.
I'm using hadoop 2.6.5.
Configuration conf = new Configuration();
        Job job = Job.getInstance(conf, "PartA");
        job.setJarByClass(MyDriver.class);
        job.setMapperClass(MyMapperA.class);
        job.setReducerClass(MyReducerA.class);
        job.setCombinerClass(MyReducerA.class);
        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(IntWritable.class);
        job.setInputFormatClass(SequenceFileInputFormat.class); // The new line
        FileInputFormat.addInputPath(job, new Path(args[0]));
        FileOutputFormat.setOutputPath(job, new Path(args[1]));
        System.exit(job.waitForCompletion(true) ? 0 : 1);
Sadly i'm receiving problems after adding this line:
            job.setInputFormatClass(SequenceFileInputFormat.class);
The errors received:
java.lang.Exception: java.lang.IllegalArgumentException: Unknown codec: com.hadoop.compression.lzo.LzoCodec
at org.apache.hadoop.mapred.LocalJobRunner$Job.runTasks(LocalJobRunner.java:462)
I've tried adding several maven dependencies, but without success