I want to synchronize mongodb and hadoop, but when I delete document from mongodb, this document must not be deleted in hadoop.
I tried using mongo-hadoop and hive. this is hive query:
CREATE EXTERNAL TABLE SubComponentSubmission
(
  id STRING,
  status INT,
  providerId STRING,
  dateCreated TIMESTAMP,
  subComponentId STRING,
  packageName STRING
)
STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES('mongo.columns.mapping'=
                    '{"id":"_id", "status":"Status", 
                      "providerId":"ProviderId", 
                      "dateCreated":"DateCreated", 
                      "subComponentId":"SubComponentPackage.SubComponentId", 
                      "packageName":"SubComponentPackage.PackageName"}'
                    )
TBLPROPERTIES('mongo.uri'='mongodb://<host>:27017/<db name>.<collection name>');
this query creates table that is synchronized to corresponding mongodb collection. by this query mongo-hadoop handles document deletion too.
does mongo-hadoop have any option, not to handle document deletion? or, is there any other tool that solves this problem?
thanks in advance.
 
    