This answer (by yours truly) provides a graphical, step by step walkthrough of how to set up a rule and program a macro that modifies incoming emails as they are received to do some processing on the body.
You would just have to adapt the code to do the processing to remove the "repetitive content" you don't want to see.
The general problem of removing repetitive content is actually rather complicated, computationally.
Assuming a "repetitive string" is a substring which has already occurred within the text, you would have a loop structure like this (pseudocode, don't try to copy this into a program):
For i = 1 To Len(str)
For j = i To Len(str)
needle = substring(str, i, j)
nlen = Len(needle)
For k = 1 To Len(str)
match = substring(str, k, k + nlen)
If needle = match Then
'...do stuff
End If
Next
Next
Next
Sounds pretty complex. Also, this kind of loop would catch things like "Pettitte" (a last name) and change it to "Peti" (the rest of the characters are substrings of length 1 which already occurred). You'd have to set a minimum length for the "needle" so as to avoid having at most one instance of every letter of the alphabet. Then you'd have to perform some analysis on the string to determine if it's "header text" or something which you want to remove. Otherwise, it would catch something like "you should not do that. I really, strongly advise that you not do that." and change it to "you should not do that. I really, strongly advise that you"
If you don't want to go with the general purpose (naive) way of finding duplicate content, which could delete a lot of meaningful content, you'd have to decide:
- Which substrings to attempt to detect duplicates of;
- Which instances of the duplicates to keep and which to delete.
The InStr and Mid functions in VBA should be helpful. Press F2 on your keyboard in the VBA editor to see the list of available functions in the various modules. The builtin string functions in the VBA module should prove useful.
I don't think anything like this already exists in a pre-canned format that you can just take and use, but if all you want to remove is redundant mail headers like From:, To:, Subject:, it should be fairly easy to detect them using a few substring or regex matches. If you get really stuck in the bowels of the code, I think a StackOverflow question would be more appropriate as a followup.