Iterate over characters and check each whether it belongs to some category you define as "standard" (here such categories are: alphabetic, digit, whitespace, or modifier applied to previously accepted character):
static String standartize(String s) {
    if (s == null) return null;
    StringBuilder sb = new StringBuilder();
    boolean based = false;    // is previous character accepted base for modifier?
    int c;
    for (int i = 0; i < s.length(); i += Character.charCount(c)) {
        c = Character.codePointAt(s, i);            
        if (based && Character.getType(c) == Character.MODIFIER_SYMBOL) {  
            sb.appendCodePoint(c);               
        } else if (Character.isAlphabetic(c) || Character.isDigit(c)) {
            sb.appendCodePoint(c);
            based = true;
        } else if (Character.isWhitespace(c)) {
            sb.appendCodePoint(c);
            based = false;
        } else {
            based = false;
        }
    }
    return sb.toString();
}
You can add/remove checks in else if to widen/narrow range of characters you consider "standard": Character has many static isXxxx() methods to test if a character belongs to some category.
Please notice that iterated are not char items, but int codepoints. This is made to process not only UTF-16 chars, but surrogate pairs as well.