I would like to use some text in my app that is kind of messy. I don't have control over the text, so it is what it is.
I'm looking for a light weight1 approach to cleaning up all the things shown in the examples here:
original: <p>Occasionally we deal with this.</p>                       desired: Occasionally we deal with this.
original: <p>Sometimes they \emphasize\ like this, I could live with it</p>      desired: Sometimes they emphasize like this, I could live with it
original: <p>This is junk, but it's what I have<\/p>\r\n                         desired: This is junk, but it's what I have
original: <p>This is test1</p>                                                   desired: This is test1
original: <p>This is u\u00f1icode</p>                                            desired: This is uñicode
So we see special characters, like   unicode,  like \u00f1, html paragraph, like <p> and </p>, new line stuff, like \n\r, and just weird backslashes \ in places.  The desired is translating the translatable and removing the other junk.
Although it's possible for me to manipulate the strings directly, taking care of each of these things individually, I wondered if there was a simple1 way to clean up these strings without too much overhead1.
A partial answer is provided already, but there are more problems to fix in the examples I've provided. That solution translates HTML special characters, but no unicode formatted as \u0000, not removing HTML tags, etc.
Additional Things I Tried
This is not the global solution I was looking for, but it shows the direction one could go to solve the problem.
let samples = ["<p>This is test1</p>                                             ":"This is test1",
           "<p>This is u\\u00f1icode</p>                                      ":"This is u–icode",
           "<p>This is uñicode</p>                                       ":"This is u–icode",
           "<p>This is junk, but it's what I have<\\/p>\\r\\n                   ":"This is junk, but it's what I have",
           "<p>Sometimes they \\emphasize\\ like this, I could live with it</p>":"Sometimes they emphasize like this, I could live with it",
           "<p>Occasionally we deal with this.</p>                 ":"Occasionally we deal with this."]
for (key, value) in samples {
    print ("original: \(key)      desired: \(value)" )
}
print("\n\n\n")
for (key, _) in samples {
    var _key = key.trimmingCharacters(in: CharacterSet.whitespaces)
    _key = _key.replacingOccurrences(of: "\\/", with: "/")
    if _key.hasSuffix("\\r\\n") { _key = String(_key.dropLast(4)) }
    if _key.hasPrefix("<p>") { _key = String(_key.dropFirst(3)) }
    if _key.hasSuffix("</p>") { _key = String(_key.dropLast(4)) }
    while let uniRange = _key[_key.startIndex...].range(of: "\\u") {
        let charDefRange = uniRange.upperBound..<_key.index(uniRange.upperBound, offsetBy: 4)
        let uniFullRange = uniRange.lowerBound..<charDefRange.upperBound
        let charDef = "&#x" + _key[charDefRange] + ";"
        _key = _key.replacingCharacters(in: uniFullRange, with: charDef)
    }
    let decoded = _key.stringByDecodingHTMLEntities
    print("decoded: \(decoded)")
}
OUTPUT
original: <p>Occasionally we deal with this.</p>                       desired: Occasionally we deal with this.
original: <p>Sometimes they \emphasize\ like this, I could live with it</p>      desired: Sometimes they emphasize like this, I could live with it
original: <p>This is uñicode</p>                                          desired: This is uñicode
original: <p>This is junk, but it's what I have<\/p>\r\n                         desired: This is junk, but it's what I have
original: <p>This is test1</p>                                                   desired: This is test1
original: <p>This is u\u00f1icode</p>                                            desired: This is uñicode
decoded: Occasionally we deal with this.
decoded: Sometimes they \emphasize\ like this, I could live with it
decoded: This is uñicode
decoded: This is junk, but it's what I have
decoded: This is test1
decoded: This is uñicode
Footnotes: 1. There are probably many larger packages or libraries that could do this as a very small part of their total functionality, and those are of less interest here.
 
    