Is there a native way to encode or decode HTML entities using JavaScript or ES6? For example, < would be encoded as <. There are libraries like html-entities for Node.js but it feels like there should be something built into JavaScript that already handles this common need.
            Asked
            
        
        
            Active
            
        
            Viewed 5.1k times
        
    39
            
            
         
    
    
        Marty Chang
        
- 6,269
- 5
- 16
- 25
- 
                    3There is not a native JavaScript facility. JavaScript the programming language does not really have much to do with HTML, goofy APIs on the String prototype notwithstanding. – Pointy Oct 26 '16 at 13:39
- 
                    1@Pointy I think generally speaking you're right. It just feels like since JavaScript is so widely used on the web, and HTML entities are a common feature of web development, something like this would've made its way into the language over the past decade. – Marty Chang Oct 26 '16 at 14:21
- 
                    I think the question would benefit from clearly including the existence of such a function in browsers and nodejs standard library in its scope. – hippietrail Nov 24 '17 at 04:19
5 Answers
30
            
            
        A nice function using es6 for escaping html:
const escapeHTML = str => str.replace(/[&<>'"]/g, 
  tag => ({
      '&': '&',
      '<': '<',
      '>': '>',
      "'": ''',
      '"': '"'
    }[tag]));
 
    
    
        asafel
        
- 713
- 1
- 7
- 14
- 
                    2+1 for being nice and simple and for working without depending on some 3rd-party code ... though the fallback is unnecessary due to the regexp – Thomas Urban Feb 16 '20 at 19:58
6
            There is no native function in the JavaScript API that convert ASCII characters to their "html-entities" equivalent. Here is a beginning of a solution and an easy trick that you may like
- 
                    1Thanks for the answer (inconvenient as it may) that what I want doesn't exist. Can you post a different solution though? Or just remove the solution link? That linked solution neither decodes HTML entities nor handles `&` vs. numeric encoding. – Marty Chang Oct 26 '16 at 14:23
6
            
            
        Roll Your Own (caveat - use HE instead for most use cases)
For pure JS without a lib, you can Encode and Decode HTML entities using pure Javascript like this:
let encode = str => {
  let buf = [];
  for (var i = str.length - 1; i >= 0; i--) {
    buf.unshift(['&#', str[i].charCodeAt(), ';'].join(''));
  }
  return buf.join('');
}
let decode = str => {
  return str.replace(/&#(\d+);/g, function(match, dec) {
    return String.fromCharCode(dec);
  });
}
Usages:
encode("Hello > © <") // "Hello > © <"
decode("Hello > © © <") // "Hello > © © <"
However, you can see this approach has a couple shortcomings:
- It encodes even safe characters H→H
- It can decode numeric codes (not in the astral plane), but doesn't know anything about full list of html entities / named character codes supported by browsers like >
Use the HE Library (Html Entities)
- Support for all standardized named character references
- Support for unicode
- Works with ambiguous ampersands
- Written by Mathias Bynens
Usage:
he.encode('foo © bar ≠ baz  qux'); 
// Output : 'foo © bar ≠ baz 𝌆 qux'
he.decode('foo © bar ≠ baz 𝌆 qux');
// Output : 'foo © bar ≠ baz  qux'
Related Questions
 
    
    
        KyleMit
        
- 30,350
- 66
- 462
- 664
4
            
            
        To unescape HTML entities, Your browser is smart and will do it for you
Way1
_unescape(html: string) :string { 
   const divElement = document.createElement("div");
   divElement.innerHTML = html;
   return divElement.textContent || tmp.innerText || "";
}
Way2
_unescape(html: string) :string {
     let returnText = html;
     returnText = returnText.replace(/ /gi, " ");
     returnText = returnText.replace(/&/gi, "&");
     returnText = returnText.replace(/"/gi, `"`);
     returnText = returnText.replace(/</gi, "<");
     returnText = returnText.replace(/>/gi, ">");
     return returnText;
}
You can also use underscore or lodash's unescape method but this ignores   and handles only &, <, >, ", and ' characters.
 
    
    
        Sunil Garg
        
- 14,608
- 25
- 132
- 189
2
            
            
        The reverse (decode) of the answer (encode) @rasafel provided:
const decodeEscapedHTML = (str) =>
  str.replace(
    /&(\D+);/gi,
    (tag) =>
      ({
        '&': '&',
        '<': '<',
        '>': '>',
        ''': "'",
        '"': '"',
      }[tag]),
  )
 
    
    
        Ryan - Llaver
        
- 528
- 4
- 19
 
    