How to avoid XSS when accessing DOM?

Sometimes, we may need to manipulate a string content by mounting it on the DOM tree, like finding some nodes to remove, finding some classes to remove or modifying the content. But mostly, it should lead to an XSS problem when the external content contains some unexpected scripts like the following snippet:

const unsafe = '<img src=x onerror=alert(1)>';
$('<div>').html(unsafe).find('img').remove(); // leads to a XSS

const div = document.createElement('div');
div.innerHTML = unsafe; // leads to XSS
const img = div.querySelector('img');
img && img.parentNode.removeChild(img);

So how can we solve this problem? In most cases, we may use a regex expression rather than depending on DOM, which results in a situation where the regex should be complicated. To avoid this problem, most developers may use some libraries like js-xss or DOMPurify to handle this. Nevertheless, these libraries are too strict to remove some safe tags except if you have set up the right configurations.

Here, I just want to clarify another simple way where we can rely on the iframe sandbox technics:

const unsafe = '<img src=x onerror=alert(1)>';
const $frame = $('<iframe>').appendTo('body'), $sandboxDoc = $frame.contents();

// scripts have been blocked
// the document won't also access the image which throws us a 404
$('<div>', $sandboxDoc).html(unsafe).find('img').remove(); 

// by pure javascript
const frame = document.createElement('iframe');
const sandboxDoc = frame.contentDocument;
const div = sandboxDoc.createElement('div');
div.innerHTML = unsafe;
const img = div.querySelector('img');
img && img.parentNode.removeChild(img);
Empty Comments
Sign in GitHub

As the plugin is integrated with a code management system like GitLab or GitHub, you may have to auth with your account before leaving comments around this article.

Notice: This plugin has used Cookie to store your token with an expiration.