What Regex Removes HTML Tags?

Strip HTML tags: /<[^>]*>/g -- but use a real parser (DOMParser) for production. This regex matches anything between < and > and removes it. It works for simple cases but fails on nested tags, attributes containing >, comments, and malformed HTML.

Quick Regex Approach

JavaScript

const html = '<p>Hello <b>world</b></p>';
const text = html.replace(/<[^>]*>/g, '');
// "Hello world"

Python

import re
html = '<p>Hello <b>world</b></p>'
text = re.sub(r'<[^>]*>', '', html)
# 'Hello world'

Better: Use a Parser

JavaScript (DOMParser)

function stripHTML(html) {
  const doc = new DOMParser()
    .parseFromString(html, 'text/html');
  return doc.body.textContent || '';
}
stripHTML('<p>Hello <b>world</b></p>');
// "Hello world"

Python (BeautifulSoup)

from bs4 import BeautifulSoup
soup = BeautifulSoup('<p>Hello</p>', 'html.parser')
soup.get_text()
# 'Hello'

Why Regex Fails on HTML

Try It Yourself

Test regex patterns with our Regex Tester or encode HTML with our HTML Encoder.

Built by Michael Lip. 100% client-side — no data leaves your browser.