"""Normalize OCR/PDF-converted HTML formatting glitches. This script fixes recurring structural issues observed in converted legal HTML files: 1) Heading tags that actually contain paragraph text ...