pdf-parsing/ ├── pipeline/ Core ML pipeline — installable Python package │ ├── Dolphin/ git submodule — ByteDance layout model │ ├── html-to-markdown/ git submodule — Go HTML→Markdown converter │ └── ...