A minimal document indexer written in Python. Computes a positional index for a document collection and a term document matrix, then allows a ranked query based on the index and the matrix. A college ...
Parse Document Model (Python) provides Pydantic models for representing text documents using a hierarchical model. This library allows you to define documents as a hierarchy of (specialised) nodes ...