// call `reshape_block_scales_to_sfa()` after this kernel at integration time. // Keeping the quantize kernel layout-agnostic makes it easier to unit-test // against a pytorch reference. // Launch ...
A vector quantization library originally transcribed from Deepmind's tensorflow implementation, made conveniently into a package. It uses exponential moving averages to update the dictionary. VQ has ...