No really... The syntax seems to have been invented by someone who wanted to bet that he could push more brackets in a code than C++ and Lisp together. Who could come ...
Full GPT-2 small (117M) forward pass in the GPU via WebGL2 shaders BPE tokenization using js-tiktoken in the browser (no WASM fetch) Simple Python script to download the pretrained weights ...