Reproducible recipe: serve abliterated Gemma-4-12B (gemma4_unified) at 50-118 tok/s on no-NVLink Blackwell (SM120) via vLLM nightly + ModelOpt FP8/NVFP4 + MTP spec-decode. Run Gemma 4 locally with ...
If you found this repository helpful in discovering new opportunities, don’t keep it to yourself — share it with your friends or batchmates so they can benefit too! You can also connect with me on ...