Yak Shaving for Token Speed: Chasing MTP and TurboQuant on Linux

Yak Shaving for Token Speed: Chasing MTP and TurboQuant on Linux

Inspired by the performance gains and resource efficiency of Multi-Token Prediction (MTP) and KV cache quantisation, I started what would become another yak-shaving endeavour.

I was an early adopter of Ollama to host LLMs on my MacBook and home Ubuntu server. Between the easy access to the latest models, reliable …

more ...

Interview with Techgoondu

In my role as field CTO, I sometimes speak with analysts and the media. Conversations often focus around a predetermined area of interest of set of questions. The follow text is based on my preparation for an interview with Techgoondu, Sinagpore.

Q1. What’s the Asia-Pacific AI landscape like currently …

more ...