Microsoft made a 70x more efficient LLM

This new model BitNet b1.58, operates with ternary parameters (-1, 0, 1), matching the performance of traditional 16-bit models at a size starting from 3B.

It outlines an innovative approach for developing future Large Language Models (LLMs) that balance exceptional performance with cost efficiency, mainly since energy consumption is a pivotal hurdle in scaling Large Language Models.

Moreover, this advancement could foster a new computational framework and pave the way for creating specialized hardware tailored for 1-bit Large Language Models.

Previous
Previous

Anthropics' new prompt library helps turn ideas into actions

Next
Next

Remember to tip your ChatGPT