Anemll's banner
Anemll's profile picture

Anemll

@anemll4,113 subscribers

ANEMLL (pronounced like "animal") Artificial Neural Engine Machine Learning Library, Open Source Project

Shorts

Qwen 3.5 0.8B, Gated DeltaNet attention is running on Apple Neural Engine ~56 t/s in LUT6 quantization with some room for optimization left. It is CoreML, Swift and IOSurface on M4Pro. It will slow down as we increase context, but not by much. I think Private API opens the way to integrate ANE with GPU/MLX and possibly some MoE.

Qwen 3.5 0.8B, Gated DeltaNet attention is running on Apple Neural Engine ~56 t/s in LUT6 quantization with some room for optimization left. It is CoreML, Swift and IOSurface on M4Pro. It will slow down as we increase context, but not by much. I think Private API opens the way to integrate ANE with GPU/MLX and possibly some MoE.

13,589 次观看

Videos

没有更多内容可加载