Edge AIOptimization
Transformer Architecture Optimization for Edge Devices
"We introduce MobileFormer, a novel architecture that reduces inference latency by 40% on mobile CPUs while maintaining BERT-base performance levels..."
J. Zhang, Google Research • ICLR 2025 142k Reads