Microsoft has launched Fara-7B, a lightweight 7-billion-parameter model designed to operate a computer from a single screenshot. The model works like a digital assistant that studies what appears on the screen and performs actions such as clicking, typing, and navigating. It builds on earlier small language model efforts and arrives as a more efficient upgrade. Because it runs directly on devices, it avoids heavy cloud requirements that slow down many multi-model systems. As a result, users gain faster responses and stronger privacy.
Fara-7B functions as a Computer Use Agent, focusing on actions instead of long text generation. Unlike traditional systems, it does not rely on large stacks of components or intense server power. Instead, it behaves like a single, unified model built to complete everyday digital tasks. Many AI agents need huge compute setups just to understand a screen, so this compact approach shifts the focus toward accessibility and ease of use.
How Fara-7B Was Built
The model’s design emphasizes simplicity. It studies a screenshot and then decides the next action, which makes deployment easier and more affordable. To support this behavior, engineers created a synthetic data pipeline called FaraGen. This system allows AI agents to interact with real websites across 70,000 domains. Therefore, it generates realistic multi-step sessions that include retries, mistakes, scrolling, and searching.
Each session passes through multiple AI evaluation layers to ensure accuracy and alignment with what appears onscreen. After filtering, the team retained 145,630 verified sessions containing more than one million individual actions for training. Consequently, Fara-7B learned to handle a wide range of real tasks.
Fara-7B uses around 124,000 input tokens and roughly 1,100 output tokens for each task. A typical task costs around 2.5 cents, which is significantly lower than costs associated with larger reasoning models.
Performance, Availability, and Developer Access
Benchmark scores show strong results for a lightweight model. Fara-7B achieves 73.5 percent on Web Voyager, 34.1 percent on OnlineMind 2 Webb, 26.2 percent on DeepShop, and 38.4 percent on WebTailBench. These benchmarks highlight real-world tasks such as online shopping, job applications, and general web navigation.
Fara-7B is available on Microsoft Foundry and Hugging Face under an MIT license. It also integrates with Magentic-UI, a research interface designed for testing and experimentation. In addition, a quantized and silicon-optimized version supports Copilot+ PCs running Windows 11, enabling users to install and run the model locally.
With open weights and simplified deployment, Fara-7B lowers barriers for developers who want to build and experiment with Computer Use Agent technology. It encourages broader innovation in automation, particularly for everyday web-based tasks.








