Tech & Economy

Zhipu AI Open Sources AutoGLM, Bringing Hands Free Phone Operation Closer to Reality

Zhipu AI Open Sources AutoGLM, Bringing Hands Free Phone Operation Closer to Reality
Share on:

Chinese artificial intelligence company Zhipu AI has taken a notable step in the evolution of AI agents by open sourcing AutoGLM, a model designed to operate smartphones independently. According to reports, AutoGLM is the first AI agent to demonstrate stable and practical phone use directly on devices, moving beyond simple command execution to full interaction with mobile apps. The release highlights growing momentum behind AI systems that can act on behalf of users in everyday digital environments.

From commands to real phone interaction

AutoGLM represents a shift in how AI agents interact with mobile devices. Instead of relying on predefined shortcuts or backend integrations, the model interprets what appears on the screen and responds in a human like way. It can recognize interface elements, understand context, and simulate taps, swipes, and text input. This allows it to carry out complete workflows such as ordering food, booking flights, or navigating shopping apps without direct human intervention.

This approach mirrors how people naturally use smartphones, making AutoGLM more flexible than task specific automation tools. By understanding visual layouts and changing interfaces, the agent can adapt to app updates and different usage scenarios more effectively.

Multi step tasks handled end to end

One of AutoGLM’s key strengths is its ability to handle multi step tasks from start to finish. For example, when ordering food, the model can open a delivery app, search for restaurants, select items, customize orders, enter delivery details, and confirm payment. Similar logic applies to travel bookings or online shopping. These tasks require memory, sequencing, and decision making across multiple screens, all of which AutoGLM is designed to manage autonomously.

This capability moves AI agents closer to being true digital assistants rather than simple tools triggered by single commands. It also reduces friction for users by automating repetitive or time consuming interactions.

Broad support across popular Chinese apps

AutoGLM currently supports core workflows across more than fifty high frequency Chinese applications. These include widely used platforms such as WeChat, Taobao, Douyin, and Meituan. Supporting such a large and diverse app ecosystem is significant because each app has its own interface logic and interaction patterns.

By demonstrating compatibility with major social, commerce, and service apps, Zhipu AI shows that AutoGLM is not limited to experimental use cases. Instead, it is designed for real world deployment in scenarios that millions of users encounter daily.

Why on device operation matters

A notable aspect of AutoGLM is its stable on device operation. Running directly on the phone reduces reliance on constant cloud connectivity and can improve responsiveness and privacy. On device agents are also better positioned to interact seamlessly with system level features and apps.

This design choice aligns with broader industry trends favoring edge AI, where computation happens closer to the user. As smartphones become more powerful, on device agents like AutoGLM could become standard features rather than niche experiments.

Open sourcing as a strategic signal

By open sourcing AutoGLM, Zhipu AI is inviting developers, researchers, and partners to explore and build upon the model. This move can accelerate adoption and refinement, while also positioning Zhipu AI as a key contributor to the emerging AI agent ecosystem. Open sourcing also increases transparency, allowing the wider community to evaluate performance, limitations, and potential risks.

In a competitive AI landscape, sharing foundational technology can help establish standards and attract talent, even as companies continue to develop proprietary applications on top.

Implications for the future of mobile AI

AutoGLM points toward a future where AI agents handle routine digital tasks with minimal user input. Instead of switching between apps and manually completing steps, users could delegate actions through natural language instructions. This could reshape how people interact with smartphones, shifting focus from screens to outcomes.

At the same time, widespread deployment raises questions about security, consent, and error handling. Ensuring that agents act reliably and transparently will be critical as capabilities expand.

A step toward practical AI agents

Zhipu AI’s open sourcing of AutoGLM marks a meaningful advance in practical AI agents. By enabling stable, full phone operation across popular apps, the model demonstrates that autonomous digital assistance is moving from concept to reality. While challenges remain, AutoGLM shows how AI can increasingly operate within existing digital environments, bringing hands free phone use closer to everyday life.