Top Qs
Timeline
Chat
Perspective

Agentic AI

Systems that perform tasks without human intervention From Wikipedia, the free encyclopedia

Remove ads

In the context of generative artificial intelligence, AI agents (also referred to as compound AI systems or agentic AI) are a class of intelligent agents distinguished by their ability to operate autonomously in complex environments. Agentic AI tools prioritize decision-making over content creation and do not require human prompts or continuous oversight.[1]

Overview

Summarize
Perspective

AI agents possess several key attributes, including complex goal structures, natural language interfaces, the capacity to act independently of user supervision, and the integration of software tools or planning systems. Their control flow is frequently driven by large language models (LLMs).[2] Agents also include memory systems for remembering previous user-agent interactions and orchestration software for organizing agent components.[3]

Researchers and commentators have noted that AI agents do not have a standard definition.[2][4][5][6] The concept of agentic AI has been compared to the fictional character J.A.R.V.I.S..[7]

A common application of AI agents is the automation of tasks—for example, booking travel plans based on a user's prompted request.[8][9] Prominent examples include Devin AI, AutoGPT, and SIMA.[10] Further examples of agents released since 2025 include OpenAI Operator,[11] ChatGPT Deep Research,[12] Manus,[13] Quark (based on Qwen),[14] AutoGLM Rumination,[14] and Coze (by ByteDance).[14] Frameworks for building AI agents include LangChain,[15] as well as tools such as CAMEL,[16][17] Microsoft AutoGen,[18] and OpenAI Swarm.[19]

Companies such as Google, Microsoft and Amazon Web Services have offered platforms for deploying pre-built AI agents.[20]

Proposed protocols for standardizing inter-agent communication include the Agent Protocol (by LangChain), the Model Context Protocol (by Anthropic), AGNTCY,[21] Gibberlink,[22] the Internet of Agents,[23] Agent2Agent (by Google),[24] and the Agent Network Protocol.[25] Some of these protocols are also used for connecting agents with external applications.[3] Software frameworks for addressing agent reliability include AgentSpec, ToolEmu, GuardAgent, Agentic Evaluations, and predictive models from H2O.ai.[26]

In February 2025, Hugging Face released Open Deep Research, an open source version of OpenAI Deep Research.[27] Hugging Face also released a free web browser agent, similar to OpenAI Operator.[28] Galileo AI published on Hugging Face a leadership board for agents, which ranks their performance based on their underlying LLMs.[29]

Memory systems for agents include Mem0,[30][31] MemGPT,[32] and MemOS.[33]

Remove ads

History

AI agents have been traced back to research from the 1990s, with Harvard professor Milind Tambe noting that the definition of an AI agent was not clear at the time either. Researcher Andrew Ng has been credited with spreading the term "agentic" to a wider audience in 2024.[34]

Training and testing

Researchers have attempted to build world models[35][36] and reinforcement learning environments[37] to train or evaluate AI agents.

Autonomous capabilities

The Financial Times compared the autonomy of AI agents to the SAE classification of self-driving cars, comparing most applications to level 2 or level 3, with some achieving level 4 in highly specialized circumstances, and level 5 being theoretical.[38]

Multimodal AI agents

In addition to large language models (LLMs), vision-language models (VLMs) and multimodal foundation models can be used as the basis for agents. In September 2024, Allen Institute for AI released an open-source vision-language model, which Wired noted could give AI agents the ability to perform complex computer tasks, including the possibility of automated computer hacking.[39] Nvidia released a framework for developers to use VLMs, LLMs and retrieval-augmented generation for building AI agents that can analyze images and videos, including video search and video summarization.[40][41] Microsoft released a multimodal agent model – trained on images, video, software user interface interactions, and robotics data – that the company claimed can manipulate software and robots.[42]

Remove ads

Applications

Summarize
Perspective

As of April 2025, per the Associated Press, there are few real-world applications of AI agents.[43] As of June 2025, per Fortune, many companies are primarily experimenting with AI agents.[44]

A recruiter for the Department of Government Efficiency proposed in April 2025 to use AI agents to automate the work of about 70,000 United States federal government employees, as part of a startup with funding from OpenAI and a partnership agreement with Palantir. This proposal was criticized by experts for its impracticality, if not impossibility, and the lack of corresponding widespread adoption by businesses.[45]

The Information divided AI agents into seven archetypes: business-task agents, for acting within enterprise software; conversational agents, which act as chatbots for customer support; research agents, for querying and analyzing information (such as OpenAI Deep Research); analytics agents, for analyzing data to create reports; software developer or coding agents (such as Cursor); domain-specific agents, which include specific subject matter knowledge; and web browser agents (such as OpenAI Operator).[3]

By mid-2025, AI agents have been used in video game development,[46] gambling (including sports betting),[47] and cryptocurrency wallets[47] (including cryptocurrency trading and meme coins[48]). In August 2025, New York Magazine described software development as the most definitive use case of AI agents.[49] Likewise, by October 2025, noting a decline in expectations, The Information noted AI coding agents and customer support as the primary use cases by businesses.[50]

AI agents have also been integrated into operating systems. Writing in The Economist, Signal president Meredith Whittaker has noted that agents have been included in operating systems developed by Microsoft, Apple and Google.[51] In November 2025, Microsoft released a test software build of Windows 11 that included agents intended to run background tasks, with the ability to read and write personal files.[52]

In November 2025, The Wall Street Journal reported that few companies that deployed AI agents have received a return on investment.[53]

In November 2025, the Internal Revenue Service stated that it would use Agentforce, AI agents from Salesforce, for the Office of Chief Counsel, Taxpayer Advocate Services and the Office of Appeals.[54] That same month, Staffordshire Police announced that they would trial Agentforce agents for handling non-emergency 101 calls in the United Kingdom starting in 2026.[55]

Web browsing

Web browsers with integrated AI agents are sometimes called agentic browsers. Such agents can perform small tedious tasks during web browsing and potentially even perform browser actions on behalf of the user. Products like OpenAI Operator and Perplexity Comet integrate a spectrum of AI capabilities including the ability to browse the web, interact with websites and perform actions on behalf of the user.[56][57] In 2025, Microsoft launched NLWeb, an agentic web search replacement that would allow websites to use agents to query content from websites by using RSS-like interfaces that allow for the lookup and semantic retrieval of content.[58] Products integrating agentic web capabilities have been criticised for exfiltrating information about their users to third-party servers[59] and exposing security issues since the way the agents communicate often occur through non-standard protocols.[58]

Remove ads

Proposed benefits

Proponents argue that AI agents can increase personal and economic productivity,[9][60] foster greater innovation,[61] and liberate users from monotonous tasks.[61][62] A Bloomberg opinion piece by Parmy Olson argued that agents are best suited for narrow, repetitive tasks with low risk.[63] Conversely, researchers suggest that agents could be applied to web accessibility for people who have disabilities,[64][65] and researchers at Hugging Face propose that agents could be used for coordinating resources such as during disaster response.[66] The R&D Advisory Team of the BBC views AI agents as being most useful when their assigned goal is uncertain.[67] Erik Brynjolfsson suggests that AI agents are more valuable enhancing, rather than replacing, humans.[68]

Remove ads

Concerns

Summarize
Perspective

Concerns include potential issues of liability,[60][67] an increased risk of cybercrime,[8][60] ethical challenges,[60] as well as problems related to AI safety[60] and AI alignment.[8][62] Other issues involve data privacy,[8][69] weakened human oversight,[8][60][66] a lack of guaranteed repeatability,[70] reward hacking,[71] algorithmic bias,[69][72] compounding software errors,[8][10] lack of explainability of agents' decisions,[8][73] security vulnerabilities,[8][74] stifling competition,[51] problems with underemployment,[72] job displacement,[9][72] cognitive offloading,[75] and the potential for user manipulation,[73][76] misinformation[66] or malinformation.[66] They may also complicate legal frameworks and risk assessments, foster hallucinations, hinder countermeasures against rogue agents, and suffer from the lack of standardized evaluation methods.[77][8][78] They have also been criticized for being expensive[2][8] and having a negative impact on internet traffic,[8] and potentially on the environment due to high energy usage.[70][79][80] According to an estimation by Nvidia CEO Jensen Huang, AI agents would require 100 times more computing power than LLMs.[81] There is also the risk of increased concentration of power by political leaders, as AI agents may not question instructions in the same way that humans would.[71]

Journalists have described AI agents as part of a push by Big Tech companies to "automate everything".[82] Several CEOs of those companies have stated in early 2025 that they expect AI agents to eventually "join the workforce".[83][84] However, in a preprint study, Carnegie Mellon University researchers tested the behavior of agents in a simulated software company and found that none of the agents could complete a majority of the assigned tasks.[83][85] Other researchers had similar findings with Devin AI[86] and other agents in business settings[87][88] and freelance work.[89] CNN argued that statements by CEOs on the potential replacement of their employees by AI agents were a strategy to "[keep] workers working by making them afraid of losing their jobs."[90] Tech companies have pressured employees to use generative AI models in their work, including AI coding agents. Brian Armstrong, the CEO of Coinbase, fired several employees who did not.[91][92] Some business leaders have replaced some of their employees with agents, but have said that the agents would need more supervision than those employees.[50] Futurism questioned whether Amazon's previously announced efforts to replace parts of its workforce with generative AI and AI agents could have led to the October 2025 outage of Amazon Web Services.[93]

Yoshua Bengio warned at the 2025 World Economic Forum that "all of the catastrophic scenarios with AGI or superintelligence happen if we have agents".[94]

In March 2025, Scale AI signed a contract with the United States Department of Defense to work with them, in collaboration with Anduril Industries and Microsoft, to develop and deploy AI agents for the purpose of assisting the military with "operational decision-making".[95] In July 2025, Fox Business reported that the company EdgeRunner AI built an offline agent, compressed and fine-tuned on military information, with the CEO seeing more common LLMs as "heavily politicized to the left". As of that time, the company model is being used by the United States Special Operations Command in an overseas deployment.[96] Researchers have expressed concerns that agents and the large language models they are based on could be biased towards aggressive foreign policy decisions.[97][98]

Research-focused agents have the risk of consensus bias and coverage bias due to collecting information available on the public Internet.[99] NY Mag unfavorably compared the user workflow of agent-based web browsers to Amazon Alexa, which was "software talking to software, not humans talking to software pretending to be humans to use software."[100]

Agents have been linked to the dead Internet theory due to their ability to both publish and engage with online content.[101]

Agents may get stuck in infinite loops.[11][102]

Since many inter-agent protocols are being developed by large technology companies, there are concerns that those companies could use these protocols for self-benefit.[25]

A June 2025 Gartner report accused many projects described as agentic AI of being rebrands of previously released products, terming the phenomenon as "agent washing".[49]

Researchers have warned about the impact of providing AI agents access to cryptocurrency and smart contracts.[48]

During a vibe coding experiment, a coding agent by Replit deleted a production database during a code freeze, "[covered] up bugs and issues by creating fake data [and] fake reports" and responded with false information.[103][104]

In July 2025, PauseAI referred OpenAI to the Australian Federal Police, accusing the company of violating Australian laws through ChatGPT agent due to the risk of assisting the development of biological weapons.[105]

Issues with multi-agent systems include few coordination protocols between component agents, inconsistent performance, and challenges debugging.[106]

In November 2025, Anthropic claimed that a group of hackers sponsored by China attempted a cyberattack against at least 30 organizations by using Claude Code in an agentic workflow, and that several of these infiltrations had succeeded.[107] However, independent cybersecurity researchers questioned the significance of Anthropic's findings.[107][108]

Whittaker argued that the push by Big Tech companies to deploy AI agents risked security vulnerabilities across the Internet.[109]

Possible mitigation

Zico Kolter noted the possibility of emergent behavior as a result of interactions between agents, and proposed research in game theory to model the risks of these interactions.[110]

Guardrails, defined by Business Insider as "filters, rules, and tools that can be used to identify and remove inaccurate content" have been suggested to help reduce errors.[111]

To address security vulnerabilities related to data access, language models could be redesigned to separate instructions and data, or agentic applications could be required to include guardrails. These ideas were proposed in response to a zero-click exploit that affected Microsoft 365 Copilot.[44] Confidential computing has been proposed for protecting data security in projects involving AI agents and generative AI.[112]

A pre-print by Nvidia researchers has suggested small language models (SLMs) as an alternative to LLMs for AI agents, arguing that SLMs are cheaper and more energy efficient.[113][114]

The Economist has advised avoiding what Simon Willison has described as the "lethal trifecta" for AI agents and LLMs: "outside-content exposure, private-data access and outside-world communication".[115]

Remove ads

See also

References

Loading related searches...

Wikiwand - on

Seamless Wikipedia browsing. On steroids.

Remove ads