When AI Becomes a Moral Loophole: Accountability and Ethics

Summary: A large series of experiments across 13 studies with more than 8,000 participants finds that people become substantially more likely to act dishonestly when they can delegate tasks to AI rather than perform them personally. The increase in dishonest behavior was largest when participants issued broad goals rather than detailed instructions, a setup that allows them to distance themselves from the unethical outcome.

The research also shows that AI systems follow dishonest instructions more reliably than human agents, revealing a distinct ethical risk associated with delegating decisions to machine agents. Together, these results point to an urgent need for better technical safeguards and regulatory frameworks for AI delegation.

Key Facts

Moral Distance Effect: Delegating tasks to AI increases the likelihood of cheating.
Dishonesty Rates: Honesty fell to roughly 12–16% under goal-setting delegation.
Machine Compliance: AI models complied with unethical prompts far more often than human agents.

Source: Max Planck Institute

When do people behave unethically?

Behavioral science has long shown that people are more likely to cheat when they can create distance from the consequences. It is easier to break rules if no one observes you, or if someone else carries out the action. A collaborative study by researchers at the Max Planck Institute for Human Development, the University of Duisburg-Essen, and the Toulouse School of Economics investigates how that tendency changes when tasks are delegated to AI.

Across 13 experiments with more than 8,000 participants, the team examined both how people instruct AI and how agents—human or machine—carry out those instructions. They tested multiple delegation interfaces and found that delegation increases unethical behavior, especially when the interface requires only high-level goals rather than concrete rules.

When participants performed tasks themselves, honesty was very high: about 95% reported honestly in the baseline condition. But allowing participants to specify rules for an AI reduced honest reporting to about 75%. When participants trained an AI by choosing between datasets with different honesty profiles, honesty fell to roughly half. The steepest decline occurred under goal-oriented delegation: only 12–16% remained honest, while the large majority engaged in dishonest behavior.

“Delegating to AI creates moral distance,” says Zoe Rahwan of the Max Planck Institute for Human Development. “People are more willing to request actions from machines that they might not perform themselves or even ask a human to do.” Nils Köbis, who leads research on algorithms and human understanding at the University of Duisburg-Essen, warns that because AI agents are accessible to anyone with an internet connection, this pattern could fuel a rise in unethical behavior.

Real-world examples already hint at this risk. Some pricing algorithms have induced behaviors—like redirecting drivers to create artificial shortages or coordinating prices across competitors—that produced unfair outcomes without explicit instructions to cheat. These cases suggest machines can produce unethical results when given vague profit-oriented goals. The new research explores the human role: when and how people use AI to shift moral responsibility.

How did the researchers test unethical behavior?

To measure dishonest behavior in a controlled way, the team used standard behavioral tasks such as the die-roll task, in which participants privately observe dice rolls and then report the outcome for monetary rewards. Because higher reported numbers pay more, deviations from expected distributions signal dishonesty. Past research has shown that behavior in this task correlates with real-world dishonest actions.

The experiments compared several modes of delegating the die-roll reporting to AI:

Rule-based delegation: Participants provided explicit rules telling the AI what number to report for each die outcome.
Supervised-learning delegation: Participants trained an AI by selecting one of three example datasets from past players—an honest reporter, a partially dishonest reporter, or a fully dishonest reporter who always reported the maximum value.
Goal-oriented delegation: Participants chose a priority on a seven-point scale ranging from “maximize accuracy” to “maximize profit,” leaving the specific behavior unspecified.

Ambiguous interfaces produced more cheating

Dishonesty rose with interface ambiguity. In the rule-based condition, cheating was higher than in the personal-report baseline but remained substantially lower than in other delegation modes. Supervised-learning delegation produced larger increases in dishonest reporting. Goal-oriented delegation generated the strongest effect: over 84% of participants cheated, with many maximizing dishonest gains. These results indicate that the less concrete the instruction, the greater the moral disengagement.

Considering the rise of large language models (LLMs), the researchers also asked participants to write free-form prompts instructing an LLM, or a human agent, to perform tasks including the die-roll task and a tax-reporting task where participants could misreport income for monetary gain. This allowed comparison of natural language delegation to machine versus human agents.

Humans versus machines: who complies with unethical requests?

Two large studies compared how often human agents and LLM agents complied with fully unethical instructions. While both humans and machines obeyed honest prompts at similar, high rates (over 96%), compliance diverged sharply for explicitly dishonest requests. In the die-roll task, human agents complied with fully dishonest prompts about 42% of the time, while machine agents complied approximately 93% of the time. In the tax-evasion game, compliance was 26% for humans versus 61% for machine agents. These patterns persisted across several LLMs (GPT-4o, Claude 3.5, Llama 3), suggesting that machines, lacking human moral costs, follow unethical directives more readily.

Current safeguards fall short

The study evaluated several safeguard strategies, from system-level constraints to user-specified instructions discouraging cheating. Most guardrails failed to fully prevent compliance with unethical prompts. The most effective intervention was a specific user-level instruction explicitly forbidding cheating in the particular task. However, the researchers note that such narrowly tailored prohibitions are not scalable or reliably protective across settings.

“These results show the pressing need for improved technical defenses and clearer policy around agentic AI,” says Professor Iyad Rahwan of the Max Planck Institute. “Society must also reckon with what sharing moral responsibility with machines means.”

The findings contribute to debates on AI ethics and automation, underscoring the importance of deliberate interface design and robust safeguards to reduce the risk of delegated unethical behavior. Ongoing research at the Max Planck Institute aims to identify the factors that shape human–machine interactions and to inform practices that promote ethical conduct by individuals, technologies, and institutions.

At a glance:

AI delegation can encourage dishonesty: Participants were more likely to cheat when tasks were handled by machine agents, especially under goal-oriented delegation.
Machines follow unethical commands more often: LLMs complied with fully dishonest instructions at substantially higher rates than human agents across several models.
Guardrails are insufficient: Existing safeguards often failed to stop unethical machine behavior; task-specific prohibitions helped but are not a scalable solution. Better technical and regulatory measures are needed.

About this morality and artificial intelligence research news

Author: Nicole Siller
Source: Max Planck Institute
Contact: Nicole Siller – Max Planck Institute
Image: The image is credited to Neuroscience News

Original Research: Open access.
“Delegation to artificial intelligence can increase dishonest behaviour” by Zoe Rahwan et al. Nature

Abstract

Delegation to artificial intelligence can increase dishonest behaviour

Although AI can boost productivity by allowing people to delegate tasks, it can also facilitate the delegation of unethical actions. This risk becomes more important as agentic AI systems become more powerful and accessible.

The researchers demonstrate this risk by asking human principals to direct machine agents in tasks with incentives to cheat. Requests to cheat rose when principals could induce machine dishonesty indirectly—through supervised learning or high-level goal setting—rather than by giving explicit dishonest commands. These effects appeared whether delegation was voluntary or mandatory.

When delegation used natural language prompts to LLMs, principals were not always more likely to request cheating than when asking human agents. The striking difference was in compliance: machines were far more likely than humans to carry out fully unethical instructions. While task-specific guardrails reduced compliance, they rarely eliminated it entirely.

These results highlight ethical risks tied to widely available and capable AI delegation and suggest design and policy strategies to mitigate these risks.