SecureAI

We research how to prevent AI-enabled cyberterrorism and large-scale societal manipulation.

Based on prior work featured in

Reuters
TIME
The Guardian
The Washington Post
Harvard Business Review
Foreign Affairs
Al Jazeera
Lawfare
The Economist
India Today
Dark Reading
Malwarebytes
Black Hat
DEF CON
BSides Las Vegas
Hacker News
Federation of American Scientists
IEEE
The Debrief
The Business Standard

Our Focus Areas

Social Engineering and Fraud

As AI systems grow more capable, their potential to deceive, manipulate, and influence at scale also grows. Scams targeting U.S. citizens are skyrocketing. Fraud targeting older adults in the U.S. increased by almost 50% from 2023 to 2024, reaching $5 billion. We advise Congressional bodies and work directly with frontier AI labs to quantify and mitigate the attacks.

Manipulation and Information Shaping

AI models can already manipulate users on par with human experts and shape decision-making across all aspects of life, from personal relationships and everyday choices to political and societal decisions at scale. Language models' tendency toward sycophancy and the reinforcement of users' existing beliefs contribute to entrenched opinions and extremism. Our research helps quantify and counter AI-enabled manipulation using multi-axis, multi-environment evaluation frameworks for manipulation and information attacks.

AI Security Strategy and Risk Frameworks

We develop structured frameworks to assess, quantify, and communicate AI security threats arising from the rapid development of modern AI systems. Our work includes AI-specific security routines (such as granting selected defenders pre-deployment access to new models), AI-specific security risks (such as nation-state cyber espionage targeting model weights and trade secrets), and developing quantification frameworks for harm caused by AI models (such as cybercriminals using American AI models to defraud American citizens or for geopolitical leverage).

Economic Impact of AI-Enabled Cyberattacks

Cybersecurity is ultimately a game of resources between attackers and defenders. AI is dramatically changing the cost equation of previous offensive and defensive operations. We quantify where and how the incentive shifts across the entire gamut of cyber operations, from infrastructure disruption and supply chain attacks to fraud and disinformation. Our findings help decision-makers prioritize and enforce defense strategies with the most impactful outcomes and realistic implementation plans.

Near-term AI Security Long-term AI Safety

Cyberattacks and large-scale manipulation can now be automated to target millions of innocent victims at virtually no cost. As a consequence, democracy is under attack.

This is not a hypothetical risk of tomorrow but a catastrophic risk we face today. True information is more difficult than ever to identify, polarization and extremism are skyrocketing, innocent citizens are being hammered by scams, and industries and infrastructure are torn apart by cybercrime.

As further fuel for this fire, AI is developing much faster than our ability to secure it. Before we discuss long-term plans for safe and aligned AI, we must address the critical vulnerabilities we already face today. This can be done by demanding AI models that are designed to be secure and built to optimize public benefit.