What It Does

Redwood Research is a Berkeley-based 501(c)(3) nonprofit AI safety organization whose primary mission is aligning superhuman AI systems. Founded by Buck Shlegeris and Ryan Greenblatt (now Chief Scientist), the organization focuses on three research threads: AI control (techniques for keeping AI systems safe even if they are misaligned), AI scheming detection (evaluating whether frontier models might strategically deceive evaluators), and empirical alignment research (interpretability, evaluation methodology).

Unlike METR, which focuses on capability assessment, Redwood’s work is more directly oriented toward solutions: developing control protocols that assume misalignment and ask whether we can still catch misbehavior in time. The organization also runs Constellation, a ~30,000 sq ft coworking space in Berkeley hosting staff from Open Philanthropy, ARC, Atlas Fellowship, and CEA, making it a physical node in the EA-aligned AI safety ecosystem.

Key Features

AI Control research: formal threat models for misaligned AI, protocols for catching scheming behavior before it causes harm
Triframe and related agent architectures for AI evaluation scaffolding
“Making Deals with Early Schemers” and related work on AI strategic deception
Pre-deployment evaluation consulting for frontier labs (collaborative with METR)
Ryan Greenblatt’s publicly-tracked personal AI timeline estimates, cited widely in the AI safety community
Constellation office space: hub for EA-adjacent AI safety organizations in Berkeley
Public Substack/blog (blog.redwoodresearch.org) publishing research and opinion from staff

Use Cases

AI safety policy: Governments and researchers using Redwood’s control protocol work to understand what procedural safeguards could reduce risk from advanced AI systems
Evaluator methodology: Labs and evaluation bodies drawing on Redwood’s scheming detection research to design evaluations robust to strategic deception
Research collaboration: AI safety-focused organizations sharing the Constellation space and collaborating on alignment work
Timeline calibration: Technical Directors and strategy teams tracking Greenblatt’s published timeline estimates as one data point in scenario planning for AI capability timelines

Adoption Level Analysis

Small teams (<20 engineers): Not directly applicable as a product or toolset. Relevant as a source of public research on AI safety methodology and evaluation design. Blog posts and arXiv papers are accessible.

Medium orgs (20–200 engineers): Relevant if building AI agents and concerned about safety properties. Redwood’s control protocol research and scheming detection work can inform evaluation design. Not a vendor relationship.

Enterprise (200+ engineers): Primary audience for indirect influence. Frontier AI labs, government bodies, and large-scale AI deployers engage with Redwood’s research for safety policy development. The control protocol work is especially relevant for organizations deploying autonomous agents in high-stakes settings.

Alternatives

Alternative	Key Difference	Prefer when…
METR	Focuses on capability benchmarking and pre-deployment evaluation, less on control protocols	You need formal third-party evaluation of whether a model has dangerous autonomous capabilities
Apollo Research	Specializes in AI scheming and deception evaluation specifically	You need targeted evaluation of strategic deception and goal-directed misalignment
UK AISI / DSIT	Government body with regulatory mandate; produces Inspect framework	You need government-backed evaluation framework or compliance evidence
ARC (Alignment Research Center)	Precursor organization; broader interpretability and elicitation work	You need mechanistic interpretability or elicitation research

Evidence & Sources

Notes & Caveats

Funding concentration: Redwood has received approximately $21M in total funding, with $20M from Open Philanthropy (OP). This creates a single-donor dependency risk. If OP reallocated priorities, Redwood would face significant financial pressure.
Community selection bias: As a prominent institution in the EA/AI safety community, Redwood’s public communications (including staff blog posts and timeline estimates) may be subject to community norms that reward doom-adjacent updates. This is a soft but real publication bias to account for when consuming timeline estimates.
Small team, outsized influence: The organization is small (~30–50 people) but its work is referenced by major AI labs and governments. This creates a bottleneck and makes the work potentially fragile to key-person departures.
Not a product or service: Redwood does not sell tools, APIs, or SaaS. Catalog value is as a reference organization for understanding the AI safety research landscape and tracking influential AI timeline estimates.
Relationship with Anthropic: Redwood’s founders and staff have close ties to Anthropic (Greenblatt previously worked there). This creates collaborative advantages but also potential epistemic proximity that could affect how Anthropic’s model capabilities are perceived and reported.

Redwood Research

At a Glance

What It Does

Key Features

Use Cases

Adoption Level Analysis

Alternatives

Evidence & Sources

Notes & Caveats

Related

METR (Model Evaluation & Threat Research)

Humanity's Last Exam (HLE)

AI Safety Evaluation (Pre-Deployment)

Anthropic