Skip to content

Redwood Research

★ New
assess
AI / ML vendor N/A (nonprofit research organization) free

At a Glance

Nonprofit AI safety research organization focused on AI control, alignment, and pre-deployment evaluation, home to the triframe agent architecture and influential work on AI scheming detection.

Type
vendor
Pricing
free
License
N/A
Adoption fit
enterprise
Top alternatives

What It Does

Redwood Research is a Berkeley-based 501(c)(3) nonprofit AI safety organization whose primary mission is aligning superhuman AI systems. Founded by Buck Shlegeris and Ryan Greenblatt (now Chief Scientist), the organization focuses on three research threads: AI control (techniques for keeping AI systems safe even if they are misaligned), AI scheming detection (evaluating whether frontier models might strategically deceive evaluators), and empirical alignment research (interpretability, evaluation methodology).

Unlike METR, which focuses on capability assessment, Redwood’s work is more directly oriented toward solutions: developing control protocols that assume misalignment and ask whether we can still catch misbehavior in time. The organization also runs Constellation, a ~30,000 sq ft coworking space in Berkeley hosting staff from Open Philanthropy, ARC, Atlas Fellowship, and CEA, making it a physical node in the EA-aligned AI safety ecosystem.

Key Features

  • AI Control research: formal threat models for misaligned AI, protocols for catching scheming behavior before it causes harm
  • Triframe and related agent architectures for AI evaluation scaffolding
  • “Making Deals with Early Schemers” and related work on AI strategic deception
  • Pre-deployment evaluation consulting for frontier labs (collaborative with METR)
  • Ryan Greenblatt’s publicly-tracked personal AI timeline estimates, cited widely in the AI safety community
  • Constellation office space: hub for EA-adjacent AI safety organizations in Berkeley
  • Public Substack/blog (blog.redwoodresearch.org) publishing research and opinion from staff

Use Cases

  • AI safety policy: Governments and researchers using Redwood’s control protocol work to understand what procedural safeguards could reduce risk from advanced AI systems
  • Evaluator methodology: Labs and evaluation bodies drawing on Redwood’s scheming detection research to design evaluations robust to strategic deception
  • Research collaboration: AI safety-focused organizations sharing the Constellation space and collaborating on alignment work
  • Timeline calibration: Technical Directors and strategy teams tracking Greenblatt’s published timeline estimates as one data point in scenario planning for AI capability timelines

Adoption Level Analysis

Small teams (<20 engineers): Not directly applicable as a product or toolset. Relevant as a source of public research on AI safety methodology and evaluation design. Blog posts and arXiv papers are accessible.

Medium orgs (20–200 engineers): Relevant if building AI agents and concerned about safety properties. Redwood’s control protocol research and scheming detection work can inform evaluation design. Not a vendor relationship.

Enterprise (200+ engineers): Primary audience for indirect influence. Frontier AI labs, government bodies, and large-scale AI deployers engage with Redwood’s research for safety policy development. The control protocol work is especially relevant for organizations deploying autonomous agents in high-stakes settings.

Alternatives

AlternativeKey DifferencePrefer when…
METRFocuses on capability benchmarking and pre-deployment evaluation, less on control protocolsYou need formal third-party evaluation of whether a model has dangerous autonomous capabilities
Apollo ResearchSpecializes in AI scheming and deception evaluation specificallyYou need targeted evaluation of strategic deception and goal-directed misalignment
UK AISI / DSITGovernment body with regulatory mandate; produces Inspect frameworkYou need government-backed evaluation framework or compliance evidence
ARC (Alignment Research Center)Precursor organization; broader interpretability and elicitation workYou need mechanistic interpretability or elicitation research

Evidence & Sources

Notes & Caveats

  • Funding concentration: Redwood has received approximately $21M in total funding, with $20M from Open Philanthropy (OP). This creates a single-donor dependency risk. If OP reallocated priorities, Redwood would face significant financial pressure.
  • Community selection bias: As a prominent institution in the EA/AI safety community, Redwood’s public communications (including staff blog posts and timeline estimates) may be subject to community norms that reward doom-adjacent updates. This is a soft but real publication bias to account for when consuming timeline estimates.
  • Small team, outsized influence: The organization is small (~30–50 people) but its work is referenced by major AI labs and governments. This creates a bottleneck and makes the work potentially fragile to key-person departures.
  • Not a product or service: Redwood does not sell tools, APIs, or SaaS. Catalog value is as a reference organization for understanding the AI safety research landscape and tracking influential AI timeline estimates.
  • Relationship with Anthropic: Redwood’s founders and staff have close ties to Anthropic (Greenblatt previously worked there). This creates collaborative advantages but also potential epistemic proximity that could affect how Anthropic’s model capabilities are perceived and reported.

Related