Skip to content

Vera

★ New
assess
AI / ML open-source MIT open-source

At a Glance

Experimental MIT-licensed programming language designed for LLM code generation that replaces variable names with typed De Bruijn slot references, mandates algebraic effects and formal contracts, and compiles to WebAssembly.

Type
open-source
Pricing
open-source
License
MIT
Adoption fit
small
Top alternatives

Vera

What It Does

Vera is an experimental programming language specifically designed for large language models to write, not humans to read. It replaces variable names entirely with typed De Bruijn slot references (@Type.index, where .0 is the most recently bound value of that type), mandates explicit algebraic effect declarations on every function (IO, Http, State, Exn, Inference, Async), and requires full formal contracts — preconditions (requires), postconditions (ensures), and termination proofs (decreases) — verified against the Z3 SMT solver. Programs compile to WebAssembly and run via wasmtime on the CLI or natively in the browser.

The project’s thesis is that standard programming languages optimise for human authorship in ways that create avoidable failure modes for LLMs: flexible variable naming enables incoherence, implicit effects hide state changes, and optional contracts leave correctness assumptions unverified. Vera removes these degrees of freedom. At v0.0.108 (April 2026) the reference implementation is written in Python, covers ~122 built-in functions, includes a 13-chapter language specification, and has 3,205+ tests at 96% coverage.

Key Features

  • Typed slot references instead of variable names: All bindings addressed as @Type.index (De Bruijn indexing) — @Int.0 is the innermost integer binding, @Int.1 the next. Eliminates naming-coherence errors but introduces index-off-by-one risks.
  • Mandatory effect declarations: Every function’s algebraic effects (IO, Http, State, Exn, Inference, Async, Diverge, pure) must be explicitly declared; the type system enforces effect handling. LLM calls are a first-class effect via Inference.
  • Three-tier verification: Tier 1 (Z3 SMT static proof, covers linear arithmetic and simple recursion), Tier 2 (guided with hints), Tier 3 (runtime WASM trap fallback). Most real programs fall to Tier 3.
  • Refinement types: Type constraints such as { @Int | @Int.0 > 0 } (positive integer) expressed inline and checked by Z3 where decidable.
  • Contract-driven testing: vera test generates input counterexamples from contract specifications via Z3 constraint solving.
  • WebAssembly output: Single WASM binary runnable via wasmtime (CLI) or browser; vera compile --target browser emits WASM + self-contained JS runtime + HTML.
  • SKILL.md agent interface: A machine-readable language reference at /SKILL.md is the primary interface for AI coding agents (Claude Code, Cursor, Windsurf).
  • Native LLM integrations: Inference effect auto-detects Anthropic, OpenAI, and Moonshot API keys; HTTP, JSON, Markdown, HTML, and regex are standard built-ins.
  • One canonical form: vera fmt enforces a single textual representation; no stylistic variation is valid.
  • Modules and visibility: Explicit public/private on every top-level declaration; circular imports are detected at compile time.

Use Cases

  • LLM agent code generation research: Building benchmarks or experiments studying how language constraints affect LLM generation accuracy on algorithmic tasks.
  • Exploratory formal-methods tooling: Prototyping contract-verified WebAssembly programs for small, self-contained functions where Z3 coverage is sufficient (linear arithmetic, simple recursion).
  • AI-native tool building: Generating LLM pipeline utilities (HTTP fetching, JSON transformation, Markdown processing) where the code is written entirely by an agent and never read or maintained by humans.

Adoption Level Analysis

Small teams (<20 engineers): Fits experimentally only. No garbage collector, no package manager, no ecosystem, human-unreadable syntax, and a 50-problem single-run benchmark are the current state. Suitable only for researchers or developers explicitly studying AI-native language design.

Medium orgs (20–200 engineers): Does not fit. No production deployments documented, no hiring pipeline for Vera skills, no debugger, no IDE support beyond basic TextMate/VS Code syntax highlighting, and no path to integration with existing CI/CD or testing infrastructure.

Enterprise (200+ engineers): Does not fit. No compliance story, no SLA, no support contract, no organizational memory beyond the author’s GitHub repository.

Alternatives

AlternativeKey DifferencePrefer when…
DafnyMature verification-aware language with Microsoft backing, years of LLM-target research (POPL 2025), human-readable syntax, and 89–96% LLM success rates documented independentlyYou need formal verification for LLM-generated code with real-world research validation
Python / TypeScriptEstablished ecosystems, readable, debuggable, broad LLM training data, HumanEval benchmarks widely reproducedLLM code generation for production workloads
Lean 4Theorem-prover-grade verification, academic credibility, active communityMathematical proofs or high-assurance software
KokaAlgebraic effects research language from Microsoft Research with human-readable syntaxStudying algebraic effects in a more established setting

Evidence & Sources

Notes & Caveats

  • No garbage collector: Programs use bump allocation and can exhaust heap memory; this is acknowledged in the author’s own documentation and limits any real workload.
  • Tier 3 covers most real code: Z3 Tier 1 static proof only covers linear arithmetic, basic logic, and recursion with well-founded measures. Float64, container operations, HTTP, LLM calls, and complex recursion all fall to Tier 3 runtime checking — which is effectively a runtime assertion, not a formal proof.
  • Single-author, pre-validation stage: v0.0.108 with 755+ commits is active development but the author explicitly acknowledges the language has not been stress-tested by its intended users (LLMs) at scale.
  • Human readability deliberately sacrificed: The author describes reading Vera as “not a pleasant experience.” Any debugging or human review of generated code requires mental De Bruijn index resolution.
  • VeraBench methodology gaps: 50 problems, one run per model, no pass@k, no held-out validation set, benchmark authored and administered by the language creator. Results should not be treated as independent evidence.
  • No package manager or standard library beyond built-ins: The 122 built-in functions are the full standard library. Composition of reusable Vera modules is possible but there is no registry or distribution mechanism.
  • SKILL.md as primary LLM interface: The design choice to surface a machine-readable spec at /SKILL.md is interesting and consistent with Agent Skills Specification patterns, but the language is too early-stage for this to carry weight beyond novelty.

Related