Skip to content

OpenLLMetry

★ New
trial
Observability open-source Apache-2.0 open-source

At a Glance

Open-source OpenTelemetry-based instrumentation library for LLM applications, providing standardized traces, metrics, and logs across 16+ LLM providers, 7 vector databases, and 10 AI frameworks.

Type
open-source
Pricing
open-source
License
Apache-2.0
Adoption fit
small, medium, enterprise
Top alternatives

What It Does

OpenLLMetry wraps OpenTelemetry’s tracing, metrics, and logging APIs with instrumentation patches for LLM providers, vector databases, and AI frameworks. Installing the package and calling Traceloop.init() injects monkey patches that automatically capture LLM request parameters (model, temperature, token counts), prompt/completion payloads, latency, and errors as OTel spans. Those spans are emitted via standard OTLP and accepted by any OTel-compatible backend — Datadog, Grafana, Honeycomb, New Relic, SigNoz, and 20+ others — without routing through Traceloop’s own infrastructure.

The project contributed its GenAI semantic conventions (the gen_ai.* attribute namespace) upstream to the OpenTelemetry project, where they are now the official experimental spec under the GenAI SIG. This means OpenLLMetry-instrumented traces are interoperable with any tool that also adopts the OTel GenAI conventions — including Datadog’s native LLM observability layer (released late 2024).

Key Features

  • Auto-instrumentation for 16+ LLM providers: OpenAI, Anthropic, AWS Bedrock, Google Vertex AI/Gemini, Cohere, Groq, Mistral, HuggingFace, Ollama, IBM Watsonx, Replicate, Together AI, SageMaker, and others
  • Auto-instrumentation for 7 vector databases: Chroma, LanceDB, Marqo, Milvus, Pinecone, Qdrant, Weaviate
  • Framework-level instrumentation for LangChain, LangGraph, LlamaIndex, CrewAI, Haystack, LiteLLM, Langflow, Agno, AWS Strands, OpenAI Agents SDK
  • Sends to 23+ backends via OTLP — no vendor lock-in to Traceloop’s platform
  • Prompt/completion payload capture with optional log disable for privacy-sensitive environments
  • Metrics emission: token usage, latency histograms, error rates per model and operation
  • Python-first with TypeScript, Go, and Ruby support in separate sub-packages
  • Contributes to official OTel GenAI Semantic Conventions SIG — aligned with emerging industry standard

Use Cases

  • Use case 1: Teams already running Datadog, Grafana, or Honeycomb who want LLM call visibility without a separate observability silo
  • Use case 2: Organizations with compliance requirements preferring OTel’s standard data format over proprietary SDKs that may change without notice
  • Use case 3: Multi-cloud or multi-provider LLM deployments needing a single, provider-agnostic instrumentation layer
  • Use case 4: Platform teams building internal LLM observability infrastructure on top of an open standard with upstream governance

Adoption Level Analysis

Small teams (<20 engineers): Fits well as a lightweight addition to existing observability stacks. Two-file setup (install + init) works. If the team lacks any observability infrastructure, Langfuse’s self-hosted stack (with built-in UI for agent traces) may provide more out-of-the-box value.

Medium orgs (20–200 engineers): Good fit when OTel is already standardized. The 23+ backend destinations mean engineering teams can route LLM traces into whatever APM tooling is already paid for and alert-configured. Requires some work to build LLM-specific dashboards on the backend side.

Enterprise (200+ engineers): Viable, particularly for regulated industries that require data to stay within their own OTel pipeline and never touch a third-party SaaS. The Apache-2.0 license and OTel standards alignment are compliance-friendly. Post-ServiceNow acquisition (March 2026), enterprises should evaluate whether Traceloop’s platform will remain independently purchasable or get bundled into ServiceNow licensing.

Alternatives

AlternativeKey DifferencePrefer when…
LangfuseMIT-licensed; ships full-stack UI (tracing, eval, prompt mgmt, datasets); 24k+ GitHub starsYou want an all-in-one LLM observability platform with self-hosted option and no existing OTel investment
Arize PhoenixOTel-native like OpenLLMetry but with stronger native evaluation toolkit; local-first dev workflowYou need LLM evaluation (hallucination scores, retrieval quality) co-located with tracing
LangSmithDeepest LangChain/LangGraph agent chain debugging; proprietary; $39/user/monthYour stack is LangChain-heavy and you need first-class agent trace visualization today
Datadog LLM ObservabilityNative GenAI OTel conventions support; enterprise-grade alerting; expensiveYou already pay for Datadog and want zero-additional-infrastructure LLM monitoring

Evidence & Sources

Notes & Caveats

  • ServiceNow acquisition (March 2026): Traceloop was acquired by ServiceNow for an estimated $60–80M. The team has committed to keeping OpenLLMetry open-source and continuing OTel contributions. However, commercial product roadmap will now be dictated by ServiceNow’s enterprise AI governance strategy (AI Control Tower integration). Teams relying on the Traceloop SaaS platform should monitor for pricing or feature changes.
  • Prompt payload capture = PII risk: By default, OpenLLMetry captures full prompt and completion text as span attributes. Any PII in user prompts will flow through your OTel pipeline and into your observability backend. There is no built-in redaction layer; teams must implement sanitization at the OTel Collector level or disable log capture explicitly.
  • GenAI semantic conventions are experimental: The gen_ai.* attribute namespace is marked experimental in the OTel spec. Attribute names, types, and enums may change in minor releases through 2026–2027 as the GenAI SIG stabilizes the spec. Budget for instrumentation migration work.
  • Agent trace visualization gap: Multi-step agent trace correlation (seeing a full agent session as a unified tree of LLM calls + tool invocations) depends on the downstream backend’s visualization capabilities. OpenLLMetry emits correct spans, but most APM vendors do not yet render them as agent-native waterfall/tree views. Langfuse and LangSmith have better native UI for this today.
  • Python-first maturity: The Python package is the primary implementation. TypeScript/Go/Ruby instrumentations have fewer supported providers and may lag Python releases.
  • Prior telemetry collection: Versions before v0.49.2 collected usage telemetry from end-user applications. The fix was made in response to community criticism. Review changelog if pinned to an older version.

Related