Tracing & Observability¶
otel ¶
OpenTelemetry integration for Heddle distributed tracing.
All public functions in this module are safe to call without
opentelemetry installed — they degrade to no-ops. This lets
production code instrument unconditionally while making OTel an
optional dependency.
Trace context propagation uses W3C traceparent format, injected into
NATS message dicts under the _trace_context key.
GenAI semantic conventions ~~~~~~~~~~~~~~~~~~~~~~~~~~
LLM call spans (llm.call) in worker/runner.py follow the emerging
OTel GenAI semantic conventions for attribute naming:
gen_ai.system— provider identifier (anthropic,ollama,openai)gen_ai.request.model/gen_ai.response.model— model namesgen_ai.usage.input_tokens/gen_ai.usage.output_tokens— token countsgen_ai.request.temperature/gen_ai.request.max_tokens— request params
When HEDDLE_TRACE_CONTENT=1, prompt and completion text are recorded as
span events (gen_ai.content.prompt, gen_ai.content.completion).
See: https://opentelemetry.io/docs/specs/semconv/gen-ai/
Legacy llm.* attributes are preserved for backward compatibility.
Setup::
from heddle.tracing import init_tracing
init_tracing("heddle-pipeline", endpoint="http://localhost:4317")
init_tracing ¶
Initialize OTel tracing with OTLP exporter.
Idempotent: a second call is a no-op that returns True. Without
this guard, calling init_tracing twice triggered the OTel SDK's
"Overriding of current TracerProvider is not allowed" warning, which
surfaced in tests and in CLI commands that re-imported the tracing
module.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
service_name
|
str
|
Service name reported to the collector. |
'heddle'
|
endpoint
|
str | None
|
OTLP gRPC endpoint (e.g. |
None
|
Returns:
| Type | Description |
|---|---|
bool
|
|
bool
|
|
Source code in src/heddle/tracing/otel.py
141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 | |
status ¶
Return a snapshot of the current tracing configuration.
Addresses the inspectability-of-defaults guardrail: callers and operators can ask "is OTel active? what's the endpoint? what exporter is it using?" without guessing or reading process state.
Returns:
| Type | Description |
|---|---|
dict[str, Any]
|
A dict with these keys (always present, types as documented): |
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
dict[str, Any]
|
|
The returned dict is a shallow copy of internal state — mutating
it does not affect future status() calls.
TODO(cli): when a heddle status CLI subcommand is added,
surface this dict in its output (a one-line "OTel: enabled,
endpoint=…" summary plus a verbose mode that prints the full
dict). See workspace AUDIT_TODO.md OTel W1.
Source code in src/heddle/tracing/otel.py
get_tracer ¶
Get a tracer instance (real or no-op depending on OTel availability).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
name
|
str
|
Instrumentation scope name (e.g. |
'heddle'
|
Returns:
| Type | Description |
|---|---|
Any
|
An OTel |
Source code in src/heddle/tracing/otel.py
inject_trace_context ¶
Inject current trace context into a message dict.
Adds a _trace_context key containing W3C propagation headers.
Safe to call without OTel installed (no-op).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
carrier
|
dict[str, Any]
|
Message dict (modified in-place). |
required |
Source code in src/heddle/tracing/otel.py
extract_trace_context ¶
Extract trace context from a message dict.
Reads the _trace_context key and returns an OTel Context
that can be passed to tracer.start_as_current_span(context=...).
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
carrier
|
dict[str, Any]
|
Message dict with optional |
required |
Returns:
| Type | Description |
|---|---|
Any
|
An OTel |
Source code in src/heddle/tracing/otel.py
trace_correlation_processor ¶
trace_correlation_processor(logger: Any, method_name: str, event_dict: dict[str, Any]) -> dict[str, Any]
Structlog processor that tags log records with the active trace context.
When called inside a span, adds trace_id (32-char hex) and
span_id (16-char hex) to the event_dict so downstream renderers
and shippers can correlate logs with their span in any OTel backend.
No-op when OTel is unavailable or when no span is active.
Wire into a structlog.configure(...) call before the renderer,
e.g.::
structlog.configure(processors=[
structlog.processors.TimeStamper(fmt="iso"),
trace_correlation_processor,
structlog.dev.ConsoleRenderer(),
])
The hex encoding matches the W3C traceparent convention used by
most OTel backends and the heddle _trace_context wire field.