Structured Metrics via slog JSON Events
Status: accepted Version: 0.1.0 Last Updated: 2026-02-21 Governing ADRs: 📝 ADR-0019 (structured metrics observability), 📝 ADR-0010 (slog structured logging), 📝 ADR-0013 (background job scheduling), 📝 ADR-0008 (OpenAI API / LiteLLM backend)
Overview
The observability subsystem provides lightweight, structured metric events emitted through the existing slog logger infrastructure. Instead of adding external metrics libraries (Prometheus, StatsD), Spotter emits well-defined metric.* log events with standardized attribute keys. These events are human-readable in text mode (default) and machine-parseable in JSON mode (opt-in via SPOTTER_LOG_FORMAT=json). Operators can use jq, loki, or simple scripts to aggregate, filter, and visualize metrics from log output.
Scope
This spec covers:
- Log format configuration (
SPOTTER_LOG_FORMATenvironment variable) - Metric event naming convention and attribute key schema
- Background job instrumentation (sync, enrichment, playlist sync)
- HTTP request instrumentation (enhancement of existing middleware)
- LLM/AI operation instrumentation (token usage, latency, model)
- Track matching instrumentation (strategy, confidence, success rate)
Out of scope: Log aggregation infrastructure (Loki, Grafana), alerting rules, log rotation and retention, dashboard templates.
Requirements
Log Format Configuration
REQ-FMT-001 — The application MUST support a SPOTTER_LOG_FORMAT environment variable (or equivalent Viper config key log.format) with two valid values:
text(default) — usesslog.NewTextHandler(os.Stdout, opts)json— usesslog.NewJSONHandler(os.Stdout, opts)
REQ-FMT-002 — If SPOTTER_LOG_FORMAT is unset or set to any value other than json, the application MUST default to text format. Invalid values MUST NOT cause a startup failure.
REQ-FMT-003 — The log format selection MUST occur in cmd/server/main.go during logger initialization, before any log statements are emitted.
REQ-FMT-004 — The log level configuration MUST remain independent of the format selection. Both text and JSON modes MUST respect the configured log level (default: Debug).
REQ-FMT-005 — The application MUST log the selected format on startup: logger.Info("logger initialized", "format", format, "level", level).
Metric Event Keys
REQ-KEY-001 — All metric events MUST use the message prefix metric. followed by a category name. The following categories are defined:
| Message | Category | Description |
|---|---|---|
metric.sync | Sync | Listen and playlist sync operations |
metric.enricher | Enricher | Metadata enrichment runs per enricher |
metric.background_tick | Background | Per-tick summary for each background loop |
metric.track_match | Matching | Individual track matching attempts |
metric.llm | LLM | AI/LLM API calls |
metric.request | Request | HTTP request completion (existing middleware) |
REQ-KEY-002 — Each metric event MUST include the attributes defined in its schema below. Optional attributes are marked with (optional).
REQ-KEY-003 — Duration attributes MUST use the key suffix _ms and represent milliseconds as int64. Durations MUST be measured using time.Since(start).Milliseconds().
REQ-KEY-004 — Boolean success/failure attributes MUST use the key success with a bool value. Error details (when success=false) MUST use the key error with a string value (empty string on success).
Background Job Instrumentation
REQ-BG-001 — Each background loop tick MUST emit a metric.background_tick event after processing all users for that tick:
| Attribute | Type | Description |
|---|---|---|
loop | string | Loop identifier: "sync", "metadata", or "playlist_sync" |
users_processed | int | Number of users processed in this tick |
duration_ms | int64 | Total wall-clock time for the tick |
errors | int | Number of users that failed (optional, default 0) |
REQ-BG-002 — The metric.background_tick event MUST be emitted at Info level regardless of whether errors occurred during the tick. Individual user errors MUST continue to be logged separately at Error level.
REQ-BG-003 — The metric.sync event MUST be emitted after each per-user sync operation completes (success or failure):
| Attribute | Type | Description |
|---|---|---|
provider | string | Provider name: "navidrome", "spotify", "lastfm" |
listens_synced | int | Number of new listens imported |
playlists_synced | int | Number of playlists synced |
duration_ms | int64 | Time to complete this provider sync |
success | bool | Whether the sync completed without error |
error | string | Error message if success=false, empty otherwise |
REQ-BG-004 — The metric.enricher event MUST be emitted after each enricher completes processing an entity batch:
| Attribute | Type | Description |
|---|---|---|
enricher | string | Enricher name: "lidarr", "musicbrainz", "navidrome", "spotify", "lastfm", "fanart", "openai" |
entity_type | string | Entity type processed: "artist", "album", "track" |
entities_processed | int | Number of entities processed (optional) |
duration_ms | int64 | Time to complete enrichment |
success | bool | Whether enrichment completed without error |
error | string | Error message if success=false, empty otherwise |
Request Instrumentation
REQ-REQ-001 — The existing request logging middleware (internal/middleware/logging.go) MUST be updated to use the metric.request message prefix for consistency with the metric event convention. The existing attributes (method, path, status, remote_ip, latency, request_id) MUST be preserved.
REQ-REQ-002 — The request logging middleware MUST add a duration_ms attribute (in addition to the existing latency duration) for consistency with other metric events:
| Attribute | Type | Description |
|---|---|---|
method | string | HTTP method |
path | string | Request path |
status | int | Response status code |
remote_ip | string | Client IP address |
latency | duration | Request duration (existing, for human readability) |
duration_ms | int64 | Request duration in milliseconds (for machine parsing) |
request_id | string | Request ID from middleware |
LLM Instrumentation
REQ-LLM-001 — Every LLM API call (OpenAI or LiteLLM-compatible backend) MUST emit a metric.llm event upon completion:
| Attribute | Type | Description |
|---|---|---|
model | string | Model identifier (e.g., "gpt-4o", "gpt-4o-mini") |
operation | string | Operation type: "mixtape_generate", "playlist_enhance", "similar_artists", "artist_bio", "album_review", "track_review", "playlist_metadata" |
tokens_used | int | Total tokens consumed (prompt + completion) |
prompt_tokens | int | Prompt tokens (optional, if available from API response) |
completion_tokens | int | Completion tokens (optional, if available from API response) |
duration_ms | int64 | Wall-clock time for the API call |
success | bool | Whether the API call succeeded |
error | string | Error message if success=false, empty otherwise |
REQ-LLM-002 — Token counts MUST be extracted from the OpenAI API response usage field when available. If the API response does not include usage data (e.g., streaming responses), tokens_used MUST be set to 0 and the event MUST still be emitted.
REQ-LLM-003 — The operation attribute MUST distinguish between different AI use cases so that token consumption can be analyzed per feature.
Track Matching Instrumentation
REQ-MATCH-001 — Each track matching attempt MUST emit a metric.track_match event:
| Attribute | Type | Description |
|---|---|---|
strategy | string | Matching strategy used: "isrc", "exact", "fuzzy" |
matched | bool | Whether a match was found |
confidence | float64 | Match confidence score (0.0-1.0, relevant for fuzzy matching) |
duration_ms | int64 | Time to complete the matching attempt |
source_provider | string | Provider the track originated from (optional) |
target_provider | string | Provider being matched against (optional) |
REQ-MATCH-002 — If a track is matched via the first strategy (ISRC), the event for that strategy MUST have matched=true. Subsequent strategies (exact, fuzzy) MUST NOT emit events for that track — only the successful (or final failing) strategy emits.
REQ-MATCH-003 — For tracks that fail all three strategies, a single metric.track_match event MUST be emitted with strategy="fuzzy" (the last attempted), matched=false, and confidence=0.0.
Metric Event Examples
JSON format (SPOTTER_LOG_FORMAT=json)
{"time":"2026-02-21T10:30:00Z","level":"INFO","msg":"metric.sync","provider":"navidrome","listens_synced":42,"playlists_synced":3,"duration_ms":1234,"success":true,"error":""}
{"time":"2026-02-21T10:31:00Z","level":"INFO","msg":"metric.enricher","enricher":"musicbrainz","entity_type":"artist","entities_processed":15,"duration_ms":8500,"success":true,"error":""}
{"time":"2026-02-21T10:32:00Z","level":"INFO","msg":"metric.llm","model":"gpt-4o","operation":"mixtape_generate","tokens_used":2150,"prompt_tokens":1800,"completion_tokens":350,"duration_ms":4200,"success":true,"error":""}
{"time":"2026-02-21T10:33:00Z","level":"INFO","msg":"metric.track_match","strategy":"fuzzy","matched":true,"confidence":0.85,"duration_ms":12,"source_provider":"spotify","target_provider":"navidrome"}
Text format (default)
time=2026-02-21T10:30:00Z level=INFO msg=metric.sync provider=navidrome listens_synced=42 playlists_synced=3 duration_ms=1234 success=true error=""
Querying with jq
# All sync metrics for navidrome
cat spotter.log | jq 'select(.msg == "metric.sync" and .provider == "navidrome")'
# Average LLM token usage per operation
cat spotter.log | jq 'select(.msg == "metric.llm") | {operation, tokens_used}'
# Track matching success rate by strategy
cat spotter.log | jq 'select(.msg == "metric.track_match") | {strategy, matched}'
# Failed enricher runs
cat spotter.log | jq 'select(.msg == "metric.enricher" and .success == false)'
Scenarios
Scenario 1: Operator analyzes sync performance
Given SPOTTER_LOG_FORMAT=json is set
And the sync loop has been running for 24 hours
When the operator runs: cat spotter.log | jq 'select(.msg == "metric.sync")'
Then they see one event per provider per user per tick
And each event includes listens_synced, duration_ms, and success
And they can calculate average sync duration and total listens synced per day
Scenario 2: Operator investigates high token usage
Given SPOTTER_LOG_FORMAT=json is set
And multiple users have been generating mixtapes
When the operator runs: cat spotter.log | jq 'select(.msg == "metric.llm") | .tokens_used' | awk '{sum+=$1} END {print sum}'
Then they see total token consumption across all LLM operations
And they can break down usage by operation type to identify the most expensive feature
Scenario 3: Text format backward compatibility
Given SPOTTER_LOG_FORMAT is not set (or set to "text")
When the application starts and runs background jobs
Then all metric events are emitted in slog TextHandler format
And existing log parsing scripts continue to work
And the only difference from pre-observability output is the addition of new metric.* log lines
Scenario 4: Track matching strategy effectiveness
Given SPOTTER_LOG_FORMAT=json is set
And a playlist sync has matched 100 tracks
When the operator queries metric.track_match events
Then they can determine: 60 matched via ISRC, 25 via exact match, 10 via fuzzy match, 5 unmatched
And they can see the average confidence score for fuzzy matches
And they can identify which provider pairs have the lowest match rates
Implementation Notes
- Logger initialization:
cmd/server/main.go:38-41— addSPOTTER_LOG_FORMATcheck and handler selection - Config addition:
internal/config/config.go— addLog.Formatfield with Viper binding toSPOTTER_LOG_FORMAT - Request middleware:
internal/middleware/logging.go:18— change message tometric.request, addduration_ms - Sync instrumentation:
internal/services/syncer.go— emitmetric.syncafter each provider sync - Enricher instrumentation:
internal/services/metadata.go— emitmetric.enricherafter each enricher run - Background tick:
cmd/server/main.go— emitmetric.background_tickafter each loop iteration - LLM instrumentation:
internal/vibes/generator.go,internal/vibes/enhancer.go,internal/services/similar_artists.go— emitmetric.llmafter each API call - Track matching:
internal/services/(track matching code) — emitmetric.track_matchper matching attempt - Governing comment:
// Governing: <a href="/decisions/ADR-0019-structured-metrics-observability" className="rfc-ref">📝 ADR-0019</a> (structured metrics), <a href="/decisions/ADR-0010-stdlib-slog-structured-logging" className="rfc-ref">📝 ADR-0010</a> (slog), SPEC observability