Skip to main content
Status:🚀 ACTIVE
Date:📅 unknown
Domain:📦 Observability

Structured Metrics via slog JSON Events

Status: accepted Version: 0.1.0 Last Updated: 2026-02-21 Governing ADRs: 📝 ADR-0019 (structured metrics observability), 📝 ADR-0010 (slog structured logging), 📝 ADR-0013 (background job scheduling), 📝 ADR-0008 (OpenAI API / LiteLLM backend)

Overview

The observability subsystem provides lightweight, structured metric events emitted through the existing slog logger infrastructure. Instead of adding external metrics libraries (Prometheus, StatsD), Spotter emits well-defined metric.* log events with standardized attribute keys. These events are human-readable in text mode (default) and machine-parseable in JSON mode (opt-in via SPOTTER_LOG_FORMAT=json). Operators can use jq, loki, or simple scripts to aggregate, filter, and visualize metrics from log output.

Scope

This spec covers:

  • Log format configuration (SPOTTER_LOG_FORMAT environment variable)
  • Metric event naming convention and attribute key schema
  • Background job instrumentation (sync, enrichment, playlist sync)
  • HTTP request instrumentation (enhancement of existing middleware)
  • LLM/AI operation instrumentation (token usage, latency, model)
  • Track matching instrumentation (strategy, confidence, success rate)

Out of scope: Log aggregation infrastructure (Loki, Grafana), alerting rules, log rotation and retention, dashboard templates.


Requirements

Log Format Configuration

REQ-FMT-001 — The application MUST support a SPOTTER_LOG_FORMAT environment variable (or equivalent Viper config key log.format) with two valid values:

  • text (default) — uses slog.NewTextHandler(os.Stdout, opts)
  • json — uses slog.NewJSONHandler(os.Stdout, opts)

REQ-FMT-002 — If SPOTTER_LOG_FORMAT is unset or set to any value other than json, the application MUST default to text format. Invalid values MUST NOT cause a startup failure.

REQ-FMT-003 — The log format selection MUST occur in cmd/server/main.go during logger initialization, before any log statements are emitted.

REQ-FMT-004 — The log level configuration MUST remain independent of the format selection. Both text and JSON modes MUST respect the configured log level (default: Debug).

REQ-FMT-005 — The application MUST log the selected format on startup: logger.Info("logger initialized", "format", format, "level", level).

Metric Event Keys

REQ-KEY-001 — All metric events MUST use the message prefix metric. followed by a category name. The following categories are defined:

MessageCategoryDescription
metric.syncSyncListen and playlist sync operations
metric.enricherEnricherMetadata enrichment runs per enricher
metric.background_tickBackgroundPer-tick summary for each background loop
metric.track_matchMatchingIndividual track matching attempts
metric.llmLLMAI/LLM API calls
metric.requestRequestHTTP request completion (existing middleware)

REQ-KEY-002 — Each metric event MUST include the attributes defined in its schema below. Optional attributes are marked with (optional).

REQ-KEY-003 — Duration attributes MUST use the key suffix _ms and represent milliseconds as int64. Durations MUST be measured using time.Since(start).Milliseconds().

REQ-KEY-004 — Boolean success/failure attributes MUST use the key success with a bool value. Error details (when success=false) MUST use the key error with a string value (empty string on success).

Background Job Instrumentation

REQ-BG-001 — Each background loop tick MUST emit a metric.background_tick event after processing all users for that tick:

AttributeTypeDescription
loopstringLoop identifier: "sync", "metadata", or "playlist_sync"
users_processedintNumber of users processed in this tick
duration_msint64Total wall-clock time for the tick
errorsintNumber of users that failed (optional, default 0)

REQ-BG-002 — The metric.background_tick event MUST be emitted at Info level regardless of whether errors occurred during the tick. Individual user errors MUST continue to be logged separately at Error level.

REQ-BG-003 — The metric.sync event MUST be emitted after each per-user sync operation completes (success or failure):

AttributeTypeDescription
providerstringProvider name: "navidrome", "spotify", "lastfm"
listens_syncedintNumber of new listens imported
playlists_syncedintNumber of playlists synced
duration_msint64Time to complete this provider sync
successboolWhether the sync completed without error
errorstringError message if success=false, empty otherwise

REQ-BG-004 — The metric.enricher event MUST be emitted after each enricher completes processing an entity batch:

AttributeTypeDescription
enricherstringEnricher name: "lidarr", "musicbrainz", "navidrome", "spotify", "lastfm", "fanart", "openai"
entity_typestringEntity type processed: "artist", "album", "track"
entities_processedintNumber of entities processed (optional)
duration_msint64Time to complete enrichment
successboolWhether enrichment completed without error
errorstringError message if success=false, empty otherwise

Request Instrumentation

REQ-REQ-001 — The existing request logging middleware (internal/middleware/logging.go) MUST be updated to use the metric.request message prefix for consistency with the metric event convention. The existing attributes (method, path, status, remote_ip, latency, request_id) MUST be preserved.

REQ-REQ-002 — The request logging middleware MUST add a duration_ms attribute (in addition to the existing latency duration) for consistency with other metric events:

AttributeTypeDescription
methodstringHTTP method
pathstringRequest path
statusintResponse status code
remote_ipstringClient IP address
latencydurationRequest duration (existing, for human readability)
duration_msint64Request duration in milliseconds (for machine parsing)
request_idstringRequest ID from middleware

LLM Instrumentation

REQ-LLM-001 — Every LLM API call (OpenAI or LiteLLM-compatible backend) MUST emit a metric.llm event upon completion:

AttributeTypeDescription
modelstringModel identifier (e.g., "gpt-4o", "gpt-4o-mini")
operationstringOperation type: "mixtape_generate", "playlist_enhance", "similar_artists", "artist_bio", "album_review", "track_review", "playlist_metadata"
tokens_usedintTotal tokens consumed (prompt + completion)
prompt_tokensintPrompt tokens (optional, if available from API response)
completion_tokensintCompletion tokens (optional, if available from API response)
duration_msint64Wall-clock time for the API call
successboolWhether the API call succeeded
errorstringError message if success=false, empty otherwise

REQ-LLM-002 — Token counts MUST be extracted from the OpenAI API response usage field when available. If the API response does not include usage data (e.g., streaming responses), tokens_used MUST be set to 0 and the event MUST still be emitted.

REQ-LLM-003 — The operation attribute MUST distinguish between different AI use cases so that token consumption can be analyzed per feature.

Track Matching Instrumentation

REQ-MATCH-001 — Each track matching attempt MUST emit a metric.track_match event:

AttributeTypeDescription
strategystringMatching strategy used: "isrc", "exact", "fuzzy"
matchedboolWhether a match was found
confidencefloat64Match confidence score (0.0-1.0, relevant for fuzzy matching)
duration_msint64Time to complete the matching attempt
source_providerstringProvider the track originated from (optional)
target_providerstringProvider being matched against (optional)

REQ-MATCH-002 — If a track is matched via the first strategy (ISRC), the event for that strategy MUST have matched=true. Subsequent strategies (exact, fuzzy) MUST NOT emit events for that track — only the successful (or final failing) strategy emits.

REQ-MATCH-003 — For tracks that fail all three strategies, a single metric.track_match event MUST be emitted with strategy="fuzzy" (the last attempted), matched=false, and confidence=0.0.


Metric Event Examples

JSON format (SPOTTER_LOG_FORMAT=json)

{"time":"2026-02-21T10:30:00Z","level":"INFO","msg":"metric.sync","provider":"navidrome","listens_synced":42,"playlists_synced":3,"duration_ms":1234,"success":true,"error":""}
{"time":"2026-02-21T10:31:00Z","level":"INFO","msg":"metric.enricher","enricher":"musicbrainz","entity_type":"artist","entities_processed":15,"duration_ms":8500,"success":true,"error":""}
{"time":"2026-02-21T10:32:00Z","level":"INFO","msg":"metric.llm","model":"gpt-4o","operation":"mixtape_generate","tokens_used":2150,"prompt_tokens":1800,"completion_tokens":350,"duration_ms":4200,"success":true,"error":""}
{"time":"2026-02-21T10:33:00Z","level":"INFO","msg":"metric.track_match","strategy":"fuzzy","matched":true,"confidence":0.85,"duration_ms":12,"source_provider":"spotify","target_provider":"navidrome"}

Text format (default)

time=2026-02-21T10:30:00Z level=INFO msg=metric.sync provider=navidrome listens_synced=42 playlists_synced=3 duration_ms=1234 success=true error=""

Querying with jq

# All sync metrics for navidrome
cat spotter.log | jq 'select(.msg == "metric.sync" and .provider == "navidrome")'

# Average LLM token usage per operation
cat spotter.log | jq 'select(.msg == "metric.llm") | {operation, tokens_used}'

# Track matching success rate by strategy
cat spotter.log | jq 'select(.msg == "metric.track_match") | {strategy, matched}'

# Failed enricher runs
cat spotter.log | jq 'select(.msg == "metric.enricher" and .success == false)'

Scenarios

Scenario 1: Operator analyzes sync performance

Given SPOTTER_LOG_FORMAT=json is set
And the sync loop has been running for 24 hours
When the operator runs: cat spotter.log | jq 'select(.msg == "metric.sync")'
Then they see one event per provider per user per tick
And each event includes listens_synced, duration_ms, and success
And they can calculate average sync duration and total listens synced per day

Scenario 2: Operator investigates high token usage

Given SPOTTER_LOG_FORMAT=json is set
And multiple users have been generating mixtapes
When the operator runs: cat spotter.log | jq 'select(.msg == "metric.llm") | .tokens_used' | awk '{sum+=$1} END {print sum}'
Then they see total token consumption across all LLM operations
And they can break down usage by operation type to identify the most expensive feature

Scenario 3: Text format backward compatibility

Given SPOTTER_LOG_FORMAT is not set (or set to "text")
When the application starts and runs background jobs
Then all metric events are emitted in slog TextHandler format
And existing log parsing scripts continue to work
And the only difference from pre-observability output is the addition of new metric.* log lines

Scenario 4: Track matching strategy effectiveness

Given SPOTTER_LOG_FORMAT=json is set
And a playlist sync has matched 100 tracks
When the operator queries metric.track_match events
Then they can determine: 60 matched via ISRC, 25 via exact match, 10 via fuzzy match, 5 unmatched
And they can see the average confidence score for fuzzy matches
And they can identify which provider pairs have the lowest match rates

Implementation Notes

  • Logger initialization: cmd/server/main.go:38-41 — add SPOTTER_LOG_FORMAT check and handler selection
  • Config addition: internal/config/config.go — add Log.Format field with Viper binding to SPOTTER_LOG_FORMAT
  • Request middleware: internal/middleware/logging.go:18 — change message to metric.request, add duration_ms
  • Sync instrumentation: internal/services/syncer.go — emit metric.sync after each provider sync
  • Enricher instrumentation: internal/services/metadata.go — emit metric.enricher after each enricher run
  • Background tick: cmd/server/main.go — emit metric.background_tick after each loop iteration
  • LLM instrumentation: internal/vibes/generator.go, internal/vibes/enhancer.go, internal/services/similar_artists.go — emit metric.llm after each API call
  • Track matching: internal/services/ (track matching code) — emit metric.track_match per matching attempt
  • Governing comment: // Governing: <a href="/decisions/ADR-0019-structured-metrics-observability" className="rfc-ref">📝 ADR-0019</a> (structured metrics), <a href="/decisions/ADR-0010-stdlib-slog-structured-logging" className="rfc-ref">📝 ADR-0010</a> (slog), SPEC observability