Design: Listen and Playlist Synchronization Service
Context
Spotter is a companion application for Navidrome that aggregates listening history and playlists from multiple music providers (Spotify, Last.fm, Navidrome). Without a centralized sync service, each provider would need its own ingestion pipeline, leading to duplicated logic for deduplication, scheduling, and event notification. The sync service consolidates all provider data ingestion into a single orchestration layer that runs on a configurable background ticker and supports on-demand triggers from the HTTP layer.
Governing ADRs: [📝 ADR-0007](../../adrs/📝 ADR-0007-in-memory-event-bus) (in-memory event bus for SSE notifications), [📝 ADR-0005](../../adrs/📝 ADR-0005-navidrome-primary-identity-provider) (Navidrome as primary identity), [📝 ADR-0013](../../adrs/📝 ADR-0013-goroutine-ticker-background-scheduling) (goroutine ticker scheduling), [📝 ADR-0016](../../adrs/📝 ADR-0016-pluggable-provider-factory-pattern) (pluggable provider factory).
Goals / Non-Goals
Goals
- Unified orchestration of history and playlist ingestion from all configured providers
- Incremental history sync using a per-provider timestamp watermark (last
PlayedAt) - Cross-provider listen deduplication with a 2-minute time window
- Playlist upsert with track position preservation and removal of stale tracks
- Background scheduling on a configurable interval (default 5 minutes)
- On-demand sync triggered from HTTP handlers (returns 202, runs in background)
- Per-provider backoff and error classification for resilient retries
- Structured metric events (
metric.sync,metric.background_tick) for observability - Sync event audit logging to the database (
SyncEvententity)
Non-Goals
- Metadata enrichment (delegated to the Metadata Enrichment Pipeline)
- Playlist write-back to Navidrome (delegated to the Playlist Sync to Navidrome spec)
- Provider OAuth token management (delegated to each provider's factory)
- Cross-user sync coordination or distributed locking
Decisions
Factory-Based Provider Discovery
Choice: Provider instances are created per-user at sync time via registered providers.Factory functions.
Rationale: Factories encapsulate per-user credential lookup (e.g., user.Edges.SpotifyAuth).
Returning nil, nil for unconfigured providers allows silent skipping without error logging.
This decouples the Syncer from provider-specific authentication details.
Alternatives considered:
- Direct provider construction in the Syncer: would couple Syncer to every provider's auth model
- Singleton providers shared across users: would require thread-safe token management per user
Cross-Provider Deduplication via Time Window
Choice: Before persisting a listen, check for an existing listen with the same track name,
artist name, and a PlayedAt within a 2-minute window across all providers.
Rationale: The same song played once may be reported by both Spotify and Last.fm with slightly different timestamps. A 2-minute window catches these duplicates without false positives for genuinely separate plays of the same track.
Alternatives considered:
- Exact timestamp match per provider only: misses cross-provider duplicates
- ISRC-based matching: not all providers populate ISRC on history tracks
- Longer time windows (5+ minutes): risk of false positive dedup for repeated listens
Error Classification with Backoff
Choice: Sync errors are classified into Fatal and Retriable categories. Fatal errors
trigger user notifications and email alerts; retriable errors use exponential backoff.
Rationale: A revoked OAuth token (fatal) requires user intervention, while a transient 503 (retriable) will resolve on its own. Distinguishing these prevents unnecessary user alerts while ensuring real failures are surfaced promptly.
Alternatives considered:
- Uniform retry for all errors: would spam notifications on transient failures
- No backoff: would hammer failing providers every tick
Architecture
Sync Orchestration Flow
Listen Deduplication Flow
Key Implementation Details
Primary file: internal/services/sync.go (~920 lines)
Syncerstruct holds[]providers.Factory,*BackoffManager,SyncNotifier, and the Ent client, config, logger, and event bus.Register(factory)appends a factory to the slice. Called at startup incmd/server/main.gofor Navidrome, Spotify, and Last.fm.getActiveProviders(ctx, user)refreshes the user with all auth edges loaded (WithSpotifyAuth,WithNavidromeAuth,WithLastfmAuth), then calls each factory.nilreturns are silently skipped; errors are logged and the factory is skipped.syncHistory()iterates HistoryFetcher providers, queries the most recentListen.PlayedAtfor the timestamp watermark, then callsGetRecentListenswith a batched callback.persistListens()performs the two-layer deduplication: cross-provider (2-min window) and exact match (provider+track+played_at). New listens are published to the event bus.syncPlaylists()iterates PlaylistManager providers. Spotter-managed Navidrome playlists (those synced from other sources) are excluded from re-import.persistPlaylistTracks()uses a two-phase approach: move existing tracks to negative positions to avoid unique constraint conflicts, then upsert new positions.logEvent()persists structuredSyncEventrecords for audit logging.
Background scheduler: cmd/server/main.go:206-262 — goroutine with time.NewTicker,
per-user goroutine fan-out with bounded concurrency via channel semaphore, emits
metric.background_tick after each tick.
On-demand trigger: HTTP handlers call syncer.Sync(ctx, user) in a background goroutine,
returning immediately.
Risks / Trade-offs
- Cross-provider dedup window is heuristic — A 2-minute window may miss duplicates with larger timestamp drift (e.g., scrobbler delay). It may also false-positive on repeated plays of the same song within 2 minutes. The current window is a pragmatic balance.
- No transaction around playlist track sync — The two-phase position update (negative positions, then real positions) is not wrapped in a single transaction. A crash mid-sync could leave tracks with negative positions. Mitigation: the next sync will correct them.
- Unbounded per-user goroutines — While a semaphore limits concurrency, the goroutine count is still proportional to user count. For Spotter's personal-use deployment model, this is acceptable.
- Backoff state is in-memory — Process restart clears all backoff state. A provider with a persistent error will be retried immediately after restart.
- History sync defaults to epoch for new users — The
sincetimestamp falls back totime.Unix(0, 0)rather than a configurable lookback. This imports the full available history, which may be very large for some providers.
Migration Plan
The sync service was implemented as part of the initial Spotter architecture. No migration from a prior system was required. Key implementation milestones:
- Core sync loop:
Syncer.Sync()with factory-based provider discovery - Background scheduler: goroutine + ticker in
main.gowith per-user fan-out - Deduplication: cross-provider time-window check added to prevent double-counting
- Backoff and error classification:
BackoffManageradded for resilient retry behavior - Observability:
metric.syncandmetric.background_tickevents added per 📝 ADR-0019 - Email notifications:
SyncNotifierinterface added for fatal error alerting
Open Questions
- Should the cross-provider dedup window (2 minutes) be configurable via environment variable?
Currently hardcoded in
isDuplicateListen(). - Should the sync service support partial playlist sync (only playlists changed since last sync) rather than fetching all playlists every tick?
- Should backoff state be persisted to the database to survive process restarts?
- The
sincetimestamp for new users defaults to epoch. Should this be configurable (e.g.,SPOTTER_SYNC_HISTORY_LOOKBACK) to limit initial import size?