Agent SOPs: Token-optimized workflow instructions for AI agents
# Agent SOPs
> Token-optimized workflow instructions for AI agents using reklawdbox.
Concise, imperative SOPs for AI agents. Each includes a copyable agent prompt. ### LLM-friendly access [Section titled “LLM-friendly access”](#llm-friendly-access) Per-SOP plaintext URLs for init prompts: | Workflow | URL | | -------------------- | ------------------------------------------------------------------------------------ | | Batch Import | [`/_llms-txt/batch-import-sop.txt`](/_llms-txt/batch-import-sop.txt) | | Collection Audit | [`/_llms-txt/collection-audit-sop.txt`](/_llms-txt/collection-audit-sop.txt) | | Genre Classification | [`/_llms-txt/genre-classification-sop.txt`](/_llms-txt/genre-classification-sop.txt) | | Genre Audit | [`/_llms-txt/genre-audit-sop.txt`](/_llms-txt/genre-audit-sop.txt) | | Set Building | [`/_llms-txt/set-building-sop.txt`](/_llms-txt/set-building-sop.txt) | | Pool Building | [`/_llms-txt/pool-building-sop.txt`](/_llms-txt/pool-building-sop.txt) | | Chapter Set Planning | [`/_llms-txt/chapter-set-planning-sop.txt`](/_llms-txt/chapter-set-planning-sop.txt) | | Metadata Backfill | [`/_llms-txt/metadata-backfill-sop.txt`](/_llms-txt/metadata-backfill-sop.txt) | | Library Health | [`/_llms-txt/library-health-sop.txt`](/_llms-txt/library-health-sop.txt) | All SOPs combined: [`/_llms-txt/agent-sops.txt`](/_llms-txt/agent-sops.txt) See also [`/llms.txt`](/llms.txt) for the full site index. ## Workflows [Section titled “Workflows”](#workflows) * [Batch Import](/agent/batch-import/) — prepare newly acquired music for Rekordbox import * [Genre Classification](/agent/genre-classification/) — classify genres using Discogs, Beatport, and audio evidence * [Genre Audit](/agent/genre-audit/) — audit existing genre tags against enrichment and audio evidence * [Set Building](/agent/set-building/) — build transition-scored DJ set sequences * [Pool Building](/agent/pool-building/) — build compatible track pools for live improvisation * [Chapter Set Planning](/agent/chapter-set-planning/) — plan multi-chapter sets with bridge tracks * [Collection Audit](/agent/collection-audit/) — detect and fix naming, tagging, and convention violations * [Metadata Backfill](/agent/metadata-backfill/) — backfill labels, years, and albums from enrichment caches * [Library Health](/agent/library-health/) — scan broken links, orphan files, duplicates, and playlist coverage ## Before You Start [Section titled “Before You Start”](#before-you-start) Run `cache_coverage()` first. Most workflows require cached data — hydrate if coverage is low. ## Key Principles [Section titled “Key Principles”](#key-principles) * **Read-only DB.** Changes staged in memory, exported as XML. * **Human approval required.** Present recommendations, wait for confirmation. * **Cache-first.** Flag gaps rather than triggering external calls mid-workflow. * **Export path:** `update_tracks()` → `preview_changes()` → `write_xml()`
# Batch Import
> Agent SOP for preparing newly acquired music for Rekordbox import.
Process newly acquired music (downloaded albums, loose tracks, zips) into an organized structure ready for Rekordbox import. **Goal:** After this SOP, any processed track can be imported into Rekordbox with Artist, Title, Album, Year, Track Number, and Label displaying correctly. Genre is handled separately via the genre classification SOP. ### Agent prompt [Section titled “Agent prompt”](#agent-prompt) Paste into your agent to start: ```plaintext Process the new music in [/path/to/folder] for Rekordbox import. ``` ## Constraints * **Tags are source of truth.** Write tags before renaming — filenames are derived from tags. * **Never set genre.** Leave genre tags empty; handled via genre classification SOP. * **Stop on ambiguity.** Never guess artist names, album years, or label names — ask the user. * **Process album by album.** Complete one fully before starting the next. * **WAV dual-tag rule.** WAV files need both ID3v2 and RIFF INFO in sync. `write_file_tags` handles this automatically. * **Verify before moving.** Confirm tags and filenames before moving to final location. ## Prerequisites | Tool | Purpose | | ---------------- | ------------------------------------------------------------------ | | `reklawdbox` MCP | Tag reading/writing, Discogs/Beatport lookups, cover art embedding | `lookup_discogs`, `lookup_beatport`, `lookup_bandcamp`, `lookup_musicbrainz`, `read_file_tags`, `write_file_tags`, `embed_cover_art`, and `extract_cover_art` are MCP tool calls. *** ## Phase 1: Assessment ### Step 1: Discover library structure Before importing, understand how the destination library is organized. Use `read_library` to find the content root, then scan 2 levels deep: ```sh find "/path/to/content-root" -maxdepth 2 -type d | head -80 ``` Identify: * **Album directory** — which top-level directory contains `Artist/Album (Year)/` subdirectories? * **Loose track directory** — which top-level directory contains batch folders with flat audio files? Present your findings to the user and confirm before proceeding: ```plaintext Your library structure appears to be: - Albums: {albums-dir}/{Artist}/{Album (Year)}/ - Loose tracks: {loose-dir}/{batch-name}/ Is this correct? What should this batch's loose tracks folder be named? ``` Use the confirmed paths for all move operations in Phase 2 Step 9 and Phase 3 Step 4. If the library is empty or the structure is unclear, ask the user to define their preferred convention. ### Step 2: List the batch directory ```sh ls -la "/path/to/batch/" ``` Categorize: directories (albums/EPs), audio files at root (loose tracks), zip files (need extraction). ### Step 3: Handle zip files Extract all root-level zips. Already-extracted zips (matching directory exists with files) are archived without re-extraction. Failed extractions go to `_failed_zips/`. ```sh cd "/path/to/batch" mkdir -p "_processed_zips" "_failed_zips" find . -maxdepth 1 -type f -name "*.zip" -print0 | while IFS= read -r -d '' zip; do zip="${zip#./}" dir="${zip%.zip}" if [ -d "$dir" ] && find "$dir" -type f -print -quit | grep -q .; then mv "$zip" "_processed_zips/$zip" echo "Archived already-extracted: $zip" continue fi tmp_dir="$(mktemp -d "${TMPDIR:-/tmp}/batch-import-unzip.XXXXXX")" if unzip -o "$zip" -d "$tmp_dir"; then mkdir -p "$dir" find "$tmp_dir" -mindepth 1 -maxdepth 1 -exec mv -n {} "$dir"/ \; if find "$dir" -type f -print -quit | grep -q .; then mv "$zip" "_processed_zips/$zip" echo "Extracted: $zip" else mv "$zip" "_failed_zips/$zip" echo "No files extracted: $zip" fi else mv "$zip" "_failed_zips/$zip" echo "Extraction failed: $zip" fi rm -rf "$tmp_dir" done ``` ### Step 4: Report to user Summarize what was found: album directories, loose tracks, zip results. *** ## Phase 2: Process Albums For each album subdirectory, follow these steps in order. ### Step 1: Survey current state ```sh ls -la "/path/to/batch/Album Directory/" ``` Read tags for all files: ```plaintext read_file_tags(directory="/path/to/batch/Album Directory/") ``` Note: current filename pattern, which tags are present, whether cover art exists. ### Step 2: Parse directory name Common incoming patterns: * `Artist Name - Album Name` * `Artist Name - Album Name (Year)` * `Artist Name - Album Name [FLAC 24-96]` Extract: artist, album name, year (if present). ### Step 3: Determine album type * **Same artist on all tracks** → Single Artist * **Different artists per track** → VA Compilation * **“Various Artists” in dir name** → VA Compilation For VA: label name is **required**. Check Publisher tag, directory name, or look up. ### Step 4: Look up metadata ```plaintext lookup_discogs(artist="Artist Name", title="First Track Title", album="Album Name") ``` If no Discogs result, try Beatport: ```plaintext lookup_beatport(artist="Artist Name", title="First Track Title") ``` If still missing data (common with self-released and digital-only music), try Bandcamp: ```plaintext lookup_bandcamp(artist="Artist Name", title="First Track Title") ``` Bandcamp is the authoritative source for Bandcamp-sourced downloads and is often the **only** source for self-released music. The tool handles search and metadata extraction with caching automatically. If still missing year or label, try MusicBrainz as a final fallback: ```plaintext lookup_musicbrainz(artist="Artist Name", title="First Track Title") ``` Use results for: release year, label name, artist/album spelling verification. **Stop and ask** on: multiple matches with different years, no results and year/label unknown, ambiguous artist. Never use lookup results for genre. ### Step 5: Write tags Use `write_file_tags` for all tag writes. It handles WAV dual-tagging automatically. **Album-wide + per-track tags:** ```plaintext write_file_tags(writes=[ {path: "/path/to/album/01 original.flac", tags: { artist: "Track Artist", title: "Track Title", track: "1", album: "Album Name", year: "YEAR", publisher: "Label Name", album_artist: "Artist Name" }}, ... ]) ``` **Per-track tags** — parse from filenames. Common incoming patterns: | Pattern | Parse as | | ------------------------------- | ------------------------------- | | `Artist - Album - NN Title.wav` | Track N: Artist - Title | | `NN Artist - Title.wav` | Track N: Artist - Title | | `NN Title.wav` | Track N: \[AlbumArtist] - Title | | `NN. Title.wav` | Track N: \[AlbumArtist] - Title | ### Step 6: Verify tags ```plaintext read_file_tags(directory="/path/to/album/") ``` Confirm every file has: Artist, Title, Track, Album, Year. ### Step 7: Rename files from tags Construct target filenames from tags: `NN Artist - Title.ext` Use `mv -n` (no-clobber) for each file. Validate the target filename is non-empty before renaming. ```sh mv -n "/path/to/album/old filename.flac" "/path/to/album/01 Artist Name - Track Title.flac" ``` If rename produces unexpected results, stop and check tags — rename depends entirely on tag correctness. ### Step 8: Embed cover art Check for existing cover art in the directory: ```sh find "/path/to/album/" -maxdepth 1 -type f \( -iname "*.jpg" -o -iname "*.jpeg" -o -iname "*.png" \) -print ``` **Single obvious cover** (`cover.jpg`, `front.jpg`, `folder.jpg`): ```plaintext embed_cover_art(image_path="/path/to/album/cover.jpg", targets=["/path/to/album/01 Artist - Track.flac", ...]) ``` **Multiple images:** Ask user which is the cover. **No images:** Try `lookup_bandcamp(artist="...", title="...")` first — if the result includes a `cover_image` URL, download it: ```sh curl -sL -o "/path/to/album/cover.jpg" "BANDCAMP_COVER_URL" ``` If no Bandcamp result, try `lookup_discogs(...)` — if result includes `cover_image`, download and embed: ```sh curl -sL -o "/path/to/album/cover.jpg" "DISCOGS_COVER_IMAGE_URL" ``` ```plaintext embed_cover_art(image_path="/path/to/album/cover.jpg", targets=["/path/to/album/01 Artist - Track.flac", ...]) ``` If no cover from `lookup_bandcamp` or `lookup_discogs`, note for user to source manually. ### Step 9: Create target directory and move files Use the album directory convention established in Phase 1 Step 1. Determine clean directory name (strip tech specs, add year, clean special chars). ```sh # Single Artist — use the confirmed album directory mkdir -p "/path/to/albums-dir/Artist Name/Album Name (Year)" # Or VA mkdir -p "/path/to/albums-dir/Various Artists/Label Name/Album Name (Year)" # Move audio + cover art find "/path/to/batch/Old Dir" -maxdepth 1 -type f \ \( -iname "*.wav" -o -iname "*.flac" -o -iname "*.mp3" \) \ -exec mv -n {} "/path/to/dest/Artist Name/Album Name (Year)/" \; find "/path/to/batch/Old Dir" -maxdepth 1 -type f -iname "cover.*" \ -exec mv -n {} "/path/to/dest/Artist Name/Album Name (Year)/" \; # Remove old empty directory rmdir "/path/to/batch/Old Dir" ``` ### Step 10: Verify final state ```sh ls -la "/path/to/dest/Artist Name/Album Name (Year)/" ``` ```plaintext read_file_tags(directory="/path/to/dest/Artist Name/Album Name (Year)/") ``` Confirm: files in correct location, `NN Artist - Title.ext` format, all tags present, VA has label subdirectory, cover art embedded. *** ## Phase 3: Process Loose Tracks For audio files at the root of the batch directory. ### Step 1: Clean filenames Strip store-generated `(Original)`, `(Original Mix)`, and `(Original Version)` suffixes. Keep other parenthetical info (`(Remix)`, `(Edit)`, `(Original Club Mix)`, etc.): ```sh cd "/path/to/batch" find . -maxdepth 1 -type f \( -name "* (Original Mix).*" -o -name "* (Original Version).*" -o -name "* (Original).*" \) -print0 | while IFS= read -r -d '' f; do new_name="$(echo "$f" | sed -E 's/ \(Original( (Mix|Version))?\)//')" mv -n "$f" "$new_name" done ``` Expected format: `Artist Name - Track Title.ext`. If unparseable, ask user. ### Step 2: Read/write tags For each loose track, read existing tags: ```plaintext read_file_tags(paths=["/path/to/batch/Artist - Title.wav"]) ``` If tags are missing, look up with `lookup_discogs(...)` / `lookup_beatport(...)` / `lookup_bandcamp(...)` / `lookup_musicbrainz(...)`. Bandcamp is often the only source for self-released and digital-only music. Write tags (include album — almost always available from the source release): ```plaintext write_file_tags(writes=[{ path: "/path/to/batch/Artist - Title.wav", tags: {artist: "Artist Name", title: "Track Title", album: "Release Name", publisher: "Label Name", year: "YEAR"} }]) ``` ### Step 3: Embed cover art Check for `cover_Artist Name - Track Title.jpg` files. If found: ```plaintext embed_cover_art(image_path="/path/to/cover.jpg", targets=["/path/to/track.flac"]) ``` If no local cover file exists, try `lookup_bandcamp(artist="...", title="...")` — if the result includes a `cover_image` URL, download it: ```sh curl -sL -o "/path/to/cover_Artist - Title.jpg" "BANDCAMP_COVER_URL" ``` ```plaintext embed_cover_art(image_path="/path/to/cover_Artist - Title.jpg", targets=["/path/to/Artist - Title.wav"]) ``` **WAV files:** Keep the cover image file alongside the audio — Rekordbox cannot import cover art from WAV tags, so the user needs the image colocated for manual import. For other formats (FLAC, MP3, AIFF), clean up the downloaded image after embedding. If no Bandcamp result, try `lookup_discogs(...)` for `cover_image`. Note any tracks without cover art in the report. ### Step 4: Move loose tracks Use the loose track destination established in Phase 1 Step 1. ```sh mkdir -p "/path/to/loose-dir/batch-name" mv -n "/path/to/batch/Artist - Title.wav" "/path/to/loose-dir/batch-name/" ``` *** ## Phase 4: Multi-Disc Albums If an album has disc subdirectories (`CD1/`, `CD2/`, `Disc 1/`, etc.): * Track numbers restart at 01 per disc * Cover art at album root (not in disc folders) * Set Disc Number tag on each track via `write_file_tags` * Album-wide tags (album, year, publisher) go on all tracks across all discs *** ## Phase 5: Final Verification Summarize: albums processed (single artist vs VA), loose tracks processed, any unresolved items, WAV tracks needing manual cover art in Rekordbox, and next steps (import, collection audit SOP, genre classification SOP). *** ## Decision Reference ### Proceed automatically when * Artist clearly identified in tags, filename, or directory name * Year present in tags, directory name, or single clear Discogs match * Label present in tags or Discogs (for VA) * Single artist with consistent tags across album ### Stop and ask when * Multiple matches with different years * No results and year/label unknown * VA album but label unknown * Ambiguous: collaboration vs VA vs single artist * Multiple images — which is album cover? * Conflicting metadata between tags and filenames * Unparseable filenames *** ## Common Incoming Filename Patterns ### Album tracks | Pattern | Parse as | | ----------------------------------- | ------------------------------- | | `Artist - Album - NN Title.wav` | Track N: Artist - Title | | `Artist - Album - NN. Title.wav` | Track N: Artist - Title | | `NN Artist - Title.wav` | Track N: Artist - Title | | `NN. Artist - Title.wav` | Track N: Artist - Title | | `NN Title.wav` | Track N: \[AlbumArtist] - Title | | `NN. Title.wav` | Track N: \[AlbumArtist] - Title | | `Artist - Album - NN AX. Title.wav` | Track N: Artist - Title (vinyl) | ### Loose tracks | Pattern | Status | | ----------------------------------- | -------------------------------------------------------------------- | | `Artist - Title.wav` | Correct | | `Artist - Title (Remix Info).wav` | Correct | | `Artist, Artist B - Title.wav` | Correct | | `Artist - Title (Original Mix).wav` | Remove `(Original)` / `(Original Mix)` / `(Original Version)` suffix | | `Title.wav` | Missing artist — ask user |
# Chapter Set Planning
> Agent SOP for planning multi-chapter DJ sets with bridge tracks.
Plan a full set from locked chapters with bridge tracks between them. ### Agent prompt [Section titled “Agent prompt”](#agent-prompt) Paste into your agent to start: ```plaintext Plan a set from my locked chapters. ``` ## Constraints * **Read-only.** Chapter set planning never modifies track metadata. * **Human controls the result.** Agent proposes chapter order and bridges. User approves all decisions. * **Cache-first.** All scoring uses cached data. No external API calls. * **Chapters first.** This workflow assumes the user has already built and locked chapters (pools saved as playlists) using the Pool Building workflow. ## Prerequisites * At least 2 locked chapters (playlists) from the Pool Building workflow. Alternatively, `discover_pools` can find chapter candidates automatically from a library subset using Bron-Kerbosch clique enumeration. * User must identify each chapter as **sequenced** (fixed order) or **unordered** (improvise order live) ```plaintext cache_coverage() ``` All providers at 100%. ## Steps ### 1. Gather chapters Ask user for: * Chapter playlist IDs (from `get_playlists`) * Whether each chapter is **sequenced** or **unordered** * Target set duration and overall energy arc preference ### 2. Analyze each chapter For each chapter: ```plaintext describe_pool( playlist_id="...", master_tempo=false ) ``` Summarize per chapter: * Energy band, BPM center, dominant genre, key neighborhood * Cohesion score * For unordered chapters: optimal reference BPM Present a table of all chapters side by side. ### 3. Propose chapter order Based on the user’s energy arc preference, propose an ordering that creates a coherent energy progression across the night: * **Warmup** — lowest energy chapters first * **Build** — ascending energy * **Peak** — highest energy chapters * **Release** — descending energy to close Consider BPM progression — avoid large BPM jumps between adjacent chapters. Present the proposed order with energy flow visualization. Ask user: “Approve order? / reorder / swap chapters” ### 4. Find bridge tracks For each chapter boundary (transition between chapters), find 1-3 bridge tracks that connect them: **Step 4a: Identify boundary tracks** * For **sequenced chapters**: the last 2-3 tracks of the outgoing chapter and first 2-3 of the incoming chapter * For **unordered chapters**: all members are potential boundary tracks **Step 4b: Search for bridges** Use `expand_pool` with seeds from both chapters’ boundary tracks: ```plaintext expand_pool( seed_track_ids=[...outgoing_boundary, ...incoming_boundary], additions=5, master_tempo=false ) ``` Tracks compatible with both sets of boundary tracks are natural bridges. **Step 4c: Score bridge candidates** For each candidate, score against both chapters: ```plaintext score_pool_compatibility( track_id="candidate", pool_track_ids=[...outgoing_boundary], master_tempo=false ) ``` ```plaintext score_pool_compatibility( track_id="candidate", pool_track_ids=[...incoming_boundary], master_tempo=false ) ``` Rank by the minimum of the two min-scores (must work with both sides). **Step 4d: Present bridge options** Present 3-5 bridge candidates per boundary with: * Compatibility scores to both chapters * BPM/energy position relative to the two chapters * Why each works (strongest axes) Ask user: “Pick bridge(s) / skip this boundary / search with different parameters” ### 5. Sequence unordered chapters For unordered chapters, propose internal sequence: ```plaintext build_set( track_ids=[...chapter_track_ids], target_tracks=N, priority="balanced", energy_curve="flat", master_tempo=false, beam_width=3 ) ``` Or use `score_transition` for pairwise evaluation if the chapter is small (≤5 tracks). Present ordering options. Ask user: “Pick sequence / keep unordered / adjust” **Weight presets.** Use `save_weight_preset` to save custom transition or pool weights for reuse. `list_weight_presets` shows all available presets (built-in and custom). `delete_weight_preset` removes custom presets. ### 6. Present full set plan Display the complete set plan: * Chapter order with bridge tracks between them * Per-chapter: track list, energy band, BPM range * Overall energy arc visualization * Total track count and estimated duration * Any gaps or weak transitions flagged Ask user: “Approve plan? / swap chapters / change bridges / adjust sequences” ### 7. Export Export the full set as a single ordered playlist: ```plaintext write_xml(playlists=[{ "name": "Set Plan: Venue Date", "track_ids": [...chapter1, ...bridge1, ...chapter2, ...bridge2, ...] }]) ``` Optionally export each chapter as a separate playlist too. Report output path. Remind user: File → Import Collection in Rekordbox.
# Collection Audit
> Agent SOP for detecting and fixing naming, tagging, and convention violations.
Scan audio files for convention violations and fix them with user approval. ### Agent prompt [Section titled “Agent prompt”](#agent-prompt) Paste into your agent to start: ```plaintext Audit my collection for naming and tagging issues. ``` ## Constraints * **Never rename imported files.** Before any rename, check `search_tracks(path="...")` — if the file is in Rekordbox, defer the rename. Rekordbox cannot update paths via XML import (creates duplicates). * **Never manually resolve with `resolution="fixed"`** — that value is reserved for the scan engine’s auto-resolution. * Resolution options: `accepted_as_is`, `deferred`, `wont_fix`. * **Read-only by default.** No files modified until user approves a fix plan. * **Never delete audio files.** Renaming and tag editing only. * **WAV dual-tag rule.** `write_file_tags` handles this automatically. ## Prerequisites * reklawdbox MCP server connected * No external tools required — all tag operations use `audit_state` and `write_file_tags` ## Steps ### 0. Determine scope **Default: full collection.** Use the `content_roots` from `read_library` as the scan scope. If the user specified a narrower scope in their prompt (an artist, directory, or subset), use that instead. **Always skip `GENRE_SET`** — genre issues are handled by the Genre Classification and Genre Audit workflows. ### 0b. (Optional) Structural health scans Before the tag audit, consider running these scans for structural library issues: * `scan_broken_links()` — tracks in Rekordbox whose audio files are missing from disk * `scan_orphan_files()` — audio files on disk not imported into Rekordbox * `scan_duplicates()` — duplicate tracks by metadata or file hash * `scan_playlist_coverage()` — tracks not assigned to any playlist These are independent of the tag audit and can be run at any time. Address any findings before or after the tag audit. ### 1. Scan ```plaintext audit_state(scan, scope="/path/", skip_issue_types=["GENRE_SET"]) ``` Review summary: `files_in_scope`, `new_issues` by type, `total_open`. If `total_open` = 0 → skip to Final Report. ### 2. Review and triage Query issues by type: ```plaintext audit_state(query_issues, scope="/path/", status="open", issue_type="WAV_TAG3_MISSING") ``` Categorize by safety tier: | Category | Safety tier | Issue types | Action | | -------------- | ----------- | ------------------------------------------------------ | -------------------------- | | Auto-fixable | Safe | `WAV_TAG3_MISSING`, `WAV_TAG_DRIFT`, `ARTIST_IN_TITLE` | → Step 3 | | Rename-fixable | Rename-safe | `ORIGINAL_MIX_SUFFIX`, `TECH_SPECS_IN_DIR` | → Step 3 (if not imported) | | Needs review | Review | All others | → Step 4 | Present triage summary to user. Ask user: “Start with safe auto-fixes?“ ### 3. Fix safe issues Present planned writes for approval, then apply: ```plaintext write_file_tags(writes=[{path: "...", tags: {artist: "...", title: "..."}}]) ``` Re-scan to verify fixes (scan auto-marks resolved issues as `fixed`): ```plaintext audit_state(scan, scope="/path/") ``` Never manually resolve with `resolution="fixed"` — that’s reserved for the scan engine. **WAV\_TAG\_DRIFT and Original suffixes:** When the drift is caused by one tag layer having an `(Original)`, `(Original Mix)`, or `(Original Version)` suffix that the other lacks, strip the suffix from both layers rather than syncing it. These are store-generated filler — the shorter title is correct. More specific Original variants like `(Original Club Mix)` or `(Original Demo Version)` should be kept as-is. ### 4. Review-tier issues Present each issue type in batches. **Empty/missing tags** (`EMPTY_ARTIST`, `EMPTY_TITLE`, `MISSING_TRACK_NUM`, `MISSING_ALBUM`, `MISSING_YEAR`): * Check the `detail` field — it may be `null` for empty/missing-tag issues. Infer fix from filename and parent directory structure. * Use `lookup_discogs()` / `lookup_beatport()` / `lookup_bandcamp()` / `lookup_musicbrainz()` for missing album/year * Ask user to confirm each fix **Filename issues** (`BAD_FILENAME`, `MISSING_YEAR_IN_DIR`, `FILENAME_TAG_DRIFT`): * For `FILENAME_TAG_DRIFT`: compare `detail.filename` vs `detail.tag` values (see handling below) * For imported files needing rename: defer with note about manual Rekordbox relocate **FILENAME\_TAG\_DRIFT — handling:** On macOS, `/` is the only forbidden filename character. The scanner normalises `/` to `-` before comparison, so `AC/DC` in a tag won’t flag drift against `AC-DC` in a filename. All other characters (`: ? " * | < >`) are valid on macOS — if a download tool substituted them, the file should be renamed to match the tag. Workflow: 1. Read the `detail` JSON — contains `filename` and `tag` values for each drifted field 2. If the file is **not imported** into Rekordbox, rename to match the tag 3. If the file **is imported**, defer with a note about manual Rekordbox relocate, or accept as-is 4. If the drift is a genuinely different name (not just character substitution), look up via `lookup_discogs` / `lookup_beatport` / `lookup_bandcamp` / `lookup_musicbrainz` 5. If ambiguous after lookup, flag for user review — do not guess **Genre tags** (`GENRE_SET`, if included): * Per file: keep / clear / migrate to comments * When migrating: read existing comment first via `read_file_tags`, prepend genre, never overwrite * `write_file_tags(writes=[{path: "...", tags: {comment: "Genre: Deep House | ", genre: null}}])` **No-tag files** (`NO_TAGS`): * Infer metadata from parent directory name, companion files, filename * Present inferred values to user for confirmation before writing For items user wants to skip: ```plaintext audit_state(resolve_issues, issue_ids=[...], resolution="accepted_as_is", note="...") audit_state(resolve_issues, issue_ids=[...], resolution="deferred", note="...") audit_state(resolve_issues, issue_ids=[...], resolution="wont_fix", note="...") ``` ### 5. Verification scan ```plaintext audit_state(scan, scope="/path/") ``` If `total_open` > 0, return to Step 2 or confirm remaining items are intentionally deferred. ### 6. Final report ```plaintext audit_state(get_summary, scope="/path/") ``` Present: scope, files scanned, pass rate, fixes by type, deferred items, next steps (Rekordbox import, genre classification SOP). *** ## Issue Type Reference | Issue type | Safety tier | Detection | Fix method | | --------------------- | ----------- | ------------------------------------------------------------------------- | -------------------------------------------------------------- | | `EMPTY_ARTIST` | Review | `artist` field empty/null | Parse from filename | | `EMPTY_TITLE` | Review | `title` field empty/null | Parse from filename | | `MISSING_TRACK_NUM` | Review | `track` field empty/null | Parse from filename | | `MISSING_ALBUM` | Review | `album` field empty/null | Directory name or enrichment lookup | | `MISSING_YEAR` | Review | `year` field empty/null | Enrichment lookup | | `ARTIST_IN_TITLE` | Safe | `title` starts with `{artist} -` | Strip prefix | | `WAV_TAG3_MISSING` | Safe | WAV file with `tag3_missing` non-empty | Copy from tag 2 | | `WAV_TAG_DRIFT` | Safe | WAV `id3v2` and `riff_info` values differ | Sync to tag 2 (but see Original-suffix note in Step 3) | | `GENRE_SET` | Review | `genre` field non-empty | User decides keep/clear/migrate | | `NO_TAGS` | Review | All 14 tag fields empty/null | Infer from filename/dir | | `BAD_FILENAME` | Review | Filename doesn’t match canonical | User review | | `ORIGINAL_MIX_SUFFIX` | Rename-safe | Filename contains `(Original)`, `(Original Mix)`, or `(Original Version)` | Strip suffix from filename and tags (defer rename if imported) | | `TECH_SPECS_IN_DIR` | Rename-safe | Directory contains `[FLAC]`, `[WAV]`, etc. | Strip from dir name | | `MISSING_YEAR_IN_DIR` | Review | Album directory missing `(YYYY)` suffix | Enrichment lookup | | `FILENAME_TAG_DRIFT` | Review | Filename artist/title disagrees with tags | Rename to match tag (defer if imported) |
# Genre Audit
> Agent SOP for auditing existing genre tags against enrichment and audio evidence.
Audit existing genre tags across a Rekordbox collection using cached enrichment and audio evidence. ### Agent prompt [Section titled “Agent prompt”](#agent-prompt) Paste into your agent to start: ```plaintext Audit my existing genre tags. ``` ## Constraints * **Taxonomy is compiled.** The alias map in `genre.rs` cannot be changed at runtime. If the user disagrees with a mapping, the agent works around it by using the user’s preferred genre directly. * **No auto-tagging.** Every genre change requires explicit human approval. * **Cache-first.** Audit runs from cached data. If cache is empty for a track, it is flagged as insufficient evidence — do not trigger enrichment mid-audit. * **XML export only.** No direct DB writes. All changes flow through `update_tracks` → `preview_changes` → `write_xml`. ## Prerequisites ```plaintext cache_coverage(has_genre=true) ``` All providers must be at 100% before proceeding. If not, hydrate first: ```plaintext analyze_audio_batch(has_genre=true, skip_cached=true, max_tracks=200) enrich_tracks(has_genre=true, skip_cached=true, providers=["discogs"], max_tracks=50) enrich_tracks(has_genre=true, skip_cached=true, providers=["beatport"], max_tracks=50) ``` Repeat until all providers reach 100%. Report progress between batches. **Recommended:** run `suggest_normalizations()` and stage alias fixes before auditing. This ensures the audit compares against canonical genre names, not aliases that inflate conflict counts. ## Steps ### 1. Audit batch ```plaintext audit_genres(max_tracks=50) ``` The server applies the genre decision tree to tracks that already have genres, comparing the evidence-based result against the current tag. Only conflicts and manual-review tracks appear in results. Report the confidence distribution before review: “This batch: N high, N medium, N low, N insufficient out of N total.” ### 2. Review by confidence tier Process each tier separately, highest confidence first. #### High confidence Present as a numbered list, grouped by suggested genre for scannability: ```plaintext High confidence (12 conflicts): → Deep Techno (5): #1 Artist A — Track X | Techno → Deep Techno (Beatport: Raw/Deep, Discogs: Deep Techno x3) #2 Artist B — Track Y | Techno → Deep Techno (label: Ilian Tape, audio: atmospheric) ... → Tech House (3): #6 Artist C — Track Z | House → Tech House (Beatport: Tech House, Discogs: Tech House x2) ... ``` Ask user: **“Approve all, or reject/change specific numbers.”** Stage approved changes. #### Medium confidence Same numbered-list format. Highlight evidence tensions where sources disagree. Ask user: **“Approve all, or reject/change specific numbers. Say ‘investigate #N’ for any you want more context on.”** For investigated tracks: use artist discography, label roster, and related tracks in the library to form a recommendation. Stage approved changes. #### Low confidence and manual review These need per-track judgment. For each: 1. Read the evidence and flags from the audit result 2. Use artist reputation, label context, and track title to form a recommendation 3. Present reasoning: `Artist on Label — known for [context]. Evidence: [summary]. Recommend: GENRE because [reason].` Present in small groups (3–5) and ask user to approve/reject/change each. Stage approved changes. ### 3. Paginate Repeat Steps 1–2 with `offset` until all tracks are processed. Report cumulative progress between batches: `"Batch 3/N: M audited, X changes staged, Y confirmed."` ### 4. Export ```plaintext preview_changes() ``` Ask user: “Export these changes to XML?” ```plaintext write_xml() ``` Report output path, then walk the user through the Rekordbox import: 1. **Add XML to Rekordbox** — Open Preferences → Advanced → rekordbox xml → **Imported Library** → Browse → select the exported XML file. 2. **Open the XML view** — In the sidebar, click the **“Display rekordbox xml”** icon. The imported tracks appear under “All Tracks”. 3. **Import into collection** — Select all tracks (Cmd+A), right-click → **Import To Collection**. When prompted “Do you want to load information in the tag of the library being imported?”, click **Yes** (tick “Don’t ask me again” for bulk imports).
# Genre Classification
> Agent SOP for evidence-based genre tagging.
Classify genres across a Rekordbox collection using cached enrichment and audio evidence. ### Agent prompt [Section titled “Agent prompt”](#agent-prompt) Paste into your agent to start: ```plaintext Classify genres for my ungenred tracks. ``` ## Constraints * **Taxonomy is compiled.** The alias map in `genre.rs` cannot be changed at runtime. If the user disagrees with a mapping, the agent works around it by using the user’s preferred genre directly. * **No auto-tagging.** Every genre change requires explicit human approval. * **Cache-first.** Classification runs from cached data. If cache is empty for a track, it is flagged as insufficient evidence — do not trigger enrichment mid-classification. * **XML export only.** No direct DB writes. All changes flow through `update_tracks` → `preview_changes` → `write_xml`. * **Per-track reasoning is mandatory for low/insufficient confidence.** Do not skip or batch-approve these tracks. Delegate to subagents if the volume is large. ## Prerequisites ```plaintext cache_coverage(has_genre=false) ``` All providers must be at 100% before proceeding. If not, hydrate first: ```plaintext analyze_audio_batch(has_genre=false, skip_cached=true, max_tracks=200) enrich_tracks(has_genre=false, skip_cached=true, providers=["discogs"], max_tracks=50) enrich_tracks(has_genre=false, skip_cached=true, providers=["beatport"], max_tracks=50) ``` Repeat until all providers reach 100%. Report progress between batches. All `classify_tracks` calls support `max_tracks` (default 50, max 200) and `offset` for pagination. Continue until all tracks are covered. ## Steps ### 1. Normalize aliases ```plaintext suggest_normalizations(stage_aliases=true) ``` This auto-stages all non-debatable alias mappings (e.g. `Hip-Hop` → `Hip Hop`, `DnB` → `Drum & Bass`) and returns grouped results showing each mapping and its track count. Review the staged aliases. If any are debatable, unstage them with `clear_changes(track_ids=[...])` and ask the user. Store overrides for Step 2. **Unknown genres** (non-canonical, no alias mapping) will be classified in Step 2 using the `has_unknown_genre` filter. ### 2. Classify and get approval policy Get the confidence distribution first using `format="summary"`: ```plaintext classify_tracks(max_tracks=200, format="summary", genre_overrides=[...]) classify_tracks(has_unknown_genre=true, max_tracks=200, format="summary", genre_overrides=[...]) ``` The summary format returns genre-grouped counts with top artists per genre — no per-track data. Report the aggregate distribution: ```plaintext "N tracks classified: X high, Y medium, Z low, W insufficient." ``` Present the confidence distribution and ask the user which tiers to auto-approve: **“Approve all high-confidence? Approve high and medium? Or review everything?”** Default recommendation: approve high, present medium as summary, full review for low/insufficient. ### 3. Stage approved tiers Re-run classification with `auto_stage` to stage the approved tiers in one shot: ```plaintext classify_tracks(max_tracks=200, format="summary", auto_stage=["high", "medium"], genre_overrides=[...]) classify_tracks(has_unknown_genre=true, max_tracks=200, format="summary", auto_stage=["high", "medium"], genre_overrides=[...]) ``` The response includes a `staging` field with the staged count. No separate `update_tracks` call is needed. For medium-confidence tracks that are NOT auto-approved, use `format="compact"` to get the per-track list for review: ```plaintext classify_tracks(max_tracks=200, format="compact", genre_overrides=[...]) classify_tracks(has_unknown_genre=true, max_tracks=200, format="compact", genre_overrides=[...]) ``` Present as a numbered list grouped by genre. Highlight which evidence sources agree vs disagree: ```plaintext → Techno (8): #1 Cristi Cons — Bird In Space (Discogs: Techno, label: Cocoon → Techno, 124bpm) #2 Polar Inertia — Black Sun remix (Discogs: Techno, 131bpm) ... ``` Ask: **“Approve all, or reject/change specific numbers. Say ‘investigate #N’ for any you want more context on.”** For investigated tracks, use `resolve_tracks_data` to get full evidence, then check artist context via `search_tracks`. Stage approved changes with `update_tracks`. ### 4. Review low and insufficient confidence These tracks need per-track reasoning — do not batch-approve them. Use subagents to parallelize the work. #### 4.1 Get dispatch roster Use `format="dispatch"` to get low/insufficient tracks grouped by artist: ```plaintext classify_tracks(max_tracks=200, format="dispatch", genre_overrides=[...]) classify_tracks(has_unknown_genre=true, max_tracks=200, format="dispatch", genre_overrides=[...]) ``` The dispatch format returns artists sorted by track count descending, with per-track evidence and flags included. Partition into subagent batches: * Artists with 10+ tracks → dedicated subagent * Remaining artists → batch into subagents of \~40–50 tracks #### 4.2 Dispatch review subagents Launch as many subagents in parallel as possible. Each subagent receives: * A list of tracks from the dispatch roster (including evidence and flags) * The review prompt below **Review subagent prompt:** > You are classifying genres for tracks in a Rekordbox library. For each track, produce a genre recommendation and stage it. > > **Evidence and flags are included with each track in the dispatch data. You do NOT need to call resolve\_tracks\_data.** > > **Tools available:** > > * `get_genre_taxonomy()` — canonical genre list and BPM ranges > * `search_tracks(artist="...", has_genre=true)` — artist’s other genred tracks in the library (use when you need more context beyond the dispatch data) > * `get_track(track_id="...")` — single track detail > * `update_tracks(changes=[...])` — stage genre changes > > **What the decision tree already considered:** BPM range plausibility, Beatport/Discogs genre tags, label-genre mapping, audio energy profile, same-family depth resolution. > > **What you add:** Artist reputation beyond this library, label/scene context, remix conventions (e.g. remixer known for a specific genre), title interpretation. > > **Workflow:** > > 1. Call `get_genre_taxonomy()` to load the canonical genre list — only recommend genres from this list > 2. Review the evidence and flags included with each track > 3. Check if the artist has other genred tracks in the library via `search_tracks` where useful > 4. Consider artist reputation, label identity, and track title > 5. Recommend a canonical genre with one-sentence reasoning > 6. After classifying all tracks, stage your recommendations via `update_tracks` > > **Output format:** > > ```plaintext > #N Artist — Title > Evidence: [key signals from the evidence array] > Library: [artist's other genres, or "no other tracks"] > Recommend: GENRE — [why] > ``` > > **Tracks to classify:** \[track list here] #### 4.3 Verify staged results Once all subagents have completed, verify the aggregate results: ```plaintext preview_changes(format="summary") ``` Compare the staged track count against `dispatch_stats.total_tracks` from Step 4.1 (plus any tracks staged in Step 3). If the staged count is lower than expected, identify which artist batches may have failed and report them to the user. Present the summary to the user. Report: `"Review complete: N tracks staged across M genres."` Ask: **“Any genres or artists you want to review before export?“** ### 5. Export ```plaintext preview_changes(format="summary") ``` Ask user: “Export these changes to XML?” ```plaintext write_xml() ``` Report output path, then walk the user through the Rekordbox import: 1. **Add XML to Rekordbox** — Open Preferences → Advanced → rekordbox xml → **Imported Library** → Browse → select the exported XML file. 2. **Open the XML view** — In the sidebar, click the **“Display rekordbox xml”** icon. The imported tracks appear under “All Tracks”. 3. **Import into collection** — Select all tracks (Cmd+A), right-click → **Import To Collection**. When prompted “Do you want to load information in the tag of the library being imported?”, click **Yes** (tick “Don’t ask me again” for bulk imports).
# Library Health
> Agent SOP for scanning broken links, orphan files, duplicates, and playlist coverage.
Scan a Rekordbox library for broken links, orphan files, duplicates, and playlist coverage gaps. ### Agent prompt [Section titled “Agent prompt”](#agent-prompt) Paste into your agent to start: ```plaintext Run a full health scan on my library. ``` ## Constraints * **Read-only scanning.** No files are modified, moved, or deleted by these tools. * **Results are point-in-time snapshots.** Run scans after any Rekordbox import/export or file system changes. * **Samplers excluded.** All scans ignore Rekordbox factory sampler files. ## Prerequisites * reklawdbox MCP server connected * No external tools required — all scanning is built-in ## Steps ### 0. Start with read\_library ```plaintext read_library ``` Review `content_roots` and `total_tracks` to understand the library scope. ### 1. Scan for broken links ```plaintext scan_broken_links(path_prefix="/path/to/scope", limit=200) ``` Checks every track in the database for missing files on disk. Use `path_prefix` to scope to a specific directory. `suggest_relocations` (default true) reports suggested relocations when a file with the same name exists elsewhere in the content roots. **If broken links found:** Review suggested relocations. File moves must be done manually in Rekordbox (Preferences > Advanced > Database Management > Relocate) or by correcting the directory structure. ### 2. Scan for orphan files ```plaintext scan_orphan_files(path_prefix="/path/to/scope", limit=200) ``` Finds audio files on disk that are not imported into Rekordbox. Groups results by directory. Use `path_prefix` to scope to a specific directory instead of scanning all content roots. **If orphans found:** These files can be imported into Rekordbox via drag-and-drop, or deleted if no longer wanted. ### 3. Check playlist coverage ```plaintext scan_playlist_coverage(limit=200) ``` Finds tracks not assigned to any playlist. Accepts the shared search filters (`genre`, `artist`, `bpm_min`, `bpm_max`, `path_prefix`, `has_label`, etc.) for scoping to specific subsets of the library. **If uncovered tracks found:** Review and assign to playlists, or accept that some tracks are intentionally unplaylisted. ### 4. Find duplicates ```plaintext scan_duplicates(detection_level="metadata", limit=50) ``` Groups tracks by matching artist + title. Each group includes a `suggested_keep` recommendation based on audio quality (bitrate > sample rate > play count > rating). Use `path_prefix` to scope to a specific directory. For higher confidence, run exact-level detection: ```plaintext scan_duplicates(detection_level="exact", limit=50) ``` This hashes actual audio files and only reports byte-identical duplicates. **If duplicates found:** Remove duplicates manually in Rekordbox. The `suggested_keep` field indicates which copy to retain. ### 5. Summary Report findings across all four scans: broken link count, orphan count, playlist coverage percentage, and duplicate group count. Suggest next steps based on which issues were found.
# Metadata Backfill
> Agent SOP for backfilling labels, years, and albums from enrichment caches.
Backfill missing labels, years, and albums across a Rekordbox collection using cached enrichment data and file metadata. ### Agent prompt [Section titled “Agent prompt”](#agent-prompt) Paste into your agent to start: ```plaintext Find tracks missing labels and years in my collection and fill them from enrichment data. ``` ## Constraints * **`backfill_labels` is enrichment-dependent.** The tool pulls labels from Discogs, MusicBrainz, Bandcamp, and Beatport enrichment caches — it does not check file tags or the web. Tracks without cached enrichment are skipped (unless `auto_enrich=true`). For tracks still unlabeled after the tool runs, the agent researches labels manually (step 1c). * **Years use a priority cascade.** `backfill_years` tries multiple sources in order: file tags → folder path `(YYYY)` → Discogs enrichment → Beatport enrichment → MusicBrainz enrichment → Bandcamp enrichment. For tracks still missing years after the tool runs, the agent researches release dates (step 2c). * **No auto-tagging without review (labels).** The backfill tools stage non-conflicting values automatically. Label conflicts require human approval before staging. Year conflicts are resolved automatically using the lowest year (see “Lowest year wins”). * **XML export only.** All changes flow through `update_tracks` → `preview_changes` → `write_xml`. * **Self-released tracks are skipped (labels only).** Discogs “Not On Label” entries are filtered out — they carry no useful signal. * **Year 0 = unset.** Rekordbox uses 0 for tracks with no year. Rekordbox often fails to import year tags from WAV files, so year=0 does not mean the file lacks a year tag — check before assuming it’s missing. * **Lowest year wins.** Year is used as a signal for genre classification — it represents the track’s production era, not its release or reissue date. When a conflict arises between current year and enrichment year, always use the lowest (earliest) year unless it is obviously wrong (e.g., a nonsensical date like 1900 for a modern electronic track). Do not present year conflicts for interactive review — resolve them automatically. Web research for year gaps should search for *release dates* (not production dates specifically), since release date is the best publicly available proxy. * **Year gaps are always researched.** Tracks left with year=0 after all automated sources are exhausted must be researched by the agent via web search, store lookups, and model knowledge. The tools handle the bulk; the agent fills every remaining gap it can. * **Never ask whether to research — just do it.** After a backfill tool reports remaining gaps, immediately begin researching them. Do not summarize and wait for instructions, do not ask “should I also research?”, and do not present research as a separate decision. The automated backfill is the easy warm-up; researching the remaining gaps is the core work. Asking whether to proceed is asking whether to finish the job — the answer is always yes. * **Pause only where marked.** This workflow has exactly two pause points: conflict-resolution prompts (where user input is needed) and the final export confirmation. Everything else — including the transition from backfill to research — is automatic. Do not stop, summarize, or ask for direction at unmarked transitions. * **Albums are best-effort.** `backfill_albums` fills empty albums from file tags, folder paths, and enrichment caches. Unlike labels and years, album gaps do not require exhaustive research — fill what the tools and a light web search can find. Empty album is valid for singles, loose tracks, and DJ edits. Album names where the value equals the track title or artist name are noise and are skipped automatically. ## Prerequisites Check enrichment cache coverage across **all providers** for unlabeled tracks: ```plaintext cache_coverage(has_label=false) ``` Review the `coverage` section for each provider’s `searched_percent` and `has_result` counts. Hydrate providers with low coverage before backfilling — the more cached enrichment data available, the more labels and years the backfill tools can fill. ### Hydration priority 1. **Discogs** (best for label data, genre/style): ```plaintext enrich_tracks(has_label=false, skip_cached=true, providers=["discogs"], max_tracks=50) ``` 2. **Bandcamp** (strongest for underground/independent electronic music — high hit rate for labels *and* years): ```plaintext enrich_tracks(has_label=false, skip_cached=true, providers=["bandcamp"], max_tracks=50) ``` 3. **Beatport** (label data extracted from search results since v0.15; older cache entries lack labels): ```plaintext enrich_tracks(has_label=false, skip_cached=true, providers=["beatport"], max_tracks=50) ``` If Beatport `searched_percent` is 100% but labels are still missing, existing cache entries predate label extraction. Re-enrich with `force_refresh=true` to populate them. 4. **MusicBrainz** is hydrated automatically by `lookup_musicbrainz` during year research (step 2c) or via `auto_enrich`. Repeat batches until providers reach 100% searched. Report progress between batches. **Alternative: skip manual hydration** and use `auto_enrich=true` on the backfill tools (see steps 1 and 2). This automatically fetches Bandcamp data for uncached tracks before backfilling. Slower for large collections but requires no manual batching. ## Steps ### 1. Fill labels #### 1a. Backfill from enrichment ```plaintext backfill_labels() ``` Or, to automatically fetch Bandcamp data for uncached tracks: ```plaintext backfill_labels(auto_enrich=true) ``` The tool scans the entire collection (excluding samples), looks up each track’s enrichment across all four providers (Discogs, MusicBrainz, Bandcamp, Beatport), and: * **Fills** empty labels from enrichment (auto-staged) * **Skips** tracks where current label matches enrichment * **Reports conflicts** where current label differs from enrichment (not staged) The summary includes `no_enrichment_by_provider` showing which providers are missing data for the remaining unlabeled tracks. If `no_bandcamp` is high, hydrate Bandcamp and re-run. Report the summary: “Backfill complete: N labels staged, N already correct, N conflicts, N without enrichment.” Then immediately proceed to conflict resolution. #### 1b. Resolve conflicts Deduplicate conflicts before presenting — a track must appear in exactly one group. Group conflicts by pattern and present each group as a table with columns: **#**, **Artist — Title**, **Current**, **Enrichment**, **Rec** (recommendation). Follow each table with a clear bulk-action prompt. ##### Group A: Artist-name-as-label → actual label When the current label is the artist’s name and enrichment has an actual label, recommend the enrichment label — it provides a stronger signal than the artist name. ```plaintext Group A: Artist-name-as-label → actual label (N tracks) — recommend: use enrichment | # | Artist — Title | Current | Enrichment | Rec | |----|-----------------------------|---------------|------------|----------------| | 1 | Fantastic Man — Antiboudi | Fantastic Man | Mule Musiq | use enrichment | | 2 | Fantastic Man — Diaspora | Fantastic Man | Mule Musiq | use enrichment | ``` **→ “Approve all Group A, reject all, or specify numbers to change (e.g. ‘approve all except 2’).”** ##### Group B: Label name variations When both labels refer to the same entity with different formatting (e.g. “Palette” vs “Palette Recordings”, “трип” vs “trip recordings”), recommend keeping the current label — it was set intentionally. ```plaintext Group B: Label name variations (N tracks) — recommend: keep current | # | Artist — Title | Current | Enrichment | Rec | |----|----------------------------|---------|--------------------|--------------| | 3 | Dauwd — Kindlinn | Palette | Palette Recordings | keep current | | 4 | Lone — Abraxas | трип | trip recordings | keep current | ``` **→ “Skip all Group B (keep current labels), or specify numbers to change.”** ##### Group C: Wrong enrichment When enrichment returned nonsense — artist names as labels, gibberish, or clearly incorrect data — recommend keeping the current label. ```plaintext Group C: Wrong enrichment (N tracks) — recommend: keep current | # | Artist — Title | Current | Enrichment | Rec | |----|-----------------------------------|--------------------|--------------|--------------| | 5 | Beat Movement — X | WarinD Records | BEAT MOVEMENT | keep current | | 6 | Chaka Demus and Pliers — Murder | Soul Jazz Records | Alex Di Ciò | keep current | ``` **→ “Skip all Group C, or specify numbers to change.”** ##### Group D: Genuine disagreements When the enrichment label is a different entity entirely (e.g. different pressing, compilation, or wrong match), present for individual review with no default recommendation. ```plaintext Group D: Genuine disagreements (N tracks) — review individually | # | Artist — Title | Current | Enrichment | Rec | |----|--------------------------------|------------------|--------------------|--------| | 7 | Rick Wade — Authentideep | Harmonie Park | Unknown Season | review | | 8 | Vril — Haus (Rework) | Giegling | Delsin | review | ``` **→ “For each, reply ‘keep’ or ‘use enrichment’, or skip. You can also bulk-skip: ‘skip all Group D’.”** Stage approved changes via `update_tracks`. Then immediately proceed to research remaining gaps — do not ask. #### 1c. Research remaining gaps After steps 1a–1b, tracks may still lack labels. The `backfill_labels` output includes a `research_queue` section listing the count and top artists by frequency. **`write_xml` will refuse to export until this step is done** — it checks whether `backfill_labels` reported unlabeled tracks and blocks export unless `skip_label_gate=true` is passed. Start by fetching the first batch of unlabeled tracks: ```plaintext search_tracks(has_label=false, limit=50) ``` Then for each batch: 1. **Group by artist** for efficient lookup — tracks from the same artist often share a label. 2. **Prioritize by artist frequency** — artists with many unlabeled tracks first, for maximum coverage per lookup. 3. **Research labels** using web search, store lookups (`lookup_beatport`, `lookup_discogs`, `lookup_bandcamp`), label catalogs, and model knowledge. 4. **Present findings** grouped by confidence: * **High confidence** (exact release found on store/catalog): present for bulk approval. * **Uncertain** (multiple candidates or ambiguous match): present individually with source context. 5. **Stage approved labels** via `update_tracks`. Skip tracks where no label can be determined — self-released tracks, private edits, or truly obscure releases may have no public label. ### 2. Fill years #### 2a. Backfill from sources ```plaintext backfill_years() ``` Or, to automatically fetch Bandcamp and MusicBrainz data for uncached year-zero tracks: ```plaintext backfill_years(auto_enrich=true) ``` The tool scans the entire collection (excluding samples) and tries six sources in priority order for each track with year=0: 1. **File tags** — reads year from the audio file’s metadata (ID3v2, Vorbis Comment, RIFF INFO). Most reliable since the user tagged it. 2. **Folder path** — extracts year from a `(YYYY)` suffix in the parent directory name (e.g. `Album (2019)/`). 3. **Discogs enrichment** — falls back to the cached Discogs release year. 4. **Beatport enrichment** — falls back to the cached Beatport `publish_date`/`release_date`. 5. **MusicBrainz enrichment** — falls back to the cached MusicBrainz `first-release-date`. 6. **Bandcamp enrichment** — falls back to the cached Bandcamp `release_date`. Particularly effective for underground/independent electronic music. The first source to produce a valid year (1900–2099) wins. Non-conflicting fills are auto-staged. For tracks that already have a non-zero year, the tool checks Discogs enrichment only — if the Discogs year differs from the current year, the track is reported as a conflict. Other providers (Beatport, MusicBrainz, Bandcamp) are not compared against existing years. The summary includes `remaining_uncached_providers` showing which providers are missing data for remaining year-zero tracks. If providers show gaps, hydrate them and re-run (or use `auto_enrich=true`). Note: Beatport date extraction requires enrichment cache entries created after this feature was added. For tracks with older cache entries, re-enrich with `enrich_tracks(providers=["beatport"], force_refresh=true)` to populate the `release_date` field. Report the summary: “Year backfill complete: N years staged (N file tags, N folder paths, N Discogs, N Beatport, N MusicBrainz, N Bandcamp), N already correct, N conflicts, N without any source.” Then immediately proceed to resolve conflicts. #### 2b. Resolve conflicts Year conflicts are resolved automatically — do not present them for interactive review. For each conflict, stage the **lowest (earliest) year** from either the current value or the Discogs enrichment value via `update_tracks`. The earliest year best approximates the track’s production era, which is the signal used for genre classification. **Exception — obviously wrong years:** If the lowest year is clearly nonsensical (e.g., 1900 for a modern electronic track, or a year that predates the genre by decades), flag it to the user instead of auto-staging. This should be rare. Report the summary: “Year conflicts resolved: N auto-staged (used earliest year), N already had the earliest year, N flagged for review.” Then immediately proceed to research remaining gaps — do not ask. #### 2c. Research remaining gaps After `backfill_years`, some tracks will still have year=0 with no source data. The tool reports these as `remaining_year_zero`. The agent must research every one of them — this is not optional. **Do not skip this step or proceed to export until all year-zero tracks have been researched.** **If `auto_enrich=true` was used**, Bandcamp and MusicBrainz have already been fetched for these tracks. Skip to the web research sub-step below. **Batch enrich remaining year-zero tracks** Use `enrich_tracks` with the `year_zero` filter to batch-enrich remaining tracks via Bandcamp: ```plaintext enrich_tracks(year_zero=true, skip_cached=true, providers=["bandcamp"], max_tracks=50) ``` Repeat until all year-zero tracks are covered. Then **re-run `backfill_years()`** to pick up the new data. **Re-run backfill** After enrichment, re-run the backfill to incorporate newly cached data: ```plaintext backfill_years() ``` This picks up years from the fresh Bandcamp/MusicBrainz cache entries. **Web research for remaining gaps** For tracks still without years after all enrichment sources are exhausted: 1. **Group by artist** for efficient lookup. 2. **Search for release dates** — use web search, Discogs links, store lookups (`lookup_beatport`, `lookup_discogs`, `lookup_bandcamp`), label catalogs, and model knowledge to find the original release year. 3. **Present findings** grouped by confidence: * **High confidence** (exact release found): present for bulk approval. * **Uncertain** (multiple candidates or approximate): present individually with context. 4. **Stage approved years** via `update_tracks`. Only mark a track as unresolvable after genuine research effort — anonymous private edits or truly untraceable tracks exist, but they are the exception. Exhaust web search, store lookups, and artist discographies before giving up on any track. ### 3. Backfill albums ```plaintext backfill_albums() ``` Or, to automatically fetch Bandcamp data for uncached tracks: ```plaintext backfill_albums(auto_enrich=true) ``` The tool scans tracks with empty album and tries four sources in order: 1. **File tags** — reads album from the audio file’s metadata. 2. **Folder path** — extracts album name from the parent directory if it has a `(YYYY)` suffix (e.g. `Album Name (2019)/`). Edition qualifiers like `(Deluxe Edition)` and `(Original Motion Picture Soundtrack)` are stripped automatically. 3. **Bandcamp enrichment** — uses the `album` field from cached Bandcamp data. 4. **Discogs enrichment** — uses the `title` field (release name) from cached Discogs data. Noise is filtered automatically: albums that match the track title or artist name are skipped. No conflict resolution is needed — the tool only fills empty albums. Report the summary: “Album backfill complete: N albums staged (N file tags, N folder paths, N Bandcamp, N Discogs), N already set, N without any source.” The agent may optionally web-research remaining gaps, but this is not required. Empty album is valid for singles, loose tracks, and DJ edits. ### 4. Export **Pre-export gate (enforced by tool).** `write_xml` checks whether `backfill_labels` reported unlabeled tracks. If so, it returns an error unless `skip_label_gate=true` is passed. Before calling `write_xml(skip_label_gate=true)`, confirm: * Step 1c (label research): all unlabeled tracks have been researched. Remaining gaps are genuinely unresolvable (private edits, anonymous tracks), not just unattempted. * Step 2c (year research): all year-zero tracks have been researched. Same standard. If either step was skipped, go back and complete it before proceeding. ```plaintext preview_changes() ``` Ask user: “Export these changes to XML?” ```plaintext write_xml(skip_label_gate=true) ``` Report output path, then walk the user through the Rekordbox import: 1. **Add XML to Rekordbox** — Open Preferences → Advanced → rekordbox xml → **Imported Library** → Browse → select the exported XML file. 2. **Open the XML view** — In the sidebar, click the **“Display rekordbox xml”** icon. The imported tracks appear under “All Tracks”. 3. **Import into collection** — Select all tracks (Cmd+A), right-click → **Import To Collection**. When prompted “Do you want to load information in the tag of the library being imported?”, click **Yes** (tick “Don’t ask me again” for bulk imports).
# Pool Building
> Agent SOP for building compatible track pools.
Build unordered pools of mutually compatible tracks for live improvisation. ### Agent prompt [Section titled “Agent prompt”](#agent-prompt) Paste into your agent to start: ```plaintext Build a track pool from my collection. ``` ## Constraints * **Read-only.** Pool building never modifies track metadata. It discovers compatible groupings. * **Human controls the result.** Agent proposes pools and additions. User approves, removes, or adjusts. * **Cache-first.** All pool scoring uses cached audio analysis and enrichment. No external API calls. * **Master tempo default.** Default `master_tempo=false` — accounts for pitch shift when scoring key compatibility. Override to `true` only if user confirms they play with master tempo on. * **Symmetric scoring.** Pool compatibility is symmetric: score(A,B) == score(B,A). No directional context. ## Prerequisites ```plaintext cache_coverage() ``` All providers must be at 100%. If not, hydrate first (audio analysis, then enrichment). Essentia analysis is especially important — tracks without it lose the timbral axis and brightness/rhythm axes in pool scoring. ## Steps ### 1. Collect seed tracks Ask user for 2-5 seed tracks that define the pool’s character. Sources: * Specific track IDs the user already has in mind * A playlist the user wants to expand from * Tracks from a recent session (`get_sessions`, `get_session_tracks`) * Search results (`search_tracks`) Confirm seeds with user before proceeding. Present each seed’s key, BPM, energy, genre. **Alternative: automatic seed discovery.** Instead of manual seed selection, `discover_pools` can find natural track pools using Bron-Kerbosch clique enumeration on a compatibility graph. Filter by genre, BPM, or playlist and adjust `threshold` (0.3-0.95) to control pool tightness. Discovered pools can be used directly or as seeds for expansion. ### 2. Expand the pool ```plaintext expand_pool( seed_track_ids=[...], additions=5, master_tempo=false ) ``` Adjust parameters based on user goals: * `cross_genre=true` for timbral discovery across genre boundaries * `preset="timbral"` to prioritize sonic similarity over metadata matching. The `preset` parameter accepts a named preset string (e.g., `"balanced"`), a preset with overrides (e.g., `{preset: "timbral", overrides: {timbral: 0.4}}`), or fully custom weights with axes: bpm, energy, timbral, key, genre, brightness, rhythm. Custom presets can be saved with `save_weight_preset` and listed with `list_weight_presets` for reuse across sessions. * `playlist_id` to restrict candidates to a specific playlist * Search filters (`genre`, `bpm_min`, `bpm_max`, `rating_min`) to scope candidates Present each addition with: * Track info (title, artist, BPM, key, genre) * `min_score` — worst compatibility with any pool member (the guarantee) * `mean_score` — average compatibility across the pool * `rationale` — strongest/weakest axes, most compatible member If `stopped_early=true`, explain that either the remaining candidates scored below the quality threshold (0.4) or the candidate pool was exhausted. Suggest widening filters, relaxing `cross_genre`, or adjusting seeds. Ask user: “Keep all additions? / remove specific ones / expand further / adjust seeds” ### 3. Iterate If user removes tracks or wants more: ```plaintext expand_pool( seed_track_ids=[...all current pool members...], additions=3, master_tempo=false ) ``` Use the full current pool as seeds for the next expansion round. This ensures new additions are compatible with everything already approved. Repeat until user is satisfied with pool size and composition. ### 4. Describe the pool ```plaintext describe_pool( pool_track_ids=[...], master_tempo=false ) ``` Present the analysis: * **Cohesion** — mean/min pairwise compatibility scores * **Energy band** — energy range across the pool * **BPM center/spread** — tempo characteristics * **Key neighborhood** — effective keys at reference BPM * **Dominant genre** — most common genre * **Analysis coverage** — % of tracks with full Essentia data * **Weak members** — tracks with low min-compatibility (candidates for removal) * **Optimal reference BPM** (master\_tempo=false only) — the BPM that maximizes key compatibility across the pool If weak members are flagged, ask user: “Remove weak members? / keep them / see pairwise scores” If `bpm_range_warning` appears, explain that the pool spans too wide a BPM range for reliable harmonic evaluation at a single reference BPM. ### 5. Validate specific pairs (optional) For tracks the user is uncertain about: ```plaintext score_pool_compatibility( track_a="...", track_b="...", master_tempo=false ) ``` Show per-axis scores to help the user understand why tracks do or don’t fit. ### 6. Lock the pool Once the user approves the pool, save it as a playlist: ```plaintext write_xml(playlists=[{"name": "Pool: Deep House 126", "track_ids": [...]}]) ``` Note: `write_xml` also exports and clears any staged metadata changes. Use `preview_changes` first if other workflows have pending edits. Report output path. Remind user: File → Import Collection in Rekordbox. The pool is now a locked **chapter** — ready for live performance or further set planning.
# Set Building
> Agent SOP for building transition-scored DJ set sequences.
Build ordered DJ set tracklists using transition scoring and beam search sequencing. ### Agent prompt [Section titled “Agent prompt”](#agent-prompt) Paste into your agent to start: ```plaintext Build a DJ set from my collection. ``` ## Constraints * **Read-only.** Set building never modifies track metadata. It only creates playlist orderings. * **Human controls the result.** Agent proposes candidates. User picks, reorders, swaps. Export only what the user approves. * **Cache-first.** All transition scoring uses cached audio analysis and enrichment. No external API calls. * **Multiple candidates.** Always present at least 2 candidate orderings unless the pool is too small. ## Prerequisites ```plaintext cache_coverage() ``` All providers must be at 100%. If not, hydrate first (audio analysis, then enrichment). ## Steps ### 1. Collect set parameters Ask user for: * **Duration** — e.g., 60 min (\~10-12 tracks), 90 min (\~15-18 tracks) * **Genre focus** — specific genres, playlist, or “any” * **BPM range** — e.g., 120-135, or “flexible” * **Energy curve** — warmup→build→peak→release (default), flat, peak\_only, or custom * **Priority** — balanced (default), harmonic, energy, or genre. Accepts a named preset string (e.g., `"balanced"`), a preset with overrides (e.g., `{preset: "harmonic", overrides: {energy: 0.25}}`), or fully custom weights with axes: key, bpm, energy, genre, brightness, rhythm. Custom presets can be saved with `save_weight_preset` and listed with `list_weight_presets` for reuse across sessions. * **Starting track** — optional seed track * **Harmonic style** — conservative / balanced (default) / adventurous * **BPM drift tolerance** — default 6% * **BPM trajectory** — optional start→peak BPM ramp (e.g., “start 122, peak at 130”) Defaults: 60 min, balanced priority, warmup→build→peak→release, balanced harmonic, master tempo on, 6% drift. Confirm parameters with user before proceeding. ### 2. Review play history Check recent sessions to inform track selection: ```plaintext get_sessions(limit=10) ``` Present sessions to user with date, track count, duration. Ask which were gigs vs practice, and how history should influence selection: * Avoid tracks from specific sessions? * Prefer battle-tested tracks (high play count)? * Prioritize unplayed tracks? * No preference? If the user wants history-aware selection, check play stats for the candidate pool’s scope: ```plaintext get_play_stats(genre="...", bpm_min=N, bpm_max=N, include_unplayed=true) ``` Use results to filter or annotate the candidate pool in the next step. ### 3. Build candidate pool By genre/BPM search: ```plaintext search_tracks(genre="...", bpm_min=N, bpm_max=N, limit=200) ``` Or from an existing playlist: ```plaintext get_playlist_tracks(playlist_id="...") ``` Then resolve full data: ```plaintext resolve_tracks_data(track_ids=[...], max_tracks=200) ``` Filter out tracks missing stratum\_dsp analysis or outside BPM range. Present pool summary: track count, genre breakdown, BPM range, key spread. Ask user: “Proceed with this pool? / adjust filters / add from another playlist” ### 4. Generate candidates ```plaintext build_set( track_ids=[...], target_tracks=12, start_track_id="...", priority="balanced", energy_curve="warmup_build_peak_release", beam_width=3, master_tempo=true, harmonic_style="balanced", bpm_drift_pct=6.0, bpm_range=[122, 130] ) ``` Present 2-3 candidates with: track order, key/BPM/genre per track, per-transition scores, energy curve visualization, overall score. Ask user: “pick A / pick B / compare position # / regenerate / adjust parameters” ### 5. Refine selected set Interactive editing commands: | Command | Action | | ------------------------- | --------------------------------------------------------- | | `swap #N TrackID` | Replace track, re-score adjacent transitions | | `move #N to #M` | Reorder, re-score affected transitions | | `remove #N` | Remove track, re-score new adjacent pair | | `insert TrackID after #N` | Add track, score both new transitions | | `suggest #N` | Find best replacement using `query_transition_candidates` | | `details #N` | Show full data for track at position N | | `check` | Re-score and re-display full set | | `done` | Finalize and proceed to export | For `suggest #N`, call: ```plaintext query_transition_candidates( from_track_id="prev_track", pool_track_ids=[...remaining...], energy_phase="...", target_bpm=N, master_tempo=true, harmonic_style="balanced" ) ``` After each edit, use `score_transition()` to validate and show impact. Ask user after each edit: “Continue editing? / done” ### 6. Export ```plaintext write_xml(playlists=[{"name": "Set Name", "track_ids": [...]}]) ``` Note: `write_xml` also exports and clears any staged metadata changes. Use `preview_changes` first if other workflows have pending edits. Report output path. Remind user: File → Import Collection in Rekordbox.