From 3eff77aa1a74d32392c8e741a8279e1d31988132 Mon Sep 17 00:00:00 2001 From: Ethan J Lewis Date: Sun, 19 Apr 2026 22:15:34 -0500 Subject: [PATCH] Add implementation plan for tag enhancement Four tasks: taxonomy append, prompt rewrite, tag cap bump, and end-to-end verification against the Morning Person note. Co-Authored-By: Claude Opus 4.7 --- .../plans/2026-04-19-tag-enhancement.md | 230 ++++++++++++++++++ 1 file changed, 230 insertions(+) create mode 100644 docs/superpowers/plans/2026-04-19-tag-enhancement.md diff --git a/docs/superpowers/plans/2026-04-19-tag-enhancement.md b/docs/superpowers/plans/2026-04-19-tag-enhancement.md new file mode 100644 index 0000000..d9c9ce2 --- /dev/null +++ b/docs/superpowers/plans/2026-04-19-tag-enhancement.md @@ -0,0 +1,230 @@ +# Tag Enhancement Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Stop the tagger from force-fitting taxonomy tags onto notes that don't match (e.g., tagging a memoir as "productivity"), by seeding a personal-narrative cluster into the taxonomy and rewriting the prompt to permit zero-taxonomy-tag output when nothing fits. + +**Architecture:** Single-script tool; all changes land in `tag-notes.py` (system prompt + tag cap) and `tag-taxonomy.yaml` (new cluster). No new files, no new dependencies, no test harness added. Verification is a manual re-run against a known problem note. + +**Tech Stack:** Python, PyYAML, ruamel.yaml, local LM Studio (OpenAI-compatible) endpoint. + +**Note on testing:** This project has no test infrastructure (per `CLAUDE.md`: "There are no tests, linter, or build step"). The tagger's correctness is judged by LLM output quality against real notes, not unit tests. Each task below ends in a manual spot-check where meaningful; end-to-end verification lives in Task 4. + +**Spec:** `docs/superpowers/specs/2026-04-19-tag-enhancement-design.md` + +--- + +## Task 1: Add Personal Narrative cluster to taxonomy + +**Files:** +- Modify: `tag-taxonomy.yaml` (append at end of file) + +- [ ] **Step 1: Append the new cluster** + +Append these lines to the end of `tag-taxonomy.yaml` (there is currently a `# Personal Interests` cluster ending with `- gardening`; add a blank line after `gardening`, then this block): + +```yaml + + # Personal Narrative & Life + - memoir + - personal-essay + - reflection + - family + - parenting + - recovery + - mental-health + - aging + - relationships + - childhood + - identity +``` + +- [ ] **Step 2: Verify the YAML still parses** + +Run: `python3 -c "import yaml; print(len(yaml.safe_load(open('tag-taxonomy.yaml'))['tags']))"` + +Expected: prints an integer equal to the old count + 11 (the old file had 31 tags, so expect `42`). If the number isn't old-count + 11, the YAML is malformed — fix indentation before moving on. + +- [ ] **Step 3: Commit** + +```bash +git add tag-taxonomy.yaml +git commit -m "Add personal-narrative cluster to tag taxonomy" +``` + +--- + +## Task 2: Rewrite the system prompt in `request_metadata` + +**Files:** +- Modify: `tag-notes.py:127-145` (the `system_prompt` f-string inside `request_metadata`) + +**Context:** Current prompt forces 1-5 taxonomy tags and tells the LLM to be "conservative" about new suggestions. We're flipping both: allow 0 taxonomy tags when nothing fits, and let new suggestions go up to 5. + +- [ ] **Step 1: Replace the system_prompt f-string** + +In `tag-notes.py`, find the current `system_prompt` assignment inside `request_metadata` (starts at line 127): + +```python + system_prompt = f"""You analyze markdown notes and return structured metadata. + +Return ONLY valid JSON in this exact shape: +{{ + "tags_from_taxonomy": ["tag1", "tag2"], + "new_tag_suggestions": ["newtag1"], + "seo_title_suffix": "Short descriptor that will follow the note title", + "seo_description": "Factual summary between {SEO_DESC_MIN} and {SEO_DESC_MAX} characters.", + "seo_keywords": ["keyword1", "keyword2"] +}} + +Rules: +- tags_from_taxonomy: 1-5 tags drawn from the existing taxonomy that best fit the content. +- new_tag_suggestions: 0-2 NEW tags, only when content truly warrants it (be conservative). +- seo_title_suffix: a short, clean, non-clickbaity descriptor of the note. Do NOT include the note title or a leading colon — only the text that would follow ": ". Aim for 4-10 words. +- seo_description: a clean factual summary, STRICTLY between {SEO_DESC_MIN} and {SEO_DESC_MAX} characters inclusive. Count characters carefully before responding. +- seo_keywords: 10-15 relevant keywords, no duplicates. + +Existing tag taxonomy: {taxonomy_str}""" +``` + +Replace it with: + +```python + system_prompt = f"""You analyze markdown notes and return structured metadata. + +Return ONLY valid JSON in this exact shape: +{{ + "tags_from_taxonomy": ["tag1", "tag2"], + "new_tag_suggestions": ["newtag1"], + "seo_title_suffix": "Short descriptor that will follow the note title", + "seo_description": "Factual summary between {SEO_DESC_MIN} and {SEO_DESC_MAX} characters.", + "seo_keywords": ["keyword1", "keyword2"] +}} + +Rules: +- Tags should describe what the note is substantively about, not topics it merely mentions in passing. +- tags_from_taxonomy: 0-5 tags drawn from the existing taxonomy, ONLY when they genuinely fit. Do NOT force a taxonomy tag — return an empty list if nothing truly applies. +- new_tag_suggestions: 0-5 NEW tags when the taxonomy doesn't adequately cover the content. Each must be a reusable category (not hyper-specific to one note). Use lowercase-hyphenated style (e.g., personal-essay). +- seo_title_suffix: a short, clean, non-clickbaity descriptor of the note. Do NOT include the note title or a leading colon — only the text that would follow "<title>: ". Aim for 4-10 words. +- seo_description: a clean factual summary, STRICTLY between {SEO_DESC_MIN} and {SEO_DESC_MAX} characters inclusive. Count characters carefully before responding. +- seo_keywords: 10-15 relevant keywords, no duplicates. + +Existing tag taxonomy: {taxonomy_str}""" +``` + +The three substantive changes: +1. Added philosophy line: `- Tags should describe what the note is substantively about, not topics it merely mentions in passing.` +2. `tags_from_taxonomy: 1-5 ... best fit the content.` → `0-5 ... ONLY when they genuinely fit. Do NOT force a taxonomy tag — return an empty list if nothing truly applies.` +3. `new_tag_suggestions: 0-2 ... be conservative).` → `0-5 NEW tags when the taxonomy doesn't adequately cover the content. Each must be a reusable category (not hyper-specific to one note). Use lowercase-hyphenated style (e.g., personal-essay).` + +- [ ] **Step 2: Syntax check the module** + +Run: `python3 -c "import ast; ast.parse(open('tag-notes.py').read()); print('ok')"` + +Expected: prints `ok`. If it prints a SyntaxError, the f-string braces or quotes are wrong — fix before moving on. + +- [ ] **Step 3: Commit** + +```bash +git add tag-notes.py +git commit -m "Allow zero-taxonomy-tag output and more new tag suggestions" +``` + +--- + +## Task 3: Raise the combined tag cap from 5 to 8 + +**Files:** +- Modify: `tag-notes.py:225` (inside `process_note`, in the `if needs_tags:` block) + +- [ ] **Step 1: Change the slice** + +Find this line in `tag-notes.py` (inside `process_note`, ~line 225): + +```python + combined = list(dict.fromkeys(list(taxonomy_tags) + list(new_suggestions)))[:5] +``` + +Change to: + +```python + combined = list(dict.fromkeys(list(taxonomy_tags) + list(new_suggestions)))[:8] +``` + +- [ ] **Step 2: Syntax check** + +Run: `python3 -c "import ast; ast.parse(open('tag-notes.py').read()); print('ok')"` + +Expected: prints `ok`. + +- [ ] **Step 3: Commit** + +```bash +git add tag-notes.py +git commit -m "Raise combined tag cap from 5 to 8" +``` + +--- + +## Task 4: End-to-end verification against the problem note + +**Files:** +- Modify (temporarily): the "Becoming a Morning Person" note in `~/Documents/ejl-zk/40 Public/41 Notes/` + +**Context:** The tagger only touches fields that are currently empty. To force reprocessing we clear the `tags:` field on the problem note, run the script, and inspect the result. + +- [ ] **Step 1: Locate the note** + +Run: `find ~/Documents/ejl-zk/40\ Public/41\ Notes/ -iname "*morning*person*.md"` + +Expected: prints one file path. Note it for the following steps (referred to below as `$NOTE`). + +- [ ] **Step 2: Confirm LM Studio is up** + +Run: `curl -s http://localhost:1234/v1/models | head -c 200` + +Expected: JSON response listing at least one model, including `openai/gpt-oss-20b` (the value of `MODEL_NAME`). If the request fails, start LM Studio and load the model before continuing. + +- [ ] **Step 3: Clear the existing tags field** + +Open `$NOTE` in an editor and set the `tags:` frontmatter value to an empty list (`tags: []`) or delete the line entirely. Save. Do NOT clear other fields — the script won't touch already-populated ones, which is the desired isolation. + +- [ ] **Step 4: Run the tagger** + +Run: `cd ~/bin/note-tagger && ./tag-notes.py` + +Expected: the script processes every note; for the Morning Person note it prints a line like ` + Added tags: memoir, personal-essay, family, recovery, aging, ...` and does NOT print `productivity` or `learning` in that line. + +- [ ] **Step 5: Inspect the frontmatter** + +Open `$NOTE` and inspect the `tags:` block. + +Pass criteria: +- Contains at least 3 of: `memoir`, `personal-essay`, `reflection`, `family`, `parenting`, `recovery`, `aging`, `relationships`, `childhood`, `identity`. +- Does NOT contain `productivity` or `learning`. +- Has ≤ 8 tags total (enforces the Task 3 cap). + +Fail criteria (any triggers a rethink — do NOT patch around it): +- Still includes `productivity` or `learning`. +- Zero tags written. +- More than 8 tags. + +If it fails: the follow-up noted in the spec is to add a single few-shot example to the prompt. Stop and report the failing output; don't silently escalate to that change. + +- [ ] **Step 6: Spot-check one other already-tagged note** + +Run: `git status` in the notes vault (if it's a git repo) OR simply open one other note that already had tags before this run. Confirm its `tags:` field was NOT modified (the script's "only touch empty fields" invariant must still hold). + +Expected: no change to that note's tags. If there IS a change, the empty-check logic regressed — stop and investigate. + +- [ ] **Step 7: No commit for this task** + +Task 4 is verification only. Any prompt-tuning follow-up is out of scope for this plan. + +--- + +## Out of scope (from spec) + +- Two-pass LLM classification. +- Few-shot examples in the prompt (follow-up candidate only if Task 4 fails). +- Changes to `seo_*` fields, `CONTENT_CHAR_LIMIT`, retry flow, slug derivation, or YAML round-tripping.