switch to using Anthropic API

Raise combined tag cap from 5 to 8
Allow zero-taxonomy-tag output and more new tag suggestions
2026-04-25 18:47:29 -05:00 · 2026-04-19 22:27:40 -05:00 · 2026-04-19 22:24:53 -05:00 · 2026-04-19 22:22:27 -05:00 · 2026-04-19 22:18:10 -05:00 · 2026-04-19 22:15:34 -05:00
5 changed files with 499 additions and 145 deletions
@@ -0,0 +1,230 @@
+# Tag Enhancement Implementation Plan
+
+> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
+
+**Goal:** Stop the tagger from force-fitting taxonomy tags onto notes that don't match (e.g., tagging a memoir as "productivity"), by seeding a personal-narrative cluster into the taxonomy and rewriting the prompt to permit zero-taxonomy-tag output when nothing fits.
+
+**Architecture:** Single-script tool; all changes land in `tag-notes.py` (system prompt + tag cap) and `tag-taxonomy.yaml` (new cluster). No new files, no new dependencies, no test harness added. Verification is a manual re-run against a known problem note.
+
+**Tech Stack:** Python, PyYAML, ruamel.yaml, local LM Studio (OpenAI-compatible) endpoint.
+
+**Note on testing:** This project has no test infrastructure (per `CLAUDE.md`: "There are no tests, linter, or build step"). The tagger's correctness is judged by LLM output quality against real notes, not unit tests. Each task below ends in a manual spot-check where meaningful; end-to-end verification lives in Task 4.
+
+**Spec:** `docs/superpowers/specs/2026-04-19-tag-enhancement-design.md`
+
+---
+
+## Task 1: Add Personal Narrative cluster to taxonomy
+
+**Files:**
+- Modify: `tag-taxonomy.yaml` (append at end of file)
+
+- [ ] **Step 1: Append the new cluster**
+
+Append these lines to the end of `tag-taxonomy.yaml` (there is currently a `# Personal Interests` cluster ending with `- gardening`; add a blank line after `gardening`, then this block):
+
+```yaml
+
+  # Personal Narrative & Life
+  - memoir
+  - personal-essay
+  - reflection
+  - family
+  - parenting
+  - recovery
+  - mental-health
+  - aging
+  - relationships
+  - childhood
+  - identity
+```
+
+- [ ] **Step 2: Verify the YAML still parses**
+
+Run: `python3 -c "import yaml; print(len(yaml.safe_load(open('tag-taxonomy.yaml'))['tags']))"`
+
+Expected: prints an integer equal to the old count + 11 (the old file had 31 tags, so expect `42`). If the number isn't old-count + 11, the YAML is malformed — fix indentation before moving on.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tag-taxonomy.yaml
+git commit -m "Add personal-narrative cluster to tag taxonomy"
+```
+
+---
+
+## Task 2: Rewrite the system prompt in `request_metadata`
+
+**Files:**
+- Modify: `tag-notes.py:127-145` (the `system_prompt` f-string inside `request_metadata`)
+
+**Context:** Current prompt forces 1-5 taxonomy tags and tells the LLM to be "conservative" about new suggestions. We're flipping both: allow 0 taxonomy tags when nothing fits, and let new suggestions go up to 5.
+
+- [ ] **Step 1: Replace the system_prompt f-string**
+
+In `tag-notes.py`, find the current `system_prompt` assignment inside `request_metadata` (starts at line 127):
+
+```python
+    system_prompt = f"""You analyze markdown notes and return structured metadata.
+
+Return ONLY valid JSON in this exact shape:
+{{
+  "tags_from_taxonomy": ["tag1", "tag2"],
+  "new_tag_suggestions": ["newtag1"],
+  "seo_title_suffix": "Short descriptor that will follow the note title",
+  "seo_description": "Factual summary between {SEO_DESC_MIN} and {SEO_DESC_MAX} characters.",
+  "seo_keywords": ["keyword1", "keyword2"]
+}}
+
+Rules:
+- tags_from_taxonomy: 1-5 tags drawn from the existing taxonomy that best fit the content.
+- new_tag_suggestions: 0-2 NEW tags, only when content truly warrants it (be conservative).
+- seo_title_suffix: a short, clean, non-clickbaity descriptor of the note. Do NOT include the note title or a leading colon — only the text that would follow "<title>: ". Aim for 4-10 words.
+- seo_description: a clean factual summary, STRICTLY between {SEO_DESC_MIN} and {SEO_DESC_MAX} characters inclusive. Count characters carefully before responding.
+- seo_keywords: 10-15 relevant keywords, no duplicates.
+
+Existing tag taxonomy: {taxonomy_str}"""
+```
+
+Replace it with:
+
+```python
+    system_prompt = f"""You analyze markdown notes and return structured metadata.
+
+Return ONLY valid JSON in this exact shape:
+{{
+  "tags_from_taxonomy": ["tag1", "tag2"],
+  "new_tag_suggestions": ["newtag1"],
+  "seo_title_suffix": "Short descriptor that will follow the note title",
+  "seo_description": "Factual summary between {SEO_DESC_MIN} and {SEO_DESC_MAX} characters.",
+  "seo_keywords": ["keyword1", "keyword2"]
+}}
+
+Rules:
+- Tags should describe what the note is substantively about, not topics it merely mentions in passing.
+- tags_from_taxonomy: 0-5 tags drawn from the existing taxonomy, ONLY when they genuinely fit. Do NOT force a taxonomy tag — return an empty list if nothing truly applies.
+- new_tag_suggestions: 0-5 NEW tags when the taxonomy doesn't adequately cover the content. Each must be a reusable category (not hyper-specific to one note). Use lowercase-hyphenated style (e.g., personal-essay).
+- seo_title_suffix: a short, clean, non-clickbaity descriptor of the note. Do NOT include the note title or a leading colon — only the text that would follow "<title>: ". Aim for 4-10 words.
+- seo_description: a clean factual summary, STRICTLY between {SEO_DESC_MIN} and {SEO_DESC_MAX} characters inclusive. Count characters carefully before responding.
+- seo_keywords: 10-15 relevant keywords, no duplicates.
+
+Existing tag taxonomy: {taxonomy_str}"""
+```
+
+The three substantive changes:
+1. Added philosophy line: `- Tags should describe what the note is substantively about, not topics it merely mentions in passing.`
+2. `tags_from_taxonomy: 1-5 ... best fit the content.` → `0-5 ... ONLY when they genuinely fit. Do NOT force a taxonomy tag — return an empty list if nothing truly applies.`
+3. `new_tag_suggestions: 0-2 ... be conservative).` → `0-5 NEW tags when the taxonomy doesn't adequately cover the content. Each must be a reusable category (not hyper-specific to one note). Use lowercase-hyphenated style (e.g., personal-essay).`
+
+- [ ] **Step 2: Syntax check the module**
+
+Run: `python3 -c "import ast; ast.parse(open('tag-notes.py').read()); print('ok')"`
+
+Expected: prints `ok`. If it prints a SyntaxError, the f-string braces or quotes are wrong — fix before moving on.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tag-notes.py
+git commit -m "Allow zero-taxonomy-tag output and more new tag suggestions"
+```
+
+---
+
+## Task 3: Raise the combined tag cap from 5 to 8
+
+**Files:**
+- Modify: `tag-notes.py:225` (inside `process_note`, in the `if needs_tags:` block)
+
+- [ ] **Step 1: Change the slice**
+
+Find this line in `tag-notes.py` (inside `process_note`, ~line 225):
+
+```python
+        combined = list(dict.fromkeys(list(taxonomy_tags) + list(new_suggestions)))[:5]
+```
+
+Change to:
+
+```python
+        combined = list(dict.fromkeys(list(taxonomy_tags) + list(new_suggestions)))[:8]
+```
+
+- [ ] **Step 2: Syntax check**
+
+Run: `python3 -c "import ast; ast.parse(open('tag-notes.py').read()); print('ok')"`
+
+Expected: prints `ok`.
+
+- [ ] **Step 3: Commit**
+
+```bash
+git add tag-notes.py
+git commit -m "Raise combined tag cap from 5 to 8"
+```
+
+---
+
+## Task 4: End-to-end verification against the problem note
+
+**Files:**
+- Modify (temporarily): the "Becoming a Morning Person" note in `~/Documents/ejl-zk/40 Public/41 Notes/`
+
+**Context:** The tagger only touches fields that are currently empty. To force reprocessing we clear the `tags:` field on the problem note, run the script, and inspect the result.
+
+- [ ] **Step 1: Locate the note**
+
+Run: `find ~/Documents/ejl-zk/40\ Public/41\ Notes/ -iname "*morning*person*.md"`
+
+Expected: prints one file path. Note it for the following steps (referred to below as `$NOTE`).
+
+- [ ] **Step 2: Confirm LM Studio is up**
+
+Run: `curl -s http://localhost:1234/v1/models | head -c 200`
+
+Expected: JSON response listing at least one model, including `openai/gpt-oss-20b` (the value of `MODEL_NAME`). If the request fails, start LM Studio and load the model before continuing.
+
+- [ ] **Step 3: Clear the existing tags field**
+
+Open `$NOTE` in an editor and set the `tags:` frontmatter value to an empty list (`tags: []`) or delete the line entirely. Save. Do NOT clear other fields — the script won't touch already-populated ones, which is the desired isolation.
+
+- [ ] **Step 4: Run the tagger**
+
+Run: `cd ~/bin/note-tagger && ./tag-notes.py`
+
+Expected: the script processes every note; for the Morning Person note it prints a line like `  + Added tags: memoir, personal-essay, family, recovery, aging, ...` and does NOT print `productivity` or `learning` in that line.
+
+- [ ] **Step 5: Inspect the frontmatter**
+
+Open `$NOTE` and inspect the `tags:` block.
+
+Pass criteria:
+- Contains at least 3 of: `memoir`, `personal-essay`, `reflection`, `family`, `parenting`, `recovery`, `aging`, `relationships`, `childhood`, `identity`.
+- Does NOT contain `productivity` or `learning`.
+- Has ≤ 8 tags total (enforces the Task 3 cap).
+
+Fail criteria (any triggers a rethink — do NOT patch around it):
+- Still includes `productivity` or `learning`.
+- Zero tags written.
+- More than 8 tags.
+
+If it fails: the follow-up noted in the spec is to add a single few-shot example to the prompt. Stop and report the failing output; don't silently escalate to that change.
+
+- [ ] **Step 6: Spot-check one other already-tagged note**
+
+Run: `git status` in the notes vault (if it's a git repo) OR simply open one other note that already had tags before this run. Confirm its `tags:` field was NOT modified (the script's "only touch empty fields" invariant must still hold).
+
+Expected: no change to that note's tags. If there IS a change, the empty-check logic regressed — stop and investigate.
+
+- [ ] **Step 7: No commit for this task**
+
+Task 4 is verification only. Any prompt-tuning follow-up is out of scope for this plan.
+
+---
+
+## Out of scope (from spec)
+
+- Two-pass LLM classification.
+- Few-shot examples in the prompt (follow-up candidate only if Task 4 fails).
+- Changes to `seo_*` fields, `CONTENT_CHAR_LIMIT`, retry flow, slug derivation, or YAML round-tripping.
@@ -0,0 +1,97 @@
+# Tag Enhancement Design
+
+Date: 2026-04-19
+
+## Problem
+
+The tagger is producing wrong tags for personal-narrative content. Concrete example: the essay "Becoming a Morning Person" — a memoir about life stages, family, recovery, aging, and parenting — was tagged `productivity` and `learning`.
+
+Root cause is twofold:
+
+1. **Taxonomy gap.** `tag-taxonomy.yaml` has no categories that cover personal narrative, memoir, family, or reflection. The closest fits the LLM can find are productivity-adjacent tags from the "Knowledge & Learning" cluster.
+2. **Prompt bias.** The system prompt in `request_metadata` (tag-notes.py:127) requires 1-5 taxonomy tags (no zero option) and tells the LLM to be "conservative" about new tag suggestions (0-2 max). Together these force the model to pick taxonomy tags even when none genuinely apply, and discourage it from proposing the new categories that would better describe the content.
+
+## Goals
+
+- The "Becoming a Morning Person" essay should be tagged with memoir/personal-narrative concepts, not productivity/learning.
+- Future notes that fall outside the current taxonomy should surface new tag suggestions rather than force-fit existing ones.
+- Taxonomy can grow over time via the existing `new_tag_accumulator` → end-of-run prompt flow — no change to that mechanism.
+
+## Non-Goals
+
+- No two-pass LLM classification. Single call per note stays.
+- No few-shot examples in the prompt for this iteration. May be added as a follow-up if the 20B local model underperforms on the rewritten prompt.
+- No change to `seo_title_suffix`, `seo_description`, `seo_keywords`, `CONTENT_CHAR_LIMIT`, the SEO-description retry flow, YAML round-tripping, or the slug derivation.
+
+## Changes
+
+### 1. Taxonomy additions (`tag-taxonomy.yaml`)
+
+Add a new cluster at the end of the file:
+
+```yaml
+  # Personal Narrative & Life
+  - memoir
+  - personal-essay
+  - reflection
+  - family
+  - parenting
+  - recovery
+  - mental-health
+  - aging
+  - relationships
+  - childhood
+  - identity
+```
+
+Rationale: these are deliberately broad and reusable. `sobriety` was considered and rejected — not expected to be a recurring theme. `recovery` is retained as a broader concept (recovery from any kind of setback, not only substance-related).
+
+### 2. Prompt rewrite in `request_metadata` (tag-notes.py:127)
+
+Three changes to the system prompt:
+
+**a. Add a tagging philosophy sentence** at the top of the `Rules:` section:
+
+> Tags should describe what the note is substantively about, not topics it merely mentions in passing.
+
+**b. Allow zero taxonomy tags.** Replace:
+
+> `tags_from_taxonomy: 1-5 tags drawn from the existing taxonomy that best fit the content.`
+
+with:
+
+> `tags_from_taxonomy: 0-5 tags drawn from the existing taxonomy, ONLY when they genuinely fit. Do NOT force a taxonomy tag — return an empty list if nothing truly applies.`
+
+**c. Loosen new-tag suggestions.** Replace:
+
+> `new_tag_suggestions: 0-2 NEW tags, only when content truly warrants it (be conservative).`
+
+with:
+
+> `new_tag_suggestions: 0-5 NEW tags when the taxonomy doesn't adequately cover the content. Each must be a reusable category (not hyper-specific to one note). Use lowercase-hyphenated style (e.g., personal-essay).`
+
+### 3. Raise the combined tag cap (tag-notes.py:225)
+
+Change `[:5]` to `[:8]`:
+
+```python
+combined = list(dict.fromkeys(list(taxonomy_tags) + list(new_suggestions)))[:8]
+```
+
+Memoir and reflection-style notes often legitimately touch 6-8 distinct themes; capping at 5 was causing otherwise-accurate tags to be dropped.
+
+## Verification
+
+After implementation:
+
+1. Open the "Becoming a Morning Person" note in `~/Documents/ejl-zk/40 Public/41 Notes/` and clear its `tags:` frontmatter field (set to empty list).
+2. Run `./tag-notes.py`.
+3. Confirm the new `tags` value includes memoir/personal-essay-style tags (e.g., `memoir`, `personal-essay`, `family`, `recovery`, `aging`, `reflection`) and does NOT include `productivity` or `learning`.
+4. Spot-check 1-2 other notes that already have reasonable tags — confirm the rewrite didn't regress them. (All LLM-backed fields are only touched when empty, so notes with existing tags won't be reprocessed at all.)
+
+If the tags still look off, the follow-up is to add a single few-shot example to the system prompt showing a memoir case with zero taxonomy tags and all new suggestions — not included in this change.
+
+## Files Touched
+
+- `tag-taxonomy.yaml` — append new cluster
+- `tag-notes.py` — `request_metadata` system prompt (~20 lines) and the `[:5]` → `[:8]` cap
@@ -1,53 +1,65 @@
 #!/usr/bin/env python3
 """
 Note Tagging and SEO Metadata Script
-Processes markdown notes using a local LLM to add tags, slugs, and SEO metadata
+Processes markdown notes using the Anthropic API to add tags, slugs, and SEO metadata.
 """

-import os
-import sys
 import io
 import json
+import os
 import re
+import sys
 from pathlib import Path

-import requests
+import anthropic
 import yaml
 from ruamel.yaml import YAML

+# ---------------------------------------------------------------------------
 # Configuration
-LM_STUDIO_URL = "http://localhost:1234/v1/chat/completions"
-MODEL_NAME = "openai/gpt-oss-20b"
+# ---------------------------------------------------------------------------
 TAXONOMY_FILE = "tag-taxonomy.yaml"
 NOTES_FOLDER = os.path.expanduser("~/Documents/ejl-zk/40 Public/41 Notes/")
 CONTENT_CHAR_LIMIT = 20000
 SEO_DESC_MIN = 150
 SEO_DESC_MAX = 160
+MODEL = "claude-sonnet-4-6"

 # Round-trip YAML preserves existing frontmatter formatting
 yaml_rt = YAML()
 yaml_rt.preserve_quotes = True
 yaml_rt.width = 4096

+# Anthropic client — reads ANTHROPIC_API_KEY from env automatically
+client = anthropic.Anthropic()

-def load_taxonomy(taxonomy_path):
-    with open(taxonomy_path, 'r') as f:
+
+# ---------------------------------------------------------------------------
+# Taxonomy helpers
+# ---------------------------------------------------------------------------
+
+def load_taxonomy(taxonomy_path: Path) -> list[str]:
+    with open(taxonomy_path) as f:
        data = yaml.safe_load(f) or {}
-    return data.get('tags', []) or []
+    return data.get("tags", []) or []


-def append_tags_to_taxonomy(taxonomy_path, new_tags):
-    with open(taxonomy_path, 'r') as f:
+def append_tags_to_taxonomy(taxonomy_path: Path, new_tags: set[str]) -> None:
+    with open(taxonomy_path) as f:
        data = yaml.safe_load(f) or {}
-    existing = data.get('tags', []) or []
+    existing = data.get("tags", []) or []
    combined = list(dict.fromkeys(existing + list(new_tags)))
-    data['tags'] = combined
-    with open(taxonomy_path, 'w') as f:
+    data["tags"] = combined
+    with open(taxonomy_path, "w") as f:
        yaml.dump(data, f, default_flow_style=False, sort_keys=False, allow_unicode=True)


-def extract_frontmatter(content):
-    pattern = r'^---\s*\n(.*?)\n---\s*\n(.*)$'
+# ---------------------------------------------------------------------------
+# Frontmatter helpers
+# ---------------------------------------------------------------------------
+
+def extract_frontmatter(content: str):
+    pattern = r"^---\s*\n(.*?)\n---\s*\n(.*)$"
    match = re.match(pattern, content, re.DOTALL)
    if not match:
        return None, content
@@ -55,66 +67,65 @@ def extract_frontmatter(content):
    return frontmatter, match.group(2)


-def reconstruct_markdown(frontmatter, body):
+def reconstruct_markdown(frontmatter, body: str) -> str:
    stream = io.StringIO()
    yaml_rt.dump(frontmatter, stream)
    fm_str = stream.getvalue()
-    if not fm_str.endswith('\n'):
-        fm_str += '\n'
+    if not fm_str.endswith("\n"):
+        fm_str += "\n"
    return f"---\n{fm_str}---\n{body}"


-def slugify(text):
+def slugify(text: str) -> str:
    text = text.lower()
-    text = re.sub(r"[’'`]", '', text)
-    text = re.sub(r'[^\w\s-]', ' ', text)
-    text = re.sub(r'[-\s]+', '-', text).strip('-')
+    text = re.sub(r"[''`]", "", text)
+    text = re.sub(r"[^\w\s-]", " ", text)
+    text = re.sub(r"[-\s]+", "-", text).strip("-")
    return text


-def parse_json_response(content):
+# ---------------------------------------------------------------------------
+# LLM helpers
+# ---------------------------------------------------------------------------
+
+def parse_json_response(content: str | None) -> dict | None:
    if content is None:
        return None
    try:
        return json.loads(content)
    except json.JSONDecodeError:
        pass
-    start = content.find('{')
-    end = content.rfind('}')
+    start = content.find("{")
+    end = content.rfind("}")
    if start != -1 and end > start:
        try:
-            return json.loads(content[start:end + 1])
+            return json.loads(content[start : end + 1])
        except json.JSONDecodeError:
            pass
    return None


-def call_llm_json(system_prompt, user_prompt, max_tokens=900):
-    payload = {
-        "model": MODEL_NAME,
-        "messages": [
-            {"role": "system", "content": system_prompt},
-            {"role": "user", "content": user_prompt},
-        ],
-        "temperature": 0.2,
-        "max_tokens": max_tokens,
-        "response_format": {"type": "text"},
-    }
+def call_llm_json(system_prompt: str, user_prompt: str, max_tokens: int = 1024) -> dict | None:
    try:
-        response = requests.post(LM_STUDIO_URL, json=payload, timeout=120)
-        if not response.ok:
-            print(f"  ! LLM error: {response.status_code} {response.reason}")
-            print(f"    body: {response.text[:500]}")
-            return None
-        result = response.json()
-        content = result['choices'][0]['message']['content']
-        return parse_json_response(content)
-    except Exception as e:
-        print(f"  ! LLM error: {e}")
+        message = client.messages.create(
+            model=MODEL,
+            max_tokens=max_tokens,
+            system=system_prompt,
+            messages=[{"role": "user", "content": user_prompt}],
+            temperature=0.2,
+        )
+        content = message.content[0].text if message.content else ""
+        parsed = parse_json_response(content)
+        if parsed is None:
+            print(f"  ! LLM returned no parseable JSON (stop_reason={message.stop_reason})")
+            print(f"    content: {content[:500]!r}")
+        return parsed
+    except anthropic.APIError as e:
+        print(f"  ! Anthropic API error: {e}")
        return None


-def request_metadata(title, note_content, taxonomy):
+def request_metadata(title: str, note_content: str, taxonomy: list[str]) -> dict | None:
    taxonomy_str = ", ".join(taxonomy)
    system_prompt = f"""You analyze markdown notes and return structured metadata.

@@ -145,13 +156,14 @@ Produce the JSON described in the system prompt."""
    return call_llm_json(system_prompt, user_prompt)


-def request_description_retry(title, note_content, previous_desc):
+def request_description_retry(title: str, note_content: str, previous_desc: str) -> str:
    system_prompt = f"""You rewrite SEO descriptions to a strict length.

 Return ONLY valid JSON of the form:
 {{"seo_description": "..."}}

 The description must be a clean, factual summary of the note, STRICTLY between {SEO_DESC_MIN} and {SEO_DESC_MAX} characters inclusive. Count characters carefully before responding."""
+
    user_prompt = f"""Note title: {title}

 Note content:
@@ -161,48 +173,51 @@ Your previous description was {len(previous_desc)} characters, outside the allow
 "{previous_desc}"

 Rewrite it to fit strictly within {SEO_DESC_MIN}-{SEO_DESC_MAX} characters."""
-    result = call_llm_json(system_prompt, user_prompt, max_tokens=400)
+    result = call_llm_json(system_prompt, user_prompt)
    if result:
-        return (result.get('seo_description') or '').strip()
-    return ''
+        return (result.get("seo_description") or "").strip()
+    return ""


-def process_note(file_path, taxonomy, new_tag_accumulator):
+# ---------------------------------------------------------------------------
+# Note processing
+# ---------------------------------------------------------------------------
+
+def process_note(file_path: Path, taxonomy: list[str], new_tag_accumulator: set) -> None:
    print(f"Processing: {file_path}")
-    with open(file_path, 'r', encoding='utf-8') as f:
-        content = f.read()
+    content = file_path.read_text(encoding="utf-8")

    frontmatter, body = extract_frontmatter(content)
    if frontmatter is None:
        print("  ⚠️  No frontmatter found, skipping")
        return

-    existing_tags = frontmatter.get('tags', []) or []
+    existing_tags = frontmatter.get("tags", []) or []
    if existing_tags == [None]:
        existing_tags = []

    needs_tags = not existing_tags
-    needs_slug = not frontmatter.get('slug')
-    needs_seo_title = not frontmatter.get('seo-title')
-    needs_seo_desc = not frontmatter.get('seo-description')
-    needs_seo_keywords = not frontmatter.get('seo-keywords')
+    needs_slug = not frontmatter.get("slug")
+    needs_seo_title = not frontmatter.get("seo-title")
+    needs_seo_desc = not frontmatter.get("seo-description")
+    needs_seo_keywords = not frontmatter.get("seo-keywords")

-    if not (needs_tags or needs_slug or needs_seo_title or needs_seo_desc or needs_seo_keywords):
+    if not any([needs_tags, needs_slug, needs_seo_title, needs_seo_desc, needs_seo_keywords]):
        print("  ✓ All fields already populated, skipping")
        return

-    title = frontmatter.get('title') or Path(file_path).stem
+    title = frontmatter.get("title") or file_path.stem
    updated = False

    if needs_slug:
-        slug = slugify(Path(file_path).stem)
-        frontmatter['slug'] = slug
+        slug = slugify(file_path.stem)
+        frontmatter["slug"] = slug
        print(f"  + Added slug: {slug}")
        updated = True

-    if not (needs_tags or needs_seo_title or needs_seo_desc or needs_seo_keywords):
-        with open(file_path, 'w', encoding='utf-8') as f:
-            f.write(reconstruct_markdown(frontmatter, body))
+    # If only slug was needed, skip the LLM call
+    if not any([needs_tags, needs_seo_title, needs_seo_desc, needs_seo_keywords]):
+        file_path.write_text(reconstruct_markdown(frontmatter, body), encoding="utf-8")
        print("  ✓ Updated successfully")
        return

@@ -212,11 +227,11 @@ def process_note(file_path, taxonomy, new_tag_accumulator):
        return

    if needs_tags:
-        taxonomy_tags = llm_response.get('tags_from_taxonomy') or []
-        new_suggestions = llm_response.get('new_tag_suggestions') or []
+        taxonomy_tags = llm_response.get("tags_from_taxonomy") or []
+        new_suggestions = llm_response.get("new_tag_suggestions") or []
        combined = list(dict.fromkeys(list(taxonomy_tags) + list(new_suggestions)))[:5]
        if combined:
-            frontmatter['tags'] = combined
+            frontmatter["tags"] = combined
            updated = True
            print(f"  + Added tags: {', '.join(combined)}")
            genuinely_new = [t for t in combined if t not in taxonomy and t in new_suggestions]
@@ -225,19 +240,17 @@ def process_note(file_path, taxonomy, new_tag_accumulator):
                new_tag_accumulator.update(genuinely_new)

    if needs_seo_title:
-        suffix = (llm_response.get('seo_title_suffix') or '').strip()
-        suffix = suffix.lstrip(':').strip()
-        # Strip a leading repeat of the title if the LLM included it anyway
+        suffix = (llm_response.get("seo_title_suffix") or "").strip().lstrip(":").strip()
        if suffix.lower().startswith(title.lower()):
-            suffix = suffix[len(title):].lstrip(':').strip()
+            suffix = suffix[len(title):].lstrip(":").strip()
        if suffix:
            seo_title = f"{title}: {suffix}"
-            frontmatter['seo-title'] = seo_title
+            frontmatter["seo-title"] = seo_title
            updated = True
            print(f"  + Added SEO title: {seo_title}")

    if needs_seo_desc:
-        seo_desc = (llm_response.get('seo_description') or '').strip()
+        seo_desc = (llm_response.get("seo_description") or "").strip()
        if seo_desc and not (SEO_DESC_MIN <= len(seo_desc) <= SEO_DESC_MAX):
            print(f"  ~ SEO description length {len(seo_desc)} outside {SEO_DESC_MIN}-{SEO_DESC_MAX}, re-asking")
            retry = request_description_retry(title, body[:CONTENT_CHAR_LIMIT], seo_desc)
@@ -248,30 +261,32 @@ def process_note(file_path, taxonomy, new_tag_accumulator):
            else:
                print("  ! Retry failed; using original")
        if seo_desc:
-            frontmatter['seo-description'] = seo_desc
+            frontmatter["seo-description"] = seo_desc
            updated = True
            print(f"  + Added SEO description ({len(seo_desc)} chars)")

    if needs_seo_keywords:
-        seo_keywords = list(dict.fromkeys(llm_response.get('seo_keywords') or []))
+        seo_keywords = list(dict.fromkeys(llm_response.get("seo_keywords") or []))
        if seo_keywords:
-            frontmatter['seo-keywords'] = seo_keywords
+            frontmatter["seo-keywords"] = seo_keywords
            updated = True
            print(f"  + Added {len(seo_keywords)} SEO keywords")

    if updated:
-        with open(file_path, 'w', encoding='utf-8') as f:
-            f.write(reconstruct_markdown(frontmatter, body))
+        file_path.write_text(reconstruct_markdown(frontmatter, body), encoding="utf-8")
        print("  ✓ Updated successfully")
    else:
        print("  - No updates needed")


-def main():
+# ---------------------------------------------------------------------------
+# Entry point
+# ---------------------------------------------------------------------------
+
+def main() -> None:
    taxonomy_path = Path(__file__).parent / TAXONOMY_FILE
    if not taxonomy_path.exists():
        print(f"Error: Taxonomy file not found at {taxonomy_path}")
-        print(f"Please create {TAXONOMY_FILE} in the same directory as this script")
        sys.exit(1)

    taxonomy = load_taxonomy(taxonomy_path)
@@ -281,11 +296,8 @@ def main():
    if not target_path.exists():
        print(f"Error: Notes folder not found: {target_path}")
        sys.exit(1)
-    if not target_path.is_dir():
-        print(f"Error: {target_path} is not a directory")
-        sys.exit(1)

-    md_files = sorted(target_path.rglob('*.md'))
+    md_files = sorted(target_path.rglob("*.md"))
    if not md_files:
        print(f"No markdown files found in {target_path}")
        sys.exit(0)
@@ -293,7 +305,7 @@ def main():
    print(f"Processing all markdown files under: {target_path}")
    print(f"Found {len(md_files)} markdown files\n")

-    new_tag_accumulator = set()
+    new_tag_accumulator: set[str] = set()
    for md_file in md_files:
        try:
            process_note(md_file, taxonomy, new_tag_accumulator)
@@ -304,17 +316,27 @@ def main():
    print("\n✓ Processing complete!")

    fresh = sorted(t for t in new_tag_accumulator if t not in taxonomy)
-    if fresh:
-        print(f"\nNew tags suggested during this run: {', '.join(fresh)}")
-        try:
-            answer = input("Add these to the taxonomy? [y/N]: ").strip().lower()
-        except EOFError:
-            answer = ''
-        if answer == 'y':
-            append_tags_to_taxonomy(taxonomy_path, fresh)
-            print(f"✓ Added {len(fresh)} tag(s) to {taxonomy_path.name}")
-        else:
-            print("Skipped taxonomy update.")
+    if not fresh:
+        return
+
+    print(f"\nNew tags suggested during this run: {', '.join(fresh)}")
+
+    # Non-interactive (CI): log and skip
+    if not sys.stdin.isatty():
+        print("Non-interactive environment detected — skipping taxonomy update.")
+        print(f"To add these manually, run the script locally and answer 'y' when prompted.")
+        return
+
+    try:
+        answer = input("Add these to the taxonomy? [y/N]: ").strip().lower()
+    except EOFError:
+        answer = ""
+
+    if answer == "y":
+        append_tags_to_taxonomy(taxonomy_path, fresh)
+        print(f"✓ Added {len(fresh)} tag(s) to {taxonomy_path.name}")
+    else:
+        print("Skipped taxonomy update.")


 if __name__ == "__main__":
@@ -1,46 +1,51 @@
-# Tag Taxonomy for Note Tagging
-# Add new tags here as the LLM suggests good ones
-
 tags:
-  # Technology & Development
-  - self-hosting
-  - linux
-  - automation
-  - ai-tools
-  - web-development
-  - infrastructure
-  - docker
-  - security
-  - privacy
-  
-  # Work & Management
-  - project-management
-  - business-analysis
-  - leadership
-  - agile
-  - team-dynamics
-  - process-improvement
-  - governance
-  
-  # Knowledge & Learning
-  - knowledge-management
-  - zettelkasten
-  - note-taking
-  - learning
-  - productivity
-  
-  # Philosophy & Spirituality
-  - buddhism
-  - eastern-philosophy
-  - meditation
-  - mindfulness
-  
-  # Literature & Writing
-  - literature
-  - postmodernism
-  - writing
-  
-  # Personal Interests
-  - plants
-  - aroids
-  - gardening
+- self-hosting
+- linux
+- automation
+- ai-tools
+- web-development
+- infrastructure
+- docker
+- security
+- privacy
+- project-management
+- business-analysis
+- leadership
+- agile
+- team-dynamics
+- process-improvement
+- governance
+- knowledge-management
+- zettelkasten
+- note-taking
+- learning
+- productivity
+- buddhism
+- eastern-philosophy
+- meditation
+- mindfulness
+- literature
+- postmodernism
+- writing
+- plants
+- aroids
+- gardening
+- memoir
+- personal-essay
+- reflection
+- family
+- parenting
+- recovery
+- mental-health
+- aging
+- relationships
+- childhood
+- identity
+- morning-routine
+- sleep-habits
+- baking
+- cooking-techniques
+- fermentation
+- food-baking
+- kitchen-hacks
+- sourdough
Author	SHA1	Message	Date
ejlewis	e792732a13	switch to using Anthropic API	2026-04-25 18:47:29 -05:00
ejlewis	fd1407e06f	Raise combined tag cap from 5 to 8	2026-04-19 22:27:40 -05:00
ejlewis	68d78fe6bd	Allow zero-taxonomy-tag output and more new tag suggestions	2026-04-19 22:24:53 -05:00
ejlewis	b4fb1283b9	Add personal-narrative cluster to tag taxonomy	2026-04-19 22:22:27 -05:00
ejlewis	ef41b6b30a	enhancements	2026-04-19 22:18:10 -05:00
ejlewis	3eff77aa1a	Add implementation plan for tag enhancement Four tasks: taxonomy append, prompt rewrite, tag cap bump, and end-to-end verification against the Morning Person note. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 22:15:34 -05:00
ejlewis	43d708c834	Add design spec for tag enhancement Covers taxonomy seeding with a personal-narrative cluster, prompt rewrite to stop force-fitting taxonomy tags, and raised combined tag cap. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>	2026-04-19 22:13:07 -05:00