first commit

2026-05-10 13:52:46 +08:00
commit ccc63d1e70
4583 changed files with 584341 additions and 0 deletions
--- a/minimax-xlsx/SKILL.md
+++ b/minimax-xlsx/SKILL.md
@@ -0,0 +1,324 @@
+---
+name: minimax-xlsx
+description: "MiniMax spreadsheet production system. Engage for any task that involves tabular data, numeric analysis, or spreadsheet generation. Supports XLSX/XLSM/CSV through Python 3 (openpyxl + pandas) for workbook construction, formula recalculation via recalc.py (LibreOffice headless), and the MiniMaxXlsx CLI (C#/.NET) for structural validation, formula auditing, and pivot table synthesis."
+---
+
+<brief>
+You are a rigorous quantitative analyst who converts raw data into publication-ready Excel deliverables. Every engagement produces at least one .xlsx file. Ship only the artifacts the user asked for — no READMEs, no supplementary documents, nothing that wastes context window.
+</brief>
+
+<toolkit_inventory>
+
+**Workbook construction** — Python 3 via the `ipython` tool: `openpyxl` (creation, styling, formulas) + `pandas` (data wrangling).
+
+**Formula recalculation** — `recalc.py` via the `shell` tool: invokes LibreOffice in headless mode to compute all formula values, then scans for error tokens and returns a JSON report. openpyxl writes formula text (e.g., `=SUM(A1:A10)`) but does NOT compute results — this script fills that gap.
+
+```bash
+python ./scripts/recalc.py output.xlsx [timeout_seconds]
+```
+
+- Auto-configures LibreOffice macro on first run
+- Recalculates every formula across all sheets
+- Returns JSON with error locations and tallies
+- Default timeout: 30 seconds
+- **When to run**: ALWAYS after `wb.save()` and BEFORE `recalc`, whenever the file has formulas
+- **When to skip**: Only if the file has zero formulas (pure static data)
+
+Clean output:
+```json
+{"status": "success", "total_errors": 0, "total_formulas": 42, "error_summary": {}}
+```
+
+Error output:
+```json
+{"status": "errors_found", "total_errors": 2, "total_formulas": 42, "error_summary": {"#REF!": {"count": 2, "locations": ["Sheet1!B5", "Sheet1!C10"]}}}
+```
+
+**CLI diagnostics** — MiniMaxXlsx binary via the `shell` tool, located at `./scripts/MiniMaxXlsx`:
+
+| Command | What it does | Typical invocation |
+|---|---|---|
+| `recalc` | Detects formula error tokens (#VALUE!, #REF!, etc.), zero-value cells, and implicit array formulas that work in LibreOffice but fail in MS Excel. **Run after recalc.py.** | `./scripts/MiniMaxXlsx recalc output.xlsx` |
+| `refcheck` | Detects formula anomalies: range overflow, header row captured in calculations, narrow aggregation (SUM over 1-2 cells), and pattern deviation among neighboring formulas | `./scripts/MiniMaxXlsx refcheck output.xlsx` |
+| `info` | Emits JSON describing every sheet, table, column header, and data boundary in an xlsx file | `./scripts/MiniMaxXlsx info input.xlsx --pretty` |
+| `pivot` | Generates a PivotTable (with optional companion chart) through native OpenXML construction. **Read `./pivot.md` before use.** Required flags: `--source`, `--location`, `--values`. Optional: `--rows`, `--cols`, `--filters`, `--name`, `--style`, `--chart` | `./scripts/MiniMaxXlsx pivot in.xlsx out.xlsx --source "Sheet!A1:F100" --rows "Col" --values "Val:sum" --location "Dest!A3"` |
+| `chart` | Confirms every chart is backed by real data; reports bounding-box overlaps between charts on the same sheet. Exit 0 = OK; exit 1 = broken/empty charts that must be fixed. Overlaps are warnings — still resolve them | `./scripts/MiniMaxXlsx chart output.xlsx` (add `-v` for positions, `--json` for machine output) |
+| `check` | Checks OpenXML conformance against Office 2013 standards; catches incompatible modern functions, corrupted PivotTable/Chart nodes, and absolute .rels paths. Exit 0 = deliverable; non-zero = rebuild from scratch | `./scripts/MiniMaxXlsx check output.xlsx` |
+
+**Implicit array formula handling** (detected by `recalc`):
+- Patterns like `MATCH(TRUE(), range>0, 0)` require CSE (Ctrl+Shift+Enter) in MS Excel
+- LibreOffice handles these transparently, so they pass recalculation but fail in Excel
+- When detected, restructure:
+  - Wrong: `=MATCH(TRUE(), A1:A10>0, 0)` → shows #N/A in Excel
+  - Right: `=SUMPRODUCT((A1:A10>0)*ROW(A1:A10))-ROW(A1)+1` → works everywhere
+  - Right: Or use a helper column with explicit TRUE/FALSE values
+
+**Supplementary guides** (loaded on demand — not preloaded):
+- `./pivot.md` — mandatory before any PivotTable work
+- `./charts.md` — mandatory before creating chart objects
+- `./styling.md` — mandatory before writing openpyxl styling code
+
+</toolkit_inventory>
+
+<protocol>
+
+Every spreadsheet task moves through five phases in strict order. Do not skip or reorder phases.
+
+<phase_intake>
+
+## Phase 1 — Understand the Task
+
+Before writing any code:
+
+1. Restate the problem, surrounding context, and desired outcome in your own words
+2. Identify all data sources — plan acquisition strategy, log each attempt, fall back to alternatives when a primary source is unavailable
+3. For data that requires exploration: clean first, then profile distributions, correlations, missing values, and outliers through descriptive statistics
+4. Derive evidence-backed findings from the processed data; apply methodologies, document significant effects, review assumptions, handle outliers, confirm robustness, ensure reproducibility
+5. Audit all calculations systematically; validate using alternative data, methods, or segments; assess domain plausibility against external benchmarks; clarify gaps, validation procedures, and significance
+6. Numeric data must be stored in numeric format — never as text strings
+7. Financial or monetary datasets require currency formatting with the appropriate symbol
+
+**External data provenance** — if the deliverable incorporates data fetched via `datasource`, `web_search`, API calls, or any retrieval tool:
+- Append two traceability columns next to the data: `Provider` | `Reference Link`
+- Embed URLs as plain strings — HYPERLINK() causes formula-evaluation overhead and occasional corruption
+- Sample:
+
+| Data Content | Provider | Reference Link |
+|---|---|---|
+| Apple Revenue | Yahoo Finance | https://finance.yahoo.com/... |
+| China GDP | World Bank API | world_bank_open_data |
+
+- When row-level attribution is impractical, add a footnote section at the bottom of the relevant sheet (separated by a blank row and a "References" label), or create a standalone "References" worksheet
+- Delivering a workbook that contains retrieved data without provenance metadata is forbidden
+
+</phase_intake>
+
+<phase_design>
+
+## Phase 2 — Design the Workbook
+
+Create a **sheet-level blueprint** before writing any code. For each sheet, document:
+- Cell layout (headers, data region, summary rows, computed columns)
+- Every formula and which cells it references
+- Cross-sheet dependencies and lookup relationships
+
+**Dynamic computation rule (non-negotiable):**
+
+Any value derivable from a formula must be expressed as a formula. Static values are only acceptable for external-fetch data, true constants, or circular-dependency avoidance.
+
+```python
+# Live formulas — correct
+ws['D3'] = '=B3*C3'
+ws['E3'] = '=D3/SUM($D$3:$D$50)'
+ws['F3'] = '=AVERAGE(B3:B50)'
+
+# Frozen snapshots — wrong
+result = price * qty
+ws['D3'] = result  # loses traceability
+```
+
+**Cross-table lookups — step by step:**
+
+When two tables share a common key (signals: "based on", "from another table", "match against", or columns like ProductID / EmployeeID appear in both):
+
+1. Identify the shared key column in both the source and the target table
+2. Confirm the key occupies the **first column** of the lookup range — if not, use `INDEX()` + `MATCH()` instead
+3. Build the formula with absolute anchoring and an error wrapper:
+   ```python
+   ws['D3'] = '=IFERROR(VLOOKUP(B3,$E$2:$H$120,2,FALSE),"")'
+   ```
+4. For cross-sheet references, prefix the range with the sheet name: `Summary!$A$2:$D$80`
+5. Multi-file scenarios: consolidate all sources into a single workbook before writing any lookup formulas — substituting pandas `merge()` for VLOOKUP is not allowed
+
+**Common pitfalls**: #N/A usually means the key does not exist in the target range; #REF! means the column index exceeds the width of the lookup range.
+
+**Scenario assumptions:** If certain formulas need assumptions to produce values, complete all assumptions upfront. Every cell in every table must receive a computed result — placeholder text like "Manual calculation required" is forbidden.
+
+</phase_design>
+
+<phase_fabrication>
+
+## Phase 3 — Build, Audit, Repeat
+
+Construct the workbook one sheet at a time. Audit immediately after each sheet — never defer checks to the end.
+
+```
+FOR EACH sheet:
+    1. BUILD  — populate cells with data, formulas, and visual formatting
+    2. SAVE   — wb.save('output.xlsx')
+    3. RECALC — python ./scripts/recalc.py output.xlsx (if sheet has formulas)
+    4. AUDIT  — ./scripts/MiniMaxXlsx recalc output.xlsx
+               ./scripts/MiniMaxXlsx refcheck output.xlsx
+               (if the sheet has charts) ./scripts/MiniMaxXlsx chart output.xlsx -v
+    5. FIX    — resolve every finding; loop back to step 1 until zero issues
+    6. NEXT   — advance to the next sheet only when the current one is clean
+```
+
+**Recheck outcomes are authoritative — no negotiation allowed.**
+
+The `recalc` subcommand identifies formula errors (#VALUE!, #DIV/0!, #REF!, #NAME?, #N/A, etc.) and zero-result cells. Follow these rules without exception:
+
+1. **Zero tolerance**: If `recalc` flags ANY issue, resolve it before delivery. Period.
+2. **Do NOT assume issues will self-correct:**
+   - Wrong: "These errors will disappear when the user opens the file in Excel"
+   - Wrong: "Excel will recalculate and fix these automatically"
+   - Right: Fix ALL flagged issues until error_count = 0
+3. **Every finding is an action item:**
+   - `error_count: 5` means 5 problems to solve
+   - `zero_value_count: 3` means 3 suspicious cells to examine
+   - Only `error_count: 0` allows advancing to the next step
+4. **Common rationalizations to avoid:**
+   - Wrong: "The #REF! happens because openpyxl doesn't evaluate formulas" — fix it!
+   - Wrong: "The #VALUE! will resolve when opened in Excel" — fix it!
+   - Wrong: "Zero values are expected" — examine each one; many are broken references!
+5. **Delivery gate**: Files with ANY recalc findings cannot be shipped.
+
+**Workbook scaffold:**
+
+```python
+from openpyxl import Workbook
+from openpyxl.styles import PatternFill, Font, Border, Side, Alignment
+import pandas as pd
+
+wb = Workbook()
+ws = wb.active
+ws.title = "Data"
+ws.sheet_view.showGridLines = False  # mandatory on every sheet
+
+ws['B2'] = "Title"
+ws['B2'].font = Font(size=16, bold=True)
+ws.row_dimensions[2].height = 30  # prevent title clipping
+
+wb.save('output.xlsx')
+```
+
+**Visual design** — before writing any styling code, read `./styling.md` for complete theme palettes, conditional formatting recipes, and cover page specifications. Key rules:
+
+- Gridlines off on every sheet; content starts at B2, not A1
+- Four themes are available: **grayscale** (default), **financial** (monetary/fiscal work), **verdant** (ecology, education, humanities), **dusk** (technology, creative, scientific). Select the theme that best matches the task domain
+- Cell text colors follow a two-tier convention: **blue** (#1565C0) marks hard-coded inputs, assumptions, and user-adjustable constants; **black** is the default for all formula cells regardless of reference scope. Cross-sheet and external links are not color-coded — instead, document them in the Cover page formula index
+- A Cover page is mandatory as the first worksheet in every deliverable
+- Default: no borders. Use thin borders within models only when they clarify structure.
+
+**Merged cells:** Use `ws.merge_cells()` for titles, multi-column headers, or grouped labels. Apply formatting to the top-left cell only. Where to merge: titles, section headers, category labels spanning columns. Where NOT to merge: data regions, formula ranges, PivotTable source areas. Always set `alignment` on merged cells.
+
+**Charts** — when the request contains any of: "visual", "chart", "graph", "visualization", "diagram":
+
+Read `./charts.md` in full before creating any chart object. That guide covers the complete workflow, openpyxl construction examples (bar/line/pie), chart type selection, overlap detection and resolution, and `chart` verification. Do not attempt chart creation without it.
+
+**PivotTables** — activate when you detect any of these signals:
+- Explicit: "pivot table", "data pivot", "数据透视表"
+- Implicit: roll up, grouped summary, category totals, segment analysis, distribution view, frequency split, total per category
+- The dataset exceeds 50 rows with natural grouping dimensions
+- Multi-dimensional cross-tabulation is needed
+
+When a PivotTable is warranted:
+1. Read `./pivot.md` cover-to-cover before doing anything
+2. Follow the execution sequence documented there
+3. Use the `pivot` CLI command exclusively — hand-coding pivot structures in openpyxl is forbidden
+4. The pivot output is **read-only from this point forward** — any subsequent openpyxl `load_workbook()` call will silently break internal XML references, producing a file Excel refuses to open
+
+**Execution order is strict:** Complete all openpyxl-authored sheets (Cover, Summary, data tabs) first, then run `pivot` as the final write step. After `pivot` emits the file, do not modify that file again.
+
+</phase_fabrication>
+
+<phase_verification>
+
+## Phase 4 — Certify the File
+
+After every sheet has passed its individual audit, run the structural gate:
+
+```bash
+./scripts/MiniMaxXlsx check output.xlsx
+```
+
+- Exit code 0 → safe to deliver
+- Non-zero → the file will not open in Microsoft Excel. Do NOT attempt incremental patches — regenerate the workbook from corrected code.
+
+</phase_verification>
+
+<phase_release>
+
+## Phase 5 — Delivery Checklist
+
+Before handing the file to the user, confirm every item:
+
+- [ ] At least one .xlsx file in the delivery
+- [ ] Every sheet with headers also contains data rows — no empty tables
+- [ ] No formula cell evaluates to null (if any do, verify the referenced cells hold values)
+- [ ] Row and column dimensions are proportional — no extremely narrow columns paired with tall rows
+- [ ] All computations use real data unless the user explicitly requested synthetic data
+- [ ] Measurement units appear in column headers, not inline with cell values
+- [ ] Theme matches the task domain: financial for fiscal work, verdant for ecology/education/humanities, dusk for technology/creative/scientific, grayscale for everything else
+- [ ] External data includes provenance metadata (Provider + Reference Link) in the workbook
+- [ ] Charts are real embedded objects, not "chart data" sheets with manual instructions
+- [ ] PivotTables were built via the `pivot` CLI, not hand-coded in openpyxl
+- [ ] Cross-table lookups use VLOOKUP/INDEX-MATCH formulas, not pandas `merge()`
+- [ ] `check` returned exit code 0
+- [ ] Chart overlaps have been resolved (if charts exist) — no overlapping bounding boxes
+
+</phase_release>
+
+</protocol>
+
+<guardrails>
+
+## Hard Constraints
+
+**Zero-tolerance error tokens** — none of these may exist in the delivered file:
+`#VALUE!`, `#DIV/0!`, `#REF!`, `#NAME?`, `#NULL!`, `#NUM!`, `#N/A`
+
+**Additional banned outcomes:**
+- Off-by-one cell references (wrong row, wrong column, or both)
+- Text starting with `=` misinterpreted as a formula
+- Hardcoded numbers where a formula should exist
+- Filler strings — "TODO", "Not computed", "Needs manual input", "Awaiting data" or any similar stub text in a delivered cell
+- Column headers missing units; mixed units within a calculation chain
+- Monetary figures without currency symbols (¥/$)
+- Any cell computing to 0 must be investigated — often a broken reference
+
+**Off-by-one prevention:** Before each save, trace every formula's references back to the intended cells. Then run `refcheck`. Common errors: referencing header rows, wrong row/column offset. If a result is 0 or unexpected, verify references first.
+
+**Monetary values:** Store at full precision (15000000, not 1.5M). Format for display via `"¥#,##0"`. Never store abbreviated figures that force downstream formulas to multiply by scale factors.
+
+---
+
+**Compatibility blocklist — the `check` command rejects these automatically:**
+
+The following functions require Excel 365/2021+ or are Google Sheets exclusives. Files that use them will fail to open in Excel 2019/2016. Grouped by migration effort:
+
+**Drop-in replacements available** (swap the function, keep the same cell structure):
+
+| Blocked | Substitute |
+|---------|-----------|
+| `XLOOKUP()` | `INDEX()` + `MATCH()` |
+| `XMATCH()` | `MATCH()` |
+| `SORT()`, `SORTBY()` | Sort via Data ribbon or VBA |
+| `SEQUENCE()` | `ROW()` arithmetic or manual fill |
+| `RANDARRAY()` | `RAND()` with fill-down |
+| `LET()` | Break into helper cells |
+| `LAMBDA()` | Named ranges or VBA |
+
+**Structural redesign required** (no drop-in replacement — rethink the approach):
+
+| Blocked | Migration strategy |
+|---------|-------------------|
+| `FILTER()` | AutoFilter, or SUMIF/COUNTIF criteria ranges |
+| `UNIQUE()` | Remove Duplicates, or COUNTIF-based dedup helper column |
+| `TEXTSPLIT()` | `MID()` + `FIND()` chain |
+| `VSTACK()`, `HSTACK()` | Manual range layout or helper columns |
+| `TAKE()`, `DROP()` | `INDEX()` + `ROW()` offset slicing |
+| `ARRAYFORMULA()` *(Google only)* | CSE arrays via Ctrl+Shift+Enter |
+| `QUERY()` *(Google only)* | PivotTables or SUMIF/COUNTIF |
+| `IMPORTRANGE()` *(Google only)* | Copy data into the workbook manually |
+
+---
+
+**Banned workflow patterns:**
+- Building all sheets first, then running checks once at the end
+- Ignoring `recalc` / `refcheck` findings and moving to the next sheet
+- Delivering any file that failed `check`
+- Creating "chart data" sheets with manual-insert instructions instead of real embedded charts
+- Delivering files with overlapping charts without resolving the overlaps
+
+</guardrails>
--- a/minimax-xlsx/_meta.json
+++ b/minimax-xlsx/_meta.json
@@ -0,0 +1,6 @@
+{
+  "ownerId": "kn796gme8ra5magcj2xm9pk4gs82a06m",
+  "slug": "minimax-xlsx",
+  "version": "1.0.0",
+  "publishedAt": 1772859367560
+}
--- a/minimax-xlsx/charts.md
+++ b/minimax-xlsx/charts.md
@@ -0,0 +1,187 @@
+---
+name: charts
+description: "Chart creation and verification guide for the minimax-xlsx skill. Read this document when the task requires embedded Excel charts or data visualizations."
+---
+
+**Path note**: Relative paths in this document (e.g., `./scripts/`) are anchored to the skill directory that contains this file.
+
+<embedded_objects>
+
+## Charts Must Be Real Embedded Objects
+
+**Proactive stance on visualization:**
+- If the user asks for charts or visuals, generate them immediately — don't wait for per-dataset instructions
+- When a workbook has multiple data tables, each table should have at least one chart unless the user says otherwise
+- If any dataset lacks a chart, explain why and confirm before shipping
+
+**What you must NOT do:**
+- Output a helper-only "chart dataset" tab and ask the user to insert charts manually
+- Mark chart work complete while expecting end users to finish chart insertion
+- Mark "Add visual charts" as completed without embedding actual chart objects
+
+**What you must do:**
+- Build embedded charts inside the .xlsx via openpyxl by default
+- Standalone image exports (PNG/JPG) only when explicitly requested
+
+</embedded_objects>
+
+<creation_sequence>
+
+**Mandatory sequence:**
+```
+1. Construct the workbook with openpyxl (data, styling)
+2. Insert charts using openpyxl.chart classes
+3. Save the file
+4. Run chart to confirm charts have data and detect overlaps
+5. If exit code is 1 → fix empty/malformed charts
+6. If overlaps reported → reposition charts (see overlap fixing below)
+```
+
+</creation_sequence>
+
+<code_samples>
+
+**Imports:**
+```python
+from openpyxl import Workbook
+from openpyxl.chart import BarChart, LineChart, PieChart, Reference
+from openpyxl.chart.label import DataLabelList
+```
+
+**Bar chart walkthrough:**
+```python
+from openpyxl import Workbook
+from openpyxl.chart import BarChart, Reference
+
+wb = Workbook()
+ws = wb.active
+
+rows = [
+    ['Region', 'Revenue'],
+    ['East', 480],
+    ['West', 320],
+    ['North', 560],
+    ['South', 410],
+]
+for r in rows:
+    ws.append(r)
+
+ch = BarChart()
+ch.type = "col"
+ch.style = 10
+ch.title = "Revenue by Region"
+ch.y_axis.title = 'Revenue'
+ch.x_axis.title = 'Region'
+
+vals = Reference(ws, min_col=2, min_row=1, max_row=5)
+cats = Reference(ws, min_col=1, min_row=2, max_row=5)
+
+ch.add_data(vals, titles_from_data=True)
+ch.set_categories(cats)
+ch.shape = 4
+
+ws.add_chart(ch, "E2")
+
+wb.save('output.xlsx')
+```
+
+### Chart Type Selection
+
+| Data Pattern | Chart Class | Key Config |
+|---|---|---|
+| Vertical comparison | `BarChart()` | `type="col"` (vertical) or `type="bar"` (horizontal) |
+| Temporal trend | `LineChart()` | `style=10`, optional markers |
+| Proportional split | `PieChart()` | No axes needed |
+| Cumulative spread | `AreaChart()` | `grouping="standard"` |
+
+### Line Chart Sample
+```python
+from openpyxl.chart import LineChart, Reference
+
+ch = LineChart()
+ch.title = "Trend Analysis"
+ch.style = 13
+ch.y_axis.title = 'Value'
+ch.x_axis.title = 'Month'
+
+vals = Reference(ws, min_col=2, min_row=1, max_row=13, max_col=3)
+ch.add_data(vals, titles_from_data=True)
+cats = Reference(ws, min_col=1, min_row=2, max_row=13)
+ch.set_categories(cats)
+
+ws.add_chart(ch, "E2")
+```
+
+### Pie Chart Sample
+```python
+from openpyxl.chart import PieChart, Reference
+
+pie = PieChart()
+pie.title = "Market Share"
+
+vals = Reference(ws, min_col=2, min_row=1, max_row=5)
+labels = Reference(ws, min_col=1, min_row=2, max_row=5)
+
+pie.add_data(vals, titles_from_data=True)
+pie.set_categories(labels)
+
+ws.add_chart(pie, "E2")
+```
+
+</code_samples>
+
+<post_check>
+
+**Post-generation check (non-negotiable):**
+```bash
+./scripts/MiniMaxXlsx chart output.xlsx -v
+```
+Exit code 1 means broken charts — they must be fixed. No rationalizations — if chart fails, the chart IS defective regardless of how data was embedded.
+
+</post_check>
+
+<collision_handling>
+
+### Overlap Detection and Resolution
+
+`chart` automatically detects chart collisions on each sheet. When overlaps are reported, reposition charts before delivery.
+
+**Overlap report fields**: `ChartA`, `ChartB`, `SheetName`, `RangeA`, `RangeB`, `OverlapRegion`, `OverlapPercentage`
+
+**Repositioning guidelines:**
+- **Vertical stacking** (preferred): Place charts below each other with **2 empty rows** between
+- **Side-by-side**: When sheet width allows, place horizontally with **1 empty column** gap
+- **Consistent sizing**: Keep charts on the same sheet at uniform dimensions (default: 10 columns wide x 15 rows tall)
+- Use position data from `-v` output to calculate non-overlapping anchors
+
+**Overlap fix example:**
+```python
+# chart reported: chart1 at E2:N17, chart2 at E15:N30 (overlap at E15:N17)
+# Fix: stack vertically with 2-row gap
+from openpyxl import load_workbook
+
+wb = load_workbook('output.xlsx')
+ws = wb['SheetName']
+
+for i, chart in enumerate(ws._charts):
+    chart.anchor = f'E{2 + i * 17}'  # 15 rows height + 2 rows gap
+
+wb.save('output.xlsx')
+```
+
+After repositioning, re-run `chart -v` to confirm zero overlaps.
+
+**Theme-appropriate chart colors:**
+- Grayscale: `2C2C2C`, `6B6B6B`, `1565C0`, `5B8DB8`
+- Financial: `1B3A5C`, `2A6496`, `5B9BD5`, `8FBCD8`
+
+**Chart type decision guide:**
+| Data Scenario | Chart | Use Case |
+|---|---|---|
+| Temporal progression | Line | Time series |
+| Category comparison | Column/Bar | Side-by-side metrics |
+| Part-of-whole | Pie/Doughnut | Percentages (6 items max) |
+| Data spread | Histogram | Distribution shape |
+| Variable relationships | Scatter | Correlation analysis |
+
+</collision_handling>
--- a/minimax-xlsx/pivot.md
+++ b/minimax-xlsx/pivot.md
@@ -0,0 +1,164 @@
+---
+name: pivot
+description: "Operational playbook for building PivotTables with the MiniMaxXlsx CLI. Treat this as the source of truth before invoking the pivot subcommand."
+---
+
+# Pivot Operations Manual
+
+Use this guide when a workbook needs grouped aggregation, cross-axis summaries, or interactive drilldown.
+
+## 1) Decision Gate
+
+Choose PivotTable mode when one or more conditions are true:
+
+- The request explicitly asks for a pivot table
+- The dataset is large enough that formula-only summaries become hard to maintain
+- The user needs category-by-category totals, count splits, or two-dimensional breakdowns
+- The output must support manual filtering and regrouping inside Excel
+
+Do not force PivotTable mode for trivial one-line totals. Use formulas for simple, static math.
+
+## 2) Input Readiness Contract
+
+Before running any pivot command, confirm:
+
+- Header row exists and every header is unique
+- Source block has no merged cells
+- No blank row breaks inside the data block
+- Aggregation fields are numeric where required
+- Workbook formulas already passed structural checks
+
+Recommended preflight sequence:
+
+```bash
+./scripts/MiniMaxXlsx refcheck working.xlsx
+./scripts/MiniMaxXlsx info working.xlsx --pretty
+```
+
+`info` output is authoritative. Never guess sheet names or ranges manually.
+
+## 3) Seven-Checkpoint Flow
+
+Follow this exact flow to avoid broken files:
+
+1. **Assemble base workbook** with openpyxl (cover, raw data, helper sheets)
+2. **Save once** and run `refcheck`
+3. **Inspect metadata** using `info --pretty`
+4. **Draft pivot command** from inspected headers and ranges
+5. **Run pivot as final write operation**
+6. **Run structural validation** with `check`
+7. **Deliver without reopening output in openpyxl**
+
+Why checkpoint 7 matters: a second openpyxl save can repackage XML relationships and invalidate pivot internals.
+
+## 4) Command Surface
+
+### Required arguments
+
+| Argument | Meaning | Example |
+|---|---|---|
+| `input.xlsx` | Source workbook to read | `working.xlsx` |
+| `output.xlsx` | New workbook to generate | `deliverable.xlsx` |
+| `--source` | Full source range with sheet prefix | `"RevenueLog!B3:H920"` |
+| `--location` | Pivot anchor cell | `"PivotBoard!C4"` |
+| `--values` | Metric + reducer list | `"NetAmount:sum,OrderNo:count"` |
+
+### Optional arguments
+
+| Argument | Meaning | Example |
+|---|---|---|
+| `--rows` | Row grouping fields | `"Region,Channel"` |
+| `--cols` | Column grouping fields | `"Quarter"` |
+| `--filters` | Page filters | `"Year,Owner"` |
+| `--name` | Pivot object name | `"QuarterlyMix"` |
+| `--style` | Theme (`monochrome` / `finance`) | `"monochrome"` |
+| `--chart` | Companion chart (`bar` / `line` / `pie`) | `"line"` |
+
+Supported reducers: `sum`, `count`, `avg`, `average`, `min`, `max`.
+
+## 5) Parameter Assembly Pattern
+
+Build parameters in this order to reduce mistakes:
+
+1. `--location` (destination first)
+2. `--values` (what to aggregate)
+3. `--source` (where data comes from)
+4. `--rows` / `--cols` / `--filters` (how to slice)
+5. `--name` / `--style` / `--chart` (presentation)
+
+This ordering is intentional: start from reporting target, then metric intent, then data origin.
+
+## 6) Fresh Example Set
+
+### Scenario A: Operations latency rollup
+
+```bash
+./scripts/MiniMaxXlsx pivot \
+    ops_raw.xlsx ops_pivot.xlsx \
+    --location "OpsPivot!B5" \
+    --values "LatencyMs:avg,RequestId:count" \
+    --source "ApiEvents!A1:G1800" \
+    --rows "Service,Cluster" \
+    --filters "ReleaseTag" \
+    --name "LatencyOverview" \
+    --style "monochrome" \
+    --chart "line"
+```
+
+### Scenario B: Clinic visit mix by month
+
+```bash
+./scripts/MiniMaxXlsx pivot \
+    clinic_daily.xlsx clinic_report.xlsx \
+    --location "VisitSummary!A4" \
+    --values "VisitFee:sum,VisitId:count" \
+    --source "VisitLog!A1:F2400" \
+    --rows "Department" \
+    --cols "VisitMonth" \
+    --name "DeptVisitMix" \
+    --style "finance" \
+    --chart "bar"
+```
+
+### Scenario C: Warehouse damage composition
+
+```bash
+./scripts/MiniMaxXlsx pivot \
+    warehouse_events.xlsx warehouse_dashboard.xlsx \
+    --location "LossShare!D3" \
+    --values "LossCost:sum" \
+    --source "DamageRecords!A1:E460" \
+    --rows "LossType" \
+    --filters "Warehouse" \
+    --name "LossStructure" \
+    --chart "pie"
+```
+
+## 7) Validation and Release Rule
+
+Run:
+
+```bash
+./scripts/MiniMaxXlsx check deliverable.xlsx
+```
+
+- Exit code `0`: release candidate
+- Non-zero: do not patch the xlsx in place; regenerate from corrected source flow
+
+## 8) Failure Playbook
+
+| Symptom | Likely Cause | Action |
+|---|---|---|
+| Pivot shows no records | Source range clipped | Re-run `info`, expand `--source` to full block |
+| "Field not found" | Header mismatch or typo | Copy header text directly from `info` output |
+| Validation fails on pivot nodes | Damaged pivot relationships | Rebuild from base workbook, run pivot once as final step |
+| CLI execution fails unexpectedly | Workbook locked by another app | Close Excel/WPS process and retry |
+
+## 9) Hard Prohibitions
+
+- Do not manually construct pivot XML
+- Do not run pivot before all openpyxl sheet edits are complete
+- Do not open and save pivot output with openpyxl
+- Do not deliver files that fail `check`
+
+If any prohibition is violated, regenerate the workbook end-to-end.
--- a/minimax-xlsx/scripts/recalc.py
+++ b/minimax-xlsx/scripts/recalc.py
@@ -0,0 +1,171 @@
+#!/usr/bin/env python3
+"""
+Excel Formula Recalculation Script
+Recalculates all formulas in an Excel file using LibreOffice
+"""
+
+import json
+import sys
+import subprocess
+import os
+import platform
+from pathlib import Path
+from openpyxl import load_workbook
+
+
+def setup_libreoffice_macro():
+    """Setup LibreOffice macro for recalculation if not already configured"""
+    if platform.system() == "Darwin":
+        macro_dir = os.path.expanduser("~/Library/Application Support/LibreOffice/4/user/basic/Standard")
+    else:
+        macro_dir = os.path.expanduser("~/.config/libreoffice/4/user/basic/Standard")
+
+    macro_file = os.path.join(macro_dir, "Module1.xba")
+
+    if os.path.exists(macro_file):
+        with open(macro_file, "r") as f:
+            if "RecalculateAndSave" in f.read():
+                return True
+
+    if not os.path.exists(macro_dir):
+        subprocess.run(["soffice", "--headless", "--terminate_after_init"], capture_output=True, timeout=10)
+        os.makedirs(macro_dir, exist_ok=True)
+
+    macro_content = """<?xml version="1.0" encoding="UTF-8"?>
+<!DOCTYPE script:module PUBLIC "-//OpenOffice.org//DTD OfficeDocument 1.0//EN" "module.dtd">
+<script:module xmlns:script="http://openoffice.org/2000/script" script:name="Module1" script:language="StarBasic">
+    Sub RecalculateAndSave()
+      ThisComponent.calculateAll()
+      ThisComponent.store()
+      ThisComponent.close(True)
+    End Sub
+</script:module>"""
+
+    try:
+        with open(macro_file, "w") as f:
+            f.write(macro_content)
+        return True
+    except Exception:
+        return False
+
+
+def recalc(filename, timeout=30):
+    """
+    Recalculate formulas in Excel file and report any errors
+
+    Args:
+        filename: Path to Excel file
+        timeout: Maximum time to wait for recalculation (seconds)
+
+    Returns:
+        dict with error locations and counts
+    """
+    if not Path(filename).exists():
+        return {"error": f"File {filename} does not exist"}
+
+    abs_path = str(Path(filename).absolute())
+
+    if not setup_libreoffice_macro():
+        return {"error": "Failed to setup LibreOffice macro"}
+
+    cmd = [
+        "soffice",
+        "--headless",
+        "--norestore",
+        "vnd.sun.star.script:Standard.Module1.RecalculateAndSave?language=Basic&location=application",
+        abs_path,
+    ]
+
+    # Handle timeout command differences between Linux and macOS
+    if platform.system() != "Windows":
+        timeout_cmd = "timeout" if platform.system() == "Linux" else None
+        if platform.system() == "Darwin":
+            # Check if gtimeout is available on macOS
+            try:
+                subprocess.run(["gtimeout", "--version"], capture_output=True, timeout=1, check=False)
+                timeout_cmd = "gtimeout"
+            except (FileNotFoundError, subprocess.TimeoutExpired):
+                pass
+        if timeout_cmd:
+            cmd = [timeout_cmd, str(timeout)] + cmd
+
+    result = subprocess.run(cmd, capture_output=True, text=True)
+
+    if result.returncode != 0 and result.returncode != 124:  # 124 is timeout exit code
+        error_msg = result.stderr or "Unknown error during recalculation"
+        if "Module1" in error_msg or "RecalculateAndSave" not in error_msg:
+            return {"error": "LibreOffice macro not configured properly"}
+        else:
+            return {"error": error_msg}
+
+    # Check for Excel errors in the recalculated file - scan ALL cells
+    try:
+        wb = load_workbook(filename, data_only=True)
+        excel_errors = ["#VALUE!", "#DIV/0!", "#REF!", "#NAME?", "#NULL!", "#NUM!", "#N/A"]
+        error_details = {err: [] for err in excel_errors}
+        total_errors = 0
+
+        for sheet_name in wb.sheetnames:
+            ws = wb[sheet_name]
+            # Check ALL rows and columns - no limits
+            for row in ws.iter_rows():
+                for cell in row:
+                    if cell.value is not None and isinstance(cell.value, str):
+                        for err in excel_errors:
+                            if err in cell.value:
+                                location = f"{sheet_name}!{cell.coordinate}"
+                                error_details[err].append(location)
+                                total_errors += 1
+                                break
+
+        wb.close()
+
+        # Build result summary
+        result = {"status": "success" if total_errors == 0 else "errors_found", "total_errors": total_errors, "error_summary": {}}
+
+        # Add non-empty error categories
+        for err_type, locations in error_details.items():
+            if locations:
+                result["error_summary"][err_type] = {
+                    "count": len(locations),
+                    "locations": locations[:20],  # Show up to 20 locations
+                }
+
+        # Add formula count for context - also check ALL cells
+        wb_formulas = load_workbook(filename, data_only=False)
+        formula_count = 0
+        for sheet_name in wb_formulas.sheetnames:
+            ws = wb_formulas[sheet_name]
+            for row in ws.iter_rows():
+                for cell in row:
+                    if cell.value and isinstance(cell.value, str) and cell.value.startswith("="):
+                        formula_count += 1
+        wb_formulas.close()
+
+        result["total_formulas"] = formula_count
+        return result
+
+    except Exception as e:
+        return {"error": str(e)}
+
+
+def main():
+    if len(sys.argv) < 2:
+        print("Usage: python recalc.py <excel_file> [timeout_seconds]")
+        print("\nRecalculates all formulas in an Excel file using LibreOffice")
+        print("\nReturns JSON with error details:")
+        print("  - status: 'success' or 'errors_found'")
+        print("  - total_errors: Total number of Excel errors found")
+        print("  - total_formulas: Number of formulas in the file")
+        print("  - error_summary: Breakdown by error type with locations")
+        print("    - #VALUE!, #DIV/0!, #REF!, #NAME?, #NULL!, #NUM!, #N/A")
+        sys.exit(1)
+
+    filename = sys.argv[1]
+    timeout = int(sys.argv[2]) if len(sys.argv) > 2 else 30
+    result = recalc(filename, timeout)
+    print(json.dumps(result, indent=2))
+
+
+if __name__ == "__main__":
+    main()
--- a/minimax-xlsx/styling.md
+++ b/minimax-xlsx/styling.md
@@ -0,0 +1,270 @@
+---
+name: styling
+description: "Visual styling reference for the minimax-xlsx skill. Contains theme palettes (grayscale/financial/verdant/dusk), conditional formatting recipes, and cover page layout specifications. Read this before writing openpyxl styling code."
+---
+
+<neutral_palette>
+## Grayscale Theme (Standard Default)
+
+### Color Discipline (Strictly Enforced)
+
+**Foundation tones (only these three):**
+- **White (#FEFEFE)** — backgrounds, data regions
+- **Black (#1A1A1A)** — body text, primary headers
+- **Grey (multiple shades)** — structural elements, borders, secondary labels
+
+**Sole accent: Blue**
+- For any emphasis, differentiation, or callout, use **blue** at varying intensity
+- No green, red, orange, purple, or other hues (exception: region-specific financial indicators)
+
+### Absolute Restrictions
+
+- Avoid extra hue families (green/red/orange/purple/yellow/pink) unless a market-specific finance convention explicitly requires them
+- No rainbow or multi-hue schemes
+- No saturated/vibrant tones except blue accents
+- No gradients crossing multiple color families
+
+### Implementation Palette
+
+```python
+from openpyxl.styles import PatternFill, Font, Border, Side, Alignment
+
+# Foundation tones
+tone_bg = "FEFEFE"
+tone_subtle = "F2F3F4"
+tone_stripe = "F6F7F8"
+
+tone_primary = "1A1A1A"
+tone_header = "2C2C2C"
+tone_text = "1A1A1A"
+tone_rule = "CBCBCB"
+
+# Blue accent spectrum
+accent_deep = "1565C0"
+accent_mid = "5B8DB8"
+accent_wash = "E3EDF7"
+
+ws.sheet_view.showGridLines = False
+
+hdr_fill = PatternFill(start_color=tone_header, end_color=tone_header, fill_type="solid")
+hdr_font = Font(color="FEFEFE", bold=True)
+for cell in ws['B2:F2'][0]:
+    cell.fill = hdr_fill
+    cell.font = hdr_font
+```
+</neutral_palette>
+
+<fiscal_palette>
+## Financial Theme (Monetary/Fiscal Tasks Only)
+
+Activate this palette when the task involves: equities, GDP, compensation, revenue, margins, budgeting, ROI, government finance, or similar fiscal domains.
+
+### Regional Price-Movement Colors (non-negotiable)
+
+In mainland China markets, rising prices are conventionally shown in **red** and falling prices in **green**. For all other markets this convention is reversed: **green** for gains, **red** for losses.
+
+### Implementation Palette
+
+```python
+from openpyxl.styles import PatternFill, Font, Border, Side, Alignment
+
+fin_bg = "E8EEF2"
+fin_text = "1A1A1A"
+fin_accent = "FFF8E1"
+fin_header = "1B3A5C"
+fin_loss = "E53935"
+
+ws.sheet_view.showGridLines = False
+
+fh_fill = PatternFill(start_color=fin_header, end_color=fin_header, fill_type="solid")
+fh_font = Font(color="FEFEFE", bold=True)
+fh_mark = PatternFill(start_color=fin_accent, end_color=fin_accent, fill_type="solid")
+for cell in ws['B2:F2'][0]:
+    cell.fill = fh_fill
+    cell.font = fh_font
+```
+
+</fiscal_palette>
+
+<verdant_palette>
+## Verdant Theme (Ecology / Education / Humanities)
+
+Activate this palette when the task involves: environmental analysis, education metrics, agriculture, healthcare, sustainability reporting, life sciences, or general research that benefits from a warm organic tone.
+
+### Color Discipline
+
+**Foundation tones:**
+- **Mist white (#F0F5F1)** — backgrounds, data regions
+- **Forest dark (#1A2E22)** — body text, primary headers
+- **Sage grey (multiple shades)** — structural elements, borders, secondary labels
+
+**Sole accent: Gold**
+- For emphasis, differentiation, or callouts, use **warm gold** at varying intensity
+- No blue, red, purple, or other hues
+
+### Implementation Palette
+
+```python
+from openpyxl.styles import PatternFill, Font, Border, Side, Alignment
+
+# Foundation tones
+vrd_bg = "F0F5F1"
+vrd_subtle = "E8F0EA"
+vrd_stripe = "EDF2EE"
+
+vrd_primary = "1A2E22"
+vrd_header = "1B4332"
+vrd_text = "1A2E22"
+vrd_rule = "B5C7B9"
+
+# Gold accent spectrum
+vrd_accent_deep = "9E7C20"
+vrd_accent_mid = "C9A84C"
+vrd_accent_wash = "F5F0DC"
+
+ws.sheet_view.showGridLines = False
+
+vh_fill = PatternFill(start_color=vrd_header, end_color=vrd_header, fill_type="solid")
+vh_font = Font(color="F0F5F1", bold=True)
+vh_mark = PatternFill(start_color=vrd_accent_wash, end_color=vrd_accent_wash, fill_type="solid")
+for cell in ws['B2:F2'][0]:
+    cell.fill = vh_fill
+    cell.font = vh_font
+```
+</verdant_palette>
+
+<dusk_palette>
+## Dusk Theme (Technology / Creative / Scientific)
+
+Activate this palette when the task involves: technology metrics, product analytics, engineering reports, creative industry analysis, scientific data, or presentation-grade deliverables that need a modern aesthetic.
+
+### Color Discipline
+
+**Foundation tones:**
+- **Soft lavender (#F7F3FA)** — backgrounds, data regions
+- **Dark grape (#221429)** — body text, primary headers
+- **Iris grey (multiple shades)** — structural elements, borders, secondary labels
+
+**Sole accent: Copper**
+- For emphasis, differentiation, or callouts, use **warm copper** at varying intensity
+- No blue, green, or other hues
+
+### Implementation Palette
+
+```python
+from openpyxl.styles import PatternFill, Font, Border, Side, Alignment
+
+# Foundation tones
+dsk_bg = "F7F3FA"
+dsk_subtle = "F0ECF5"
+dsk_stripe = "F3F0F7"
+
+dsk_primary = "221429"
+dsk_header = "3C1742"
+dsk_text = "221429"
+dsk_rule = "C4B8CE"
+
+# Copper accent spectrum
+dsk_accent_deep = "A0522D"
+dsk_accent_mid = "C4724A"
+dsk_accent_wash = "FAF0EB"
+
+ws.sheet_view.showGridLines = False
+
+dh_fill = PatternFill(start_color=dsk_header, end_color=dsk_header, fill_type="solid")
+dh_font = Font(color="F7F3FA", bold=True)
+dh_mark = PatternFill(start_color=dsk_accent_wash, end_color=dsk_accent_wash, fill_type="solid")
+for cell in ws['B2:F2'][0]:
+    cell.fill = dh_fill
+    cell.font = dh_font
+```
+</dusk_palette>
+
+<conditional_rules>
+
+## Conditional Formatting — Apply Proactively
+
+Apply conditional formatting deliberately to improve scanability and analytical readability.
+
+| Content Type | Technique | Sample Code |
+|---|---|---|
+| Raw numbers | **Data Bars** | `DataBarRule(start_type='min', end_type='max', color='5B8DB8', showValue=True)` |
+| Spread/range | **Color Scales** | `ColorScaleRule(start_type='min', start_color='FEFEFE', end_type='max', end_color='5B8DB8')` |
+| Status indicators | **Icon Sets** | `IconSetRule(icon_style='3Arrows', type='percent', values=[0,25,75])` |
+| Boundary triggers | **Cell Highlights** | `CellIsRule(operator='greaterThan', formula=['50000'], fill=accent_fill)` |
+| Top performers | **Rank-based** | `FormulaRule(formula=['RANK(A2,$A$2:$A$100)<=10'], fill=gold_fill)` |
+
+**Available icon styles**: `3Arrows` (directional), `3TrafficLights1` (circle indicators), `3Symbols` (check/dash/cross), `5Rating` (star)
+
+**Theme-specific palettes:**
+- Grayscale: Data bars `5B8DB8`, Scale `F2F3F4->ABABAB->2C2C2C`
+- Financial: Positive `81C784`, Negative `E57373`, Neutral `FFD54F`
+- Verdant: Data bars `C9A84C`, Scale `F0F5F1->8BAF7E->1B4332`
+- Dusk: Data bars `C4724A`, Scale `F7F3FA->9E7CAD->3C1742`
+
+```python
+from openpyxl.formatting.rule import DataBarRule, ColorScaleRule, IconSetRule, CellIsRule
+
+# Horizontal bars
+ws.conditional_formatting.add('D3:D200', DataBarRule(start_type='min', end_type='max', color='5B8DB8', showValue=True))
+
+# Tri-color gradient
+ws.conditional_formatting.add('E3:E200', ColorScaleRule(start_type='min', start_color='E57373', mid_type='percentile', mid_value=50, mid_color='FFD54F', end_type='max', end_color='81C784'))
+
+# Directional arrows
+ws.conditional_formatting.add('F3:F200', IconSetRule(icon_style='3Arrows', type='percent', values=[0, 25, 75], showValue=True))
+```
+
+**Usage tips**: Apply to 2-4 key columns per sheet; maintain consistent color semantics; layer Data Bars + Icons for maximum impact.
+
+</conditional_rules>
+
+<cover_layout>
+
+**A cover sheet is mandatory as the very first worksheet in every deliverable.**
+
+## Layout Specification
+
+| Rows | Purpose | Formatting |
+|------|---------|------------|
+| 3-4 | **Document title** | 18-20pt, bold, center-aligned |
+| 6 | Tagline or scope description | 12pt, grey text |
+| 8-16 | **Headline metrics** | Tabular layout with key figures highlighted |
+| 18-21 | **Worksheet directory** | Sheet names mapped to brief descriptions |
+| 23+ | Disclaimers, usage notes | Small font, grey |
+
+## Required Elements
+
+**1. Document title** — clear, descriptive name for the workbook
+
+**2. Headline metrics** — 3-6 most significant numbers or findings
+
+**3. Worksheet directory** — navigation aid:
+```
+| Sheet Name | Description |
+|------------|-------------|
+| Raw Data | Original dataset (100 rows) |
+| Analysis | Sales breakdown by region |
+| Pivot Summary | Interactive pivot analysis |
+```
+
+**4. PivotTable notice** (required when the workbook includes PivotTables):
+```
+After opening, update the PivotTable cache:
+  * On Windows: select any cell inside the PivotTable, press Alt+F5
+  * On macOS: go to the PivotTable Analyze ribbon, click Refresh All
+  * Shortcut for both platforms: Ctrl+Alt+F5
+```
+
+## Cover Page Visual Standards
+
+- **Background**: White or light grey (#F2F3F4)
+- **Title row height**: 30-40pt for prominence
+- **No gridlines**: Suppress gridlines on cover for a clean presentation
+- **Column span**: Merge cells A-G for the title block
+- **Color scheme**: Match the workbook's chosen theme (grayscale or financial)
+
+## Gridline Note
+Always keep the cover sheet gridlines hidden
+
+</cover_layout>