English / 中文
doc2ai is a Claude Code plugin for converting office documents into AI-friendly text formats. It focuses on preserving source structure while removing format noise, so downstream AI agents and scripts can inspect requirements, designs, spreadsheets, and other enterprise documents more reliably.
claude plugin marketplace add https://github.com/IronRookieCoder/doc2ai
claude plugin install doc2ai/doc2ai:docs2md input.docx
/doc2ai:docs2md input.doc -o md/
/doc2ai:docs2md docs/ --report
The docs2md skill converts .doc and .docx files into structured Markdown. It uses a two-stage pipeline:
doc/docx
-> script conversion and cleanup
-> targeted AI formatting repair
-> final Markdown
/doc2ai:xlsx2csv report.xlsx
/doc2ai:xlsx2csv data/ -o csv/
The xlsx2csv skill converts .xlsx files into an index CSV plus one CSV file per worksheet. It preserves the original grid layout and avoids semantic normalization.
| Skill | Command | Description |
|---|---|---|
docs2md |
/doc2ai:docs2md |
Convert .doc / .docx documents into structured Markdown |
xlsx2csv |
/doc2ai:xlsx2csv |
Convert .xlsx workbooks into AI-friendly CSV collections |
- Pandoc must be installed and available in
PATH - Python 3
pyyaml- WPS or a compatible local conversion environment is recommended for legacy
.docfiles
- Python 3
pandaspython-calaminepyyaml
Install missing Python dependencies when needed:
pip install pandas python-calamine pyyamlmd/
└── document.md
For .doc inputs, an intermediate .docx file may be generated and retained beside the original file.
When --report is used, conversion reports are written under:
md/
└── reports/
└── document.json
csv/
└── workbook/
├── workbook.csv
├── Sheet1.csv
└── Sheet2.csv
The workbook-level CSV is an index file that records worksheet order, worksheet name, exported file name, and used range.
- Preserve source content and avoid adding conclusions not present in the original file
- Prefer structural cleanup over visual layout restoration
- Keep original spreadsheet grids, including blank cells, blank rows, and blank columns
- Do not infer spreadsheet headers or normalize rows
- Remove conversion noise such as empty anchors, image remnants, Pandoc annotations, and invalid table formatting when clearly safe
- Keep suspicious content for human review instead of deleting it by default
.claude-plugin/
├── plugin.json
└── marketplace.json
skills/
├── docs2md/
│ ├── SKILL.md
│ ├── config.yaml
│ ├── scripts/
│ └── references/
└── xlsx2csv/
├── SKILL.md
├── config.yaml
└── scripts/
- Directory input is supported for both skills.
- Batch conversion preserves relative subdirectories to avoid filename collisions.
- Office temporary files starting with
~$are skipped. - Chinese paths and filenames are supported by the bundled scripts.