Files
LocalAgent/README.md
Mimikko-zeus 68f4f01cd7 feat:增强需求澄清与任务管理功能
更新了 .env.example,新增聊天模型配置,以提升对话处理能力。
增强了 README.md,反映了包括需求澄清、代码复用和自动重试在内的新功能。
重构了 agent.py,以支持多模型交互,并为无法在本地执行的任务新增了引导处理逻辑。
改进了 SandboxRunner,增加了任务执行成功校验,并加入了工作区清理功能。

扩展了 HistoryManager,支持任务摘要生成以及记录的批量删除。
优化了 chat_view.py 和 history_view.py 中的 UI 组件,提升用户体验,包括 Markdown 渲染和任务管理选项。
2026-01-07 12:35:27 +08:00

200 lines
6.3 KiB
Markdown

# LocalAgent - Windows Local AI Execution Assistant
A Windows-based local AI assistant that can understand natural language commands and execute file processing tasks safely in a sandboxed environment.
## Features
- **Intent Recognition**: Automatically distinguishes between chat conversations and execution tasks
- **Requirement Clarification**: Interactive Q&A to clarify vague requirements before code generation
- **Code Generation**: Generates Python code based on structured requirements
- **Safety Checks**: Multi-layer security with static analysis and LLM review
- **Sandbox Execution**: Runs generated code in an isolated environment
- **Task History**: Records all executed tasks with selective deletion
- **Streaming Responses**: Real-time display of LLM responses
- **Settings UI**: Easy configuration of API and models
- **Code Reuse**: Automatically finds and reuses successful code for similar tasks
- **Auto Retry**: AI-powered code fixing for failed tasks
- **Multi-Model Support**: Different models for intent recognition, chat, and code generation
## Project Structure
```
LocalAgent/
├── app/ # Main application
│ └── agent.py # Core application class
├── llm/ # LLM integration
│ ├── client.py # API client with retry support
│ └── prompts.py # Prompt templates
├── intent/ # Intent classification
│ ├── classifier.py # Intent classifier
│ └── labels.py # Intent labels
├── safety/ # Security checks
│ ├── rule_checker.py # Static rule checker
│ └── llm_reviewer.py # LLM-based code review
├── executor/ # Code execution
│ └── sandbox_runner.py # Sandbox executor
├── history/ # Task history
│ └── manager.py # History manager
├── ui/ # User interface
│ ├── chat_view.py # Chat interface
│ ├── clarify_view.py # Requirement clarification view
│ ├── task_guide_view.py # Task confirmation view
│ ├── history_view.py # History view with Markdown support
│ └── settings_view.py # Settings configuration view
├── tests/ # Unit tests
├── workspace/ # Working directory (auto-created)
│ ├── input/ # Input files
│ ├── output/ # Output files
│ ├── codes/ # Generated code
│ └── logs/ # Execution logs
├── main.py # Entry point
├── requirements.txt # Dependencies
└── .env.example # Configuration template
```
## Installation
### Prerequisites
- Python 3.10+
- Windows OS
- SiliconFlow API Key ([Get one here](https://siliconflow.cn))
### Setup
1. **Clone the repository**
```bash
git clone <repository-url>
cd LocalAgent
```
2. **Create virtual environment** (recommended using Anaconda)
```bash
conda create -n localagent python=3.10
conda activate localagent
```
3. **Install dependencies**
```bash
pip install -r requirements.txt
```
4. **Configure environment**
```bash
cp .env.example .env
# Edit .env and add your API key
```
5. **Run the application**
```bash
python main.py
```
## Configuration
Edit `.env` file with your settings (or use the Settings UI in the app):
```env
# SiliconFlow API Configuration
LLM_API_URL=https://api.siliconflow.cn/v1/chat/completions
LLM_API_KEY=your_api_key_here
# Model Configuration
# Intent recognition model (small model recommended for speed)
INTENT_MODEL_NAME=Qwen/Qwen2.5-7B-Instruct
# Chat model (medium model recommended for conversation)
CHAT_MODEL_NAME=Qwen/Qwen2.5-32B-Instruct
# Code generation model (large model recommended for quality)
GENERATION_MODEL_NAME=Qwen/Qwen2.5-72B-Instruct
```
## Usage
### Chat Mode
Simply type questions or have conversations:
- "What is Python?"
- "Explain machine learning"
### Execution Mode
Describe file processing tasks:
- "Copy all files from input to output"
- "Convert all PNG images to JPG format"
- "Rename files with today's date prefix"
### Workflow
1. Place input files in `workspace/input/`
2. Describe your task in the chat
3. **If the requirement is vague**, the system will ask clarifying questions:
- Radio buttons for single-choice options (e.g., watermark type)
- Checkboxes for multi-choice options (e.g., watermark positions)
- Input fields for custom values (e.g., watermark text, opacity)
4. Review the execution plan and generated code
5. Click "Execute" to run
6. Find results in `workspace/output/`
### Requirement Clarification Example
When you input a vague request like "Add watermark to images", the system will:
1. **Check completeness** - Detect missing information
2. **Ask questions** - Present interactive options:
- Watermark type: Text / Image (radio)
- Position: Top-left / Top-right / Bottom-left / Bottom-right / Center (checkbox)
- Text content: [input field]
- Opacity: [input field with default 50%]
3. **Structure requirement** - Convert answers into a complete specification
4. **Generate code** - Create code based on the structured requirement
## Security
LocalAgent implements multiple security layers:
1. **Hard Rules** - Blocks dangerous operations:
- Network modules (socket, subprocess)
- Code execution (eval, exec)
- System commands (os.system, os.popen)
2. **Soft Rules** - Warns about sensitive operations:
- File deletion
- Network requests (requests, urllib)
3. **LLM Review** - Semantic analysis of generated code
4. **Sandbox Execution** - Isolated subprocess with limited permissions
## Testing
Run unit tests:
```bash
python -m pytest tests/ -v
```
## Supported File Operations
The generated code can use these libraries:
**Standard Library:**
- os, sys, pathlib - Path operations
- shutil - File copy/move
- json, csv - Data formats
- zipfile, tarfile - Compression
- And more...
**Third-party Libraries:**
- Pillow - Image processing
- openpyxl - Excel files
- python-docx - Word documents
- PyPDF2 - PDF files
- chardet - Encoding detection
## License
MIT License
## Contributing
Contributions are welcome! Please feel free to submit issues and pull requests.