Files
autodev/README.md
T

233 lines
8.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# AutoDev — Autonomous CLI Development Studio
AutoDev reads a project description and reference manuals, then autonomously plans, implements, compiles, tests, and debugs complete software projects using a local LLM.
No cloud APIs. No subscriptions. Runs entirely on your machine with [Ollama](https://ollama.com) or [vLLM](https://github.com/vllm-project/vllm).
## How It Works
```
description.txt + manuals/ → LLM plans the project → writes code → compiles → tests → debugs → delivers
```
1. You write a `description.txt` explaining what you want built
2. You put reference documentation in a `manuals/` folder
3. You run `autodev`
4. AutoDev reads everything, creates a development plan, and executes it step by step
5. If something fails to compile or run, it debugs itself — analyzing errors, generating fixes, and retrying
6. When done, you have a working project
You don't interact with it. You watch it work.
## Quick Start
```bash
# 1. Make sure Ollama is running with a model loaded
ollama run qwen2.5-coder:14b
# 2. Set up your project folder
mkdir my-project && cd my-project
mkdir manuals
# 3. Write what you want
cat > description.txt << 'EOF'
Language: Python
Build a CLI tool that converts CSV files to JSON.
It should accept an input file and output file as arguments.
Handle errors gracefully if the input file doesn't exist.
EOF
# 4. Add any reference docs (API docs, specs, examples)
cp csv-format-spec.pdf manuals/
# 5. Run AutoDev
autodev
```
## Installation
```bash
# Clone the repository
git clone https://github.com/your-username/autodev.git
cd autodev
# Symlink to your PATH
ln -s $(pwd)/autodev/autodev-cli ~/.local/bin/autodev
# or
ln -s $(pwd)/autodev/autodev-cli ~/bin/autodev
# Alternatively, run directly
python -m autodev --workdir /path/to/project
```
### Requirements
- Python 3.10+
- [Ollama](https://ollama.com) or [vLLM](https://github.com/vllm-project/vllm) running locally or on your network
- No pip dependencies — uses only the Python standard library
## Configuration
Edit `autodev/config.py` to set your LLM backend:
```python
LLM_BACKEND = "ollama" # "ollama" or "vllm"
OLLAMA_URL = "http://localhost:11434" # your Ollama instance
MODEL_NAME = "qwen2.5-coder:14b" # any model Ollama serves
```
You can also override at runtime:
```bash
autodev --backend ollama --model gemma4:e4b
```
### Tested Models
All models were tested against the same task: plan, implement, compile, test, and debug a C "hello world" project with a Makefile. Tested on Ollama with GPU offload.
| Model | Size | Result | Speed | Notes |
|-------|------|--------|-------|-------|
| `gemma4:e4b` | ~12B | ✅ Pass | Fast | Clean run, no debug needed. Best balance of speed and quality. **Recommended.** |
| `gemma3:27b` | 27B | ✅ Pass | Slow | Works well but slow. Needed sandbox fixes during early testing. Good for complex projects. |
| `gemma4:e2b` | ~8B | ❌ Fail | Very fast | Plans OK, but setup created a directory that blocked the executable name. Could not self-correct — repeated the same failed approach 10 times. |
| `gemma3:4b` | 4B | ❌ Fail | Very fast | Steps 14 passed, but debugger hallucinated a nonexistent `hello.c` file and could not reason about what files actually exist on disk. |
| `qwen2.5-coder:7b` | 7B | ❌ Fail | Fast | Classified "create main.c" as setup instead of implement, so the file was never generated. Debugger could not write a valid Makefile after 10 attempts. |
**Takeaway:** Models below ~12B parameters can plan and generate simple code, but they cannot self-correct when things go wrong. They repeat failed approaches, hallucinate files, and produce broken build scripts. **14B+ recommended for autonomous development.**
## Project Structure
Your project folder needs:
```
my-project/
├── description.txt # Required — what to build
└── manuals/ # Required — reference docs (use -nomanual to skip)
├── api-spec.md
└── protocol.txt
```
AutoDev creates these files as it works:
```
my-project/
├── description.txt
├── manuals/
├── plan.json # The development plan (human-readable)
├── worklog.json # Every action logged with timestamps
├── dependency.txt # External dependencies (compilers, libraries)
├── .autodev_state.json
└── ... your project files ...
```
## Features
### Autonomous Development Loop
Reads the description, understands the requirements, creates a structured plan, then executes it: setup → implement → compile → test → debug → finalize. No human input needed during execution.
### Self-Debugging
When compilation or tests fail, AutoDev enters a debug loop:
- Analyzes the error and source code
- Diagnoses root cause (not just symptoms)
- Generates a fix and applies it
- Verifies the fix works
- Rolls back automatically if the fix makes things worse
- Tracks failed approaches so it never repeats the same fix twice
### Resumable Sessions
Every action is logged to `worklog.json`. If AutoDev is interrupted or fails:
```bash
# Just run it again — it picks up where it left off
autodev
```
It reads the worklog, loads the existing plan, and continues from the last incomplete step.
### Cycle & Hallucination Detection
Detects when the LLM is stuck in a loop (producing similar outputs repeatedly) and automatically clears stale context to break out.
### Sandboxed Execution
- All file operations are confined to the working directory
- Shell commands are validated against a whitelist of safe tools (compilers, build tools, standard utilities)
- `sudo` and system-level commands are blocked
- Path traversal outside the working directory is prevented
### Language Agnostic
Works with any programming language the LLM knows. Tested with C, Python, and Makefiles. The LLM determines the appropriate build tools, compilers, and project structure.
### Dependency Tracking
All external dependencies (compilers, libraries, tools) are recorded in `dependency.txt` so you know exactly what the project needs.
## CLI Options
```
autodev [options]
Options:
-nomanual Skip reading manuals/ directory (for simple tasks)
-web PORT Start live web dashboard on PORT (e.g. -web 4500)
--backend {ollama,vllm} LLM backend (default: from config)
--model MODEL Model name (default: from config)
--workdir DIR Working directory (default: current directory)
```
### Web Dashboard
Run `autodev -web 4500` and open `http://localhost:4500` in your browser.
The dashboard shows three panels:
- **Plan Progress** — step-by-step checklist with ✓/✗/▸ status and completion counter
- **Project Files** — clickable file tree with live content viewer
- **LLM Activity** — real-time log of all actions and model thinking (newest first)
Updates are pushed live via Server-Sent Events — no page refresh needed.
### Incremental Updates
If you change `description.txt` and restart AutoDev, it detects the change and re-plans incrementally — telling the LLM what files already exist so it builds on previous work instead of starting over.
## Architecture
```
autodev/
├── config.py # LLM backend settings, timeouts, expert system prompt
├── llm.py # Ollama + vLLM communication with streaming and retry
├── context.py # Token-aware context window with relevance scoring
├── planner.py # Reads description + manuals, creates development plan
├── executor.py # Code generation, file writing, compilation
├── debugger.py # Error analysis, fix generation, rollback
├── sandbox.py # Whitelist-based command validation, path confinement
├── logger.py # Action logging to console and persistent worklog
├── dependency.py # Dependency tracking
├── resume.py # State persistence and session resumption
├── main.py # CLI orchestrator
└── autodev-cli # Symlink-friendly entry point
```
## How the Description Should Be Written
Be specific. Every sentence is treated as a requirement.
**Good:**
```
Language: C
Build a TCP echo server that listens on port 8080.
It should handle multiple clients using fork().
Include proper signal handling for SIGCHLD to avoid zombies.
Include a Makefile with 'all' and 'clean' targets.
The server should log connections to stderr.
```
**Too vague:**
```
Make a server program.
```
## Limitations
- Quality depends entirely on the LLM model — larger models produce better results
- No interactive mode — you can't guide it mid-run (by design)
- Manual parsing is plain text only (no PDF extraction)
- Token counting is estimated, not exact
- The LLM may occasionally produce code that compiles but doesn't meet all requirements