commit 3637f9d9df1bdfa960c17ce273f12229ee6fd437
Author: wangqifan <wangqifan@zhiyun.com>
Date:   Fri Dec 19 16:21:52 2025 +0800

    first commit

diff --git a/README.md b/README.md
new file mode 100644
index 0000000..e218cdf
--- /dev/null
+++ b/README.md
@@ -0,0 +1,116 @@
+# autodemo-win (示教式自动化原型)
+
+MIT Licensed Python 3.10+ 原型，覆盖录制→事件/多模态存储→LLM 归纳 DSL→UI 自动化执行的端到端流程，面向 Windows 10/11。
+
+## 功能概览
+- 录制层：pynput 捕获鼠标 click（不记录 move）、键盘文本缓冲（800ms 无输入自动 flush 为 text_input）、窗口焦点变更；UIA hit-test/前台窗口信息；浅层控件树摘要（深度<=3）。
+- 多模态采集：ffmpeg(优先)/mss+opencv 录屏 video.mp4，关键事件截图 frames/，鼠标附近与命中控件裁剪 frames_crops/，UIA selector+树快照 ui_snapshots/。
+- 数据层：统一 pydantic schema，events.jsonl 每条含高精度 ts、视频偏移、窗口/鼠标/UIA/帧路径等；manifest.json 记录分辨率、fps、起止时间、目录。
+- 归纳层：LLM 抽象 `LLMClient`，支持文本-only（默认 `DummyLLM`）与多模态（配置 `OPENAI_API_KEY` 时附带关键帧 base64）两种模式，输出严格符合 `dsl_schema.json` 的 DSL。
+- DSL：YAML/JSON，支持 `steps/params/assertions/retry_policy/waits` 与 `if/else`、`for_each`。
+- 执行层：基于 `uiautomation`，支持 click/type/set_value/assert_exists/wait_for，等待重试，dry-run 打印动作，窗口标题白名单保护。
+- CLI：`record` / `infer` / `run` 三个子命令；pytest 覆盖最小校验。
+
+## 目录结构
+```
+requirements.txt
+autodemo/
+  __init__.py
+  __main__.py
+  schema.py
+  screen_recorder.py
+  recorder.py
+  llm.py
+  dsl.py
+  executor.py
+  cli.py
+tests/
+  test_schema.py
+  test_dummy_llm.py
+  test_executor_dry.py
+```
+
+## 安装
+```bash
+pip install -r requirements.txt
+```
+
+## 快速使用
+1) 录制演示（按 F9 结束）：
+```bash
+python -m autodemo record --out sessions --hotkey F9 --fps 12 --screen 0
+```
+会生成 `sessions/<session_id>/`：
+- `manifest.json`：分辨率、fps、起止时间、各子目录。
+- `video.mp4`：全程录屏。
+- `events.jsonl`：逐行事件（ts/event_type/window/mouse/text/uia/frame_paths/ui_snapshot/video_time_offset_ms）。
+- `frames/`：关键事件截图。
+- `frames_crops/`：鼠标周边与命中控件区域裁剪（若可得）。
+- `ui_snapshots/`：UIA selector 与浅层控件树快照。
+
+2) 归纳 DSL（文本-only 或多模态；多模态需设置 `OPENAI_API_KEY` 环境变量，默认 `OPENAI_BASE_URL=https://api.wgetai.com/v1`、`model=gpt-5.1-high`）：
+```bash
+# 示例：对现有录制目录直接归纳
+python -m autodemo.infer --session-dir "E:\project\audoWin\sessions\26acb7e8-2317-4a44-8094-20fef3312d91" --out dsl.json
+```
+可选参数：
+- `--api-key` / `OPENAI_API_KEY`：多模态时的 LLM Key
+- `--base-url` / `OPENAI_BASE_URL`：代理/中转地址（默认 https://api.wgetai.com/v1）
+- `--model`：模型名（默认 gpt-5.1-high）
+
+3) 执行 DSL（白名单标题保护，建议 dry-run 先验证）：
+```bash
+python -m autodemo run --dsl flow.yaml --allow-title "记事本|Notepad" --dry-run
+```
+去掉 `--dry-run` 即真实执行。
+
+### 参数覆盖示例
+```bash
+python -m autodemo run --dsl flow.yaml --allow-title "记事本|Notepad" --params "{\"text\": \"hello\"}"
+```
+
+## DSL 字段示例（YAML）
+```yaml
+params:
+  text: "示例参数"
+steps:
+  - action: click
+    target: {AutomationId: "15", ControlType: "Edit"}
+  - action: type
+    target: {AutomationId: "15"}
+    text: "{{text}}"
+  - if_condition: need_confirm
+    steps:
+      - action: click
+        target: {Name: "确定"}
+    else_steps:
+      - action: click
+        target: {Name: "取消"}
+  - for_each: items
+    steps:
+      - action: type
+        target: {ClassName: "Edit"}
+        text: "{{item}}"
+assertions:
+  - "输入框非空"
+retry_policy: {max_attempts: 2, interval: 1.0}
+waits: {appear: 5.0, disappear: 5.0}
+```
+
+## 测试
+```bash
+pytest -q
+```
+
+## 组件说明
+- `recorder.py`：pynput 事件采集 + UIA hit-test/树快照 + mss/ffmpeg 录屏 + 关键帧截图裁剪 + events/manifest 持久化。
+- `screen_recorder.py`：录屏封装，优先 ffmpeg(gdigrab)，降级 mss+opencv。
+- `llm.py`：`LLMClient` 接口与 `DummyLLM` 简单规则生成；`render_prompt` 可用于接入真实 LLM。
+- `dsl.py`：DSL YAML 存取。
+- `executor.py`：uiautomation 执行器，含等待、重试、dry-run、安全白名单。
+- `cli.py`：命令行入口，子命令 `record`/`infer`/`run`。
+
+## 已知限制
+- 录制依赖全局 hook（pynput），需管理员权限时请自行处理。
+- DummyLLM 仅作示范，真实归纳需接入外部 LLM。
+- 执行器查找控件基于浅层条件匹配，复杂 UI 需扩展匹配策略或增强控件路径。