opencode: move android stuff to android-ui skill
This commit is contained in:
@@ -86,6 +86,77 @@ in
|
||||
xdg.configFile."opencode/plugins/opencode-claude-bridge.js".source =
|
||||
"${opencode-claude-bridge}/lib/opencode-claude-bridge/dist/index.js";
|
||||
|
||||
xdg.configFile."opencode/skills/android-ui.md".text = ''
|
||||
---
|
||||
name: android-ui
|
||||
description: "Android UI automation via ADB - use for any Android device interaction, UI testing, screenshot analysis, element coordinate lookup, and gesture automation."
|
||||
---
|
||||
|
||||
# Android UI Interaction Workflow
|
||||
|
||||
## 1. Taking Screenshots
|
||||
```
|
||||
adb exec-out screencap -p > /tmp/screen.png
|
||||
```
|
||||
Captures the current screen state as a PNG image.
|
||||
|
||||
## 2. Analyzing Screenshots
|
||||
Delegate screenshot analysis to an explore agent rather than analyzing images directly:
|
||||
```
|
||||
mcp_task(subagent_type="explore", prompt="Analyze /tmp/screen.png. What screen is this? What elements are visible?")
|
||||
```
|
||||
The agent describes the UI, identifies elements, and estimates Y coordinates.
|
||||
|
||||
## 3. Getting Precise Element Coordinates
|
||||
UI Automator dump - extracts the full UI hierarchy as XML:
|
||||
```
|
||||
adb shell uiautomator dump /sdcard/ui.xml && adb pull /sdcard/ui.xml /tmp/ui.xml
|
||||
```
|
||||
Then grep for specific elements:
|
||||
```sh
|
||||
# Find by text
|
||||
grep -oP 'text="Login".*?bounds="[^"]*"' /tmp/ui.xml
|
||||
# Find by class
|
||||
grep -oP 'class="android.widget.EditText".*?bounds="[^"]*"' /tmp/ui.xml
|
||||
```
|
||||
Bounds format: `[left,top][right,bottom]` — tap center: `((left+right)/2, (top+bottom)/2)`
|
||||
|
||||
## 4. Tapping Elements
|
||||
```
|
||||
adb shell input tap X Y
|
||||
```
|
||||
Where X, Y are pixel coordinates from the bounds.
|
||||
|
||||
## 5. Text Input
|
||||
```
|
||||
adb shell input text "some_text"
|
||||
```
|
||||
Note: Special characters need escaping (`\!`, `\;`, etc.)
|
||||
|
||||
## 6. Other Gestures
|
||||
```sh
|
||||
# Swipe/scroll
|
||||
adb shell input swipe startX startY endX endY duration_ms
|
||||
# Key events
|
||||
adb shell input keyevent KEYCODE_BACK
|
||||
adb shell input keyevent KEYCODE_ENTER
|
||||
```
|
||||
|
||||
## 7. WebView Limitation
|
||||
- UI Automator can see WebView content if accessibility is enabled
|
||||
- Touch events on iframe content (like Cloudflare Turnstile) often fail due to cross-origin isolation
|
||||
- Form fields in WebViews work if you get exact bounds from the UI dump
|
||||
|
||||
## Typical Flow
|
||||
1. Take screenshot → analyze with explore agent (get rough layout)
|
||||
2. Dump UI hierarchy → grep for exact element bounds
|
||||
- NEVER ASSUME COORDINATES. You must ALWAYS check first.
|
||||
- Do this before ANY tap action as elements on the screen may have changed.
|
||||
3. Calculate center coordinates from bounds
|
||||
4. Tap/interact
|
||||
5. Wait → screenshot → verify result
|
||||
'';
|
||||
|
||||
xdg.configFile."opencode/skills/playwright.md".text =
|
||||
let
|
||||
browsers = pkgs.playwright-driver.browsers;
|
||||
@@ -140,56 +211,6 @@ in
|
||||
## Nix
|
||||
For using `nix build` append `-L` to get better visibility into the logs.
|
||||
If you get an error that a file can't be found, always try to `git add` the file before trying other troubleshooting steps.
|
||||
|
||||
|
||||
## Android UI Interaction Workflow Summary
|
||||
1. Taking Screenshots
|
||||
adb exec-out screencap -p > /tmp/screen.png
|
||||
Captures the current screen state as a PNG image.
|
||||
|
||||
2. Analyzing Screenshots
|
||||
I delegate screenshot analysis to an explore agent rather than analyzing images directly:
|
||||
mcp_task(subagent_type="explore", prompt="Analyze /tmp/screen.png. What screen is this? What elements are visible?")
|
||||
The agent describes the UI, identifies elements, and estimates Y coordinates.
|
||||
|
||||
3. Getting Precise Element Coordinates
|
||||
UI Automator dump - extracts the full UI hierarchy as XML:
|
||||
adb shell uiautomator dump /sdcard/ui.xml && adb pull /sdcard/ui.xml /tmp/ui.xml
|
||||
Then grep for specific elements:
|
||||
# Find by text
|
||||
grep -oP 'text="Login".*?bounds="[^"]*"' /tmp/ui.xml
|
||||
# Find by class
|
||||
grep -oP 'class="android.widget.EditText".*?bounds="[^"]*"' /tmp/ui.xml
|
||||
Bounds format: [left,top][right,bottom] → tap center: ((left+right)/2, (top+bottom)/2)
|
||||
|
||||
4. Tapping Elements
|
||||
adb shell input tap X Y
|
||||
Where X, Y are pixel coordinates from the bounds.
|
||||
|
||||
5. Text Input
|
||||
adb shell input text "some_text"
|
||||
Note: Special characters need escaping (\!, \;, etc.)
|
||||
|
||||
6. Other Gestures
|
||||
# Swipe/scroll
|
||||
adb shell input swipe startX startY endX endY duration_ms
|
||||
# Key events
|
||||
adb shell input keyevent KEYCODE_BACK
|
||||
adb shell input keyevent KEYCODE_ENTER
|
||||
|
||||
7. WebView Limitation
|
||||
- UI Automator can see WebView content if accessibility is enabled
|
||||
- Touch events on iframe content (like Cloudflare Turnstile) often fail due to cross-origin isolation
|
||||
- Form fields in WebViews work if you get exact bounds from the UI dump
|
||||
|
||||
Typical Flow
|
||||
1. Take screenshot → analyze with explore agent (get rough layout)
|
||||
2. Dump UI hierarchy → grep for exact element bounds
|
||||
- NEVER ASSUME COORDINATES. You must ALWAYS check first.
|
||||
- Do this before ANY tap action as elements on the screen may of changed.
|
||||
3. Calculate center coordinates from bounds
|
||||
4. Tap/interact
|
||||
5. Wait → screenshot → verify result
|
||||
'';
|
||||
settings = {
|
||||
theme = "opencode";
|
||||
|
||||
Reference in New Issue
Block a user