If you find this useful,
Skill Details
Back to Skills

/tinyfish-web-agent

by simantak-dabhade · View on GitHub

Use TinyFish/Mino web agent to extract/scrape websites, extract data, and automate browser actions using natural language. Use when you need to extract/scrape data from websites, handle bot-protected

View on GitHub

TinyFish Web Agent

Requires: MINO_API_KEY environment variable

Best Practices

  1. Specify JSON format: Always describe the exact structure you want returned
  2. Parallel calls: When extracting from multiple independent sites, make separate parallel calls instead of combining into one prompt

Basic Extract/Scrape

Extract data from a page. Specify the JSON structure you want:

import requests
import json
import os

response = requests.post(
    "https://mino.ai/v1/automation/run-sse",
    headers={
        "X-API-Key": os.environ["MINO_API_KEY"],
        "Content-Type": "application/json",
    },
    json={
        "url": "https://example.com",
        "goal": "Extract product info as JSON: {\"name\": str, \"price\": str, \"in_stock\": bool}",
    },
    stream=True,
)

for line in response.iter_lines():
    if line:
        line_str = line.decode("utf-8")
        if line_str.startswith("data: "):
            event = json.loads(line_str[6:])
            if event.get("type") == "COMPLETE" and event.get("status") == "COMPLETED":
                print(json.dumps(event["resultJson"], indent=2))

Multiple Items

Extract lists of data with explicit structure:

json={
    "url": "https://example.com/products",
    "goal": "Extract all products as JSON array: [{\"name\": str, \"price\": str, \"url\": str}]",
}

Stealth Mode

For bot-protected sites:

json={
    "url": "https://protected-site.com",
    "goal": "Extract product data as JSON: {\"name\": str, \"price\": str, \"description\": str}",
    "browser_profile": "stealth",
}

Proxy

Route through specific country:

json={
    "url": "https://geo-restricted-site.com",
    "goal": "Extract pricing data as JSON: {\"item\": str, \"price\": str, \"currency\": str}",
    "browser_profile": "stealth",
    "proxy_config": {
        "enabled": True,
        "country_code": "US",
    },
}

Output

Results are in event["resultJson"] when event["type"] == "COMPLETE"

Parallel Extraction

When extracting from multiple independent sources, make separate parallel API calls instead of combining into one prompt:

Good - Parallel calls:

# Compare pizza prices - run these simultaneously
call_1 = extract("https://pizzahut.com", "Extract pizza prices as JSON: [{\"name\": str, \"price\": str}]")
call_2 = extract("https://dominos.com", "Extract pizza prices as JSON: [{\"name\": str, \"price\": str}]")

Bad - Single combined call:

# Don't do this - less reliable and slower
extract("https://pizzahut.com", "Extract prices from Pizza Hut and also go to Dominos...")

Each independent extraction task should be its own API call. This is faster (parallel execution) and more reliable.

Related Automation Skills

/babyconnect

ActiveCampaign CRM integration for lead management, deal tracking, and email automation. Use for syn

/serper

Google search via Serper API with full page content extraction. Fast API lookup + concurrent page sc

/stremio-cast

Busca conteúdo no Stremio Web e transmite para dispositivos Chromecast usando CATT e Playwright. Use

/home-assistant

Control Home Assistant smart home devices, run automations, and receive webhook events. Use when con