Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 4 additions & 6 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,7 @@ Extract structured data from any URL using AI. [docs](https://docs.scrapegraphai
```bash
just-scrape smart-scraper <url> -p <prompt> # Extract data with AI
just-scrape smart-scraper <url> -p <prompt> --schema <json> # Enforce output schema
just-scrape smart-scraper <url> -p <prompt> --scrolls <n> # Infinite scroll (0-100)
just-scrape smart-scraper <url> -p <prompt> --scrolls <n> # Infinite scroll (0-100)
just-scrape smart-scraper <url> -p <prompt> --pages <n> # Multi-page (1-100)
just-scrape smart-scraper <url> -p <prompt> --stealth # Anti-bot bypass (+4 credits)
just-scrape smart-scraper <url> -p <prompt> --cookies <json> --headers <json>
Expand All @@ -125,7 +125,7 @@ just-scrape smart-scraper https://news.example.com -p "Get all article headlines

# Scrape a JS-heavy SPA behind anti-bot protection
just-scrape smart-scraper https://app.example.com/dashboard -p "Extract user stats" \
--render-js --stealth
--stealth
```

## Search Scraper
Expand Down Expand Up @@ -164,7 +164,6 @@ Convert any webpage to clean markdown. [docs](https://docs.scrapegraphai.com/ser

```bash
just-scrape markdownify <url> # Convert to markdown
just-scrape markdownify <url> --render-js # JS rendering (+1 credit)
just-scrape markdownify <url> --stealth # Anti-bot bypass (+4 credits)
just-scrape markdownify <url> --headers <json> # Custom headers
```
Expand All @@ -176,7 +175,7 @@ just-scrape markdownify <url> --headers <json> # Custom headers
just-scrape markdownify https://blog.example.com/my-article

# Convert a JS-rendered page behind Cloudflare
just-scrape markdownify https://protected.example.com --render-js --stealth
just-scrape markdownify https://protected.example.com --stealth

# Pipe markdown to a file
just-scrape markdownify https://docs.example.com/api --json | jq -r '.result' > api-docs.md
Expand All @@ -196,7 +195,7 @@ just-scrape crawl <url> --no-extraction --max-pages <n> # Markdown only (2 cr
just-scrape crawl <url> -p <prompt> --schema <json> # Enforce output schema
just-scrape crawl <url> -p <prompt> --rules <json> # Crawl rules (include_paths, same_domain)
just-scrape crawl <url> -p <prompt> --no-sitemap # Skip sitemap discovery
just-scrape crawl <url> -p <prompt> --render-js --stealth # JS + anti-bot
just-scrape crawl <url> -p <prompt> --stealth # Anti-bot bypass
```

### Examples
Expand Down Expand Up @@ -242,7 +241,6 @@ Get raw HTML content from a URL. [docs](https://docs.scrapegraphai.com/services/

```bash
just-scrape scrape <url> # Raw HTML
just-scrape scrape <url> --render-js # JS rendering (+1 credit)
just-scrape scrape <url> --stealth # Anti-bot bypass (+4 credits)
just-scrape scrape <url> --branding # Extract branding (+2 credits)
just-scrape scrape <url> --country-code <iso> # Geo-targeting
Expand Down
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
{
"name": "just-scrape",
"version": "0.1.4",
"version": "0.1.6",
"description": "ScrapeGraph AI CLI tool",
"type": "module",
"main": "dist/cli.mjs",
Expand Down
13 changes: 4 additions & 9 deletions skills/just-scrape/SKILL.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,7 +47,6 @@ API key resolution order: `SGAI_API_KEY` env var → `.env` file → `~/.scrapeg
All commands support `--json` for machine-readable output (suppresses banner, spinners, prompts).

Scraping commands share these optional flags:
- `--render-js` — render JavaScript (+1 credit)
- `--stealth` — bypass anti-bot detection (+4 credits)
- `--headers <json>` — custom HTTP headers as JSON string
- `--schema <json>` — enforce output JSON schema
Expand All @@ -63,7 +62,6 @@ just-scrape smart-scraper <url> -p <prompt>
just-scrape smart-scraper <url> -p <prompt> --schema <json>
just-scrape smart-scraper <url> -p <prompt> --scrolls <n> # infinite scroll (0-100)
just-scrape smart-scraper <url> -p <prompt> --pages <n> # multi-page (1-100)
just-scrape smart-scraper <url> -p <prompt> --render-js # JS rendering (+1 credit)
just-scrape smart-scraper <url> -p <prompt> --stealth # anti-bot (+4 credits)
just-scrape smart-scraper <url> -p <prompt> --cookies <json> --headers <json>
just-scrape smart-scraper <url> -p <prompt> --plain-text
Expand All @@ -80,7 +78,7 @@ just-scrape smart-scraper https://news.example.com -p "Get headlines and dates"

# JS-heavy SPA behind anti-bot
just-scrape smart-scraper https://app.example.com/dashboard -p "Extract user stats" \
--render-js --stealth
--stealth
```

### Search Scraper
Expand Down Expand Up @@ -113,14 +111,13 @@ Convert any webpage to clean markdown.

```bash
just-scrape markdownify <url>
just-scrape markdownify <url> --render-js # +1 credit
just-scrape markdownify <url> --stealth # +4 credits
just-scrape markdownify <url> --headers <json>
```

```bash
just-scrape markdownify https://blog.example.com/my-article
just-scrape markdownify https://protected.example.com --render-js --stealth
just-scrape markdownify https://protected.example.com --stealth
just-scrape markdownify https://docs.example.com/api --json | jq -r '.result' > api-docs.md
```

Expand All @@ -136,7 +133,7 @@ just-scrape crawl <url> --no-extraction --max-pages <n> # markdown only (2 cre
just-scrape crawl <url> -p <prompt> --schema <json>
just-scrape crawl <url> -p <prompt> --rules <json> # include_paths, same_domain
just-scrape crawl <url> -p <prompt> --no-sitemap
just-scrape crawl <url> -p <prompt> --render-js --stealth
just-scrape crawl <url> -p <prompt> --stealth
```

```bash
Expand All @@ -157,7 +154,6 @@ Get raw HTML content from a URL.

```bash
just-scrape scrape <url>
just-scrape scrape <url> --render-js # +1 credit
just-scrape scrape <url> --stealth # +4 credits
just-scrape scrape <url> --branding # extract logos/colors/fonts (+2 credits)
just-scrape scrape <url> --country-code <iso>
Expand Down Expand Up @@ -269,7 +265,7 @@ done

```bash
# JS-heavy SPA behind Cloudflare
just-scrape smart-scraper https://protected.example.com -p "Extract data" --render-js --stealth
just-scrape smart-scraper https://protected.example.com -p "Extract data" --stealth

# With custom cookies/headers
just-scrape smart-scraper https://example.com -p "Extract data" \
Expand All @@ -280,7 +276,6 @@ just-scrape smart-scraper https://example.com -p "Extract data" \

| Feature | Extra Credits |
|---|---|
| `--render-js` | +1 per page |
| `--stealth` | +4 per request |
| `--branding` (scrape only) | +2 |
| `search-scraper` extraction | 10 per request |
Expand Down
2 changes: 0 additions & 2 deletions src/commands/crawl.ts
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,6 @@ export default defineCommand({
schema: { type: "string", description: "Output JSON schema (as JSON string)" },
rules: { type: "string", description: "Crawl rules as JSON object string" },
"no-sitemap": { type: "boolean", description: "Disable sitemap-based URL discovery" },
"render-js": { type: "boolean", description: "Enable heavy JS rendering (+1 credit/page)" },
stealth: { type: "boolean", description: "Bypass bot detection (+4 credits)" },
json: { type: "boolean", description: "Output raw JSON (pipeable)" },
},
Expand All @@ -46,7 +45,6 @@ export default defineCommand({
if (args.schema) params.schema = JSON.parse(args.schema);
if (args.rules) params.rules = JSON.parse(args.rules);
if (args["no-sitemap"]) params.sitemap = false;
if (args["render-js"]) params.render_heavy_js = true;
if (args.stealth) params.stealth = true;

out.start("Crawling");
Expand Down
2 changes: 0 additions & 2 deletions src/commands/markdownify.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ export default defineCommand({
description: "Website URL to convert",
required: true,
},
"render-js": { type: "boolean", description: "Enable heavy JS rendering (+1 credit)" },
stealth: { type: "boolean", description: "Bypass bot detection (+4 credits)" },
headers: { type: "string", description: "Custom headers as JSON object string" },
json: { type: "boolean", description: "Output raw JSON (pipeable)" },
Expand All @@ -28,7 +27,6 @@ export default defineCommand({
website_url: args.url,
};

if (args["render-js"]) params.render_heavy_js = true;
if (args.stealth) params.stealth = true;
if (args.headers) params.headers = JSON.parse(args.headers);

Expand Down
2 changes: 0 additions & 2 deletions src/commands/scrape.ts
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@ export default defineCommand({
description: "Website URL to scrape",
required: true,
},
"render-js": { type: "boolean", description: "Enable heavy JS rendering (+1 credit)" },
stealth: { type: "boolean", description: "Bypass bot detection (+4 credits)" },
branding: { type: "boolean", description: "Extract branding info (+2 credits)" },
"country-code": { type: "string", description: "ISO country code for geo-targeting" },
Expand All @@ -27,7 +26,6 @@ export default defineCommand({

const params: scrapegraphai.ScrapeParams = { website_url: args.url };

if (args["render-js"]) params.render_heavy_js = true;
if (args.stealth) params.stealth = true;
if (args.branding) params.branding = true;
if (args["country-code"]) params.country_code = args["country-code"];
Expand Down
2 changes: 0 additions & 2 deletions src/commands/smart-scraper.ts
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,6 @@ export default defineCommand({
schema: { type: "string", description: "Output JSON schema (as JSON string)" },
scrolls: { type: "string", description: "Number of infinite scrolls (0-100)" },
pages: { type: "string", description: "Total pages to scrape (1-100)" },
"render-js": { type: "boolean", description: "Enable heavy JS rendering (+1 credit)" },
stealth: { type: "boolean", description: "Bypass bot detection (+4 credits)" },
cookies: { type: "string", description: "Cookies as JSON object string" },
headers: { type: "string", description: "Custom headers as JSON object string" },
Expand All @@ -43,7 +42,6 @@ export default defineCommand({
if (args.schema) params.output_schema = JSON.parse(args.schema);
if (args.scrolls) params.number_of_scrolls = Number(args.scrolls);
if (args.pages) params.total_pages = Number(args.pages);
if (args["render-js"]) params.render_heavy_js = true;
if (args.stealth) params.stealth = true;
if (args.cookies) params.cookies = JSON.parse(args.cookies);
if (args.headers) params.headers = JSON.parse(args.headers);
Expand Down
4 changes: 0 additions & 4 deletions src/lib/schemas.ts
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,6 @@ export const SmartScraperSchema = z.object({
output_schema: jsonObject.optional(),
number_of_scrolls: z.number().int().min(0).max(100).optional(),
total_pages: z.number().int().min(1).max(100).optional(),
render_heavy_js: z.boolean().optional(),
stealth: z.boolean().optional(),
cookies: jsonStringObject.optional(),
headers: jsonStringObject.optional(),
Expand All @@ -31,7 +30,6 @@ export const SearchScraperSchema = z.object({

export const MarkdownifySchema = z.object({
website_url: z.string().url(),
render_heavy_js: z.boolean().optional(),
stealth: z.boolean().optional(),
headers: jsonStringObject.optional(),
webhook_url: z.string().url().optional(),
Expand All @@ -46,7 +44,6 @@ export const CrawlSchema = z.object({
schema: jsonObject.optional(),
rules: jsonObject.optional(),
sitemap: z.boolean().optional(),
render_heavy_js: z.boolean().optional(),
stealth: z.boolean().optional(),
webhook_url: z.string().url().optional(),
});
Expand All @@ -62,7 +59,6 @@ export const SitemapSchema = z.object({

export const ScrapeSchema = z.object({
website_url: z.string().url(),
render_heavy_js: z.boolean().optional(),
stealth: z.boolean().optional(),
branding: z.boolean().optional(),
country_code: z.string().optional(),
Expand Down