Add CSS background image extraction feature (Issue #1691) by YxmMyth · Pull Request #1702 · unclecode/crawl4ai

YxmMyth · 2026-01-10T15:39:36Z

This commit adds support for extracting CSS background images during crawling, addressing issue #1691 where background images were being skipped.

Changes

New Files

crawl4ai/js_snippet/extract_css_backgrounds.js: JavaScript script to extract background images from computed styles in the browser

Modified Files

crawl4ai/models.py:
- Added css_images field to Media class
- Added css_images_data field to AsyncCrawlResponse
crawl4ai/async_configs.py:
- Added CSS background image configuration parameters to CrawlerRunConfig:
  - extract_css_images (bool, default False)
  - css_image_min_width (int, default 100)
  - css_image_min_height (int, default 100)
  - css_image_score_threshold (int, default 2)
  - css_exclude_repeating (bool, default True)
crawl4ai/content_scraping_strategy.py:
- Added process_css_background_images() method
- Integrated CSS image extraction into _process_element()
- Added css_images to media dictionary
crawl4ai/async_crawler_strategy.py:
- Added JavaScript execution in _crawl_web() to extract CSS backgrounds
- Included css_images_data in AsyncCrawlResponse
crawl4ai/async_webcrawler.py:
- Modified aprocess_html() to accept and pass css_images_data
- Added Dict type import

Features

Extracts background images from both inline styles and stylesheets
Uses window.getComputedStyle() for accurate extraction
Smart filtering (small elements, repeating patterns)
Scoring system based on element size and properties
Opt-in by default for backward compatibility
Separate storage in media.css_images

Usage

result = await crawler.arun(
    url="https://example.com",
    extract_css_images=True,
    css_image_min_width=100,
    css_image_min_height=100,
)

css_images = result.media.get('css_images', [])

Closes #1691

This commit adds support for extracting CSS background images during crawling, addressing issue unclecode#1691 where background images were being skipped. ## Changes ### New Files - crawl4ai/js_snippet/extract_css_backgrounds.js: JavaScript script to extract background images from computed styles in the browser ### Modified Files - crawl4ai/models.py: - Added `css_images` field to Media class - Added `css_images_data` field to AsyncCrawlResponse - crawl4ai/async_configs.py: - Added CSS background image configuration parameters to CrawlerRunConfig: - extract_css_images (bool, default False) - css_image_min_width (int, default 100) - css_image_min_height (int, default 100) - css_image_score_threshold (int, default 2) - css_exclude_repeating (bool, default True) - crawl4ai/content_scraping_strategy.py: - Added process_css_background_images() method - Integrated CSS image extraction into _process_element() - Added css_images to media dictionary - crawl4ai/async_crawler_strategy.py: - Added JavaScript execution in _crawl_web() to extract CSS backgrounds - Included css_images_data in AsyncCrawlResponse - crawl4ai/async_webcrawler.py: - Modified aprocess_html() to accept and pass css_images_data - Added Dict type import ## Features - Extracts background images from both inline styles and stylesheets - Uses window.getComputedStyle() for accurate extraction - Smart filtering (small elements, repeating patterns) - Scoring system based on element size and properties - Opt-in by default for backward compatibility - Separate storage in media.css_images ## Usage ```python result = await crawler.arun( url="https://example.com", extract_css_images=True, css_image_min_width=100, css_image_min_height=100, ) css_images = result.media.get('css_images', []) ``` Closes unclecode#1691 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add CSS background image extraction feature (Issue #1691)#1702

Add CSS background image extraction feature (Issue #1691)#1702
YxmMyth wants to merge 1 commit intounclecode:mainfrom
YxmMyth:feature/css-background-image-extraction

YxmMyth commented Jan 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

YxmMyth commented Jan 10, 2026

Changes

New Files

Modified Files

Features

Usage

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant