diff --git a/CHANGELOG.md b/CHANGELOG.md index 86c3ca3..724cacd 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -5,6 +5,24 @@ All notable changes to this project will be documented in this file. The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## [2.2.3] - 2026-02-07 + +### Fixed + +#### Detection Gap: File Paths with Spaces in cached_grep + +- **Issue:** All security patterns using `cached_grep` (including `unsanitized-superglobal-read`) failed to detect violations in files with spaces in their paths (e.g., `/Users/name/Local Sites/project/file.php`) +- **Root Cause:** The `cached_grep` function used `xargs` without null-delimited input, causing it to split file paths on whitespace. When scanning `/Users/name/Local Sites/...`, xargs would split this into `/Users/name/Local` and `Sites/...`, causing grep to fail silently with "No such file or directory" +- **Impact:** Complete detection failure for any WordPress installation in directories with spaces (common with Local by Flywheel, MAMP, and other local dev tools) +- **Fix:** Modified `cached_grep` to use `tr '\n' '\0' | xargs -0` for null-delimited input, ensuring file paths with spaces are handled correctly +- **Files Changed:** + - `dist/bin/check-performance.sh` (line 3354): Changed `cat "$PHP_FILE_LIST" | xargs grep` to `tr '\n' '\0' < "$PHP_FILE_LIST" | xargs -0 grep` +- **Affected Patterns:** All patterns using `cached_grep` (44+ patterns including unsanitized superglobal reads, SQL injection detection, admin capability checks, etc.) +- **Lessons Learned:** + - Always test with file paths containing spaces, especially for tools targeting local WordPress development + - `xargs` default behavior (splitting on whitespace) is unsafe for file paths; always use `-0` with null-delimited input + - Silent failures in grep pipelines can mask critical bugs; consider adding validation checks for empty results in critical detection paths + ## [2.2.2] - 2026-02-07 ### Added diff --git a/PROJECT/2-WORKING/DETECTION-GAP-UNSANITIZED-SUPERGLOBAL.md b/PROJECT/2-WORKING/DETECTION-GAP-UNSANITIZED-SUPERGLOBAL.md new file mode 100644 index 0000000..e5ac4ea --- /dev/null +++ b/PROJECT/2-WORKING/DETECTION-GAP-UNSANITIZED-SUPERGLOBAL.md @@ -0,0 +1,117 @@ +# Detection Gap: Unsanitized Superglobal Read + +**Created:** 2026-01-29 +**Status:** In Progress +**Priority:** HIGH +**Assigned Version:** v2.2.3 + +## Problem Statement + +WPCC reported **"Unsanitized superglobal read ($_GET/$_POST)" as PASSED** when the codebase contains raw `$_POST` usage without sanitization. + +**User-Reported Cases:** +1. **Line 660**: `$keyword = $_POST['keyword'];` - No isset(), no sanitization +2. **Line 930**: `$selected_value = isset( $_GET['wholesale_sales_rep'] ) ? $_GET['wholesale_sales_rep'] : '';` - isset() present but no sanitization + +**Scanned:** Universal Child Theme 2024 (~12K LOC) +**WPCC Version:** 2.0.14 + +## Current Implementation Analysis + +### Pattern JSON (`dist/patterns/unsanitized-superglobal-read.json`) +- **Detection type**: `"direct"` +- **Search pattern**: `\\$_(GET|POST|REQUEST)\\[` +- **Exclude patterns**: Includes `"isset\\("` and `"empty\\("` in the JSON + +### Actual Implementation (`check-performance.sh` lines 3607-3648) + +**Phase 1: Initial grep** (lines 3614-3623) +```bash +UNSANITIZED_MATCHES=$(cached_grep --include=*.php -E '\$_(GET|POST|REQUEST)\[' | \ + grep -v 'sanitize_' | \ + grep -v 'esc_' | \ + grep -v 'absint' | \ + grep -v 'intval' | \ + grep -v 'floatval' | \ + grep -v 'wc_clean' | \ + grep -v 'wp_unslash' | \ + grep -v '\$allowed_keys' | \ + grep -v '//.*\$_' || true) +``` + +**Phase 2: isset/empty filter** (lines 3628-3648) +```bash +# If isset/empty is present AND superglobal appears only once, skip it +if echo "$code" | grep -q 'isset\|empty'; then + if [ "$superglobal_count" -eq 1 ]; then + continue # Skip - likely just existence check + fi +fi +``` + +## Expected Behavior vs Actual + +### Line 660: `$keyword = $_POST['keyword'];` + +**Expected:** +- Initial grep: ✅ MATCH (has `$_POST[`) +- Sanitization filter: ✅ PASS (no sanitize_ functions) +- isset/empty filter: ✅ PASS (no isset/empty, so doesn't skip) +- **Result: SHOULD FLAG** ✅ + +**Actual (user reports):** +- **Result: NOT FLAGGED** ❌ + +### Line 930: `$selected_value = isset( $_GET['wholesale_sales_rep'] ) ? $_GET['wholesale_sales_rep'] : '';` + +**Expected:** +- Initial grep: ✅ MATCH (has `$_GET[`) +- Sanitization filter: ✅ PASS (no sanitize_ functions) +- isset/empty filter: + - Has isset: YES + - Superglobal count: 2 (appears twice in ternary) + - Count > 1, so does NOT skip +- **Result: SHOULD FLAG** ✅ + +**Actual (user reports):** +- **Result: NOT FLAGGED** ❌ + +## Hypothesis: Why Are They Being Missed? + +### Possible Causes + +1. **Comment filter** (line 3662): Line might be in a comment block +2. **HTML/REST config filter** (line 3667): Line might match HTML form pattern +3. **Suppression rule** (line 3671): File might have suppression comment +4. **Pattern JSON exclude_patterns**: The JSON has `"isset\\("` which might be applied differently +5. **Bug in isset/empty logic**: Logic might not be working as expected + +## Next Steps + +- [ ] Get actual file path from user to examine line 660 in context +- [ ] Check if line is in comment block or has suppression +- [ ] Verify isset/empty filter logic with actual test cases +- [ ] Determine if pattern JSON exclude_patterns are overriding the script logic +- [ ] Create fix based on root cause + +## Proposed Solution (Pending Root Cause) + +### Option A: Fix grep logic +Update the isset/empty filter to properly handle ternary operators + +### Option B: Create validator script +Similar to `wc-coupon-thankyou-context-validator.sh`, create context-aware validation + +### Option C: Update pattern JSON +Remove `"isset\\("` and `"empty\\("` from exclude_patterns in JSON + +## Test Fixtures Created + +- `temp/test-unsanitized-assignment.php` - User-reported scenarios +- `temp/test-isset-filter.sh` - Logic testing script (needs debugging) + +## Related Files + +- `dist/patterns/unsanitized-superglobal-read.json` +- `dist/bin/check-performance.sh` (lines 3607-3750) + diff --git a/dist/PATTERN-LIBRARY.json b/dist/PATTERN-LIBRARY.json index 7b5aefc..62c3c30 100644 --- a/dist/PATTERN-LIBRARY.json +++ b/dist/PATTERN-LIBRARY.json @@ -1,6 +1,6 @@ { "version": "1.0.0", - "generated": "2026-01-27T22:31:23Z", + "generated": "2026-02-07T05:07:35Z", "summary": { "total_patterns": 55, "enabled": 55, @@ -804,8 +804,8 @@ "category": "reliability", "severity": "HIGH", "title": "Coupon logic in WooCommerce thank-you/order-received context", - "description": "Detects coupon-related operations (apply_coupon, remove_coupon, WC_Coupon instantiation) in WooCommerce thank-you or order-received contexts. This is a reliability anti-pattern because the order is already complete and coupon operations should not be performed post-checkout.", - "detection_type": "direct", + "description": "Detects coupon-related operations (apply_coupon, remove_coupon, WC_Coupon instantiation) in WooCommerce thank-you or order-received contexts. Uses context-aware validation to distinguish between problematic thank-you hooks and safe checkout hooks. Ignores commented-out code and functions hooked to safe contexts like woocommerce_checkout_order_processed.", + "detection_type": "validated", "pattern_type": "php", "mitigation_detection": false, "heuristic": false, diff --git a/dist/bin/check-performance.sh b/dist/bin/check-performance.sh index e2b59c8..b571f8c 100755 --- a/dist/bin/check-performance.sh +++ b/dist/bin/check-performance.sh @@ -1,7 +1,7 @@ #!/usr/bin/env bash # # WP Code Check by Hypercart - Performance Analysis Script -# Version: 2.2.2 +# Version: 2.2.3 # # Fast, zero-dependency WordPress performance analyzer # Catches critical issues before they crash your site @@ -3342,6 +3342,11 @@ cached_grep() { fi done + # DEBUG: Uncomment to troubleshoot cached_grep issues (v2.2.3) + # [ "${DEBUG_CACHED_GREP:-}" = "1" ] && echo "[DEBUG cached_grep] Using xargs mode with $PHP_FILE_COUNT files" >&2 + # [ "${DEBUG_CACHED_GREP:-}" = "1" ] && echo "[DEBUG cached_grep] Pattern: $pattern" >&2 + # [ "${DEBUG_CACHED_GREP:-}" = "1" ] && echo "[DEBUG cached_grep] Args: ${grep_args[*]}" >&2 + # If we have a cached PHP file list, use it; otherwise fall back to # recursive grep on the original paths. This lets JS/Node-only repos # (no PHP files) still be scanned safely without depending on the @@ -3351,7 +3356,10 @@ cached_grep() { elif [ "$PHP_FILE_COUNT" -gt 1 ] && [ -n "$PHP_FILE_LIST" ] && [ -f "$PHP_FILE_LIST" ]; then # Use cached file list with xargs for parallel processing # -Hn adds filename and line number (like -rHn but without recursion) - cat "$PHP_FILE_LIST" | xargs grep -Hn "${grep_args[@]}" "$pattern" 2>/dev/null || true + # FIX v2.2.3: Use null-delimited input (tr '\n' '\0') with xargs -0 + # to handle file paths with spaces (e.g., "/Users/name/Local Sites/...") + # Without this, xargs splits on whitespace and grep fails to find files + tr '\n' '\0' < "$PHP_FILE_LIST" | xargs -0 grep -Hn "${grep_args[@]}" "$pattern" 2>/dev/null || true else # No PHP cache (e.g., JS-only project). Fall back to recursive grep. grep -rHn "${grep_args[@]}" "$pattern" "$PATHS" 2>/dev/null || true @@ -3611,6 +3619,13 @@ UNSANITIZED_VISIBLE="" # PERFORMANCE: Use cached file list instead of grep -r # NOTE: Restrict to PHP files explicitly; in JS-only repos the fallback path in # cached_grep will otherwise recurse into documentation and non-PHP assets. + +# DEBUG: Uncomment to troubleshoot unsanitized superglobal detection (v2.2.3) +# [ "${DEBUG_UNSANITIZED:-}" = "1" ] && echo "[DEBUG] Starting unsanitized superglobal detection..." >&2 +# [ "${DEBUG_UNSANITIZED:-}" = "1" ] && echo "[DEBUG] PATHS variable: $PATHS" >&2 +# [ "${DEBUG_UNSANITIZED:-}" = "1" ] && echo "[DEBUG] PHP_FILE_COUNT: $PHP_FILE_COUNT" >&2 +# [ "${DEBUG_UNSANITIZED:-}" = "1" ] && echo "[DEBUG] PHP_FILE_LIST: $PHP_FILE_LIST" >&2 + UNSANITIZED_MATCHES=$(cached_grep --include=*.php -E '\$_(GET|POST|REQUEST)\[' | \ grep -v 'sanitize_' | \ grep -v 'esc_' | \ @@ -3622,6 +3637,10 @@ UNSANITIZED_MATCHES=$(cached_grep --include=*.php -E '\$_(GET|POST|REQUEST)\[' | grep -v '\$allowed_keys' | \ grep -v '//.*\$_' || true) +# DEBUG: Uncomment to see initial grep results (v2.2.3) +# [ "${DEBUG_UNSANITIZED:-}" = "1" ] && echo "[DEBUG] AFTER initial grep + sanitization filters:" >&2 +# [ "${DEBUG_UNSANITIZED:-}" = "1" ] && echo "[DEBUG] Total matches: $(echo "$UNSANITIZED_MATCHES" | grep -c . || echo 0)" >&2 + # Now filter out lines where isset/empty is used ONLY to check existence (not followed by usage) # Pattern: isset($_GET['x']) ) { ... } with no further $_GET['x'] on the same line # This is a more nuanced filter that allows isset() checks but catches isset() + direct usage @@ -3647,6 +3666,10 @@ UNSANITIZED_MATCHES=$(echo "$UNSANITIZED_MATCHES" | while IFS= read -r line; do echo "$line" done || true) +# DEBUG: Uncomment to see results after isset/empty filter (v2.2.3) +# [ "${DEBUG_UNSANITIZED:-}" = "1" ] && echo "[DEBUG] AFTER isset/empty filter:" >&2 +# [ "${DEBUG_UNSANITIZED:-}" = "1" ] && echo "[DEBUG] Line count: $(echo "$UNSANITIZED_MATCHES" | grep -c . || echo 0)" >&2 + if [ -n "$UNSANITIZED_MATCHES" ]; then while IFS= read -r match; do [ -z "$match" ] && continue