Skip to content

Conversation

@tgourdel
Copy link
Collaborator

Adding Vision RAG notebook

@tgourdel tgourdel requested a review from a team as a code owner December 12, 2025 00:03
"created_at": msg_date,
}
db.messages.insert_one(prompt_msg)
logger.info(f"Saved generated prompt for entry {entry_id}")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep identified an issue in your code:
Detected a logger that logs user input without properly neutralizing the output. The log message could contain characters like and and cause an attacker to forge log entries or include malicious content into the logs. Use proper input validation and/or output encoding to prevent log entries from being forged.

Dataflow graph
flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>apps/interactive-journal/backend/app/routers/routes.py</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L184 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 184] entry_id</a>"]
        end
        %% Intermediate

        subgraph Traces0[Traces]
            direction TB

            v2["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L184 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 184] entry_id</a>"]
        end
        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L212 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 212] f&quot;Saved generated prompt for entry {entry_id}&quot;</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    Traces0:::invis
    File0:::invis

    %% Connections

    Source --> Traces0
    Traces0 --> Sink


Loading

To resolve this comment:

✨ Commit Assistant fix suggestion

Suggested change
logger.info(f"Saved generated prompt for entry {entry_id}")
# Log only the first 16 characters of entry_id to minimize risk of log injection and exposure
logger.info(f"Saved generated prompt for entry {entry_id[:16] if entry_id else '[no_id]'}")
View step-by-step instructions
  1. Avoid including the raw, unsanitized value of entry_id (which comes from user input) directly in log messages.
  2. Update the log entry to log only a truncated, sanitized, or masked version of the entry ID. For example, use logger.info(f"Saved generated prompt for entry {entry_id[:16]}") to limit exposure of the input.
  3. Alternatively, if you don’t want to expose any part of the entry ID, remove it from the log or use a generic placeholder: logger.info("Saved generated prompt for entry [redacted]").
  4. If the entry ID must be recorded for tracing, validate that it matches the expected format (for example, a MongoDB ObjectId) before logging. You could use: from bson import ObjectId; if ObjectId.is_valid(entry_id): logger.info(f"Saved generated prompt for entry {entry_id}") else: logger.info("Saved generated prompt for entry [invalid_id]")

This helps prevent log injection attacks and accidental exposure of sensitive, user-controlled values.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by tainted-log-injection-stdlib-fastapi.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

  • Fix the code
  • Reply /fp $reason (if security gap doesn’t exist)
  • Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)
  • Reply /other $reason (e.g., test-only)

You can view more details about this finding in the Semgrep AppSec Platform.

def search_entries(q: str, version: int = 1):
"""Search entries using vector search, grouped by entry."""
db = get_database()
logger.info(f"Searching entries with query: {q[:50]}... (version={version})")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep identified an issue in your code:
Detected a logger that logs user input without properly neutralizing the output. The log message could contain characters like and and cause an attacker to forge log entries or include malicious content into the logs. Use proper input validation and/or output encoding to prevent log entries from being forged.

Dataflow graph
flowchart LR
    classDef invis fill:white, stroke: none
    classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none

    subgraph File0["<b>apps/interactive-journal/backend/app/routers/routes.py</b>"]
        direction LR
        %% Source

        subgraph Source
            direction LR

            v0["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L99 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 99] version</a>"]
        end
        %% Intermediate

        subgraph Traces0[Traces]
            direction TB

            v2["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L99 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 99] version</a>"]
        end
        %% Sink

        subgraph Sink
            direction LR

            v1["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L102 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 102] f&quot;Searching entries with query: {q[:50]}... (version={version})&quot;</a>"]
        end
    end
    %% Class Assignment
    Source:::invis
    Sink:::invis

    Traces0:::invis
    File0:::invis

    %% Connections

    Source --> Traces0
    Traces0 --> Sink


Loading

To resolve this comment:

✨ Commit Assistant fix suggestion

Suggested change
logger.info(f"Searching entries with query: {q[:50]}... (version={version})")
# Sanitize user input to prevent log injection
clean_q = q.replace('\n', ' ').replace('\r', ' ')[:50]
logger.info(f"Searching entries with query: {clean_q}... (version={version})")
View step-by-step instructions
  1. Avoid logging user-provided search queries directly using f-strings, as this may include malicious content in your logs.
  2. If you need to record user input for debugging, log a masked or length-limited version, and always use structured logging with separate fields. For example, use logger.info("Searching entries", extra={"query": q[:50], "version": version}).
  3. Alternatively, if you must include the value in the log message, sanitize the input by replacing newlines and special characters. For example: clean_q = q.replace('\n', ' ').replace('\r', ' ')[:50] then log: logger.info(f"Searching entries with query: {clean_q}... (version={version})").
  4. Make sure to never log the full query if it may contain sensitive or user-controlled data to minimize risk of log injection.

Structured logging lets log processors or SIEM tools better handle and filter potentially unsafe or multiline user data.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

  • /fp <comment> for false positive
  • /ar <comment> for acceptable risk
  • /other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by tainted-log-injection-stdlib-fastapi.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

  • Fix the code
  • Reply /fp $reason (if security gap doesn’t exist)
  • Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)
  • Reply /other $reason (e.g., test-only)

You can view more details about this finding in the Semgrep AppSec Platform.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant