Add Vision RAG notebook #171

tgourdel · 2025-12-12T00:03:49Z

Adding Vision RAG notebook

semgrep-code-mongodb · 2025-12-30T21:31:46Z

apps/interactive-journal/backend/app/routers/routes.py

+        "created_at": msg_date,
+    }
+    db.messages.insert_one(prompt_msg)
+    logger.info(f"Saved generated prompt for entry {entry_id}")


Semgrep identified an issue in your code:
Detected a logger that logs user input without properly neutralizing the output. The log message could contain characters like and and cause an attacker to forge log entries or include malicious content into the logs. Use proper input validation and/or output encoding to prevent log entries from being forged.

Dataflow graph

flowchart LR classDef invis fill:white, stroke: none classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none subgraph File0["<b>apps/interactive-journal/backend/app/routers/routes.py</b>"] direction LR %% Source subgraph Source direction LR v0["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L184 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 184] entry_id</a>"] end %% Intermediate subgraph Traces0[Traces] direction TB v2["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L184 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 184] entry_id</a>"] end %% Sink subgraph Sink direction LR v1["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L212 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 212] f"Saved generated prompt for entry {entry_id}"</a>"] end end %% Class Assignment Source:::invis Sink:::invis Traces0:::invis File0:::invis %% Connections Source --> Traces0 Traces0 --> Sink

Loading

To resolve this comment:

✨ Commit Assistant fix suggestion

Suggested change

logger.info(f"Saved generated prompt for entry {entry_id}")

# Log only the first 16 characters of entry_id to minimize risk of log injection and exposure

logger.info(f"Saved generated prompt for entry {entry_id[:16] if entry_id else '[no_id]'}")

View step-by-step instructions

Avoid including the raw, unsanitized value of entry_id (which comes from user input) directly in log messages.

Update the log entry to log only a truncated, sanitized, or masked version of the entry ID. For example, use logger.info(f"Saved generated prompt for entry {entry_id[:16]}") to limit exposure of the input.

Alternatively, if you don’t want to expose any part of the entry ID, remove it from the log or use a generic placeholder: logger.info("Saved generated prompt for entry [redacted]").

If the entry ID must be recorded for tracing, validate that it matches the expected format (for example, a MongoDB ObjectId) before logging. You could use: from bson import ObjectId; if ObjectId.is_valid(entry_id): logger.info(f"Saved generated prompt for entry {entry_id}") else: logger.info("Saved generated prompt for entry [invalid_id]")

This helps prevent log injection attacks and accidental exposure of sensitive, user-controlled values.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

/fp <comment> for false positive

/ar <comment> for acceptable risk

/other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by tainted-log-injection-stdlib-fastapi.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

Fix the code

Reply /fp $reason (if security gap doesn’t exist)

Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)

Reply /other $reason (e.g., test-only)

_{You can view more details about this finding in the Semgrep AppSec Platform.}

semgrep-code-mongodb · 2025-12-30T21:31:47Z

apps/interactive-journal/backend/app/routers/routes.py

+def search_entries(q: str, version: int = 1):
+    """Search entries using vector search, grouped by entry."""
+    db = get_database()
+    logger.info(f"Searching entries with query: {q[:50]}... (version={version})")


Semgrep identified an issue in your code:
Detected a logger that logs user input without properly neutralizing the output. The log message could contain characters like and and cause an attacker to forge log entries or include malicious content into the logs. Use proper input validation and/or output encoding to prevent log entries from being forged.

Dataflow graph

flowchart LR classDef invis fill:white, stroke: none classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none subgraph File0["<b>apps/interactive-journal/backend/app/routers/routes.py</b>"] direction LR %% Source subgraph Source direction LR v0["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L99 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 99] version</a>"] end %% Intermediate subgraph Traces0[Traces] direction TB v2["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L99 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 99] version</a>"] end %% Sink subgraph Sink direction LR v1["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L102 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 102] f"Searching entries with query: {q[:50]}... (version={version})"</a>"] end end %% Class Assignment Source:::invis Sink:::invis Traces0:::invis File0:::invis %% Connections Source --> Traces0 Traces0 --> Sink

Loading

To resolve this comment:

✨ Commit Assistant fix suggestion

Suggested change

logger.info(f"Searching entries with query: {q[:50]}... (version={version})")

# Sanitize user input to prevent log injection

clean_q = q.replace('\n', ' ').replace('\r', ' ')[:50]

logger.info(f"Searching entries with query: {clean_q}... (version={version})")

View step-by-step instructions

Avoid logging user-provided search queries directly using f-strings, as this may include malicious content in your logs.

If you need to record user input for debugging, log a masked or length-limited version, and always use structured logging with separate fields. For example, use logger.info("Searching entries", extra={"query": q[:50], "version": version}).

Alternatively, if you must include the value in the log message, sanitize the input by replacing newlines and special characters. For example: clean_q = q.replace('\n', ' ').replace('\r', ' ')[:50] then log: logger.info(f"Searching entries with query: {clean_q}... (version={version})").

Make sure to never log the full query if it may contain sensitive or user-controlled data to minimize risk of log injection.

Structured logging lets log processors or SIEM tools better handle and filter potentially unsafe or multiline user data.

💬 Ignore this finding

Reply with Semgrep commands to ignore this finding.

/fp <comment> for false positive

/ar <comment> for acceptable risk

/other <comment> for all other reasons

Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by tainted-log-injection-stdlib-fastapi.

🛟 Help? Slack #semgrep-help or go/semgrep-help.

Resolution Options:

Fix the code

Reply /fp $reason (if security gap doesn’t exist)

Reply /ar $reason (if gap is valid but intentional; add mitigations/monitoring)

Reply /other $reason (e.g., test-only)

_{You can view more details about this finding in the Semgrep AppSec Platform.}

Add Vision RAG notebook

40f5acd

tgourdel requested a review from a team as a code owner December 12, 2025 00:03

tgourdel added 2 commits December 11, 2025 22:52

Remove unused imports

78296b7

Switch to Claude's vision-capable LLM

108aea7

semgrep-code-mongodb bot reviewed Dec 30, 2025

View reviewed changes

tgourdel added 2 commits December 30, 2025 13:37

Fix pre-commit tests

b3aefb9

Use sonnet-4.5

88c87fe

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Vision RAG notebook #171

Add Vision RAG notebook #171

Uh oh!

tgourdel commented Dec 12, 2025

Uh oh!

semgrep-code-mongodb bot Dec 30, 2025

Uh oh!

semgrep-code-mongodb bot Dec 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	logger.info(f"Saved generated prompt for entry {entry_id}")
	# Log only the first 16 characters of entry_id to minimize risk of log injection and exposure
	logger.info(f"Saved generated prompt for entry {entry_id[:16] if entry_id else '[no_id]'}")

-    logger.info(f"Searching entries with query: {q[:50]}... (version={version})")
+    # Sanitize user input to prevent log injection
+    clean_q = q.replace('\n', ' ').replace('\r', ' ')[:50]
+    logger.info(f"Searching entries with query: {clean_q}... (version={version})")

Add Vision RAG notebook #171

Are you sure you want to change the base?

Add Vision RAG notebook #171

Uh oh!

Conversation

tgourdel commented Dec 12, 2025

Uh oh!

semgrep-code-mongodb bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

semgrep-code-mongodb bot Dec 30, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant