-
Notifications
You must be signed in to change notification settings - Fork 719
Add Vision RAG notebook #171
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| "created_at": msg_date, | ||
| } | ||
| db.messages.insert_one(prompt_msg) | ||
| logger.info(f"Saved generated prompt for entry {entry_id}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semgrep identified an issue in your code:
Detected a logger that logs user input without properly neutralizing the output. The log message could contain characters like and and cause an attacker to forge log entries or include malicious content into the logs. Use proper input validation and/or output encoding to prevent log entries from being forged.
Dataflow graph
flowchart LR
classDef invis fill:white, stroke: none
classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none
subgraph File0["<b>apps/interactive-journal/backend/app/routers/routes.py</b>"]
direction LR
%% Source
subgraph Source
direction LR
v0["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L184 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 184] entry_id</a>"]
end
%% Intermediate
subgraph Traces0[Traces]
direction TB
v2["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L184 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 184] entry_id</a>"]
end
%% Sink
subgraph Sink
direction LR
v1["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L212 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 212] f"Saved generated prompt for entry {entry_id}"</a>"]
end
end
%% Class Assignment
Source:::invis
Sink:::invis
Traces0:::invis
File0:::invis
%% Connections
Source --> Traces0
Traces0 --> Sink
To resolve this comment:
✨ Commit Assistant fix suggestion
| logger.info(f"Saved generated prompt for entry {entry_id}") | |
| # Log only the first 16 characters of entry_id to minimize risk of log injection and exposure | |
| logger.info(f"Saved generated prompt for entry {entry_id[:16] if entry_id else '[no_id]'}") |
View step-by-step instructions
- Avoid including the raw, unsanitized value of
entry_id(which comes from user input) directly in log messages. - Update the log entry to log only a truncated, sanitized, or masked version of the entry ID. For example, use
logger.info(f"Saved generated prompt for entry {entry_id[:16]}")to limit exposure of the input. - Alternatively, if you don’t want to expose any part of the entry ID, remove it from the log or use a generic placeholder:
logger.info("Saved generated prompt for entry [redacted]"). - If the entry ID must be recorded for tracing, validate that it matches the expected format (for example, a MongoDB ObjectId) before logging. You could use:
from bson import ObjectId; if ObjectId.is_valid(entry_id): logger.info(f"Saved generated prompt for entry {entry_id}") else: logger.info("Saved generated prompt for entry [invalid_id]")
This helps prevent log injection attacks and accidental exposure of sensitive, user-controlled values.
💬 Ignore this finding
Reply with Semgrep commands to ignore this finding.
/fp <comment>for false positive/ar <comment>for acceptable risk/other <comment>for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by tainted-log-injection-stdlib-fastapi.
🛟 Help? Slack #semgrep-help or go/semgrep-help.
Resolution Options:
- Fix the code
- Reply
/fp $reason(if security gap doesn’t exist) - Reply
/ar $reason(if gap is valid but intentional; add mitigations/monitoring) - Reply
/other $reason(e.g., test-only)
You can view more details about this finding in the Semgrep AppSec Platform.
| def search_entries(q: str, version: int = 1): | ||
| """Search entries using vector search, grouped by entry.""" | ||
| db = get_database() | ||
| logger.info(f"Searching entries with query: {q[:50]}... (version={version})") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Semgrep identified an issue in your code:
Detected a logger that logs user input without properly neutralizing the output. The log message could contain characters like and and cause an attacker to forge log entries or include malicious content into the logs. Use proper input validation and/or output encoding to prevent log entries from being forged.
Dataflow graph
flowchart LR
classDef invis fill:white, stroke: none
classDef default fill:#e7f5ff, color:#1c7fd6, stroke: none
subgraph File0["<b>apps/interactive-journal/backend/app/routers/routes.py</b>"]
direction LR
%% Source
subgraph Source
direction LR
v0["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L99 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 99] version</a>"]
end
%% Intermediate
subgraph Traces0[Traces]
direction TB
v2["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L99 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 99] version</a>"]
end
%% Sink
subgraph Sink
direction LR
v1["<a href=https://github.com/mongodb-developer/GenAI-Showcase/blob/1f2d3dc7ffb1518f8946e78f57dd43badfaa1f04/apps/interactive-journal/backend/app/routers/routes.py#L102 target=_blank style='text-decoration:none; color:#1c7fd6'>[Line: 102] f"Searching entries with query: {q[:50]}... (version={version})"</a>"]
end
end
%% Class Assignment
Source:::invis
Sink:::invis
Traces0:::invis
File0:::invis
%% Connections
Source --> Traces0
Traces0 --> Sink
To resolve this comment:
✨ Commit Assistant fix suggestion
| logger.info(f"Searching entries with query: {q[:50]}... (version={version})") | |
| # Sanitize user input to prevent log injection | |
| clean_q = q.replace('\n', ' ').replace('\r', ' ')[:50] | |
| logger.info(f"Searching entries with query: {clean_q}... (version={version})") |
View step-by-step instructions
- Avoid logging user-provided search queries directly using f-strings, as this may include malicious content in your logs.
- If you need to record user input for debugging, log a masked or length-limited version, and always use structured logging with separate fields. For example, use
logger.info("Searching entries", extra={"query": q[:50], "version": version}). - Alternatively, if you must include the value in the log message, sanitize the input by replacing newlines and special characters. For example:
clean_q = q.replace('\n', ' ').replace('\r', ' ')[:50]then log:logger.info(f"Searching entries with query: {clean_q}... (version={version})"). - Make sure to never log the full query if it may contain sensitive or user-controlled data to minimize risk of log injection.
Structured logging lets log processors or SIEM tools better handle and filter potentially unsafe or multiline user data.
💬 Ignore this finding
Reply with Semgrep commands to ignore this finding.
/fp <comment>for false positive/ar <comment>for acceptable risk/other <comment>for all other reasons
Alternatively, triage in Semgrep AppSec Platform to ignore the finding created by tainted-log-injection-stdlib-fastapi.
🛟 Help? Slack #semgrep-help or go/semgrep-help.
Resolution Options:
- Fix the code
- Reply
/fp $reason(if security gap doesn’t exist) - Reply
/ar $reason(if gap is valid but intentional; add mitigations/monitoring) - Reply
/other $reason(e.g., test-only)
You can view more details about this finding in the Semgrep AppSec Platform.
Adding Vision RAG notebook