-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Update os from v3.14.2 and fix nt,posix,_os #6958
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughSocket timeout APIs now accept optional float args via the VM; float→Duration conversion tightened; runtime warnings added for bool used as file descriptors; ScandirIterator gains a destructor and readinto added; link and spawn/path handling moved to FsPath/OsPath with a new _create_environ; Windows env dedupe; SIGBREAK added on Windows. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin os-posix |
fcf7097 to
5d6d72f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/convert/try_from.rs (1)
127-137:⚠️ Potential issue | 🟠 MajorAdd is_finite() check and range validation to prevent silent overflow on NaN/Inf.
Casting NaN/Inf to
u64silently saturates (NaN→0, Inf→u64::MAX), and large values aboveu64::MAXsilently cap atu64::MAX. The code should validate that the float is finite and within range before conversion. Python's timedelta rejects these cases with an error.Additionally, while the fractional part alone produces at most 999_999_999 nanos, a defensive check for
nanos >= 1_000_000_000before callingDuration::newprevents any potential panic ifsecssaturates tou64::MAX.🛠️ Suggested fix
let f = float.to_f64(); + if !f.is_finite() { + return Err(vm.new_value_error("value out of range")); + } if f < 0.0 { return Err(vm.new_value_error("negative duration")); } + if f > u64::MAX as f64 { + return Err(vm.new_value_error("value out of range")); + } // Convert float to Duration using floor rounding (_PyTime_ROUND_FLOOR) - let secs = f.trunc() as u64; + let mut secs = f.trunc() as u64; let frac = f.fract(); // Use floor to round down the nanoseconds - let nanos = (frac * 1_000_000_000.0).floor() as u32; - Ok(Self::new(secs, nanos)) + let mut nanos = (frac * 1_000_000_000.0).floor() as u32; + if nanos == 1_000_000_000 { + if secs == u64::MAX { + return Err(vm.new_value_error("value out of range")); + } + secs += 1; + nanos = 0; + } + Ok(Self::new(secs, nanos))
🤖 Fix all issues with AI agents
In `@crates/stdlib/src/socket.rs`:
- Around line 2204-2222: In settimeout (and the analogous setdefaulttimeout)
reject non-finite or excessively large floats before storing/using them: check
that the converted float is finite (f.is_finite()), non‑negative, and does not
exceed the maximum representable seconds for Duration::from_secs_f64 (e.g.
compare against u64::MAX as f64 or otherwise use a safe upper bound), and return
a ValueError via vm.new_value_error when the check fails; update the same
validations around timeout.store and any code paths that later call
Duration::from_secs_f64 so you never pass NaN/Inf or out‑of‑range values to
Duration::from_secs_f64.
In `@crates/vm/src/stdlib/os.rs`:
- Around line 1376-1385: The non-Unix branch handling of follow_symlinks (in the
match on follow_symlinks.into_option() that computes src_path) currently treats
None the same as Some(false); change it so the default None behaves like
Some(true) to match the Unix implementation: when follow_symlinks is Some(true)
or None, call fs::canonicalize(&src.path).unwrap_or_else(|_|
PathBuf::from(src.path.clone())), otherwise (Some(false)) use
PathBuf::from(src.path.clone()); update the match arms around
follow_symlinks.into_option(), src_path, and canonicalize accordingly.
🧹 Nitpick comments (2)
crates/vm/src/stdlib/nt.rs (1)
186-195: Consider reusingenviron()to keep_create_environin sync.
This avoids drift if the filtering logic changes in one place.crates/vm/src/stdlib/posix.rs (1)
456-467: Consider extracting shared logic to reduce duplication.The
_create_environfunction duplicates the logic from theenvironattribute (lines 443-454). While this may be intentional to match CPython's API, consider extracting the shared logic into a helper function.♻️ Optional refactor to reduce duplication
+fn create_environ_dict(vm: &VirtualMachine) -> PyDictRef { + use rustpython_common::os::ffi::OsStringExt; + + let environ = vm.ctx.new_dict(); + for (key, value) in env::vars_os() { + let key: PyObjectRef = vm.ctx.new_bytes(key.into_vec()).into(); + let value: PyObjectRef = vm.ctx.new_bytes(value.into_vec()).into(); + environ.set_item(&*key, value, vm).unwrap(); + } + environ +} + #[pyattr] fn environ(vm: &VirtualMachine) -> PyDictRef { - use rustpython_common::os::ffi::OsStringExt; - - let environ = vm.ctx.new_dict(); - for (key, value) in env::vars_os() { - let key: PyObjectRef = vm.ctx.new_bytes(key.into_vec()).into(); - let value: PyObjectRef = vm.ctx.new_bytes(value.into_vec()).into(); - environ.set_item(&*key, value, vm).unwrap(); - } - - environ + create_environ_dict(vm) } #[pyfunction] fn _create_environ(vm: &VirtualMachine) -> PyDictRef { - use rustpython_common::os::ffi::OsStringExt; - - let environ = vm.ctx.new_dict(); - for (key, value) in env::vars_os() { - let key: PyObjectRef = vm.ctx.new_bytes(key.into_vec()).into(); - let value: PyObjectRef = vm.ctx.new_bytes(value.into_vec()).into(); - environ.set_item(&*key, value, vm).unwrap(); - } - environ + create_environ_dict(vm) }
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/vm/src/convert/try_from.rs (1)
125-137:⚠️ Potential issue | 🟠 MajorHandle NaN/∞ and overflow in float→Duration conversion.
as u64on non‑finite or huge floats can silently saturate in Rust (e.g., NaN → 0, ∞ → u64::MAX), which turns invalid inputs into valid durations. This needs explicitis_finite()and range checks before casting, matching Python's behavior of raising OverflowError.💡 Suggested fix
if let Some(float) = obj.downcast_ref::<PyFloat>() { let f = float.to_f64(); + if !f.is_finite() { + return Err(vm.new_value_error("non-finite duration")); + } if f < 0.0 { return Err(vm.new_value_error("negative duration")); } // Convert float to Duration using floor rounding (_PyTime_ROUND_FLOOR) - let secs = f.trunc() as u64; - let frac = f.fract(); + let secs_f = f.trunc(); + if secs_f > u64::MAX as f64 { + return Err(vm.new_value_error("duration is too large")); + } + let secs = secs_f as u64; + let frac = f - secs_f; // Use floor to round down the nanoseconds let nanos = (frac * 1_000_000_000.0).floor() as u32; Ok(Self::new(secs, nanos))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In `@crates/vm/src/convert/try_from.rs`:
- Around line 140-145: The conversion can produce nanos == 1_000_000_000 due to
float precision and Duration::new (Self::new) will panic; after computing secs
and nanos from f (and frac), detect if nanos >= 1_000_000_000 and, if so,
increment secs by 1 and set nanos to 0 (or clamp nanos to 999_999_999 and adjust
secs accordingly) before calling Self::new(secs, nanos) to guarantee nanos <
1_000_000_000 and avoid panic.
In `@crates/vm/src/stdlib/nt.rs`:
- Around line 986-1001: The env handling in spawnve incorrectly accepts
path-like objects via FsPath::try_from_path_like for environment keys/values;
change the logic around FsPath::try_from_path_like(key, ...) and
FsPath::try_from_path_like(value, ...) so that spawnve only accepts native
strings/bytes: convert the incoming Py objects to Rust strings/bytes directly
(fail with a TypeError/ValueError if they are not str/bytes) and build env_str
from those string representations before converting to WideCString (preserve the
existing validation of '=' in key_str and the env_strings push). Apply the same
change to the second spawnve implementation later in the file to ensure
keys/values are not accepted via __fspath__.
🧹 Nitpick comments (8)
crates/vm/src/convert/try_from.rs (2)
132-134: Inconsistent error message ownership between float and int paths.Line 133 uses
.to_owned()for the error message while line 150 uses a string literal directly. This inconsistency suggests either unnecessary allocation on line 133 or a potential compile error on line 150.♻️ Suggested fix for consistency
Either remove
.to_owned()from line 133 ifnew_value_erroraccepts&str:if f < 0.0 { - return Err(vm.new_value_error("negative duration".to_owned())); + return Err(vm.new_value_error("negative duration")); }Or add
.to_owned()to line 150 for consistency:if bigint.sign() == Sign::Minus { - return Err(vm.new_value_error("negative duration")); + return Err(vm.new_value_error("negative duration".to_owned())); }Also applies to: 149-151
153-156: Consider aligning error message with float path for consistency.The integer overflow error uses "value out of range" while the float path uses "timestamp too large to convert to C PyTime_t". Consider using consistent error messages for the same logical error (value exceeding valid range).
♻️ Suggested fix for message consistency
let sec = bigint .to_u64() - .ok_or_else(|| vm.new_value_error("value out of range"))?; + .ok_or_else(|| vm.new_overflow_error("timestamp too large to convert to C PyTime_t".to_owned()))?;crates/vm/src/stdlib/signal.rs (1)
208-222: Consider extractingVALID_SIGNALSto a module-level constant to avoid duplication.This array is duplicated in
raise_signal()at lines 505-513. Extracting it to a single#[cfg(windows)]constant would reduce maintenance burden and ensure consistency.♻️ Proposed refactor
Add a module-level constant:
#[cfg(windows)] const WINDOWS_VALID_SIGNALS: &[i32] = &[ libc::SIGINT, libc::SIGILL, libc::SIGFPE, libc::SIGSEGV, libc::SIGTERM, SIGBREAK, libc::SIGABRT, ];Then reference
WINDOWS_VALID_SIGNALSin bothsignal()andraise_signal():#[cfg(windows)] { - const VALID_SIGNALS: &[i32] = &[ - libc::SIGINT, - libc::SIGILL, - libc::SIGFPE, - libc::SIGSEGV, - libc::SIGTERM, - SIGBREAK, - libc::SIGABRT, - ]; - if !VALID_SIGNALS.contains(&signalnum) { + if !WINDOWS_VALID_SIGNALS.contains(&signalnum) { return Err(vm.new_value_error(format!("signal number {} out of range", signalnum))); } }crates/stdlib/src/socket.rs (1)
2204-2226: Validation logic correctly rejects invalid timeout values.The implementation properly handles NaN, negative values, and infinity. The validation order is correct (NaN first with its specific message, then the combined negative/non-finite check).
Consider extracting the timeout validation into a helper to avoid duplication with
setdefaulttimeout:♻️ Optional: Extract shared validation logic
fn validate_timeout(f: f64, vm: &VirtualMachine) -> PyResult<f64> { if f.is_nan() { return Err(vm.new_value_error("Invalid value NaN (not a number)".to_owned())); } if f < 0.0 || !f.is_finite() { return Err(vm.new_value_error("Timeout value out of range".to_owned())); } Ok(f) }This could then be used by both
settimeoutandsetdefaulttimeout.crates/vm/src/ospath.rs (1)
84-95: Consider reusingwarn_if_bool_fdfromcrate::stdlib::os.This warning logic duplicates the
warn_if_bool_fdfunction incrates/vm/src/stdlib/os.rs(lines 122-136). Both perform the same check and emit the same warning message. Consider calling the shared helper to reduce duplication and ensure consistency.♻️ Proposed refactor to reuse the shared helper
// Handle fd (before __fspath__ check, like CPython) if let Some(int) = obj.try_index_opt(vm) { - // Warn if bool is used as a file descriptor - if obj - .class() - .is(crate::builtins::bool_::PyBool::static_type()) - { - crate::stdlib::warnings::warn( - vm.ctx.exceptions.runtime_warning, - "bool is used as a file descriptor".to_owned(), - 1, - vm, - )?; - } + crate::stdlib::os::warn_if_bool_fd(&obj, vm)?; let fd = int?.try_to_primitive(vm)?;Note: This requires
warn_if_bool_fdto be madepub(crate)if it isn't already.crates/vm/src/stdlib/os.rs (1)
1725-1728: Clarify the warning exception handling.The condition
e.fast_isinstance(vm.ctx.exceptions.warning)is checking if the exception is a Warning subclass. However,warn_if_bool_fdraises aRuntimeWarningviawarnings.warn(), which typically doesn't raise an exception unless warnings are configured to raise.If the intent is to propagate warnings-as-errors, this check is correct. Consider adding a brief comment to clarify the purpose.
📝 Proposed comment for clarity
match path.clone().try_into_value::<crt_fd::Borrowed<'_>>(vm) { Ok(fd) => return ftruncate(fd, length).map_err(|e| e.into_pyexception(vm)), + // Propagate warnings-as-errors (e.g., RuntimeWarning from warn_if_bool_fd) Err(e) if e.fast_isinstance(vm.ctx.exceptions.warning) => return Err(e), Err(_) => {} }crates/vm/src/stdlib/posix.rs (2)
296-307: Minor inconsistency: use importedwarn_if_bool_fdfor consistency.Line 298 uses the full crate path
crate::stdlib::os::warn_if_bool_fd, but the function is already imported at line 38. Line 518 correctly uses justwarn_if_bool_fd. Consider using the imported name here for consistency.Suggested change
fn try_from_object(vm: &VirtualMachine, obj: PyObjectRef) -> PyResult<Self> { - crate::stdlib::os::warn_if_bool_fd(&obj, vm)?; + warn_if_bool_fd(&obj, vm)?; let fd = i32::try_from_object(vm, obj)?;
464-475: Consider eliminating code duplication withenviron.
_create_environ(lines 464-475) has identical implementation to theenvironattribute function (lines 450-462). Both iterate overenv::vars_os(), convert keys and values to bytes, and insert them into a new dict.Suggested refactor
#[pyfunction] fn _create_environ(vm: &VirtualMachine) -> PyDictRef { - use rustpython_common::os::ffi::OsStringExt; - - let environ = vm.ctx.new_dict(); - for (key, value) in env::vars_os() { - let key: PyObjectRef = vm.ctx.new_bytes(key.into_vec()).into(); - let value: PyObjectRef = vm.ctx.new_bytes(value.into_vec()).into(); - environ.set_item(&*key, value, vm).unwrap(); - } - environ + environ(vm) }
| let key = FsPath::try_from_path_like(key, true, vm)?; | ||
| let value = FsPath::try_from_path_like(value, true, vm)?; | ||
| let key_str = key.to_string_lossy(); | ||
| let value_str = value.to_string_lossy(); | ||
|
|
||
| // Validate: no null characters in key or value | ||
| if key_str.contains('\0') || value_str.contains('\0') { | ||
| return Err(vm.new_value_error("embedded null character")); | ||
| } | ||
| // Validate: no '=' in key (search from index 1 because on Windows | ||
| // starting '=' is allowed for defining hidden environment variables) | ||
| if key_str.get(1..).is_some_and(|s| s.contains('=')) { | ||
| return Err(vm.new_value_error("illegal environment variable name")); | ||
| } | ||
|
|
||
| let env_str = format!("{}={}", key_str, value_str); | ||
| env_strings.push(make_widestring(&env_str)?); | ||
| env_strings.push( | ||
| widestring::WideCString::from_os_str(&*std::ffi::OsString::from(env_str)) | ||
| .map_err(|err| err.to_pyexception(vm))?, | ||
| ); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, let's understand the context - look at the complete function containing this code
sed -n '950,1010p' crates/vm/src/stdlib/nt.rsRepository: RustPython/RustPython
Length of output: 2259
🏁 Script executed:
#!/bin/bash
# Check the FsPath implementation and what try_from_path_like does
rg -n "impl.*FsPath" -A 20 | head -100Repository: RustPython/RustPython
Length of output: 2855
🏁 Script executed:
#!/bin/bash
# Search for how environment variables are typically handled in CPython
# Let's check if there are comments or documentation about env handling
rg -n "environment" crates/vm/src/stdlib/nt.rs -B 2 -A 2 | head -80Repository: RustPython/RustPython
Length of output: 1594
🏁 Script executed:
#!/bin/bash
# Look at the function signature to understand what function this code is in
sed -n '900,1010p' crates/vm/src/stdlib/nt.rs | grep -n "fn\|pub\|spawnv"Repository: RustPython/RustPython
Length of output: 685
🌐 Web query:
CPython spawnve environment variables type requirements
💡 Result:
In CPython, os.spawnve(mode, path, args, env) requires env to be a mapping (typically a dict) that defines the child process environment (it replaces the current process environment). [1]
Type requirements: the env mapping’s keys and values must be str (text). Non-str keys/values are considered invalid and will cause the call to fail (documented as returning 127). [1]
Reference example: passing os.environ (which is a str→str mapping) is the typical pattern. [1]
Sources: Python os module docs (spawn* section). [1]
[1] (docs.python.org)
Citations:
🏁 Script executed:
#!/bin/bash
# Check if there are test cases for spawnve environment handling
find . -name "*.py" -type f | xargs grep -l "spawnve" 2>/dev/null | head -5Repository: RustPython/RustPython
Length of output: 98
🏁 Script executed:
#!/bin/bash
# Look for any tests or documentation in the RustPython repo about spawnve
fd -e py -e rs | xargs grep -l "spawnve" 2>/dev/nullRepository: RustPython/RustPython
Length of output: 148
🏁 Script executed:
#!/bin/bash
# Check what argv does for comparison - if it uses FsPath too
sed -n '960,975p' crates/vm/src/stdlib/nt.rsRepository: RustPython/RustPython
Length of output: 594
🏁 Script executed:
#!/bin/bash
# Check if there's a separate non-FsPath version for environment handling elsewhere
rg -n "to_string_lossy" crates/vm/src/stdlib/nt.rs -B 5 -A 2Repository: RustPython/RustPython
Length of output: 1003
Remove os.PathLike support for environment variable keys and values in spawnve().
CPython's os.spawnve() only accepts str for environment dictionary keys and values. The current implementation using FsPath::try_from_path_like() incorrectly accepts path-like objects (via __fspath__), which is non-standard behavior.
For environment variables at lines 986–987, use direct string/bytes conversion instead of FsPath::try_from_path_like(). This deviation from CPython's strict str-only requirement may cause unexpected behavior when path-like objects are passed.
Note: argv elements appropriately use FsPath since the command path can be path-like; only the environment dict handling needs correction. A similar issue likely exists in the second spawnve implementation further down in the file.
🤖 Prompt for AI Agents
In `@crates/vm/src/stdlib/nt.rs` around lines 986 - 1001, The env handling in
spawnve incorrectly accepts path-like objects via FsPath::try_from_path_like for
environment keys/values; change the logic around FsPath::try_from_path_like(key,
...) and FsPath::try_from_path_like(value, ...) so that spawnve only accepts
native strings/bytes: convert the incoming Py objects to Rust strings/bytes
directly (fail with a TypeError/ValueError if they are not str/bytes) and build
env_str from those string representations before converting to WideCString
(preserve the existing validation of '=' in key_str and the env_strings push).
Apply the same change to the second spawnve implementation later in the file to
ensure keys/values are not accepted via __fspath__.
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [x] lib: cpython/Lib/os.py dependencies:
dependent tests: (160 tests)
[x] lib: cpython/Lib/genericpath.py dependencies:
dependent tests: (16 tests)
[x] test: cpython/Lib/test/test_posix.py (TODO: 5) dependencies: dependent tests: (no tests depend on posix) [x] lib: cpython/Lib/signal.py dependencies:
dependent tests: (67 tests)
[ ] lib: cpython/Lib/subprocess.py dependencies:
dependent tests: (52 tests)
Legend:
|
Summary by CodeRabbit
New Features
Improvements