Fix typing, typevar, genericalias, and symboltable#7091
Fix typing, typevar, genericalias, and symboltable#7091youknowone merged 1 commit intoRustPython:mainfrom
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughSymbolTable gains optional per-scope Changes
Sequence DiagramsequenceDiagram
participant Compiler as Compiler (compile.rs)
participant SymTable as SymbolTable (symboltable.rs)
participant Resolver as maybe_mangle_name
Compiler->>SymTable: enter_type_param_block(for_class)
SymTable->>SymTable: init/propagate mangled_names (Option\<Set\>)
Compiler->>SymTable: register_name(name, usage)
alt usage == TypeParam
SymTable->>SymTable: add name to mangled_names
end
Compiler->>Resolver: maybe_mangle_name(class_name, mangled_names, name)
alt mangled_names is Some and contains name
Resolver->>Resolver: call mangle_name(class_name, name)
Resolver-->>Compiler: return mangled name
else
Resolver-->>Compiler: return original name
end
Compiler->>SymTable: insert/lookup resolved name
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
No actionable comments were generated in the recent review. 🎉 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
07fbdd0 to
cf798e7
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/codegen/src/symboltable.rs (1)
1279-1322:⚠️ Potential issue | 🔴 CriticalBase class names are mangled incorrectly due to
class_namebeing set when scanning bases (regression).In the old code,
self.class_name = prev_classwas restored immediately afterself.leave_scope()and before scanning bases. The new code moved this restoration to after base scanning, leavingclass_nameset to the current class during base expression evaluation.This breaks name mangling for nested classes with private base names:
Non-generic class: Bases are scanned with
class_name = "Child"andmangled_names = None, so__Privategets mangled as_Child__Private. CPython mangles it as_Outer__Private(bases are evaluated in the enclosing scope).Generic class: Bases are scanned with
class_name = "Child"andmangled_names = Some({T}). Since__Privateis not in the set,maybe_mangle_namereturns it unmangled. CPython would mangle it as_Outer__Private.Root cause: The refactored code moved
self.class_name = prev_class(restoration of the enclosing class name) to line 1320, after bases are scanned (lines 1310–1315), whereas the old code restored it before scanning bases.Fix: Restore
self.class_nametoprev_class(the enclosing class name) immediately afterself.leave_scope()and before scanning base arguments.
🧹 Nitpick comments (2)
crates/vm/src/builtins/genericalias.rs (1)
737-766: Duplicated logic withunpack_typevartuplesintyping.rs.This TypeVarTuple unpacking logic (check
downcast_ref::<TypeVarTuple>, wrap withUnpack.__getitem__) is essentially identical tounpack_typevartuplesincrates/vm/src/stdlib/typing.rs(lines 469–490). Consider extracting a shared helper to avoid maintaining both copies.Additionally,
new_args(line 738) is allocated even whenneeds_unpackisfalseand theelsebranch returnsparamswithout using it. Move the allocation inside theif needs_unpackblock.Suggested refactor
- // Unpack TypeVarTuples: wrap each TypeVarTuple in Unpack[...] - let mut new_args = Vec::with_capacity(params.len()); - let mut needs_unpack = false; - for param in params.iter() { - if param - .downcast_ref::<crate::stdlib::typing::TypeVarTuple>() - .is_some() - { - needs_unpack = true; - break; - } - } - - let args = if needs_unpack { - let unpack_cls = typing_module.get_attr("Unpack", vm)?; - for param in params.iter() { - if param - .downcast_ref::<crate::stdlib::typing::TypeVarTuple>() - .is_some() - { - let unpacked = vm.call_method(&unpack_cls, "__getitem__", (param.clone(),))?; - new_args.push(unpacked); - } else { - new_args.push(param.clone()); - } - } - PyTuple::new_ref(new_args, &vm.ctx) - } else { - params - }; + // Unpack TypeVarTuples: wrap each TypeVarTuple in Unpack[...] + let has_tvt = params + .iter() + .any(|p| p.downcast_ref::<crate::stdlib::typing::TypeVarTuple>().is_some()); + + let args = if has_tvt { + let unpack_cls = typing_module.get_attr("Unpack", vm)?; + let new_args: Vec<PyObjectRef> = params + .iter() + .map(|p| { + if p.downcast_ref::<crate::stdlib::typing::TypeVarTuple>().is_some() { + vm.call_method(&unpack_cls, "__getitem__", (p.clone(),)) + } else { + Ok(p.clone()) + } + }) + .collect::<PyResult<_>>()?; + PyTuple::new_ref(new_args, &vm.ctx) + } else { + params + };This also aligns the style with the existing
unpack_typevartuplesincrates/vm/src/stdlib/typing.rs(line 479–488), which uses the iterator/map/collect pattern.crates/vm/src/stdlib/typevar.rs (1)
670-685: ParamSpec and TypeVar repr implementations are identical — consider extracting a shared helper.The variance-prefix formatting logic (lines 674–684) is a direct copy of TypeVar's
repr_str(lines 273–283). A small shared function likeformat_variance_prefix(name, infer_variance, covariant, contravariant)would eliminate the duplication.
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/codegen/src/symboltable.rs (1)
2631-2662:⚠️ Potential issue | 🟠 Major
maybe_mangle_namelogic is correct, butmangle_nameimplementation diverges from existing code.The
mangle_nameinsymboltable.rs(line 2640) usesstrip_prefix(|c| c == '_')to strip only one leading underscore, while the existingmangle_nameincrates/vm/src/builtins/type.rs(line 2258) usestrim_start_matches('_')to strip all leading underscores. This inconsistency was introduced in this PR (commit 3343b1c), not pre-existing. The comment insymboltable.rssays "strip leading underscore" (singular), whiletype.rssays "Strip leading underscores" (plural), indicating intent divergence. CPython strips all leading underscores during name mangling. Both implementations should be consistent;symboltable.rsshould usetrim_start_matches('_')to matchtype.rsand CPython semantics.
🧹 Nitpick comments (2)
crates/vm/src/builtins/genericalias.rs (1)
737-759: Eliminate duplicate TypeVarTuple-unpacking logic by reusingunpack_typevartuples.The logic at lines 737–759 duplicates the exact same TypeVarTuple detection and wrapping pattern already implemented in
crate::stdlib::typing::unpack_typevartuples()(lines 470–490). The only syntactic difference isdowncast_ref::<…>().is_some()vsdowncastable::<…>(), which are equivalent.To maintain a single source of truth, make
unpack_typevartuplespub(crate)intyping.rsand call it here instead:- let has_tvt = params - .iter() - .any(|p| p.downcast_ref::<crate::stdlib::typing::TypeVarTuple>().is_some()); - - let args = if has_tvt { - let unpack_cls = typing_module.get_attr("Unpack", vm)?; - let new_args: Vec<PyObjectRef> = params - .iter() - .map(|p| { - if p.downcast_ref::<crate::stdlib::typing::TypeVarTuple>() - .is_some() - { - vm.call_method(&unpack_cls, "__getitem__", (p.clone(),)) - } else { - Ok(p.clone()) - } - }) - .collect::<PyResult<_>>()?; - PyTuple::new_ref(new_args, &vm.ctx) - } else { - params - }; + let args = crate::stdlib::typing::unpack_typevartuples(¶ms, vm)?;crates/codegen/src/symboltable.rs (1)
1279-1322:class_namesave/restore is correct for nested classes and the generic path.One minor observation: when
type_params.is_some(), line 1300 re-setsclass_nameto the same value already set at line 1288. This is harmless (and needed for the non-generic path), but you could guard the second set withif type_params.is_none()if you want to avoid the redundant allocation.Optional: avoid redundant allocation when type_params is present
- // Set class_name for mangling in class body - self.class_name = Some(name.to_string()); + // Set class_name for mangling in class body (already set for generic path) + if type_params.is_none() { + self.class_name = Some(name.to_string()); + }
|
Code has been automatically formatted The code in this PR has been formatted using:
git pull origin typing |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/codegen/src/symboltable.rs (1)
1279-1321:⚠️ Potential issue | 🟠 MajorBase classes incorrectly mangled with current class name in non-generic classes.
After
leave_scope()at line 1305 (exiting the class body),self.class_nameremains set to the current class name (e.g.,"__Inner"). When bases and keywords are scanned at lines 1309–1313, they are evaluated in the enclosing scope's symbol table, but name mangling uses the wrong class context.For example, in:
class Outer: class __Inner(__Base): # __Base should mangle with "Outer", not "__Inner" passThe base class
__Basewould incorrectly be tracked as mangled with__Inner, producing_Inner__Baseinstead of_Outer__Base.For generic classes this is not an issue since bases are scanned within the type_param scope (before line 1316's
leave_scope). The fix is to restoreself.class_nametoprev_classbefore scanning bases in the non-generic case:Proposed fix
self.leave_scope(); self.in_conditional_block = saved_in_conditional; + // Restore class_name to enclosing class for base/keyword scanning in non-generic case + if type_params.is_none() { + self.class_name = prev_class.clone(); + } // Bases/keywords are scanned in type_param scope (if generic) - // class_name is still set, so mangling works via mangled_names filtering + // or enclosing scope (if non-generic) if let Some(arguments) = arguments {
🤖 Fix all issues with AI agents
In `@crates/codegen/src/symboltable.rs`:
- Around line 2638-2639: After trimming leading underscores from class_name (let
class_name = class_name.trim_start_matches('_');), add a check for an empty
result and skip mangling if empty: if class_name.is_empty() { return
original_name.to_string(); } so that classes made only of underscores (e.g.,
"__") do not trigger private-name mangling; update the code path that builds the
mangled identifier to return the unmodified identifier when class_name is empty.
🧹 Nitpick comments (1)
crates/vm/src/types/slot.rs (1)
536-545: Double MRO lookup for__iter__in the non-None path.When
__iter__exists and is notNone,cls.get_attr()(line 539) performs an MRO lookup, and thenvm.call_special_method()(line 544) performs a second MRO lookup for the same__iter__. This is functionally correct but redundant.Since this wrapper is only invoked for heap types with a Python-defined
__iter__, the overhead is likely negligible in practice. Consider caching the result if profiling ever shows this as a concern.♻️ Optional: reuse the looked-up attribute to avoid the second MRO walk
fn iter_wrapper(zelf: PyObjectRef, vm: &VirtualMachine) -> PyResult { // slot_tp_iter: if __iter__ is None, the type is explicitly not iterable let cls = zelf.class(); let iter_attr = cls.get_attr(identifier!(vm, __iter__)); match iter_attr { Some(attr) if vm.is_none(&attr) => { Err(vm.new_type_error(format!("'{}' object is not iterable", cls.name()))) } - _ => vm.call_special_method(&zelf, identifier!(vm, __iter__), ()), + Some(attr) => attr.call((zelf,), vm), + None => vm.call_special_method(&zelf, identifier!(vm, __iter__), ()), } }
| // Strip leading underscores from class name | ||
| let class_name = class_name.trim_start_matches('_'); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# First, check the actual code at lines 2638-2661 in symboltable.rs
sed -n '2630,2670p' crates/codegen/src/symboltable.rsRepository: RustPython/RustPython
Length of output: 1120
🏁 Script executed:
#!/bin/bash
# Search for mangle_name usage and test cases
rg -n "mangle_name|mangle" crates/codegen/src/symboltable.rs | head -30
# Check for test files related to name mangling
fd -e rs -e py | xargs grep -l "mangle" | head -10Repository: RustPython/RustPython
Length of output: 1805
🏁 Script executed:
#!/bin/bash
# Search for test files in the codegen directory
fd -e rs crates/codegen/src/ | xargs grep -l "test.*mangle\|mangle.*test" 2>/dev/null
# Search for any tests that use double-underscore names (which trigger mangling)
rg -n "__.*__" crates/codegen/src/ | grep -i "test\|def\|class" | head -20
# Check if there are any Python test files related to name mangling
fd test crates/codegen/ -type fRepository: RustPython/RustPython
Length of output: 2865
🏁 Script executed:
#!/bin/bash
# Search for Python test files that might test name mangling
find . -path ./target -prune -o -name "*.py" -type f -exec grep -l "^class _" {} \; | head -10
# Check if there are any tests for name mangling behavior
rg -i "name.*mangle\|mangle.*test" Lib/test/ | head -20
# Also check the Python standard library for all-underscore class names
rg "^class _[_]+[^_]" --type py | head -10Repository: RustPython/RustPython
Length of output: 5821
🏁 Script executed:
#!/bin/bash
# Check the context around the all-underscore class definitions
grep -B2 -A10 "^class __:" Lib/test/pyclbr_input.py
echo "---"
grep -B2 -A10 "^class ___:" Lib/test/pyclbr_input.py
echo "---"
grep -B2 -A10 "^class ___(object):" Lib/test/test_copyreg.pyRepository: RustPython/RustPython
Length of output: 772
🏁 Script executed:
#!/bin/bash
# Create a simple test script to verify CPython's behavior
python3 << 'EOF'
# Test CPython's name mangling with all-underscore class names
class __:
def __foo(self):
return "foo"
class ___:
def __bar(self):
return "bar"
# Check what the actual method names are after mangling
import inspect
# For class __
cls1 = __
methods1 = [m for m in dir(cls1) if not m.startswith('_') or m.startswith('__')]
print("Class __ methods:", [m for m in dir(cls1) if 'foo' in m])
# For class ___
cls2 = ___
print("Class ___ methods:", [m for m in dir(cls2) if 'bar' in m])
# The key question: are the names mangled or not?
# If they're mangled, they'd have the class name prefix
# If not mangled, they'd stay as __foo and __bar
EOFRepository: RustPython/RustPython
Length of output: 121
🏁 Script executed:
#!/bin/bash
# Search for tests that use all-underscore class names with double-underscore methods
rg -n "class __|class ___" extra_tests/ Lib/test/ --type py
# Check if pyclbr_input.py is actually tested
rg -n "pyclbr_input" . --type py | head -10
# Check if there are specific tests for name mangling behavior
rg -n "mangle|__.*__" extra_tests/ --type py | grep -i "class\|def" | head -20Repository: RustPython/RustPython
Length of output: 3837
Add empty-string check after stripping leading underscores from class name.
When class_name consists entirely of underscores (e.g., "__" or "___"), trim_start_matches('_') produces an empty string. CPython's mangling logic skips mangling in this case, but the current code proceeds to produce an incorrect mangled name.
For example, class __: with a method def __foo(self): should remain __foo (unmangled), not become ___foo.
Proposed fix
// Strip leading underscores from class name
let class_name = class_name.trim_start_matches('_');
+ if class_name.is_empty() {
+ return name.into();
+ }
let mut ret = String::with_capacity(1 + class_name.len() + name.len());🤖 Prompt for AI Agents
In `@crates/codegen/src/symboltable.rs` around lines 2638 - 2639, After trimming
leading underscores from class_name (let class_name =
class_name.trim_start_matches('_');), add a check for an empty result and skip
mangling if empty: if class_name.is_empty() { return original_name.to_string();
} so that classes made only of underscores (e.g., "__") do not trigger
private-name mangling; update the code path that builds the mangled identifier
to return the unmodified identifier when class_name is empty.
- TypeVar/ParamSpec repr: use infer_variance flag - ParamSpec: add type_check on bound argument - ParamSpecArgs/Kwargs: use equality instead of identity - NoDefault: change to IMMUTABLETYPE flag - subscript_generic: wrap TypeVarTuple in Unpack - symboltable: selective name mangling in type param scopes - symboltable: fix double default scanning for non-generic fns - Unmark 4 passing tests in test_type_params
📦 Library DependenciesThe following Lib/ modules were modified. Here are their dependencies: [ ] lib: cpython/Lib/concurrent dependencies:
dependent tests: (11 tests)
[x] lib: cpython/Lib/typing.py dependencies:
dependent tests: (14 tests)
Legend:
|
Summary by CodeRabbit
Refactor
Chores