Fix NodeId->HirId lookups by carrying structural context#1445
Conversation
fw-immunant
left a comment
There was a problem hiding this comment.
LGTM overall, couple comments inline.
- record child-slot paths, statement indices, and owner node IDs in AST span maps via the new child_slot vocabulary and NodeContextKey - teach SpanToHirMapper/HirMap to mirror that context, populate a context_to_hir_map, and fall back to it before the span-only mapping refactor: align struct field key trees with bitfield helpers
| } | ||
|
|
||
| #[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)] | ||
| pub enum SpanNodeKind { |
There was a problem hiding this comment.
Is this deprecated now? Could we replace it with your new context key?
There was a problem hiding this comment.
We probably could, but the way I wrote this was to augment the usage of this so it may be a good chunk of work -- which I'm not opposed to but I don't think it'd be a simple substitution.
| pub span_to_node_id_map: FxHashMap<NodeSpan, NodeId>, | ||
| /// Extended mapping: (NodeSpan, NodeContextKey) -> NodeId | ||
| /// This provides structural disambiguation for macro-generated nodes | ||
| pub context_to_node_id_map: FxHashMap<(NodeSpan, NodeContextKey), NodeId>, |
There was a problem hiding this comment.
Could these have replaced the old maps altogether?
There was a problem hiding this comment.
span_to_node_id_map used by the HIR->AST mapping doesn't use the context key to disambiguate at all, and I haven't tested it with that disambiguation, so it's not clear that it can be replaced with it yet. We could at least see if the context key could replace the span key in the AST->HIR mapping
| maps: AstSpanMaps, | ||
| /// Tracks block/owner/child-slot stacks while walking the AST | ||
| ctx: StructuralContext<NodeId>, | ||
| /// Depth of attribute visitation (>0 means we're inside an attribute) |
There was a problem hiding this comment.
Are attributes nested? What does that look like in Rust code?
There was a problem hiding this comment.
In libxml the most common case is cfg_attr:
#![cfg_attr(feature = "xchecks", plugin(c2rust_xcheck_plugin(config_file = "../xchecks/libxml2_rust.yaml",
djb2_names_file = "../xchecks/djb2_names_libxml2.yaml")))]
| return Some(*hir_id); | ||
| } | ||
| } | ||
| return self.span_to_hir_map.get(node_span).copied(); |
There was a problem hiding this comment.
Is there a case where a span would be in the old map but not the new context one?
There was a problem hiding this comment.
There should always be at least one in the new context map, but there may be more since the span is ambiguous
c2rust-refactor would fail due to invalid AST node <-> HIR mappings. This was caused by insufficient disambiguation due to all c2rust-bitfield-derived expanded code sharing the same span with call_site. This PR addresses the aforemented issue by threading structural context through the AST and HIR walkers so NodeId-HirId lookups stay precise even when spans collide.
child_slot,NodeContextKey, andStructuralContexthelpers to record hierarchical positions for AST nodes; mirrored the same traversal on the HIR side and stitched the maps together duringHirMapconstructionLeafnodes that conflicted with the HIR/type view where the same values remained wrapped (e.g.[u8; 3]->Node([Leaf(u8)])). We now detect those single-child wrappers, peel just enough layers to expose the matching field list, and otherwise panic with a targeted message if the arity still disagrees.I tried going down another path with more-specific spans, such as using quote_spanned!(field.field_name.span()), using fresh spans with Span::def_site().located_at(old_span), and using attribute metadata (with meta_name_value.lit.span(), but kept running into issues and so abandoned that path.