Skip to content

Fix NodeId->HirId lookups by carrying structural context#1445

Merged
oinoom merged 1 commit into
masterfrom
test.fix.map
Nov 5, 2025
Merged

Fix NodeId->HirId lookups by carrying structural context#1445
oinoom merged 1 commit into
masterfrom
test.fix.map

Conversation

@oinoom
Copy link
Copy Markdown
Contributor

@oinoom oinoom commented Nov 3, 2025

c2rust-refactor would fail due to invalid AST node <-> HIR mappings. This was caused by insufficient disambiguation due to all c2rust-bitfield-derived expanded code sharing the same span with call_site. This PR addresses the aforemented issue by threading structural context through the AST and HIR walkers so NodeId-HirId lookups stay precise even when spans collide.

  • Added child_slot, NodeContextKey, and StructuralContext helpers to record hierarchical positions for AST nodes; mirrored the same traversal on the HIR side and stitched the maps together during HirMap construction
  • Collapsed wrapper nodes derived accessors and helper shims would erase array/tuple shells on the expression side, leaving Leaf nodes that conflicted with the HIR/type view where the same values remained wrapped (e.g. [u8; 3] -> Node([Leaf(u8)])). We now detect those single-child wrappers, peel just enough layers to expose the matching field list, and otherwise panic with a targeted message if the arity still disagrees.

I tried going down another path with more-specific spans, such as using quote_spanned!(field.field_name.span()), using fresh spans with Span::def_site().located_at(old_span), and using attribute metadata (with meta_name_value.lit.span(), but kept running into issues and so abandoned that path.

Comment thread c2rust-refactor/src/transform/literals.rs
Comment thread c2rust-refactor/src/context.rs Outdated
Copy link
Copy Markdown
Contributor

@fw-immunant fw-immunant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, couple comments inline.

- record child-slot paths, statement indices, and owner node
IDs in AST span maps via the new child_slot vocabulary and
NodeContextKey
- teach SpanToHirMapper/HirMap to mirror that
context, populate a context_to_hir_map, and fall back to it
before the span-only mapping

refactor: align struct field key trees with bitfield helpers
@oinoom oinoom merged commit 621fc16 into master Nov 5, 2025
5 checks passed
}

#[derive(Debug, Clone, Copy, PartialEq, Eq, Hash)]
pub enum SpanNodeKind {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this deprecated now? Could we replace it with your new context key?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably could, but the way I wrote this was to augment the usage of this so it may be a good chunk of work -- which I'm not opposed to but I don't think it'd be a simple substitution.

pub span_to_node_id_map: FxHashMap<NodeSpan, NodeId>,
/// Extended mapping: (NodeSpan, NodeContextKey) -> NodeId
/// This provides structural disambiguation for macro-generated nodes
pub context_to_node_id_map: FxHashMap<(NodeSpan, NodeContextKey), NodeId>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these have replaced the old maps altogether?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

span_to_node_id_map used by the HIR->AST mapping doesn't use the context key to disambiguate at all, and I haven't tested it with that disambiguation, so it's not clear that it can be replaced with it yet. We could at least see if the context key could replace the span key in the AST->HIR mapping

maps: AstSpanMaps,
/// Tracks block/owner/child-slot stacks while walking the AST
ctx: StructuralContext<NodeId>,
/// Depth of attribute visitation (>0 means we're inside an attribute)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are attributes nested? What does that look like in Rust code?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In libxml the most common case is cfg_attr:

#![cfg_attr(feature = "xchecks", plugin(c2rust_xcheck_plugin(config_file = "../xchecks/libxml2_rust.yaml",
                                                             djb2_names_file = "../xchecks/djb2_names_libxml2.yaml")))]

return Some(*hir_id);
}
}
return self.span_to_hir_map.get(node_span).copied();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a case where a span would be in the old map but not the new context one?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should always be at least one in the new context map, but there may be more since the span is ambiguous

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants