A MoonBit parsing toolkit for two user workflows: composing parsers with nom-like combinators, or turning .g4 grammars into generated parseable packages.
- Rust nom–shaped combinators —
tag,take_while,many0,many1,alt,pair,preceded,terminated,delimited,opt,verify,map,map_res,separated_list0/1, and more - ANTLR CLI workflow — run
parkit generate antlr ./hello.g4, import the generated package, and call@hello.parse(...) - Evidence-driven compatibility — repository-managed fixtures, snapshots, suite catalogs, and CI-backed contract tests
- Cross-language validation — Rust nom reference tests and benchmarks keep the combinator surface honest
- MoonBit-native workflow —
make cidrives format, static checks, and tests across MoonBit and Rust reference code
The original parkit surface: a nom-like combinator toolkit for directly composing parsers in MoonBit.
The user-facing flow is straightforward: write a .g4, run parkit generate antlr, import the generated package, and call parse in normal MoonBit code.
See the package guide: src/antlr/READMD.md
src/
nom/ — the parkit parser combinator library (cogna-dev/parkit/nom)
antlr/ — frontend, runtime, CST, typed CST, fixtures, and specs
examples/
antlr/ — generated-package ANTLR sample built with `parkit generate antlr --out`
json/ — JSON parser sample built with parkit (not published to mooncakes)
benchmark/ — MoonBit benchmarks (not published to mooncakes)
reference/
nom-json/ — Rust nom reference JSON parser (cross-language validation + benchmarks)
// Tag: match an exact string
let p = @nom.tag("hello")
assert_eq!(p.parse("hello world"), Ok((" world", "hello")))
// many0: repeat zero or more times
let nums = @nom.many0(@nom.digit1())
assert_eq!(nums.parse("123 456"), Ok((" 456", ["123"])))
// JSON parsing
let json = @json.parse("{\"key\": 42}")
// => Ok(Object([("key", Number(42.0))]))The shortest path on the ANTLR side is:
- Write a normal
.g4grammar. - Run
parkit generate antlr ./hello.g4 --out ./src/hello. - Import the generated package with an alias such as
@hello. - Call
@hello.parse(...)and use typed fields such asfields.id_token.lexeme.
The full end-to-end guide lives in src/antlr/READMD.md. A checked-in sample project lives in examples/antlr/app/main.mbt and its grammar source in examples/antlr/grammar/hello.g4. The deeper design and contract documents stay indexed in src/antlr/spec/README.md.
moon test --manifest-path moon.workmake format # rewrite MoonBit and Rust formatting in place
make format-check # verify formatting without changing files
make lint # MoonBit static checks plus Rust compile checks
make test # MoonBit workspace tests plus Rust reference tests
make ci # format-check, lint, then test# MoonBit benchmarks (native target)
moon bench --manifest-path moon.work --target native
# Rust criterion benchmarks
cargo bench --manifest-path reference/nom-json/Cargo.tomlBenchmarks run on ubuntu-latest. MoonBit targets the native backend; Rust
uses criterion.rs with a native
release build. Both use the same 3-object JSON array input.
| Benchmark | MoonBit (native) | Rust nom (native) |
|---|---|---|
json_parse (3-object array) |
1.17 ms | 5.53 µs |
Note: The performance difference reflects the overhead of MoonBit's runtime and memory model compared to Rust's zero-cost abstractions, not algorithmic differences — both parsers implement the same combinator logic and produce identical results.
MoonBit native benchmark details:
| Benchmark | Mean | σ | Range |
|---|---|---|---|
json_parse (3-object array) |
1.17 ms | ±6.29 µs | 1.16 ms … 1.19 ms |
tag |
0.21 µs | ±0.00 µs | 0.21 µs … 0.21 µs |
take_while1_digits |
0.43 µs | ±0.00 µs | 0.43 µs … 0.43 µs |
separated_list0 |
2.49 µs | ±0.02 µs | 2.47 µs … 2.54 µs |
reference/nom-json/ contains an equivalent JSON parser built with Rust
nom 8. The test suite in
reference/nom-json/src/lib.rs mirrors every case in
examples/json/json_test.mbt, ensuring both implementations produce the same
result for every input.
cargo test --manifest-path reference/nom-json/Cargo.toml| Combinator | Description |
|---|---|
tag(s) |
Match exact literal string |
satisfy(pred) |
Match single char satisfying predicate |
char_(c) |
Match exact character |
take_while(pred) |
Take chars while predicate holds (0+) |
take_while1(pred) |
Take chars while predicate holds (1+) |
take(n) |
Take exactly n characters |
take_while_m_n(m,n,pred) |
Take m..n chars while predicate holds |
many0(p) |
Run parser 0 or more times |
many1(p) |
Run parser 1 or more times |
alt(parsers) |
Try each parser; return first success |
pair(a, b) |
Run two parsers, return tuple |
tuple3(a, b, c) |
Run three parsers, return tuple |
preceded(skip, keep) |
Skip prefix, keep result |
terminated(keep, skip) |
Keep result, skip suffix |
delimited(open, p, close) |
Keep middle between delimiters |
separated_list0(sep, p) |
0+ items separated by sep |
separated_list1(sep, p) |
1+ items separated by sep |
opt(p) |
Optional: Some(v) or None |
map(p, f) |
Transform output |
map_res(p, f) |
Fallibly transform output |
verify(p, pred) |
Check output satisfies predicate |
ws(p) |
Skip leading whitespace, then run p |
context(msg, p) |
Add context to error messages |
eof() |
Match end of input |
success(v) |
Always succeed with value |
fail(msg) |
Always fail |
double() |
Parse floating-point number |
int() |
Parse signed integer |
uint() |
Parse unsigned integer |
digit0/1() |
Parse decimal digits |
- ECMA-404 JSON specification
- nom — Rust parser combinator library
- criterion.rs — Rust benchmarking library
- MoonBit benchmarks documentation
Apache-2.0