How do we compare?...
https://github.com/dyu/ffi-overhead#results-500m-calls

I found this interesting (even though its beyond my ken...)
"For those wondering why luajit is faster than C/C++/Rust, that's probably because it does direct calls to the function, while C/C++/Rust go through the PLT (procedure linkage table)."

cheers -ben