this post was submitted on 09 Mar 2024
19 points (100.0% liked)

Programming

17426 readers
27 users here now

Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!

Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.

Hope you enjoy the instance!

Rules

Rules

  • Follow the programming.dev instance rules
  • Keep content related to programming in some way
  • If you're posting long videos try to add in some form of tldr for those who don't want to watch videos

Wormhole

Follow the wormhole through a path of communities [email protected]



founded 1 year ago
MODERATORS
 

This is EBNF grammar for ANSI C (C99) and it contains almost every rule. It may be missing stuff, please tell me if you notice something missing.

I am writing a C compiler, with my backend and hopefully my own frontend in OCaml. That is why I wrote this grammar. I also have written the AWK grammar, but it's not uploaded anywhere. Tell me if you want it.

Thanks.

you are viewing a single comment's thread
view the rest of the comments
[โ€“] [email protected] 1 points 8 months ago (1 children)

I am currently writing a C compiler, with my own backend (and hopefully, frontend) in OCaml.

But why write your own C frontend? It's much more of a pain than people imagine. I maintain a C frontend implemented in OCaml (the project itself goes back 25 years) and it's still not on par with GCC or Clang.

For any other language, sure, but C has so many "wonderful" features, starting with the lexer hack. Your grammar conveniently overlooks this issue but it's something you'll have to deal with to actually implement it. So it simply won't be as nice as theory suggests.

[โ€“] [email protected] 1 points 8 months ago

You're right yeah. Hand-implementing lexers and parsers is kind of 'inane'. I'm not saying it's stupid. For a small grammar it makes sense. But for a big grammar, just use a PEG generator, or Yacc/Lex. Rust has Lalrpop and Java has ANTLR. There's truly no need to implement a parser from scratch. But people on the internet really seem to think using lexer and parser generators 'limits' them. There are some hacks involed in most Lex/Yacc or PEG specs, but at the end people should keep in mind that LR parsers MUST be generated!

Maybe implement the scanner? Even that is kinda stupid. Unless you do what Rob Pike says: https://www.youtube.com/watch?v=HxaD_trXwRE