3
atheist
175d

Abstract Syntax Trees are the bane of my existence.

Comments
  • 1
  • 2
    ..and yet, they're the core mechanic of nearly ever compiler and interpreter.
  • 1
    I'm writing a docstring linter for python which required parsing python into an AST, extracting the docstrings and they use reStructuredText markup so I'm now trying to parse *that* into another AST. The tools for both are mixed quality...
  • 0
    @djsumdog they work insanely well, but not the easiest thing to interact with...
  • 1
    Wtf with downvoting this? The guy is venting. Must be getting a lot of redditors in here.
  • 2
    Ah, the *python AST* is the bane of your existence. If it's any consolation, I'm pretty sure everyone who's parsing Python is suffering from it just as much. Utterly deranged grammar. There are natural languages that are easier to parse with a program than Python.
  • 0
    @lorentz I have to say, astro (pylint's custom AST parser) is insanely good for my purpose. I gave up trying to use python's built in AST library, just a headache. I've got all docstrings extracted and now trying to parse restructured text. Which, honestly... Is worse...

    There's the docstring_parser library which is insanely noncompliant with the standard.

    There's docutils, which probably _is_ standards compliant but has zero type annotations and at best mediocre documentation for parsing.

    There's sphinx which basically defines the standard. But I looked at the code, it's ugly as hell and it's overly complicated. But the standard can fit on a page or 2. So... I'm currently just writing a simple AST parser for it myself. Or at least parse the structural mark up.

    This whole project started because I wanted something that would lint class attribute docstrings that are declared as string literals on the line after and couldn't find anything. "How hard could that be?!" Ahhh... So naive...
  • 0
    How are docstrings nodes. My lexer skips all that. Or maybe they filter it on parser level. You don't want a comment trough your interpreter is guess
  • 0
    I would've used regex to find function names and just from there copy docstring. Maybe you can even match group
  • 1
    @retoor I lex comments, then in the parsing stage keep them if they're on item level and discard them if they're inside expressions. That way you can look up the docstrings associated with a constant or namespace, but they don't complicate the expression level
  • 1
    @retoor in the built-in test runner, a test is a constant with the comment --[| test |]--
Add Comment