this post was submitted on 05 Jul 2024
401 points (95.9% liked)

Programmer Humor

32410 readers
234 users here now

Post funny things about programming here! (Or just rant about your favourite programming language.)

Rules:

founded 5 years ago
MODERATORS
 
top 33 comments
sorted by: hot top controversial new old
[–] dactylotheca@suppo.fi 76 points 4 months ago (3 children)

Some people, when confronted with a problem, think "I know, I'll use regular expressions." Now they have two problems.

[–] qaz@lemmy.world 26 points 4 months ago (1 children)

Regex really isn't that bad when using named capture groups.

[–] dactylotheca@suppo.fi 17 points 4 months ago (3 children)

Oh yeah they definitely have uses, but there's a real tendency for people to go a bit crazy with them. Complex regexen aren't exactly readable, there's all kinds of fun performance gotchas, there's sometimes other tools/algorithms that are more suitable for the task, and sometimes people try to use them to eg. parse HTML because they don't know that it is literally impossible to use regular expressions to parse languages that aren't regular

[–] frezik@midwest.social 12 points 4 months ago (1 children)

It's entirely possible to parse HTML in PCRE. You shouldn't, but it is possible. The language stopped being strictly regular a long time ago and is entirely capable of doing it.

https://stackoverflow.com/a/4234491/830741

[–] dactylotheca@suppo.fi 7 points 4 months ago* (last edited 4 months ago)

Oh yeah, extensions which make them non-regular definitely can make it possible, but just because it's now somewhat possible with some regex engines doesn't mean it's a good idea

[–] FooBarrington@lemmy.world 5 points 4 months ago (1 children)

I've once written a JS decompiler (de-bundler?) using ~150 regex for step-wise transformations. Worked surprisingly well!

[–] Azzk1kr@feddit.nl 4 points 4 months ago (1 children)

What eldritch beast was summoned as a result?

[–] FooBarrington@lemmy.world 2 points 4 months ago

Well... No new ones, at least? Though it was around that time that I started hearing whispers in the night... "You can use WASM to ship Client-Side PHP"

[–] bleistift2@sopuli.xyz 3 points 4 months ago

it is literally impossible to use regular expressions to parse languages that aren’t regular

It’s impossible to parse the whole syntax tree, but that doesn’t mean you can’t get the subset you’re interested in.

[–] Mbourgon@lemmy.world 3 points 4 months ago

Jwz’s 2nd law!

[–] MashedTech@lemmy.world 2 points 4 months ago

I learned Regex once and now it just works. Only problem for me is using MacOS so the Regex flavors aren't consistent. But once I sort that, it's smooth sailing.

[–] AnarchistArtificer@slrpnk.net 47 points 4 months ago (1 children)

Regex feels distinctly eldritch to me. Like, a lot of computing knowledge feels like magic, but regex feels like the kind of magic you get by consorting with dark forces

[–] TunaCowboy@lemmy.world 43 points 4 months ago (2 children)

regex feels like the kind of magic you get by consorting with dark forces

AKA reading the manual.

[–] Tangent5280@lemmy.world 5 points 4 months ago

Im a good christian boy thats why I refuse to read the manual

[–] Cysioland@lemmygrad.ml 3 points 4 months ago

Or studying computer science and learning about finite state machines

[–] whodatdair@lemmy.blahaj.zone 45 points 4 months ago (3 children)

Blasphemy, that’s not regex that’s just fancy grep

[–] VegOwOtenks@lemmy.world 10 points 4 months ago* (last edited 4 months ago) (1 children)

I don't actually know whether POSIX grep would support named groups :o

[–] qaz@lemmy.world 1 points 4 months ago (1 children)

Don't have you have to use the -P flag?

[–] OmnislashIsACloudApp@lemmy.world 2 points 4 months ago (1 children)
[–] qaz@lemmy.world 2 points 4 months ago

Yes, but perl mode has more features.

[–] kubica@fedia.io 3 points 4 months ago

I don't fully disagree but you are walking on a fine line...

[–] PotatoesFall@discuss.tchncs.de 2 points 4 months ago

any idea what the re in grep stands for?

[–] yogthos@lemmy.ml 20 points 4 months ago (1 children)

I really like this approach for doing non trivial regex https://github.com/VerbalExpressions

const tester = VerEx()
    .startOfLine()
    .then('http')
    .maybe('s')
    .then('://')
    .maybe('www.')
    .anythingBut(' ')
    .endOfLine();
[–] frezik@midwest.social 2 points 4 months ago* (last edited 4 months ago) (1 children)

I don't. It may look less like line noise, but it doesn't unravel the underlying complexity of what it does. It's just wordier without being helpful.

https://www.wumpus-cave.net/post/2022/06/2022-06-06-how-to-write-regexes-that-are-almost-readable/index.html

Edit: also, these alternative syntaxes tend to make some easy cases easy, but they have no idea what to do with more complicated cases. Try making nested capture groups with these, for instance. It gets messy fast.

[–] JoeyJoeJoeJr@lemmy.ml 7 points 4 months ago (1 children)

it doesn't unravel the underlying complexity of what it does... these alternative syntaxes tend to make some easy cases easy, but they have no idea what to do with more complicated cases

This can be said of any higher-level language, or API. There is always a cost to abstraction. Binary -> Assembly -> C -> Python. As you go up that chain, many things get easier, but some things become impossible. You always have the option to drop down, though, and these regex tools are no different. Software development, sysops, devops, etc are full of compromises like this.

[–] yogthos@lemmy.ml 2 points 4 months ago

Exactly, at the end of the day it's about using the right tool for the job. Code that's clear and declarative is easier to maintain, so it makes sense to default to it, but nothing stops you from using low level constructs if you really need to.

[–] itsathursday@lemmy.world 16 points 4 months ago

Named groups are nice but can I please define a group more than once because maybe I want to group my data and consolidate values in a logical way without you complaining I have already used a group previously. I know I did, I’m the one telling you, now capture it twice!

[–] jaybone@lemmy.world 8 points 4 months ago (2 children)

Can you actually name capture groups, or this means how you can refer to them by number?

[–] VegOwOtenks@lemmy.world 25 points 4 months ago (1 children)

You can use backreferences \1 \2 etc. but you can also give them names explicitly.
it looks like this: (?<name>inner-regex)
Some flavors support it, kotlins doesn't apparently.

[–] jaybone@lemmy.world 2 points 4 months ago

TIL thanks!

[–] mormund@feddit.org 1 points 4 months ago (1 children)

In modern languages you can name them with labels as well yes. Not sure about the syntax right now. Something like (?label:...) I think

[–] qaz@lemmy.world 4 points 4 months ago

It's (?<NAME>...) and those are the named capture groups referred to in the post.

[–] neidu2@feddit.nl 7 points 4 months ago

I don't see the problem. But that's probably because my goto-language is perl.