petition to trim all strings by default

Ranter

kiki

35506

Comments

5

Lensflare

17686

1d

what about whitespace devs?
7

tosensei

8492

1d

@Lensflare trim those as well
4

12bitfloat

9677

1d

Please no, that's some PHP type stuff

And we do not want PHP type stuff
4

lorentz

15299

22h

Absolutely not. No pre-trimming, no case-insensitive storage, no length limits, no grapheme normalization. Text is hard and we suck at it. By default they're an optimized sequence of 32bit codepoints that you can split and compare, where inequality guarantees that two strings can't have been created the same way (NOT that they represent different text), and less-than is an arbitrary total ordering for algorithms (NOT alphabetic order).

For all other purposes you should use a Unicode grapheme library with a specific locale. Many programming languages also choose to provide broken operations that only work on English and partially on some latin text, because they're made by Americans with deadlines.
1

lorentz

15299

21h

too late to edit but even I was too permissive up there, actually splitting strings by codepoints is incorrect. For transmission and storage count bytes, for display count graphemes. I meant to talk about a "contains" check, but since equality is meaningless, so is "contains". So instead I'll point out that you can split graphemes and normalize strings without specifying a locale, and on this level both "starts with" and normalized equality are meaningful operations.
0

jestdotty

5888

21h

@tosensei hey look an entitled karen
1

tosensei

8492

7h

@jestdotty you don't have to announce yourself, you know? you're not that significant.

Add Comment

petition to trim all strings by default

rant