this post was submitted on 22 Sep 2023
42 points (92.0% liked)
Lemmy Support
4654 readers
7 users here now
Support / questions about Lemmy.
founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
These are programming details that the user should not have to think about.
OP noted: "Don't get me started on how this messes up linux commands and scripts"
If you're running linux commands and scripts, you're not a normal user and should know this already. :) It's only been the standard for 30 years or so.
"standard" is that the character you input stays as the same character and isn't transformed. The fact that it's being shown as & means that either OP is using a client which isn't correctly decoding html, or its being double sanitised as &. Neither tog those is "standard", it's a bug.
Not in HTML. Never has worked that way.
Special reserved characters have always been handled this way because you don't want to accidentally interpret something the wrong way.
Same for URL encoding. You upload "Clever Name.PDF" to a website and it generates a URL of "Clever%20Name.PDF" because spaces aren't valid in URLs. %20 is the code for a space.
But when you see it on your screen it's supposed to be converted back into the actual character, otherwise it would be pointless. So if you're seeing & then the website is messing up.
What in the gosh darn condescending non sequitur is that? I have a special kind of dislike for people who, instead of trying to promote learning for anyone and everyone at any stage, instead choose to ridicule people for having missed some trivial detail that has about as much in common with Bash as does COBOL (basically nothing). Web scripting is, unsurprisingly, its own skill, and it's very, surpassingly, extremely, stupendously, and obviously conceivable that someone could have years of Bash experience but only recently started putting around with scripting for things like API access or HTML parsing. But you should know this already. :)
Text encoding is SUPER basic and anyone looking to get involved in Linux or scripting absolutely should know that stuff FIRST.
Source: I was teaching Linux 23 years ago before it was cool.
Here's a good primer:
ASCII:
https://www.techtarget.com/whatis/definition/ASCII-American-Standard-Code-for-Information-Interchange
URL encoding:
https://www.w3schools.com/tags/ref_urlencode.ASP
Entity encoding:
https://www.w3schools.com/html/html_entities.asp
Really, ALL of the W3Schools stuff is just fantastic. Anyone remotely interested in this stuff should start at the beginning there and work up.
I struggled hard to just post a link the other day because it has an ampersand in it, and it was being replaced.