Why Git is hard : programming

[–] [email protected] 59 points 1 year ago* (last edited 1 year ago) (5 children)

I disagree, hard.

I disagree with the general conclusion - I think it's very easy to understand*: each repo has a graph of commits. Each commit includes the diff and metadata (like parent commits). There is a difference between you repo seeing the state of another repo (fetch) and copying commits from another repo into your repo (merge; pull is just a combination of fetch and pull). Tags are pointers to specific commits, branches are pointers to specific commits that get updated when you add a child commit to this commit. That's a rather small set of very clear concepts for such a complex problem.

I also disagree with a lot of the reasoning. Like "If a commit has the same content but a different parent, it’s NOT the same commit" is not an "alien concept". When I apply the same change to different parents, I end up with different versions. Which would be kinda bad for a Version Control System.

"This in turn means that you need to be comfortable and fluent in a branching many-worlds cosmology" - yes, if you need to handle different versions, you need to switch between them. That's the complexity of what you're doing, not the tool. And I like that Git is not trying to hide things that I need to know to understand what's happening.

"distinguish between changes and snapshots that have the same intent and content but which are completely non-interchangeable and imply entirely different flows of historical events" How do you even end up in a situation like that? Anyway, sounds like you should be able to merge them without conflicts, if they are in fact completely interchangeable?

"The natural mental model is that names denote global identity." Why should another repo care, which names I use? How would you even synchronize naming across different repos without adding complexity, e.g. if two devs created a branch "experimental" or "playground". Why on earth should they be treated as the same branch?

"Git uses the cached remote content, but that’s likely out of date" I actually agree that this can lead to some errors and confusion. But automation exists - you can just fetch every x minutes.

"Branches aren't quite branches, they're more like little bookmark go-karts." A dev describing what basically is just a pointer in this way leads to the suspicion that it might not be Git's mental model that is alien.

"My favorite version of this is when the novice has followed someone's dodgy advice to set pull.rebase = true" Maybe don't do stupid stuff you don't understand? We know what fetch is, we know what merge is. Pull is basically fetch & merge.

""Pull" presents the illusion that you can just ask Git to make everything okay for you" Just... what? The rest of the sentence doesn't really fix this error in expectations.

except the CLI of course, but I can use GUI-tools for most tasks

[–] [email protected] 18 points 1 year ago* (last edited 1 year ago) (1 children)

I also disagree with a lot of the reasoning. Like “If a commit has the same content but a different parent, it’s NOT the same commit” is not an “alien concept”. When I apply the same change to different parents, I end up with different versions. Which would be kinda bad for a Version Control System.

It's also intuitive, it's how frames in video compression work, too. And in fact if you have two kids that look virtually identical but are from different families, they are very clearly not the same person. Context matters, most people more-than-intuitively understand that.

“Git uses the cached remote content, but that’s likely out of date” I actually agree that this can lead to some errors and confusion. But automation exists - you can just fetch every x minutes.

Yeah and nevermind that virtually any tool does that for you. So this is a long-solved problem.

[–] [email protected] 5 points 1 year ago

I'd expect a developer to understand that. A stack trace works the exact same way.

[–] [email protected] 14 points 1 year ago* (last edited 1 year ago) (1 children)

Hot take: Git is hard for people who do not know how to read a documentation.

The Git book is very easy to read and only takes a couple of hours to read the most significant chapters. That's how I learnt it myself.

Git is meant for developers, i.e. people who are supposed to be good at looking up online how stuff works.

[–] [email protected] 15 points 1 year ago

developers, i.e. people who are supposed to be good at looking up online how stuff works.

How I wish this were true.

[–] [email protected] 9 points 1 year ago (1 children)

Each commit includes the diff and metadata (like parent commits).

Commits don't store diffs, so you're wrong from the start here.

Hence why people say "git is hard"

[–] [email protected] 4 points 1 year ago (1 children)

Yeah, you're right, technically it's not a "diff", it's the changed files.

I don't think this technical detail has any consequences for the general mental model of Git though - as evidenced by the fact that I have been using Git for years without knowing this detail, and without any problems.

[–] [email protected] 2 points 1 year ago

It's all the files. Content-addreasable storage means that they might not take up any more space. Smart checkout means they might not require disk operations. But it's the whole tree.

[–] [email protected] 8 points 1 year ago (2 children)

One problem, I think, is that git names are kinda bad. A git branch is just a pointer to a commit, it really doesn't correspond to what we'd naturally think of as a branch in the context of a physical tree or even in a graph.

That's a bit problematic for explaining git to programming newbies, because grokking pointers is famously one of the stumbling blocks people have, along with recursion. Front-end web developers who never learned C might not really grok pointers due to never really having to deal with them much.

Some other version control systems like mercurial have both a branch in a more intuitive sense (commits have a branch as a bit of metadata), as well as pointers to commits (mercurial, for example, calls them bookmarks).

As an aside, there's a few version control systems like darcs where instead of the first-class concept being snapshots, it's diffs. There's no separate cherrypick command in darcs, it's just one way you can use the regular commands.

load more comments (2 replies)

[–] [email protected] 3 points 1 year ago* (last edited 1 year ago)

Each commit includes the diff

It doesn't. ☺

[–] [email protected] 41 points 1 year ago (4 children)

I totally disagree. Git is not hard. The way people learn git is hard. Most developers learn a couple of commands and believe they know git, but they don't. Most teachers teach to use those commands and some more advanced commands, but this does not help to understand git. Learning commands sucks. It is like a cargo cult: you just do something similar to what others do and expect the same result, but you don't understand how it works and why sometimes it does not do what you expect.

To understand git, you don't need to learn commands. Commands are simple and you can always consult a man page to know how to do something if you understand how it should work. You only need to learn core concepts first, but nobody does. The reference git book is "Pro Git" and it perfectly explains how git works, but you need to start reading from the last chapter, 10 Git Internals. The concepts described there are very simple, but nobody starts learning git with them, almost nobody teaches them in the beginning of classes. That's why git seems so hard.

[–] [email protected] 4 points 1 year ago (1 children)

Ahhhhh, that’s why! I should’ve know to read from the end not beginning lmao. Jokes aside, thanks for the advice I’ll try it out :)

load more comments (1 replies)

[–] [email protected] 3 points 1 year ago

Came here to say the same thing. The git book is an afternoon's reading. It's well worth the time - even if you think you know git.

People complain about the UX of the cli tool (perhaps rightly) but it's honestly little different from the rest of the unix cli experience: ad hoc, arbitrary, inconsistent.

What's important is a solid mental model and the vocabulary of primitive and compound operations built with it. How you spell it in the cli is just a thing you learn as you go.

[–] [email protected] 3 points 1 year ago

I agree, the teaching is wrong. I always teach it visually. That seems to do the trick

load more comments (1 replies)

[–] [email protected] 18 points 1 year ago (2 children)

In this thread - tons of smart people thinking that the tools we use to replace "make a backup of a file on a server somewhere" should require entire reference books, as if that's normal.

Saying "it's a graph of commits" makes no sense to a layperson. Hell the word "diff" makes no sense. Requiring training to get something right is acceptable, but "using CVS" is a tiny tiny part of the job, not the whole job. I mean, even most of the commenters on this thread are getting small things wrong (and some are handwaving it away saying "oh that small detail doesn't matter").

Look, git is hard. It's learnable, but it's hard. The concepts are medium hard to understand, and the way it does things is unique and designed for distributed, asynchronous work - which are usually hard problems to solve.

[–] [email protected] 8 points 1 year ago* (last edited 1 year ago)

While I agree 100% with your main point,

"it’s a graph of commits” makes no sense to a layperson

You're probably putting your standards too low. Every coder should know what a graph is, the basic concept at least. If you can understand fizzbuzz you can understand graphs too.

the word “diff” makes no sense

diff is short for difference. And that basically explains it

[–] [email protected] 6 points 1 year ago

Saying “it’s a graph of commits” makes no sense to a layperson.

Sure, but git is aimed at programmers. Who should have learned graph theory in university. It was past of the very first course I had as an undergraduate many years ago.

Git is definitely hard though for almost all the reasons in the article, perhaps other reasons too. But not understanding what a DAG is shouldn't be one of them, for the intended target audience.

[–] [email protected] 15 points 1 year ago (3 children)

My favorite version of this is when the novice has followed someone's dodgy advice to set pull.rebase = true, then they pull a shared branch that they're collaborating on, into which their coworker has just merged origin/main. Instant Sorcerer's Apprentice-scale chaos!

Why are you doing that? Don't do that.

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago)

And anyway... it's trivial to fix. If you still have the commit ID of the tip of the branch before the pull, go back to that. If not, look it up in the reflog. If that's too much of a hassle, list the commits you only have locally, stash any changes, reset to the origin/the_branch and cherry-pick your commits again and/or apply the stash.

I really embraced git once I understood that whatever I did locally, it's most of the time relatively easy to recover from cock-ups. And it's really difficult to lose work from the moment you've added it to a (local) commit or stashed it.

I do understand that git is daunting however, and there is plenty where I think the defaults are bad. Too often I've seen merge commits where someone merged a the remote of a branch into the local copy of the same branch, or even this on main. And once this stuff gets pushed it's neigh impossible to go back.

load more comments (2 replies)

[–] [email protected] 12 points 1 year ago* (last edited 1 year ago) (4 children)

A much simpler solution: don't use the git CLI. And in my professional life I don't know a single person who does. The shortcomings of git have long been abstracted away and as problematic as the CLI is, it's now just an internal library of the tools we actually use.

Also the git pull criticism is weird. Yeah it exists on paper, and year every so often once in a blue moon there's a conflict after a pull with rebase, but... this doesn't even begin to dent the oodles of time saved from just doing Ctrl+T in IntelliJ and be up-to-date with no further input. Why waste 20 minutes 40x-100x a day instead of 45 minutes once every 3-6 months? Especially this case:

My favorite version of this is when the novice has followed someone's dodgy advice to set pull.rebase = true, then they pull a shared branch that they're collaborating on, into which their coworker has just merged origin/main. Instant Sorcerer's Apprentice-scale chaos!

I'm sorry, but are you collaborating or competing on a shared branch? If it is a collaborative effort, maybe just talk about it? And in fact, unless the other person is an utter asshole, they'll have done so before merging in the new changes from main. That's not even to mention that in 99,95% of cases or so, that exact scenario is perfectly fine and gets resolved without any issues whats-o-ever and no user input necessary. Bringing us once again to the situation where you save a moderate amount of time multiple times a day by always just pulling.

(edit)
Don't get me wrong, all of this criticism is of course valid. But it feels like a very arcane case, as no project should be able to produce the issues frequently unless there's some underlying problem in either the mode of collaboration or the structure of the project in the first place, and the usage of git is long abstracted away and the tools handle virtually any and all edge case, including making merging far smarter than if you were to use the CLI.

[–] [email protected] 22 points 1 year ago (2 children)

I’ve used the git cli exclusively for more than a decade, professionally. I guess it varies wildly by team, but CLIs are the only unambiguous way to communicate instructions, both for humans and computers. That being said, I still don’t mess around with rebase for anything, and I do use a gui diff tool for merge conflict resolution. Practically everything you need to do with git can be done with like 10 commands (I’m actually being generous here, including reset, stash, and tag).

[–] [email protected] 8 points 1 year ago (1 children)

That being said, I still don’t mess around with rebase for anything

Rebasing has a worse reputation than it deserves. It's something you just get used to - just like how git use is, when you started using it. There are a couple of strategies to make it easier and less anxiety inducing:

Before starting a rebase of a long branch, create a new branch. That way in case you seriously mess up, you can just delete the rebasing branch and rename the old branch to restore everything (you can usually get away with rebase abort. This is just added safety). Even in case of a successful rebasing, you can just keep the backup branch around, as a faithful record of actual development history.
Do only one (or max 2) operations in a single rebases. Do this over multiple rebases to get what you want.

After a while, rebasing becomes as simple as commit or merging.

[–] [email protected] 2 points 1 year ago

Rebasing and merge conflicts are the top ways that git can turn into a mess. I know that rebasing could (in some circumstances) make merge conflicts less of an issue, but I just mostly think the value of "commit grooming" is overrated. I don't want to argue about this, if you like doing it, go ahead.

[–] [email protected] 5 points 1 year ago (1 children)

I had to check and make sure I didn't type the comment above because it sounds exactly like me.

All UIs do things slightly differently, the CLI is always exactly the same... Everywhere. UI for non trivial conflict resolution? Definitely. For everything else, CLI.

And, I'm also reticent to use rebase unless I have to. Gimme that good ole FF :)

load more comments (1 replies)

[–] [email protected] 16 points 1 year ago

don’t use the git cli. In my professional life I don’t know a single person who does

I do, I find it much simpler than using the GUIs

[–] [email protected] 2 points 1 year ago (4 children)

I usually use a gui but I know plenty of colleagues who exclusively use cli. I've never understood if it's an ego thing or what but it's an incredibly popular way to use git

[–] [email protected] 7 points 1 year ago (1 children)

I exclusively use CLI, it's not ego at all, I simply find typing what I want to be quicker than clicking buttons. I've written a bunch of aliases to automate my common workflows.

When I need to help a colleague who's made a mess of something, I can easily give them the command to fix it rather than finding the right options in their GUI of choice and it's often because of some broken abstraction in the GUI they got into the mess in the first place.

[–] [email protected] 3 points 1 year ago (1 children)

Yeah there are totally commands I use daily, but the visualization involved in looking at the log and available branches (which is a constant use case) is much easier in a gui for me. In fact I'd go as far as saying logs, diffs, and branching in cli are neigh unusable. The buttons I click while in the gui (like fetch/pull/commit) are largely used because at that point (after finding and checking out the right branch, etc) it would be slower to switch back to cli.

I only mentioned ego because I've seen multiple junior devs struggling with the command line resist using a gui even when it solves a specific problem they are having quite easily. To each their own though.

[–] [email protected] 3 points 1 year ago (2 children)

I use the CLI for simple commands, especially if helping someone on another PC and I don't have access to my preferred tool, but I honestly don't get people who use it religiously and never even try tools with GUIs. The convenience of being able to easily see the commit history, scroll through it, have a right click context menu or ability to just click it and see file changes (and then right click those files for additional options), is just something I can't abandon. Nowadays even the aliasing can be replicated in those tools if they support creation of custom commands so even that is a moot point - with some setup you can be as fast as with a CLI.

load more comments (2 replies)

[–] [email protected] 3 points 1 year ago (1 children)

I do have a huge ego, but I'll claim it's a total coincidence that I use CLI git.

My main reason to use mostly CLI is the better error messages I get when something goes wrong.

My secondary reason is that my preferred GUI tools for git didn't used to have support for operations I do often such as 'cherry-pick' and 'rebase'. I think that is mostly solved now, but my habits change slow and I'm used to the CLI.

[–] [email protected] 2 points 1 year ago (1 children)

Haha fair. My experience though, is that cherry picking is easily done in gui and I've honestly never attempted on cli because it only takes me three clicks in Fork

[–] [email protected] 1 points 1 year ago

Yeah. Cherry pick was the killer feature for CLI back when I was forming habits. Seems like it's built into most tools now, which is really nice.

[–] [email protected] 1 points 1 year ago (1 children)

Which to me is just wild unless you're doing something you wouldn't want to use an IDE for - and that's not actually that many professional things, if I'm being honest. But if you use an IDE, then it's far easier, faster and importantly doesn't take you out of your mental flow to just use the built-in git abstraction of that IDE.

load more comments (1 replies)

[–] [email protected] 1 points 1 year ago

There are things that my GUI of choice lack, so I occasionally type out a command, although I did also bind a couple of commands to GUI buttons, so there's that.

load more comments (1 replies)

[–] [email protected] 11 points 1 year ago (5 children)

I honestly don’t get why folks dislike rebase. I use it constantly, especially to squash commits so that my pull requests are a single commit that can be reverted easily.

[–] [email protected] 10 points 1 year ago* (last edited 1 year ago)

It's also kinda annoying to have a history full of "merge" commits polluting the commit messages and an entwined mix of parallel branches crossing each other at every merge all over the timeline. Rebasing makes things so much cleaner, keeping the branches separate until a proper merge is needed once the branch is ready.

[–] [email protected] 7 points 1 year ago

I use rebase when I'm working in a dev branch. If someone else has pushed changes to the main branch, rebasing the dev branch on top of main is a way to do the hard work of resolving merge conflicts up front. Then I can rerun tests and make sure everything still works with changes from the main branch. And finally, when it is time to merge my dev branch to main, it's a simple fast-forward.

[–] [email protected] 2 points 1 year ago (1 children)

Because rebase is fraught with peril, if you also push rebased branches upstream and someone else works off that branch.

If you stick to the rule of only using rebase on local branches that have never been pushed upstream, it's an awesome tool. If you don't, you're eventually going to cause someone to have a bad day.

[–] [email protected] 2 points 1 year ago (1 children)

Yeah, basically anything that rewrites already pushed history and is then (force-) push is bound to create problems (unless it's a solo dev only ever coding on a single device, who uses the remote repo as a mere backup solution).

[–] [email protected] 2 points 1 year ago

Yep. I work exclusively in forks, and all my work is done on my machine, rebased, squashed and then pushed to my fork for a PR. No commits from main are ever touched in my rebase. It’s such a clean workflow for me.

load more comments (2 replies)

[–] [email protected] 11 points 1 year ago* (last edited 1 year ago) (1 children)

git gets easier once you get the basic idea that branches are homeomorphic endofunctors mapping submanifolds of a Hilbert space.

(source)

Edit: but to actually have content in this comment, I'm not sure the mental model is the problem. It's not that alien that a good explanation wouldn't help, but it took a long time for git to start paying any sort of attention to "human readability." It was and still is in a way "aggressively technical" and often felt like it purposefully wanted to keep anybody but the most UNIX-bearded kernel hackers from using it. The man pages were rarely helpful unless you already understood git, the options were very unintuitively named, etc etc. And considering Linus' personality, I'm not exactly surprised.

With a little bit of more thought on how to make it more usable right from the start, I'm not sure it'd have such a reputation as it has now. The reason why I think this endofunctor joke is so funny is that that sort of explanation to "simplify" git wouldn't have been at all out of place – followed by the UNIX beards scoffing at the poor lusers who didn't understand their obviously clear description of what git branches are.

[–] [email protected] 3 points 11 months ago

Reminds me of the old joke that monads are easy to understand, you just have to realize monads are just monoids in the class of endofunctors.

[–] [email protected] 8 points 1 year ago* (last edited 1 year ago)

I might be suffering from stockholms syndrome here, but my prefered ways of working with git are the cli and the fugitive vim plugin which is a fairly thin wrapper around the cli. It does take a middle ground approach on hiding the magic and forcing you to learn the magic which I suppose can be confusing for beginners when you work collaboratory and something happens that forces you to go beyond pull/add/commit/push

[–] [email protected] 6 points 1 year ago

In my (admittedly limited) experience, mercurial is much more intuitive than git. I really dislike that git branches are only tags on the heads and completely ephemeral. It favours creating a single clean history instead of preserving what actually happened.

[–] [email protected] 6 points 1 year ago

Git Koans

[–] [email protected] 4 points 1 year ago* (last edited 1 year ago) (1 children)

I only stick with these:

pull
add
commit
push

Easy.

[–] [email protected] 1 points 1 year ago

Merge is love merge is life, get the hell out of here with that rebase witchcraft.

[–] [email protected] 3 points 1 year ago

LazyGit is a thing ❤️🙌

[–] [email protected] 1 points 1 year ago* (last edited 1 year ago) (1 children)

Isn't It Obvious That C Programmers Wrote Git

load more comments (1 replies)

Programming

Rules

Wormhole