boxin.space ~/post $ How to use git

### How to use git

Ok, I'm sure most of the people reading this feel they already know how to use git, and it's more than likely that you *do*. In January, I'll be doing a showcase of Sourcehut and related git-y things in one of our internal dev meetings at work. As it's something I'm interested in, I thought I'd create a written version for here too. It'll essentially be a non-interactive draft version of what I'll be doing in the meeting.

This assumes that you know what git is and have done at least the very basics of it (push/pull on single-contributor projects with 1 branch hosted on some common site like GitHub or whatever), and also assumes you've sent an email before.

### What is Sourcehut

Sourcehut is a collection of micro-services for hosting git or mercurial repositories and developing and maintaining software. So it includes the repo hosting itself, CI systems, issue tracking, wikis, and mailing lists. It's built around the principles of email-driven contribution, so the "website" bits of it are actually very minimal and don't rely on any JavaScript which is a wonderful change from all the bloated 4GiB turds you get when you try and visit GitHub, GitLab, or TFS etc. We'll get to email-driven contribution later. Seriously though, sourcehut is HTML, a couple of style sheets, and an authentication cookie. It's licensed under AGPL so it respects the 4 essential freedoms and the licensing combined with the business approach of its creator (Drew DeVault who we'll see a lot more of as we go) make the whole thing pretty resilient to the kind of surveillance capitalism you see with GitHub and GitLab that ultimately led me to evacuate both platforms. You can also self-host it, so if you like sourcehut but for whatever reason don't trust the people hosting sr.ht, you're not forced into anything and can just host your own. Because it's free software, you can audit the code you're running and verify that any binary version you download hasn't been tampered with.

Look mum, no malware!

When you log into sourcehut and go to git.sr.ht, you'll see something like this:

Main page of git.sr.ht

If I open a repo, then I get this set of views for exploring the project:

Repo page Tree tab Log tab

The other tabs don't have any interesting info on them for this repo because it's just a single-branch, single-maintainer project at the moment.

Personally, I don't like the light theme, but you interact with the web UI pretty rarely so I've never bothered to write any custom CSS for Stylus.

As you see on the main page for the repo, there are 2 clone links. The https one is read-only and is intended for external contributors to use because they won't be pushing to the main repo because they don't have permission to do that. If you're a direct contributor with write access to the repository, you do this using SSH which is not only more secure (as you're not sending your password to the server), but also just simpler to work with.

### Builds

Besides your common repo exploration, bug trackers, and mailing lists, builds.sr.ht is your build system. Unlike the convoluted and platform-locking "solutions" GitLab et al try and provide, the system on sourcehut is a tremendously simple yaml file. As an extra feature, if you have a file named .build.yml in the root of your repository, when you push to origin, a build is started automatically for you. Here's the build manifest for my O repo:

1image: archlinux
2packages:
3 - dlang
4sources:
5 - https://git.sr.ht/~otheb/o
6tasks:
7 - build: |
8 cd o
9 ./make.sh

The push command will display a link to the new build being started which I can click on and see happening:

A build in progress

In the event the build fails, it is kept for 10 minutes before being deleted. It gives me this information that I can use to SSH into the VM that tried to run the build to see exactly what went wrong, such as to examine log files or make sure any files are in the right places.

Notice at the top of a failed build
$ ssh -t builds@azusa.runners.sr.ht connect 119898 Enter passphrase for key '/home/olie/.ssh/id_rsa': Connected to build job #119898 (failed): https://builds.sr.ht/jobs/~otheb/119898 Your VM will be terminated 4 hours from now, or when you log out. [build@archlinux ~]$ cd o [build@archlinux o]$ ls LICENCE README buildtask.d extract infotask.d lalr lexer lib main.d make.sh parser semantic testproject [build@archlinux o]$

If you look closely, you'll notice that the content of the task in my build manifest is just a list of commands to run. The idea is you just break up your manifest into a set of tasks, where each task is essentially just its own little shell script. This means that the manifests are trivial to write.

### Bug tracker and mailing lists

Sourcehut also has an issue tracker and mailing lists. Because all these systems are small and distinct, there isn't a 1-to-1 mapping between any of these systems. You can have as many issue trackers and mailing lists as you want, and they don't even have to be related to a repository. In most cases though, you'll have your repo and have an associated bug tracker and mailing list, which is your "standard" setup for projects like this.

Let's take aerc as an example project. aerc is an email client that we'll be seeing later, that just so happens to be made by Drew DeVault from earlier (busy guy). It has a mailing list and a bug tracker. Unlike what you might see on GitHub, the bug tracker is *just* for confirmed bugs. Everything else goes on the mailing list. The idea is that if it's on the bug tracker, then it's a thing that absolutely definitely needs work to be done at some point. The service is called todo for that reason. Questions, feedback, and patches (the analogue to PRs) all go on the mailing list. This is where we get onto the actual workflow being used with these systems.

### Git and email

Git isn't designed to be used with GitHub or anything like it. It already has systems built into it for managing contributions and everything so you don't need any horrible flashy "web UIs" to use it. If you ever want to contribute to say Vim, Git itself, or the Linux kernel, then you'll go to the website for the project and see something like this (using Linux as an example):

git.kernel.org

This particular front end is called cgit and it's really simple. It even displays fairly comfortably in a text-based browser like Lynx:

git.kernel.org in Lynx

There are 2 different processes you follow depending on whether you're a contributor or a maintainer. The big key thing is that YOU DO NOT NEED AN ACCOUNT ON ANYTHING TO CONTRIBUTE. This is what makes this system so much better than GitHub et al because you're not locked into any platform at all. This is a particularly big advantage for people self-hosting. Gitea is a pretty popular self-hosted git system, but because it follows the same broken approach as everything else, hosting Gitea is just locking yourself away and making it unrealistic for anyone else to try and contribute. With sourcehut or cgit, that's not a problem. If I see a cool project that I want to contribute to, then the ideal is that I just clone the repo, make my changes, then send them in as an email. No accounts and all the security things that come with that.

### Contributing

If you're a contributor, you first clone a repository using the https link. On your local copy, you can make all the changes you want. You can also create a distinct repository, add it as an origin, and maintain a fork. As you make your changes, you can pull in changes from upstream so your copy doesn't lag behind. When you're happy with your changes, you send it to the project's mailing list as a patch. Drew DeVault (unsurprisingly) has made an interactive tutorial for contributing to projects like this.

The first 2 steps are just about configuration. You just need to setup git so it can access your email with SMTP (IMAP/POP not needed!). Then we can clone the example repository:

1$ git clone https://git.sr.ht/~sircmpwn/email-test-drive

Next, we can make our contribution to the project:

2$ echo "I'm about to try git send-email!" > blah
3$ git add blah
4$ git commit -m "Demonstrate that I can use git send-email!"

If we had write access to the repository, then what we'd do here is git push, if we were on GitHub then we'd open up a browser and throw our RAM out the window creating a pull request, but we're not doing either of those, so we stay nice and comfy in a terminal and use git send-email:

5$ git send-email HEAD^ --to="~sircmpwn/email-test-drive@lists.sr.ht"

Yes, ~sircmpwn/email-test-drive@lists.sr.ht *is* a real and valid email address. If we open our email client, we should see we have a response:

Hi Blah! Thanks for the patch! Needs a minor fix, though: > diff --git a/blah b/blah > -%<- > +I'm about to try git send-email! This statement is no longer correct - you have already tried it, and succeeded! Can you change this to the following: I have successfully used git send-email! After you make the change, you can edit your commit like so: git add blah git commit --amend Then send along a v2 of your first patch: git send-email -v2 \ --to="~sircmpwn/email-test-drive@lists.sr.ht" \ HEAD^ If you aren't sure what to do, please shoot an email to sir@cmpwn.com asking for help.

Makes sense, so let's make that change and follow the instructions. We use --amend on our commit to combine it with our first one. Git's commit history is mutable and that fact should be made use of as I'll explain further down.

So we use git to send another email, using -v2 to show that this is the second version of our contribution. We can also use the --annotate flag to allow us to add what's called "timely commentary". This won't turn up in the commit message - it's just for the email - and its purpose is to explain what makes this version different/better than the previous. Where commit messages are more a description of what the change means and does, timely commentary is a description of the motivation for the change or any miscellaneous information that's only really helpful for the maintainer looking over your contribution at the time. So if we send off the new email, wait a bit, then check our emails:

Thanks for the updated patch! This one looks good. Great work :)

Great! You just contributed to a software project... *sort of*.

Say you want to send multiple commits though. If your change is a bit more substantial and sending in a single massive commit isn't a good idea, you can hand a range of commits to git send-email. For instance, if rather than HEAD^ - which references the most recent commit - you passed in HEAD~3, this would send in the 3 most recent commits in one email. My favourite in this case is @{push}, which points at the first commit not yet pushed. In practice, this essentially means you're sending in all the commits you have made to your local copy, which is more-often-than-not what you want to be doing.

If you've received feedback on your contribution and gone back to make changes, but don't feel that the changes are substantial enough to warrant a v2, then you can send your email as a reply. You'll need the message ID of the email you want to reply to, which is in the message headers. You then pass this ID in with the --in-reply-to flag.

Personally, I recommend setting git config --global sendemail.annotate yes, which will open your editor for annotating on every email, as adding notes to your patches is always going to be helpful.

### But email is terrible

Is what some of you may be thinking. It's slow and convoluted and messages are easily buried. In reality, the problem isn't email - it's us. There are 2 significant things we seriously fail at with email, and fixing those two alone would immediately have a substantial positive effect on the usability of email.

The first is HTML email. HTML email was a terrible mistake. You can change the appearance of links so they appear to go to different places which is great for phishing attacks and terrible for you. You can embed remote content that causes your client to make a request for the image, allowing the exact moment you open the email to be logged, which is great for marketers and terrible for you. HTML is an incredibly large and complicated standard and it certainly wasn't designed for emails. Web browsers are incredibly complicated, and introducing that to an email client brings every vulnerability with it - great for attackers, terrible for you. To top it off, not all email clients will display HTML email. Crippling your means of communication just so you can have a pretty logo in your email signature isn't worth it.

Secondly, many clients default to a system called "top posting". You'll be familiar with the look of them where you'll open an email to see the response, but then also see the entire email history in the same email, which is not only utterly useless, but also a terrible mess. Just about every client will show you emails in a thread, so you can look at what was sent previously and what is being replied to. Your email only needs the content of your own message in it. Another thing you're totally allowed to do is modify the content of the email you're replying to. You don't need to have an answer at the top, then the whole question - life story and all - at the bottom. You're allowed to cut out all the fat, quote the little bits that are actually relevant to your response, and put that quote in the middle of your message. Your emails become much shorter, more readable, and far easier to work with. This is called "bottom posting", and if everyone did it then email would be a much more pleasant thing to use - especially if everyone *also* sent plain text email.

It's also worth noting that a lot of mailing lists for software projects will simply reject HTML emails. Most clients - while arguably bugged in that they default to terrible settings - can be configured to send emails properly, and there's information about that on this site, which was created by you know exactly who by now.

### Maintaining

Ok, back to git. On the other end of the contributor is the maintainer. As the maintainer, you are the proxy to your software project. If someone sends in a contribution, you are who ultimately decides at what point it gets merged into your upstream repository. So say you get a patch email. You read through it. You're at one of 2 conclusions:

The first conclusion is that it's a good patch and you're happy with it. What you can do is simply pass the verbatim content of the email into git am, which will apply the commits to your repository. From there you can check that things build and tests pass. You can then reply to the email with any normal "thanks"-type message and push the commits to your repo.

The second conclusion is that it's not a good patch and could use some work. You can send back a reply explaining what's not great and let the contributor get back to you later on with a better patch.

It's here where having a good email client makes a big difference. My client of choice - aerc - has a built-in pipe command, so I can switch to the directory of a project, then pipe the email into git am straight from the client and the patches are applied to my local copy. I don't even need to close my email client or work with any external tool. aerc also has email filters that can conditionally pass the content of an email through a script to modify its content before it is displayed. For example, emails with a subject beginning with \[PATCH are passed through an awk script that adds highlighting to diffs to make patch emails much easier to read.

### Git rebase

The last thing I want to talk about is git rebase. As I mentioned earlier, the git commit history is mutable, and this should be made use of. The commit history isn't supposed to be a pedantic retelling of your every success and failure. Ideally, you want it to be a clear description of each change that is made to a project. Mistakes are not changes, which is why things like the --amend flag on git commit and git rebase as a whole exist.

The overarching idea is that you will be making changes locally on your machine. Those changes are going to be a mixture of improvements, mistakes, changed approaches, and refactorings. Most of this doesn't matter. What matters is the A, the B, and the difference between the two - not the journey you took to get there.

The most important command for working with stuff is git rebase -i, which - when given a certain commit - will open up your editor for you to interactively rebase all commits since the given one. That looks a bit like this:

1pick 749871a Did part of the thing.
2pick bbf269a Did another part of the thing.
3pick 6eb8984 Fixed typos.
4pick 5313b79 Finished!

There's a guide in comments below this that explains the different commands you can apply to certain commits. The basic idea is you change the command word at the start and you can reorder them too. In this particular case, we want to merge the first 2 commits, so we replace the pick on the second line with squash to meld the two commits into one. The third one isn't really a meaningful change, but more just fixing typos. We *could* use squash here, but the fixup command is more appropriate. The only difference is the commit message is dropped as we don't need it here. The typos being fixed isn't meaningful information, and the merged commits will hide that the typos were ever made. Lastly, let's squash the last commit down, so we'll have 1 commit left over at the end, and that one commit will be our change as one tidy thing, rather than 4 messy things.

You'll then notice that your editor switches to a new view that looks a lot like when you're writing a commit, and that's because you are. The content is pre-filled with all commits you've squashed together so you can edit it easily. This is your opportunity to put in as much helpful information as possible to explain exactly what the change is and how it works.

It's also worth adding a note about writing commit messages. The ideal is that you have a short heading - like a title - then leave a blank line and write a short paragraph or two about what's been done. The more information, the better. You're not always going to remember to do this while working on code, which is why git rebase is so great. You can harmlessly write a hasty commit message with no info in it, then when you stop to review what you've done, you can interactively rebase your work, merge similar things, mark little typos as fixups, and for the commits that you don't feel you gave enough info on, you can use the reword command and it will open up your editor to modify the commit message.

My favourite command for doing this is git rebase -i '@{push}', which will interactively rebase all commits you haven't yet pushed. Rebasing local commits is great and makes everything easier for everyone, but the thing to avoid is rebasing commits you've already pushed, because if you push one set of commits, then push a rebased set of commits after someone else has pulled the first set, that person is not going to have fun trying to fix all their conflicting commits as they essentially have 2 different histories that land in the same place.

Often, when you pull from a repo you're making changes to, there will be conflicts, because - under the hood - what happens when you pull is your current latest commit is merged with the latest commit from the origin. Your local commits on a branch are like a phantom branch in parallel with the origin, so when you pull, we have to merge the two. You can alternatively tell git to rebase on pull rather than merge. This means that you can resolve conflicts tidily and without generating merge commits, which are often unneeded. Instead, you're more combining your two sets of commits in a way that makes everything line up without needing an extra commit on the end. Set this option with git config --global pull.rebase true. The only difference you'll notice is that when you get dropped out to resolve a conflict, once you've fixed it, you want to run git rebase --continue, which will continue the automatic rebasing process. Because the commits are being combined cumulatively rather than being merged at the end, you may be dropped out to fix things multiple times, but the result is that your commit history ends up cleaner.

There's more information on things you can do at this site, which you already know the creator of without me needing to tell you.

### Extras

This is a lot of git commands, and typing them out can sometimes be quite tedious. Shell aliases come to the rescue here and mean you can make a lot of these commands much shorter. I use fish as my shell, which has what it calls "abbreviations", which are a bit fancier than aliases in that they expand out when you press space so you can modify them if needed. As of writing, these are the abbreviations I have for git commands (copied from my fish config):

1abbr gs git status
2abbr gc git commit
3abbr gca git commit --amend
4abbr ga git add
5abbr gaa git add -A
6abbr gse git send-email HEAD^ --to=\"\"
7abbr gd git diff
8abbr gr git rebase
9abbr gra git rebase -i \'@\{push\}\'
10abbr grc git rebase --continue
11abbr gP git push
12abbr gb git branch
13abbr gba git branch -a
14abbr gp git pull
15abbr gh git checkout
16abbr gl git log \'@\{push\}..\'
17abbr gla git log

### Conclusion

That's about everything I have to cover. Git is a much bigger set of tools than I think a lot of people realise, and learning to use it "as intended" can teach you a tremendous amount - even if you don't end up using it exactly as such. Personally, I think this is a great way of using git and want to learn as much as possible as I continue to explore it. It's an interesting case where "the future is the past". We shouldn't be putting all our expertise into working with locked down and centralised systems like GitHub, we should be moving back to the original system which was decentralised and - most importantly - *accessible*.

#guides #programming
< O shit that's a lot of changes reMarkable >