Adam Dymitruk

Scripting for Fun

2012-07-20T18:16:00-07:00

I’ve been playing minecraft a couple of nights a week for about 40 minutes each time with my son. This seems to be a trend even for very young children. It wasn’t too long before I found myself running a server so that we could play together with our friends (We even setup a trello board for our missions and projects!). It’s much more fun to work together and build something. Discovering new places and building contraptions, houses and other things is very stimulating for a young mind - as well as my old mind. This generation has so much more at their disposal.

Since the 2 teams don’t play very often, it would be nice to get notifications when one goes online so the other can join. The server has a log file that we can inspect. So I came up with this little script that emails the other team when we log in and vice versa. I also made the same for logging off. This is scripting 101, but most people I know are programmers and don’t neccessairly dabble in bash.

tail -F /srv/minecraft-server/server.log | 
  grep --line-buffered 'adymitruk .\* logged in' | 
  while read line
  do 
    echo "Join me if you can." | 
      mail -s "I just logged in to Minecraft" yourfriend@gmail.com 
  done &

The minecraft log makes it easy to take actions according to what happens in the game. A line gets written saying who logged in and who logged out. Tailing this log and then grepping for those lines, we can send an email. Here’s how you can set up your server to send via gmail.

The Details

tail

This command will give you the last 10 lines of a file by default. You can change this if you like. The -F option will make this command never return and keep monitoring the file for changes, even when the file ceases to be readable.

grep

This is probably the most well known of linux commands. We are going to filter out lines that only match the regex that we provided. In this case it is looking for the log entry when I log in. You can make this more robust by ensuring the format is more strict. The log in minecraft will log any chat conversations, so a user could spam you with email at this point by chatting a pattern that matches up to this.

Grep will also buffer input before passing it along. This is why we provide the --line-buffered option. It makes sure that the buffering is limited to only one line. If you omit this, grep will appear to be stuck and you will only get an email if you manage to empty the buffer by causing the log to grow by a significant amount.

while

We make a never ending while loop to continue to consume the lines that are being piped from grep. You can enter the statements on a single line if you like but would need to insert semicolons after the do, mail and done lines.

mail

You can email from the command line by piping the message into the mail command. This is what we have done here. Multiple recipients can be listed as well. In this case we just have one.

background process

The & at the end tells the shell to run everything in a background process, freeing up the command line for more instructions. More importantly, the process will be running when you log out of the server and log back in again. If this was a more important process, I would run it as a service instead.

We can see these processes with

ps -Af | grep line-buffered -B 1 -A 1

I’m looking for some text that was unique and was used to launch the 2nd process in our pipe chain. Since it is the 2nd one, we also want the first and 3rd processes. It’s highly likely that they will be spawned right after one another and get listed one after the other. So we can ask grep to return one line before and one line after the matched line. We will also get this current grep instance as well when we issue the command but we’ll ignore that:

adam     21847     1  0 Jul16 ?        00:00:00 tail -F /srv/minecraft-server/server.log
adam     21848     1  0 Jul16 ?        00:00:00 grep --color=auto --line-buffered adymitruk .* logged in
adam     21849     1  0 Jul16 ?        00:00:00 -bash
adam     21853     1  0 Jul16 ?        00:00:00 tail -F /srv/minecraft-server/server.log
adam     21854     1  0 Jul16 ?        00:00:00 grep --color=auto --line-buffered adymitruk lost connection
adam     21855     1  0 Jul16 ?        00:00:00 -bash
adam     21856     1  0 Jul16 ?        00:00:00 tail -F /srv/minecraft-server/server.log
adam     21857     1  0 Jul16 ?        00:00:00 grep --color=auto --line-buffered friend .* logged in
adam     21858     1  0 Jul16 ?        00:00:00 -bash
adam     21862     1  0 Jul16 ?        00:00:00 tail -F /srv/minecraft-server/server.log
adam     21863     1  0 Jul16 ?        00:00:00 grep --color=auto --line-buffered friend lost connection
adam     21864     1  0 Jul16 ?        00:00:00 -bash

If we don’t want these to run anymore, we can issue the kill command followed by the process id - it’s the 2nd column.

Summary

Linux is a wonderful operating system and has matured quite a lot. It can be a lot of fun too. Especially now that Steam is going to be rolling out on a Linux platform, we should be seeing more games where we can do fun things like this.

Filtering by Author Name

2012-07-18T00:46:00-07:00

It’s unbelievable the kind of attention something simple can get. I’m still suprised at how many up-votes this answer is getting.

In Git, filtering by author name is easy. Most people simply use the name of the committer that they are interested in. However, it’s a little more powerful due to the fact that the author option on git log is actually interpreted as regex. So for looking for commits by “Adam Dymitruk” it’s easier to just type git log --author="Adam" or use the last name if there more contributors with the same first name.

You can also match on multiple authors by supplying the regex pattern. So to list commits by Jonathan or Adam, you can do this:

git log --author='\(Adam\)\|\(Jon\)'

However it’s tricky to exclude commits by a particular author or set of authors using regular expressions as noted here. Instead, turn to bash and piping you can exclude commits authored by Adam by:

git log --format='%H %an' |  # get a list of all commit hashes followed by the author name
  grep -v Adam |             # match the name but return the lines that *don't* contain the name
  cut -d ' ' -f1 |           # from this extract just the first part of the line which is commit ref
  xargs -n1 git log -1       # call git log from that commit stopped after 1 commit

A limitation of this is that some log options that you would want are not available such as --graph due to the mechanics of calling git log multiple times.

A Few More Details

The cut command is treating spaces as delimiters and is only returning the first field which is the sha1 of the commit.

If you want to exclude commits commited (but not necessarily authored) by Adam, you can replace %an with %cn. This has the same effect as using git log --committer=Adam instead of author in the first example but for exclusions.

Don’t be afraid to split your piped commands onto multiple lines. As long as a line ends with a pipe, bash knows there is more and will prompt for the next line. You can continue to do this until you have written what you want or pasted a multiline snippet from an example online. When you search history, it will be recalled as one line with proper semi-colons inserted if you used while loops or other flow control.

NDC Oslo

2012-06-09T23:15:00-07:00

Continuous Tests is Free!

Last week I was lucky enough to present and attend the Norwegian Developer Conference in Oslo. This was a wonderful event with many excellent presentations and post conference get-togethers. The highlight of this conference for me was the announcement that Continuous Tests aka Mighty Moose is now free! If you’ve been keeping up with the conference on twitter, you may have noticed the controversy that the Azure announcement caused. I also didn’t like the use of profanity in the keynote and more mentions of Steve Jobs, but that’s a small part. Aral had me in stitches with all the usability (or there lack of) issues found in our world. My criticism of those 2 things caused Aral to block me on Twitter - I guess some people have thin skin. Don’t let the Azure slip up take away from an excellent conference. Download all the presentations and watch them.

My presentation was about the previous post on this blog – a lesson in branch-per-feature as experienced by one company. I’ll be reworking the presentation to be less wordy and will post the slides here. The talk is very similar to the one I gave at Vancouver Techfest in late April. I was amazed at the number of people that attended. Presenting at the same time as me was Dan North and I really wish I could have been there for his talk. Beforehand he and I were joking about how we wanted to take our audience to other talks that we thought were better than our own!

As with any conference that I attend, the true benefit is the ability to sit with the attendees and speakers alike over dinner or drinks. I had a chance to catch up with Greg Young, Michael Feathers, Udi Dahan, Dan North and many others. I met a number of people that I knew already from Twitter such as Rob Conery, Rob Ashton, Ashic, Krzyshtof Kozmic to name a few. There is a number of great Swedes, Danes and Norwegians that I had the pleasure to meet and connect with. The discussions were really insightful and were full of content you just couldn’t get from a presentation.

On my last day there I took an opportunity to see a little bit of the city and included seeing the Botanical Gardens, Museum of Natural History, Museum of Geology and the Vigeland Statue Park. The botanical gardens and museums were to the east of central station while Vigeland park was to the west. This showed a great contrast between the poor and the well-off in Oslo as you walked the streets.

Branch-per-Feature

2012-02-05T22:38:00-08:00

The Dymitruk Model

Following the methodology defined below is the most effective way to leverage the power of Distributed Version Control Systems - specifically Git. This work is the result of an in depth analysis of Continuous Intergration and the notion of responsible Continuous Delivery. The inherent risks that de facto CI and CD introduce are mitigated by what others now refer to as “The Dymitruk Model”.

Features are small

Old-school branch-per-feature meant that branches were large and long living to avoid having to integrate because it was a pain. This was a vicious circle as the feature would diverge further and further from other features or the mainline. Features should be as atomic as possible and your development process should abide by the Open Close Principle. Features should be small.

You can see that the branches have a couple of commits each. We start with the end in mind with failing tests and implement the feature in the following commit. This would be the minimal amount of commits to expect on a typical feature. They won’t typically be that small.

Integrate relentlessly

They should be integrated into an integration branch almost as often as you commit to them. This gives feedback immediately. You have some sort of CI running off of the integration branch to tell you if your changes are not adversely affecting other work. This gives you the immediate feedback of trunk-based integration while keeping your work organized and malleable.

You can follow the lines here but it’s not easy as you don’t get “swim lanes” for your branches. You can see each time that we merged a feature into dev by finding the “Merge branch ‘FTR-X’ into dev”. We can use git show-branch to see what commits exist in what branches:

Just a note that we usually make the commit messages prefixed with FTR-X which corresponds to a ticket in Jira. This way you know at a glance what feature a commit belongs to.

Don’t do back-merges

Or at least avoid them. A back-merge is where you want to use something from the integration branch to help you get your work done on the feature. This is a smell that you don’t have independent stories. A reasonable middle-ground is cherry-picking. A successfully cherry-picked commit will not cause you issues when you merge in the future.

Keeping a feature branch filled with commits that only attain what the feature is supposed to do will make working with it much more flexible. An understanding of the DAG (directed acyclic graph) that makes up Git history will make this easy to understand.

Involve QA from the start

This should not even be a contentious point anymore. We all know how important tight feedback loops are. There should not be a QA department with QA employees. QA should be a hat that is worn at different or same people at different times.

Knowing what your Acceptance Criteria is and how you will prove it from the start is integral to getting a lot of things gelling - including a successful branch-per-feature regiment.

A proper DSL a la Ubiquitous Language (see Domain Driven Design) is at the heart of this. The tool that best communicates across to the Product Owner, Regression/Specification Testing and Behaviour Driven Design feedback is currently StoryTeller. One thing that it offers that no other tools offer is communication to the person writing the Acceptance Tests of what the system is capable of doing with the smallest amount of friction caused by technology. You simply pick what you want to do by clicking on links, filling out text boxes and selecting from drop-downs. There is no guessing as to how a tool might interpret your free-form text with it’s regex and English-parsing goodness. More on this in a future post.

A feature passes QA only if it has been integrated with all the other features that are completed. This brings us to a very light weight branch called QA or RC (release candidate). Once a feature is finished, it gets integrated to the RC branch and TeamCity or whatever CI tool you have makes a release build. This build upon being tested can throw this feature or any other back to development should they fail. Your CI tool/process will mark this with an incremented Release Candidate tag.

You can see that that the incomplete feature 4 is not yet part of the release candidate branch.

Here you can list all that’s been merged into the release candidate.

Share your hard work

There will be conflicts when you merge. This is a fact of life when work is done by more than one person. When you integrate often from your feature to the integration branch, the conflicts you solve should be remembered. This is done by git’s rerere but could be simulated in other systems with little effort. The key is to set up a way of sharing these auto-resolution conflicts to the rest of the team.

Now anyone that tries to integrate that feature and has that conflict will not have to resolve it. No dev required to put together a build. This is a manual share right now if needed. I should have the script published in about a week that does this behind the scenes. If you want to do it yourself, look no further than the .git/rr-cache folder in your repository. Simple synchronization between all devs is the bare minimum that is needed. Currently this is a branch that has it’s own branch with an independent root. Wrapping the git command to intercept fetch, pull and push makes it easy to update the rerere. Any git command can be made to look at an alternate folder for the work tree and the repository itself.

Notice that the conflict has been marked and we resolve it by just rewriting the file to make both branches agree.

Here, since we had rerere enabled, git records how we resolved this particular conflict. This will help us when we get to the release candidate branch and other people’s work later on.

When we examine the .git folder, we can see how the resolution is stored. Just like blobs, the pre-conflict image is what determines the SHA1 hash that is the name of the directory that the conflict resolution files will be stored under. The content of these files shows just how simple even advanced concepts such as rerere are when we look at how they are implemented.

Git now reuses our previous resolution when we had that conflict on the dev branch. It does not make this an automatic commit of the merge - just in case. We examine the file that was conflicted and are fine with it and go ahead and commit the merge.

I’ll share the script that I’m writing once it’s finished. The resolutions get shared across a “resolutions” branch in the same repository.

Taking features out is more powerful than putting them in

This might sound counter-intuitive. But at the end of an iteration, a feature that you thought was done may not work as the last bits of testing on the build as a whole make releasing a no-go. Anyone should be able to take out that feature and release anyway.

So the trick is not to take the feature out of the build. You make a build with the problem feature omitted. You can integrate that feature in the next iteration when there is time. Releasing a build should be painless.

Don’t make a build to test out of the integration branch. Make a separate branch that can be reset relentlessly and tag release candidates. Reset this branch to the start commit of your iteration and merge all the features you know work.

No conflicts. Remember rerere and the like? Anyone should be able to do this if you followed the practices here. This is “why” the hard work needs to be shared.

The key is we “threw away” all previous merges and have to redo them. But remembering our conflict resolutions, this is now a trivial matter. If in doubt, we haven’t really thrown them away, they are still there to reference or use. Git’s pick-axe functionality makes it really easy to find certain code changes if you don’t know where to look.

Here we have decided that feature 3 is no good and we want to make a build without it:

Now we can see that feature 3 is not part of this release candidate:

Shared code

You will quickly note how painful keeping shared code synchronized across the many applications that you have. This is usually handled by git submodules. If you don’t have explicit contracts between the shared libraries and your client applications, there is going to be a lot of work to ensure you have the right version deployed. Adhering to OCP (open closed principle) is the only way out of this and buys the ability to have a rolling deployment. The preferred way is to have a submodule that contains all the messages needed to communicate in the system. You can enforce that only new messages get added with server-side hooks that will not allow existing messages to be modified or deleted - only new messages are allowed to be added.

Giant refactorings

Your work is as organized as is possible. Whether you elect to do this off of a certain point in time on the integration branch, release candidate branch or you started a feature branch for it, you have a way of tracking that work and can apply it as a merge, rebase or manual patch to another point if necessary. If there are large changes, we can do that work before we start on other features after a release. There is no magic bullet for refactoring across the board. Organized work and history help this - it doesn’t hinder it.

Any hardships that you may encounter will be tempered by the fact that you are relentlessly sharing your conflict resolutions and continuously integrating. PROPER BRANCH-PER-FEATURE RELIES ON RELENTLESS CONTINUOUS INTEGRATION

Toggles are a hack

There are exceptions. You’re a giant company. You need to enable a feature for a small subset of early adopter users. This is now an explicit feature that’s important to business. That’s where we all want to be but most of us are not.

Having to make architectural changes because you can’t effectively organize your work is a process smell plain and simple. Some teams are not mature enough and this may be an OK solution temporarily.

This is Git Flow Improved

Most of this way of working started from the excellent post called “A Successful Git Branching Model”. The important addition to this process is the idea that you start all features in an iteration from a common point. This would be what you released for the last one. This drives home the granular, atomic, flexible nature that features must exhibit for us to deliver to business in the most effective way. Git flow allows commits to be done on dev branches. This workflow does not allow that.

The other key difference is no back-merge into the feature. Otherwise, you will not be able to exclude this feature later in the iteration.

It will be bad if you use old tools

Not having a snapshot based history will hurt as branching is effectively copying another branch which will be slow. Not having a base of where the branch originated will make merging difficult as you do not have a base to compare how each side of development has changed. There are many other issues with having a connected-only tool to support what you do in order to work with everyone.

We can always deploy

The release candidate builds represent production ready deployable packages. This is the most responsible way of doing Continuous Deployment. We have the latest completed and tested feature at our disposal. We don’t ship any code that does not belong to a feature that’s passed QA and all other tests. Having this option gives you the best position as an IT department. Business has the power to deploy whenever it wants to - and have the piece of mind that nothing is half-baked to the best of anyone’s knowledge.

Old comments from Google+

A Fresh Start: Octopress Provides the Tooling for Blogging

2012-01-25T22:03:00-08:00

This is the first post using Octopress (I have been editing it though to get some things working). So far it’s awesome. I’ll have more to show soon. Look for:

Import old posts from adventuresinagile.blogspot.com
~~Customize the front page to include my password hasher.~~ My password generator is here.
Articles and other demo stuff