Monthly Archives: July 2015

Git Best Practices

  1. Commit often
  2. All is not Lost
    1. git log -g
    2. git fsck –unreachable
    3. git stash list
  3. Backups
    1. Although a clone is a backup it does not include git configs, working directory/index, non-standard refs, or dangling objects.
  4. Once you push, don’t change history.
  5. Choose a Workflow
  6. Logically divide into repositories
  7. Useful commit messages
  8. Stay up to date
    1. Rebasing
    2. git pull –rebase
    3. git merge –no-ff
  9. Maintenance
    1. git fsck
    2. git gc –aggressive
    3. git remote update –prune
    4. git stash list
  10. Enforce standards
    1. Regression tests
    2. Complication tests
    3. Syntax/link checkers
    4. Commit message analysis
  11. Useful Tools
    1. gitolite
    2. gitslave
    3. gerrit
  12. Integrate with external tools
  13. Always name your stashes
  14. Protect against history rewriting

Git Workflow, Branching Strategy, and Release Management

This is based on the renowned Gitflow Workflow by Vincent Driessen.

Git repositories aren’t technically centralized anywhere but you could refer to one as the “source of truth.” Typically this called “origin” but development teams who are working on a giant feature together could define their own Git remote for other teammates before pushing to origin.

Typically when you’re coding, you’re either building a new feature or fixing a bug. When both need to happen at once, that’s when things get interesting.

Typically you’ll have two main immortal branches. These are the branches changes are pushed to (git push):

  • master
    • Contains an abridged version of the project
    • HEAD should always be production-ready
    • Make sure you tag each commit with a version number.
  • development
    • Contains the entire history of the project
    • HEAD contains latest changes for next release (nightly builds)

Once source code in the development branch is stable, it is merged into master and tagged with a release number to represent a new production release.

It’s good practice to pull with: git pull –rebase to avoid useless merge commits. You could modify .gitconfig to have this done automatically.

It’s also good practice to initiate a pull request before merging into development/master.

I recommend using this extension to make your life easier:

Version Naming Conventions

  • [Major version].[Minor version].[Build Number]
  • [Year].[Month].[Day].[Build]
  • Major.Minor.Hotfix

Support Branches

These branches are typically broken into the following:

Topic/Feature branches

  • Flow
    • Should only exist in the developer’s local repository, not on origin.
    • Branch off from development branch
    • Must be merged back into the development branch
  • Create
    • git checkout -b newfeature123 development
  • Merge
    • git checkout development
    • git merge –no-ff newfeature123
      • It does create a empty commit object but the trade-off is worth it. We don’t fast-forward because we want to keep historical information about the feature branch and all its commits that create the feature. This also makes it easier if you need to revert a whole feature.
    • git branch -d newfeature123
    • git push origin development

Release branches

These branches are created once the development branch has all the required feature for the expected release. Release branches are great because it allows one team to polish up the release before it gets merged into the master while other teams could work on new features of a future release. It also allows an easy view of the preparation of the release. Once the branch is created, no additional features are allowed. These are allowed:

  • bug fixes
  • documentation generation
  • other release-oriented tasks

It also helps Primarily prepares code for production release. Allows for last minute changes, minor bug fixes, and prepping version number, build dates, etc for release. The perfect time to branch off from development into a release branch is when all features targeted for the next release is merged into the development branch. Only when the release branch is created is a version number for the release is assigned.

  • Flow
    • Could branch off of the development branch
    • Merges back to develop and master branches
    • Naming convention: release-*
  • Create
    • git checkout -b release-1.3 development
    • git commit -am “Bump version number to 1.3”
  • Merge
    • git checkout master
    • git merge –no-ff release-1.3
    • git tag -a 1.3 -m “Release v1.3” (sign with -s or -u <key>)
      • When merging to master, always tag.
    • git push –tags
      • Keep in mind annotated tags aren’t tied to a specific branch, but are objects of their own with metadata. It contains the commit, tagger, date, author, etc. Lightweight tags are basically pointers to a commit hash. Here is an example of a annotated tag:
    • git checkout develop
    • git merge –no-ff release-1.3
      • This may introduce a merge conflict due to version change to which you fix and commit again.
    • git branch -d release-1.3

Hotfix branches

These are the only branches that could branch from master. When there’s an emergency fix that needs to take place, you need to branch off of the current production tag that’s on the master branch. This way developers could continue to work on the development branch while others could work on a critical bug fix without affecting them (development could be unstable at the time the fix is needed). In a way this is very similar to the Release branches.

  • Flow
    • Branches off of master
    • Merges back into development and master
    • Naming convention: hotfix-*
  • Create (Assuming fix is on version 1.3)
    • git checkout -b hotfix-1.3.1 master
    • bump the version
    • git commit a-m “Bump version number to 1.3.1”
    • fix bug
    • git commit -m “Fix bug on production”
  • Merge
    • git checkout master
    • git merge –no-ff hotfix-1.3.1
    • git tag -a 1.3 -m “Hotfix v1.3.1” (sign with -s or -u <key>)
      • When merging to master, always tag.
    • git push –tags
    • git checkout development
    • git merge –no-ff hotfix-1.3.1
      • Keep in mind if a release branch exists, the hotfix should be merged to that branch and not development. Only merge into development if the bug is a show stopper.
    • git branch -d hotfix-1.3.1

Git Relative Refs

You could use relative refs to move branches around. e.g.,  git branch -f master HEAD~3, etc. 

The “^” Operator

Goes up to the first parent. e.g., git checkout master^

The “~[integer]” Operator

The integer specified is the number of parents you would like to ascend to. e.g., git checkout HEAD~4 goes up 4 parents from HEAD.

Git Tools

git rebase

This allows you to copy commits/branch to another commit/branch directly. You’re taking all the changes that were committed on a branch and replaying it on another. This creates a nice linear sequence of commits as the two parallel (merge combines two endpoints) commits seem like they were done sequentially. This especially helps when you’re contributing to project (all they’ll have to do is a fast-forward). The caveat is that all commits on the way to the rebase target are applied, even if they were just used for debugging purposes, etc.

If you use the interactive version (-i), you could pick and reorder the commits you want. This is great if you don’t quite know which commits you want to apply and would like to see some metadata on them.
git rebase -i HEAD~3 –aboveAll

Rebase also allows you to edit previous commit messages (reword), combine multiple commits (squash), split commits (edit), and delete/revert commits.

Don’t rebase commits that have been pushed publicly as others may have already based work on those commits. If it’s necessary makes sure git pull –rebase is used.

If you want to replay changes from a topic branch that’s branched off another topic branch you could use git rebase –onto master topic1 topic 2 where will replay only the changes to topic2 not common to topic1 to master.

vs. Merge

Merge will give you a record of exactly what happened during the entire history. Rebase will clean up that story so that it’s easier for developers to see how the project was created. Rebase locally to clean up the story before you push. Never rebase after you’ve pushed.

git cherry-pick

You could specify which commits you want to apply to your current HEAD. You have to know which commit hashes you want. For example, if you have 3 commits above the master branch, you could apply just the last commit by running git cherry-pick [hash of 3rd commit]. This way, only the 3rd commit gets applied to the master.

git reset

Moves a branch reference backwards in time to a specific commit. This works for local branches but not remote branches (to which you use revert instead). It’s as if the commit never happened.

git revert

This creates a new commit with the reversed changes. You could push these changes out to others.

git reflog

This log is updated when the HEAD gets updated. This usually happens with:

  • Switching branches
  • Pulling in new changes
  • Adding new commits
  • Rewrites to history

This command will show you the history of your local repository. If you performed a destructive command like reset and want to recover commits, you could find the hash of this commit with git-reflog.

git-reflog-1 git-reflog-2 git-reflog-3

You could always use git reset –hard [commit hash]  to revert accidentally changes.

git fsck

This is Git’s file system checker, which makes sure that every available object is reachable and valid. When references like branches are deleted, the objects (e.g., Commit objects) aren’t usually deleted but aren’t reachable.

This is useful (as with reflog) to recover deleted branches, but especially useful with recovering remote branches.


 git stash

Keep in mind when modifying or adding new files on different branches, if there isn’t a commit, those changes will “propagate” to other branches if you switch. When you’re working on a branch and you’re not quite finish but need to move onto another branch, git stash will take all the changes in your working tree and index and “stashes.” You’ll run git stash apply  to “unstash” the changes.  Keep in mind it is applied to the current branch.

When you perform a stash, it’s a merge commit. Git keeps track of the state of the index/staging-area and the working tree. Keep in mind that the index and working tree could contain changes to the same file. So essentially there are two commits when you stash. With these two commits, Git is able to “unstash” your changes.


git describe

All this command does is show you the most recent annotated tag that is reachable from a commit. This may be useful for build and release scripts and also to find out which version a change was introduced.

This command takes a reference or commit hash and response with the most recent tag. If there was a commit after the tag, the tag will be followed by the number of commits and a letter “g” with the new commit hash.

git-describe-1 git-describe-2

git rev-parse

To find out which commit object a tag or branch points to.


git bisect

This command is great when you want to find out which commit introduced a breaking change. bisect performs a binary search to help you pinpoint which commit is bad. All you have to do is tell it which commit was bad, a commit that’s good, and it’ll automatically checkout the commits in between and have you test them. Once you’re finished git bisect reset.


A Git Primer – Under the Hood

These notes assume you’re familiar with the basic functions of Git.

The Git repository exists entirely in a single “.git” directory in your project root. Objects in Git are identified by hashes.

  • Blobs – Binary representation of a file.
  • Tree objects – Similar to directories. Points to blobs and other tree objects.
    • Similar to a directory with a list of files (blogs) and other tree objects (sub-directories)
    • The root tree object has the “big picture.” It is basically a snapshot of your repository.
    • If the root directory of your project contains a file (test-file.txt) and a directory (test-dir). Test-dir has a file called test-code.c:
      • Tree 831da3

        • test-dir (Tree 92ads31)
        • test-file.txt (Blob 12asd391)
      • Tree 92ads31
        • test-code.c (Blob 931bac3e)
  • Commit objects – Pointer to a single tree object
    • Commits are basically lightweight snapshots of the code at that point in time. Compression could be used. Also only the changes between revisions are tracked as to keep it compact.
    • Contains (run “git show –format=raw”)
      • Hash of the root tree object at the time of commit
      • Hash of parent commits (history of commits)
      • Author’s name/email
      • Committer’s name/email
      • Commit message
    • In the above example, the commit object has a hash of FF450. You typically only need the first 4 letters of of the hash to find the object.
  • Tag objects – Points to a single commit object, contains metadata.
  • References – Maps to a hash so you don’t have to memorize hashes. Points to a single object. Typically a Commit or Tag object.
    • .git/refs/heads
      • The “master” branch is a ref. The file contains the hash.
      • git show –oneline master
      • git rev-parse master
    • The reference HEAD points to the end of the current branch rather than the commit.
      • cat .git/HEAD will point to the refs/heads/master file.
      • When pointing to a commit object, you will be in a “detached HEAD state” which means you’re not on a branch.
  • Branches
    • They are basically pointers to a specific commit. That’s it.
    • Branches are just references. You could find them under .git/refs/heads/[branch name]
    • Initially this will point the current commit object.
    • When you make a commit on a branch, it simply just changes the current branch (.git/HEAD) to point to the newly created commit object.
    • Local Branches
    • Remote-tracking Branches
  • Tags
    • Tags are primarily use to label revisions at specific commit points. Unlike branches, they are immutable references.
    • You shouldn’t change or delete a tag once release publicly.
    • .get/refs/tags
    • By default tags created with ‘git tag [name of tag]’ is just a lightweight tag. It is only a reference to the commit object. Run ‘git cat-file -p [name of tag]’ to see that it’s not a tag object.
    • Annotated tags on the other hand could be created with ‘git tag -a -m “[message]” [name of tag]
      • Have their own author information. Who created the tag?
      • This is a tag object
      • Contains pointer to commit
      • Tag message
      • Timestamped to help keep track of release dates
      • Information about the tagger
      • Can be signed with a GPG key to prevent commits or email spoofing
    • Use git show to see what’s contained within a tag.
  • Merging
    • We create branches when we need to add a new feature or fix a bug.
    • The way Git handles merges behind the scenes isn’t intuitive.

      Running the above you’ll have three branches: master, new-feature, and  bug-fix.
    • Now let’s try to merge the bug-fix branch into the master.
    • Fast-forward means that the commits in bug-fix were upstream from master. Git will simply move the pointer up the tree to bug-fix.
    • Now let’s try to merge the new-feature branch into the master.
    • We’ll notice here, Git didn’t perform a fast-forward. The reason being, the new-feature branch isn’t upstream from master (remember master was pointing to the same level of the tree as new-feature).
    • The merge creates a new commit object. This commit object has two parents. This is called a merge commit. This is particularly useful in code review. When a developer pushes a feature branch out for review, the reviewer could create the merge commit which contains metadata on the reviewers and useful comments.
    • If you want only fast-forward merges, you could rebase the branch before the merge.
  • Remotes
    • Recall that Git stores the entire repository under the .git directory. The entire history could be traversed and the a snapshot of the project could be built.
    • A Git remote is just another Git repository.
    • Git only needs a working tree to find out which changes have been made since the last commit.
    • bare repository only contains the project’s history without requiring a working tree. This is where collaborators could push and pull from. git-bare-repo
      • The master branch doesn’t exist yet.
      • bare is set to “true”
      • You cannot add files to a bare repository. It is meant to be cloned, pulled, and pushed to/from.
  • Clone
    • git-clone
    • .git/config has a few extra lines.
      • remote “origin”
        • “origin” is the default name given a repository’s main remote.
        • fetch = which references should be retrieved when performing a “git fetch”
        • url = the URL of the repository
      • branch “master”
        • Configuration for remote-tracking branch
  • Push

    • The master branch now exists.
    • Specify the location of the repository you want to push to (origin) and the branch (master).
    • Generally you don’t want to run a naked “git push” because you may push all remote-tracking branches. There usually isn’t a problem with this but you run into the chance of pushing changes you don’t want other collaborators to pull.
    • Push updated the remote’s master to point to afd2 and the afd2 commit object as well as any tree or blob objects related to that change.
  • Remote-Tracking Branches
    • From .git/config, a remote-tracking branch is the line following [branch “master”]
    • This means the configuration is for the current local master branch.
    • The “remote” and “merge” configurations specifies when this branch is fetched, it should fetch the master branch from the origin remote. A local copy of the remote branch is stored.
  • Fetch
    • git fetch – updates your remote-tracking branches under /refs/remote/origin, it doesn’t change your local branches under /refs/head
    • [remote “origin”]
      • This is a mapping of remote references to local references. All references at the remote origin is placed locally in refs/remotes/origin/*
    • fetch does not create a local branch because you may not want it in your local repository.
    • If you want to create a local branch, run “git checkout new-feature”

      • Git will automatically create the new-feature reference (which points to the  same commit as the remote new-feature branch).
      • Git will also create a remote-tracking branch entry under .git/config
  • Pull
    • Very similar to git clone. git pull <remote> <branch>
      • git fetch <remote>
      • Uses .git/FETCH_HEAD to figure out if <branch> has a remote-tracking branch that should be merged.
      • git merge if required
    • Git overrides the contents of FETCH_HEAD every time you run “git fetch”
    • To demonstrate this, clone the repository and modify the README file on the new-feature branch. After you’ve finished, go back to the original clone.

      • Here we perform a manual “git pull.” By looking at .get/refs/heads/new-feature and FETCH_HEAD, we know there were changes in the new-feature branch. We simply perform a merge on FETCH_HEAD to apply the changes.



What is the Proper Data Type for Storing UUIDs in MySQL?

While creating a schema for SNOMED to ICD-10 mapping, I came across the requirement of storing a UUID (128-bit unsigned integer) as the ‘id’ field. 128-bits = 16 bytes so we’ll use BINARY(16).

In order to store the UUID properly, we need to remove the dashes and convert the hexadecimal representation to its binary form. This could be done in a single step with: UNHEX(REPLACE(‘id’, ‘-‘, ”)). In my case, I’m loading it from the a file so it’s done this way:

Of course when retrieving the ‘id’ field, you’ll need to HEX(‘id’).