Difference between revisions of "Using Git with OpenEMR"

From OpenEMR Project Wiki
(remove content that should be in talk, relabel section.)
m (1 revision: second)
(No difference)

Revision as of 07:46, 26 August 2011

About Repositories

Currently we have two semi-official Git repositories for OpenEMR. The first is hosted by GitHub; the second is hosted by Gitorious. Both are maintained by Brady Miller and updated from CVS every half-hour.

OpenEMR git Howto Instructions For DUMMIES

git For Dummies: This is a quick, practical OpenEMR walk through by brady with the goal of getting new developers up and working with git as quick as possible. It is not meant to serve as a substitute for the more extensive documentation below.

Git for Computer Scientists is Git documentation for those developers that are interested in a higher level overview of git's available functionality.

Other Documentation

This is not meant to be an exhaustive guide, at all. It's more like a walk-through of one possible Git workflow. The Git project has produced some fairly high-quality documentation, so you should be able to find answers to your questions there. The first place to check is the documentation included with a Git installation; it may not all make sense initially, depending on what documentation you are reading, but things really start to click if you stretch yourself a little bit and experiment.

E.g. Listing and summary of important Git commands

git --help

E.g. All Git commands, including some you'll never use, and some that might not be part of the official Git distribution

git --help -a

E.g. Lots of details about how to use the "pull" command

git pull --help

In addition to the "built-in" documentation. There's also a good tutorial or two, a user's manual , a comprehensive guide, and lots of people write about Git on the Internet.

Getting The Code

First, you need to determine the URL of the repository. Normally, you will want to use the "git" protocol, as it takes less bandwidth (and time) on both ends. If you are behind a restrictive firewall, you might need to use the "http" protocol instead. Either one is read-only on both repositories. Once you have the URL you can use Git's "clone" command to fetch the entire development history.

GitHub using HTTP:

git clone http://github.com/openemr/openemr.git OpenEMR

Gitorious using Git protocol:

git clone git://gitorious.org/openemr/openemr.git OpenEMR

Either of these commands will give you an "OpenEMR" directory that contains the whole development history and metadata inside its ".git" subdirectory. It will configure a Git "remote" named "origin" that is associated with the URL you provided. You will have at least the remote branches "origin/master" and "origin/rel-320". You will have a local branch "master" that points to the same commit object as "origin/master" and your local "master" branch will be checked out into the working directory ("OpenEMR"). Your "master" branch will also be set to automatically merge in changes from "origin/master".

Checking Out Earlier Releases

Git is also useful for working against the released versions of OpenEMR. To check out the released 3.2.0 branch you have to perform the above clone, and then check out a copy of OpenEMR corresponding to the tag v3_2_0 (found via the git tag -l command). It is highly recommended you then start a new branch immediately, so that you can save changes.

git branch my_3_2_0_version v3_2_0 # Create a new branch based on the tag
git checkout my_3_2_0_version   # Checkout the branch just created

We also have a rel-320 branch, which is the 3.2.0 tarball, plus released patches and any unreleased, back-ported patches. You can work directly on your own rel-320 branch, but that has the same disadvantages as working directly on your master branch. Again, we recommend you create another branch for working on it.

git checkout -b my_3_2_version origin/rel-320 # Create a new branch based on rel-320 in the repository we cloned and check it out

Making Changes

Branching

Unlike some other VCSes, branching is a trivial operation in Git, so it is recommended you do all your work on a branch. Later on, you can use the branch name for various operations. It will also make things a bit simpler since you branch will be somewhat isolated from changes in other branches until you want to spend the time on integration.

E.g. Making a branch and checking it out

git branch new-feature master
git checkout new-feature

E.g. Doing a checkout of a brand-new branch

git checkout -b bug-283749 master

Either way, this will create a new branch that points to the same commit as "master", update the "HEAD" symref so that Git knows what branch to update with new commits, and update your working directory to the current state of that branch. If you have uncommitted changes, Git will attempt to preserve them by moving them onto the new branch, but that can fail, so it is best to do this on a "clean" tree.

Status

Git's "status" command will let you know what branch you're currently working on, and also the state of your changes to that branch, if any. There's also a Git "diff" command if you want some more detailed output on what changes you've made.

E.g. Brief status

git status

E.g. Differences between working directory and the index.

git diff

Identifying a Specific Version

There are some situations when you need to download a specified version of OpenEMR, E.G. when you're reproducing a bug someone else has reported, or creating an OpenEMR instance thats supposed to exactly match another instance.

To get the SHA of the tip of your current branch, run the following command:

git rev-parse HEAD

E.g. Get the SHA of any branch

git rev-parse branch-name

This will return a hexadecimal string that uniquely identifies that commit object. We can use this as a "version number". At the time we're writing this page, the version number for the master branch on the Gitorious mirror is "d1ace83dd2144b5c86fe84bb6333073e55664ae3". While the tip of a branch can move from one commit to another, these SHA values always refer to a specific object, so they can be recorded and used at a later time.

Note: If a branch is deleted, you may not be able to recover commits that only existed in that branch. The same is true if a branch is rebased, a process for "rewriting history" that generates different commits for the branch, and thus different SHA values.

Branching from a Specific Version

If you have a specific version you need to start work from, instead of the current tip of a branch, you can always start you work there instead of a branch. You'll need the SHA value of the commit object you want to start from.

E.g. Starting from the Gitoritous mirror's "master" when this article was written.

git branch reproduce-bug-from-dave d1ace83dd2144b5c86fe84bb6333073e55664ae3
git checkout reproduce-bug-from-dave

Now, SHA values are rather long hexadecimal strings that aren't easy to deal with. Luckily, git lets you abbreviate a SHA value to just enough character to make it unique in the current repository. When recording a SHA outside of Git, but sure to keep the whole thing; what is unique today might not be unique tomorrow. A git branch or tag that you don't change (social policy) is probably better than copy-and-paste anyway.

E.g. Abbreviated version of above

git checkout -b reproduce-bug-from-dave d1ace83

As I write this, 7 characters is enough to uniquely identify any commit in the OpenEMR Gitorious mirror, plus my local stuff. This will increase, but slowly and logarithmically.

"Staging" and Committing

Now, you can go editing code willy-nilly. Git's "status" command will keep track of your changes by comparing the working directory to the known state of the current branch. Eventually, you'll get to a nice stopping place and want to commit that to your local repository. You don't have to worry too much about what you commit to your local repository, there are tools for cleaning that up later if need be. The first step to letting Git know that you want to include certain changes in your commit.

E.g. "Staging" a single file's changes to be committed

git add path/to/a-file-i-touched

E.g. "Staging" most outstanding changes, including file additions, but not deletions

git add .

As you add files, you can see the output of Git's "status" command update indicating what files will be committed. You can also notice the output of Git's "diff" command get smaller (!). That's because the default behavior of Git's "diff" is to output the differences between the working tree and the "staging area" (a.k.a. the index). To see what you've already scheduled to commit, you can change the arguments of Git's "diff".

E.g. Viewing staged changes between the index and HEAD.

git diff --cached

E.g. Viewing all changes (staged and unstaged) between the working directory and HEAD.

git diff HEAD

Once you are happy with what you've scheduled for a commit, updating your branch is trivial with the Git's "commit" command.

E.g. Take staged changes and make a new commit.

git commit -m 'This is my message.  A message is required.'

The commit command captures the index (the "staging area") and creates a new commit object, it then updates the reference (branch) that the "HEAD" symref points to, by pointing that branch at the newly created commit. You working tree remains unchanged, so any changes you didn't stage are untouched and still unstaged.

Getting Updates

When you want new updates, Git's "fetch" command will go out to the network and retrieve them, but it doesn't touch your local branches at all. You provide it with a remote name, and it will use that name to determine the repository URL and update all your existing remote brances.

E.g.

git fetch origin

When you have updates in one branch and want to combine them into your current branch, you want Git's "merge" command. By default, this uses a recursive 3-way merge algorithm which can resolve many conflicts automatically. Git uses the history data stored in the repository to determine the 3rd, "base" commit to use in the algorithm.

E.g.

git merge origin/master

When there have been new branches created in one of the repositories you track (or there might have been) you can automatically create the new tracking branches using Git's "remote" command.

E.g.

git remote update origin

Above is a mention that "master" is automatically configured to track "origin/master". Because this is set up we can use Git's "pull" command to combine a "fetch" and a "merge". This is probably the easiest way to start integrating your changes with anything else that has happened to the project in the meantime.

E.g. Change to "master", fetch changes from the remote it tracks, and merge those changes in.

git checkout master
git pull

Sharing Your Changes

For this, we'll need to move beyond your local repository. You'll have to set up a repository that allows (at least) read-only access to other persons. There's a lot of way to go about that, but probably the easiest is to get a GitHub or Gitorious account and then use the fork (GitHub) or clone (Gitorious) button to create a repository. You'll also want to go through the GitHub/Gitorious documentation on generating (if needed) and registering an SSH key for read-write access to your repository.

(If for some reason you are averse to setting up a public Git repository, see below for an alternate way to share changes. It really doesn't take that much work to set up a public Git repository, though.)

Once all that is done, we can get back to your local repository. First we'll set up a new remote, using the read-write URL of the publicly available repository.

E.g. My GitHub repository

git remote add GitHub git@github.com:stephen-smith/openemr.git

This sets up a remote named "GitHub" (use whatever name you want) attached to the given URL, creates remote branches for all the branches in that repository, creates tags for all the tags in that repository, and fetches all the objects in those tags/branches and stores them in your local repository (if you don't have them already).

Then, we use Git's push command to update the remote repository with information.

E.g.

git push GitHub local-branch:remote-branch

This takes our local branch "local-branch" and sets the remote branch "remote-branch" to point to the same commit and transfers any objects the remote end needs. If the branches have the same name you can leave out the ":remote-branch". If you leave out the "local-branch" (but not the colon), the remote branch is deleted. If the remote branch exists, it is only updated if the update is "fast-forward"; you can force an update by prefixing the refspec ("local-branch:remote-branch") with a "+". If this is a one-off push, you can use a URL instead of a remote name. Basically, there are a lot of options. There's also a number of ways to shorten this command; if you are interested, read and understand the documentation for Git's "push" command.

E.g. If you set up everything right, things can get really simple:

git checkout my-feature
git pull
# Hack, fiddle, test, fix, etc.
git commit
# Repeat above a few times.
git rebase -i remote/my-feature # Cleanup
git push

If you want to get a review or have the code mainlined, you'll need to make sure someone with CVS access sees your branch. Brady and Stephen are on GitHub and Gitorious so a pull request (GitHub) or merge request (Gitorious) can get their attention individually. You can also use email or the SourceForge forums to ask for attention for your branch, just make sure to include both a (at least) read-only URL for your public repository and the branch name your changes are on.

Submitting Changes

The official way to submit patches is to use the SourceForge "Patches" tracker and post messages to the SourceForge "Developers" forum. If you are tracking your changes in Git, it is relatively simple to prepare a patch set for the tracker. Git's format-patch command will automatically write out each commit to a separate patch file. Then you can tar those up and attach the file.

An example of how to prepare patches:

git checkout feature-branch
git format-patch -o /tmp origin/master
tar cjf Feature.tbz --remove-files /tmp/*.patch

Of course, branch and file names change based on what one is doing. Descriptive names save time.