Difference between revisions of "Using Git with OpenEMR"
(items on the wiki should not be marked as 'owned' by someone, it discourages other developers from editing.) |
(No difference)
|
Revision as of 01:57, 26 August 2011
Unofficial
Currently we have two semi-official Git repositories for OpenEMR. The first is hosted by GitHub; the second is hosted by Gitorious. Both are maintained by Brady Miller and updated from CVS every half-hour. The official repository is still our CVS repository on SourceForge. There is a possibility that we could move to Git for our official repository but there are a number of issues to resolve before that can happen. We've moved to Git on SourceForge! This section should be updated.
OpenEMR git Howto Instructions For DUMMIES
git For Dummies: This is a quick, practical OpenEMR walk through by brady with the goal of getting new developers up and working with git as quick as possible. It is not meant to serve as a substitute for the more extensive documentation below.
Git for Computer Scientists is Git documentation for those developers that aren't "Dummies".
Other Documentation
This is not meant to be an exhaustive guide, at all. It's more like a walk-through of one possible Git workflow. The Git project has produced some fairly high-quality documentation, so you should be able to find answers to your questions there. The first place to check is the documentation included with a Git installation; it may not all make sense initially, depending on what documentation you are reading, but things really start to click if you stretch yourself a little bit and experiment.
E.g. Listing and summary of important Git commands
git --help
E.g. All Git commands, including some you'll never use, and some that might not be part of the official Git distribution
git --help -a
E.g. Lots of details about how to use the "pull" command
git pull --help
In addition to the "built-in" documentation. There's also a good tutorial or two, a user's manual , a comprehensive guide, and lots of people write about Git on the Internet.
Getting The Code
Using either repository, you will first determine the URL of the repository. Normally, you will want to use the "git" protocol, as it takes less bandwidth (and time) on both ends. If you are behind a restrictive firewall, you might need to use the "http" protocol instead. Either one is read-only on both repositories. Once you have the URL you can use Git's "clone" command to fetch the entire development history.
E.g. GitHub using HTTP
git clone http://github.com/openemr/openemr.git OpenEMR
E.g. Gitorious using Git protocol
git clone git://gitorious.org/openemr/openemr.git OpenEMR
Either way, this will give you an "OpenEMR" directory that contains the whole development history and metadata inside its ".git" subdirectory. It will configure a Git "remote" named "origin" that is associated with the URL you provided. You will have remote branches "origin/master" and "origin/rel-320". You will have a local branch "master" that points to the same commit object as "origin/master" and your local "master" branch will be checked out into the working directory ("OpenEMR"). Your "master" branch will also be set to automatically merge in changes from "origin/master".
Checking Out Earlier Releases
Git is also useful for working against the 3.2.0 released version of OpenEMR. To check out the released 3.2.0 branch you have to perform the above clone, and then check out a copy of OpenEMR corresponding to the tag v3_2_0 (found via the git tag -l command). It is highly recommended you then start a new branch immediately, so that you can save changes.
git branch my_3_2_0_version v3_2_0 # Create a new branch based on the tag git checkout my_3_2_0_version # Checkout the branch just created
We also have a rel-320 branch, which is the 3.2.0 tarball, plus released patches and any unreleased, back-ported patches. You can work directly on your own rel-320 branch, but that has the same disadvantages as working directly on your master branch. Again, we recommend you create another branch for working on it.
git checkout -b my_3_2_version origin/rel-320 # Create a new branch based on rel-320 in the repository we cloned and check it out
Making Changes
Branching
Unlike some other VCSes, branching is a trivial operation in Git, so it is recommended you do all your work on a branch. Later on, you can use the branch name for various operations. It will also make things a bit simpler since you branch will be somewhat isolated from changes in other branches until you want to spend the time on integration.
E.g. Making a branch and checking it out
git branch new-feature master git checkout new-feature
E.g. Doing a checkout of a brand-new branch
git checkout -b bug-283749 master
Either way, this will create a new branch that points to the same commit as "master", update the "HEAD" symref so that Git knows what branch to update with new commits, and update your working directory to the current state of that branch. If you have uncommitted changes, Git will attempt to preserve them by moving them onto the new branch, but that can fail, so it is best to do this on a "clean" tree.
Status
Git's "status" command will let you know what branch you're currently working on, and also the state of your changes to that branch, if any. There's also a Git "diff" command if you want some more detailed output on what changes you've made.
E.g. Brief status
git status
E.g. Differences between working directory and the index.
git diff
Identifying a Specific Version
There are some situations when you need to download a specified version of OpenEMR, E.G. when you're reproducing a bug someone else has reported, or creating an OpenEMR instance thats supposed to exactly match another instance.
To get the SHA of the tip of your current branch, run the following command:
git rev-parse HEAD
E.g. Get the SHA of any branch
git rev-parse branch-name
This will return a hexadecimal string that uniquely identifies that commit object. We can use this as a "version number". At the time we're writing this page, the version number for the master branch on the Gitorious mirror is "d1ace83dd2144b5c86fe84bb6333073e55664ae3". While the tip of a branch can move from one commit to another, these SHA values always refer to a specific object, so they can be recorded and used at a later time.
Note: If a branch is deleted, you may not be able to recover commits that only existed in that branch. The same is true if a branch is rebased, a process for "rewriting history" that generates different commits for the branch, and thus different SHA values.
Branching from a Specific Version
If you have a specific version you need to start work from, instead of the current tip of a branch, you can always start you work there instead of a branch. You'll need the SHA value of the commit object you want to start from.
E.g. Starting from the Gitoritous mirror's "master" when this article was written.
git branch reproduce-bug-from-dave d1ace83dd2144b5c86fe84bb6333073e55664ae3 git checkout reproduce-bug-from-dave
Now, SHA values are rather long hexadecimal strings that aren't easy to deal with. Luckily, git lets you abbreviate a SHA value to just enough character to make it unique in the current repository. When recording a SHA outside of Git, but sure to keep the whole thing; what is unique today might not be unique tomorrow. A git branch or tag that you don't change (social policy) is probably better than copy-and-paste anyway.
E.g. Abbreviated version of above
git checkout -b reproduce-bug-from-dave d1ace83
As I write this, 7 characters is enough to uniquely identify any commit in the OpenEMR Gitorious mirror, plus my local stuff. This will increase, but slowly and logarithmically.
"Staging" and Committing
Now, you can go editing code willy-nilly. Git's "status" command will keep track of your changes by comparing the working directory to the known state of the current branch. Eventually, you'll get to a nice stopping place and want to commit that to your local repository. You don't have to worry too much about what you commit to your local repository, there are tools for cleaning that up later if need be. The first step to letting Git know that you want to include certain changes in your commit.
E.g. "Staging" a single file's changes to be committed
git add path/to/a-file-i-touched
E.g. "Staging" most outstanding changes, including file additions, but not deletions
git add .
As you add files, you can see the output of Git's "status" command update indicating what files will be committed. You can also notice the output of Git's "diff" command get smaller (!). That's because the default behavior of Git's "diff" is to output the differences between the working tree and the "staging area" (a.k.a. the index). To see what you've already scheduled to commit, you can change the arguments of Git's "diff".
E.g. Viewing staged changes between the index and HEAD.
git diff --cached
E.g. Viewing all changes (staged and unstaged) between the working directory and HEAD.
git diff HEAD
Once you are happy with what you've scheduled for a commit, updating your branch is trivial with the Git's "commit" command.
E.g. Take staged changes and make a new commit.
git commit -m 'This is my message. A message is required.'
The commit command captures the index (the "staging area") and creates a new commit object, it then updates the reference (branch) that the "HEAD" symref points to, by pointing that branch at the newly created commit. You working tree remains unchanged, so any changes you didn't stage are untouched and still unstaged.
Getting Updates
When you want new updates, Git's "fetch" command will go out to the network and retrieve them, but it doesn't touch your local branches at all. You provide it with a remote name, and it will use that name to determine the repository URL and update all your existing remote brances.
E.g.
git fetch origin
When you have updates in one branch and want to combine them into your current branch, you want Git's "merge" command. By default, this uses a recursive 3-way merge algorithm which can resolve many conflicts automatically. Git uses the history data stored in the repository to determine the 3rd, "base" commit to use in the algorithm.
E.g.
git merge origin/master
When there have been new branches created in one of the repositories you track (or there might have been) you can automatically create the new tracking branches using Git's "remote" command.
E.g.
git remote update origin
Above is a mention that "master" is automatically configured to track "origin/master". Because this is set up we can use Git's "pull" command to combine a "fetch" and a "merge". This is probably the easiest way to start integrating your changes with anything else that has happened to the project in the meantime.
E.g. Change to "master", fetch changes from the remote it tracks, and merge those changes in.
git checkout master git pull
Sharing Your Changes
For this, we'll need to move beyond your local repository. You'll have to set up a repository that allows (at least) read-only access to other persons. There's a lot of way to go about that, but probably the easiest is to get a GitHub or Gitorious account and then use the fork (GitHub) or clone (Gitorious) button to create a repository. You'll also want to go through the GitHub/Gitorious documentation on generating (if needed) and registering an SSH key for read-write access to your repository.
(If for some reason you are averse to setting up a public Git repository, see below for an alternate way to share changes. It really doesn't take that much work to set up a public Git repository, though.)
Once all that is done, we can get back to your local repository. First we'll set up a new remote, using the read-write URL of the publicly available repository.
E.g. My GitHub repository
git remote add GitHub git@github.com:stephen-smith/openemr.git
This sets up a remote named "GitHub" (use whatever name you want) attached to the given URL, creates remote branches for all the branches in that repository, creates tags for all the tags in that repository, and fetches all the objects in those tags/branches and stores them in your local repository (if you don't have them already).
Then, we use Git's push command to update the remote repository with information.
E.g.
git push GitHub local-branch:remote-branch
This takes our local branch "local-branch" and sets the remote branch "remote-branch" to point to the same commit and transfers any objects the remote end needs. If the branches have the same name you can leave out the ":remote-branch". If you leave out the "local-branch" (but not the colon), the remote branch is deleted. If the remote branch exists, it is only updated if the update is "fast-forward"; you can force an update by prefixing the refspec ("local-branch:remote-branch") with a "+". If this is a one-off push, you can use a URL instead of a remote name. Basically, there are a lot of options. There's also a number of ways to shorten this command; if you are interested, read and understand the documentation for Git's "push" command.
E.g. If you set up everything right, things can get really simple:
git checkout my-feature git pull # Hack, fiddle, test, fix, etc. git commit # Repeat above a few times. git rebase -i remote/my-feature # Cleanup git push
If you want to get a review or have the code mainlined, you'll need to make sure someone with CVS access sees your branch. Brady and Stephen are on GitHub and Gitorious so a pull request (GitHub) or merge request (Gitorious) can get their attention individually. You can also use email or the SourceForge forums to ask for attention for your branch, just make sure to include both a (at least) read-only URL for your public repository and the branch name your changes are on.
Official
The official way to submit patches is to use the SourceForge "Patches" tracker and post messages to the SourceForge "Developers" forum. If you are tracking your changes in Git, it is relatively simple to prepare a patch set for the tracker. Git's format-patch command will automatically write out each commit to a separate patch file. Then you can tar those up and attach the file.
E.g. My command line for preparing patches
git checkout feature-branch git format-patch -o /tmp origin/master tar cjf Feature.tbz --remove-files /tmp/*.patch
Of course, branch and file names change based on what I'm doing. Descriptive names save time.