How to Collaborate with Git¶
Git is a powerful tool for version control of software and other plain-text information. However, Git alone is not ideal for enabling and facilitating collaboration between many users working on the same research software project. Be sure to check the Important Note on Terms if you aren't familiar with Git.
If you are here because you need to know how to get software from either instance, please see Obtaining Software below.
If you are here because you need a place to collaborate with others on a software project, please see Collaborating below.
Important Note on Terms¶
On most pages in this documentation, we use "local" to refer to your laptop or desktop computer, and "remote" to refer to Cheaha or a Cloud.rc virtual machine.
When dealing with Git and repository hosting services like GitHub and GitLab we use remote to refer to repositories that are on a repository hosting site like GitHub or GitLab, and local to refer to repositories that are not on a repository hosting site.
To summarize:
- Git, GitHub, GitLab context
local
- the repository on a computer where you work on your code (laptop, desktop, Cheaha)remote
- a remote server or service that stores code (github, gitlab, etc)
- Cheaha and Cloud.rc context
local
- the machine (laptop, desktop) you use to access Cheaha or the Cloud.rc VMremote
- Cheaha or the Cloud.rc VM
For Obtaining Software¶
Cloning from GitHub¶
To do anything with GitHub, you will first need to navigate to their website https://github.com and create an account.
To clone a repository, be sure you have the repository URL. Then, using git
at a terminal, clone the repository using whatever settings are appropriate. GitHub repository pages look something like the page for this documentation, shown below.
You may also use the "Code" button on the page to see instructions for cloning the repository.
More in-depth instructions, including for SSH cloning, are provided at the official documentation.
Cloning from GitLab¶
To do anything with our GitLab instance, you will first need to create an account. Please see our GitLab Account Management page.
To clone a repository, be sure you have the repository URL. Then, using git
at a terminal, clone the repository using whatever settings are appropriate. Be sure to append .git
to the end of the repository or the clone will note be successful. For example, if the URL is https://gitlab.rc.uab.edu/user/repository
then you will clone https://gitlab.rc.uab.edu/user/repository.git
. GitLab repository pages look like the example shown below.
You may also use the "Clone" button on the page to see instructions for cloning the repository.
More in-depth instructions, including for SSH cloning, are provided at the official documentation.
For Collaborating¶
GitHub and GitLab can both be used for software project management, and have helpful tools to facilitate group collaboration within projects and across multiple projects.
Both services use organizations to manage projects across a team of people: GitHub docs page, GitLab docs page. Within a GitHub organization, people and repositories can be arranged into teams. GitLab allows arrangement of people and repositories with projects.
An important feature, used extensively for this documentation's GitHub repository, is the issue tracker. Both GitHub and GitLab have per-repository issue trackers. Collaborators can create and manage issues, label them, and resolve them.
How do I Choose Between GitHub and GitLab?¶
- Want to collaborate publicly and outside UAB? Consider using GitHub.
- Want your project private or internal to UAB? Consider using our GitLab instance.
It is possible to collaborate publicly using GitLab, but there may be additional challenges. While external collaborators can see a public GitLab repository on our instance, they can't make any changes or create issues without a XIAS Account.
It is possible to collaborate privately using GitHub with no additional hurdles, but if your project contains sensitive or protected information of any kinds, it should not be posted to GitHub, even in private repositories. Please consult with us via Support before
Good Practice for Organizing a Lab Space¶
Below is a bulleted list of good practices for organizing a lab space. Each bullet is followed by links to relevant GitHub and GitLab documentation pages, as appropriate.
- Have an organization for your lab space. GitHub GitLab
- For each software project, create a repository within your organization. GitHub, GitLab
- By default, organization members will have access at their assigned role level. These can be changed by managing roles and using teams effectively, if needed. For smaller labs this is often not necessary.
- The created repository is the central one for the organization and should not be changed directly.
- For every individual, including owners and admins, work should be performed on a personal fork of the repository and then merged by submitting pull/merge requests.
- Forks are copies of repositories made as a snapshot at the moment they are created. From that point on they are independent repositories with some features to facilitate collaborative workflows. GitHub GitLab
- Pull/merge requests allow individuals to contribute to a central repository. They allow reviewers to check the changes to ensure code quality, and to provide reviews or request changes. They are the primary means of controlling how code changes over time, and who is allowed to make those changes. GitHub GitLab
- See the Fork-Pull/Merge Request Workflow Section for more details on this valuable method of change management.
The Fork-Pull/Merge Request Workflow¶
The Fork-Pull/Merge Request workflow is a central concept to effective collaboration on individual repositories. It allows code owners and admins to effectively control how code changes, while giving accountability and credit to code maintainers and programmers. Every person working on a project has an effective means of working independently while being able to pull their changes together in a central location. It also neatly ties into issue tracking, which is discussed in the Issue Tracking Section.
The workflow assumes a central repository already exists within an organization on either GitHub or GitLab. The workflow is written from the point of view of a new programmer who wants to work on the repository. The programmer must have a local machine where they will do their work and it must have Git installed.
- One-time setup
- Workflow
- Decide on a set of changes to make. Good practice is only working on one conceptual unit at a time. One feature, one bug fix, or one documentation page. Prefer fixing bugs before adding features.
- Synchronize your downstream fork with the upstream fork to minimize risk of merge conflicts. GitHub GitLab
- Pull the downstream fork main branch to your local clone main branch.
- Create a working branch for intended changes. Give it a short, descriptive name like
feature-add-button
orfix-broken-link
. - Checkout the working branch.
- Make changes to the code on the local machine using your preferred editor. Make small units of change at a time, try not to commit too much, but make sure your changes don't break the code. There is an art to this that comes with practice, but don't be afraid of trying.
- Commit those changes to the working branch. Keep making changes and committing until the set of changes is complete.
- When all needed changes have been made, push the working branch to your fork.
- Create a pull/merge request from the downstream working branch to the upstream main branch. GitHub GitLab
- Wait for reviews, make needed changes, and hopefully merging of your request.
Sometimes merging will be blocked because of a merge conflict. One programmer may make changes to code being worked on by another, and the two changes come into conflict. If this occurs, below are some steps that may help resolve the issue. In some cases, conflict resolution is straightforward, but in other cases thought will be necessary to disentangle what code should be kept, what should be discarded, and what should be modified.
- The downstream programmer should try synchronizing their fork, pulling it to their local main branch, and [merging] the main branch into their working branch. The conflict may still occur on their local machine, but they will be able to more easily see and test the effects of various conflict resolution attempts.
- Use a three-way diff program or editor which will let you see both sets of conflicting code, and facilitate making changes and selections. VSCode has a built-in three-way merge editor.
- Be sure everyone is using the same formatting rules in their editors. Sometimes spurious conflicts can occur as a result of inconsistent formatting.
- To minimize risk of conflict, don't have more than one programmer work on the same section of code if possible.
Effective Issue Tracking¶
Effective use of issue tracking can greatly reduce cognitive load and simplify code management. It gives a central location where users and maintainers can report bugs, make feature requests, and ask for clarifications on usage and documentation. These issues are tracked over time, can be labeled and organized, and closed and reopened. GitHub GitLab
The typical issue lifecycle, at a high level, is something like below.
- Create an issue. GitHub GitLab
- Ask for clarifications and discuss as needed.
- Use the Fork-Pull/Merge Request Workflow to resolve the issue. In the Pull Request description, put the text
Fixes #...
where...
should be replaced by the issue's number. When the request is merged, the issue will automatically be linked to the request and closed.
Common Scenarios¶
Uploading an Existing Code Folder¶
The process for this has a few intricate steps that may be unfamiliar even to regular users of git, and has a few pitfalls.
- Use
git init
in the top-level code folder on the local machine, if it is not already a git repository. If it already is a repository, be sure the primary branch is calledmain
. Usegit branch -m <oldname> main
. - Create a repository on the remote server GitHub, GitLab
- Use
git remote add origin <url>
to add the remote URL to the local repository with the nameorigin
. - Verify the URL is correct with
git remote -v
. Fix it withgit remote set-url origin <url>
if needed. - Checkout the main branch without
git checkout main
. - Use
git pull origin main --allow-unrelated-histories
to combine the main branches of the remote and local repository, within your local repository. - Use
git push origin main
to push the combined histories to the remote repository. - Be sure to verify the repository looks good at the GitHub/GitLab repository page (depending on which you used).
Note
--allow-unrelated-histories
is necessary because Git considers the remote repository to be a completely distinct entity from the local repository. Their histories are unrelated.
HTTPS vs SSH Access¶
For most beginners using Git, GitHub and GitLab, HTTPS (hypertext transfer protocol secure) is probably a sufficient method for accessing in early stages. It is the default mode of accessing GitHub and GitLab when using Git at the command line. HTTPS is less secure than SSH (secure shell). We recommend learning to use SSH as soon as possible to minimize security risks. Below are links to GitHub and GitLab documentation for using SSH.