Background: I've recently inherited a set of projects at my company and I'm trying to sort out some fundamental issues with how they've been handled. Namely, the previous developers (who are no longer with the company) were not using any form of source control, made little documentation, and didn't really have any good development processes in place.
So now I've got three servers worth of projects (development, staging, production) which consist of mostly websites and applications and tools built for third-party applications and APIs we use, down to stores of SQL scripts and other things. My first thought was to get all of this into Git before changes and fixes are made, but I'm having a difficult time figuring out the best way to do it.
A lot of previous development was done directly on the production servers, which has created a divide between each server's code base. It's not immediately clear where all the differences lie - I'm seeing bug fixes on the production side that aren't carried over on development/staging, as well as new features on the development that haven't been moved up towards staging/production.
Question: What would be the best way for me to organize and move these into Git? How would I structure my repos/branches to accommodate the differences in the code?
I've considered continuing development from clones of the production server code and keeping the development/staging code bases as historical reference. Would this potentially be a point to start with, considering I don't know anything about the dev/staging code anyway? I could simply create repos of the production servers for each website, tool, script set, etc., create branches for the existing dev/staging code, and any new development would branch from the production server's code base. Does this make sense?
-
so all of the developers left before you started?Ewan– Ewan2018年01月25日 18:42:30 +00:00Commented Jan 25, 2018 at 18:42
-
Yes; it was only three developers on this particular set of projects, though they had been working on this stuff for quite a few years. I was told they left abruptly and I was brought in to start picking up the pieces of what they left behind.user9268966– user92689662018年01月25日 19:01:53 +00:00Commented Jan 25, 2018 at 19:01
-
Have a look at "nvie.com/posts/a-successful-git-branching-model" it is a model often used.Patrick Mevzek– Patrick Mevzek2018年01月25日 20:10:11 +00:00Commented Jan 25, 2018 at 20:10
-
1@RobertHarvey And? I'm using the same model on "one guy" software development (me), and the important point is the setup with branches such as: master, dev(elop), feature-X, hotfix-Y. This works irrespective on the number of people and repositories.Patrick Mevzek– Patrick Mevzek2018年01月25日 20:27:45 +00:00Commented Jan 25, 2018 at 20:27
-
2@RobertHarvey as I said: often used, obviously not a solution for 100% of use cases, but it is at least useful to read before deciding which model to use. And there were previous developers, so the lone guy may not be always alone... :-)Patrick Mevzek– Patrick Mevzek2018年01月25日 20:34:05 +00:00Commented Jan 25, 2018 at 20:34
3 Answers 3
Push the production stuff into the master
branch of a new repo. Create a develop
branch from that, and then merge the staging server into it. You may wind up with conflicts that need to be resolved. Once those are resolved, create another feature_branch
from develop
and merge the development server into it. Resolve any conflicts that arise.
This leaves you with 3 branches, which represent your production, staging, and development environments. Production -> master
, staging -> develop
, development -> feature_branch
. All development is thus done on feature_branches
and only merged in to the develop
branch when the feature is done, tested, and stable. Since it's stable, it can be used as staging. Cut a release
branch from develop
when you're ready to release, tie up any loose ends, merge that into master
, and then you have your new production build.
One of your first orders of business after getting this set up should be to merge the feature_branch
back into develop
*, and then develop
back into master
. Bear in mind that the feature_branch
may contain untested code and features, so exercise caution when merging it into develop
and then master
. Once that is done, all branches should contain the same code, and any development that was done on the production server is now ported back into the development "server".
In this model, each project would be in its own repo, and that repo would have a master
and develop
branch, plus feature_branches
for any work being done.
EDIT, to address comments: Yes, this is Gitflow.
This strategy (or Gitflow in general) keeps the existing 3-level system (production, staging, development) with a clear merge path from development on up to production. Importing the codebases this way also allows the branches to be synced up while maintaining the status quo in production - at least, until the merges can be tested. This accomplishes a few goals: gets the code in source control, gets the different codebases synced up and merged (so there's no longer bugfixes in production but not development), and provides a nice process to use going forward (a process that is well defined and used by a lot of people / teams / companies). If the OP finds that Gitflow isn't well suited for his projects / teams / company as he uses it / the company grows, then it's easy to change later on - but the critical point is that everything is in source control and development is being done on the right branch.
*You may wish to cut another feature branch and remove any obvious new features, and merge that branch into develop
(and then into master
). This keeps you from having to test new features on top of all the other tests you'll be doing.
-
1Sounds like GitFlow.Robert Harvey– Robert Harvey2018年01月25日 21:33:28 +00:00Commented Jan 25, 2018 at 21:33
-
1This is a bit of a cargo cult answer. How would gitflow specifically help solve the stated problem in the question?Mr Cochese– Mr Cochese2018年01月26日 10:33:03 +00:00Commented Jan 26, 2018 at 10:33
-
@MrCochese see my editmmathis– mmathis2018年01月26日 13:29:12 +00:00Commented Jan 26, 2018 at 13:29
-
At first, your answer seemed like just an explanation of Gitflow which wasn't what I was looking for, but your edit added the much needed context to really answer the question at hand. I won't be going with Gitflow since I don't think it's appropriate for the situation, however I appreciate the logic behind the idea and the thoroughness of it. I'd suggest adding more of your thought process to answers in the future to provide that context as I mentioned before.user9268966– user92689662018年01月29日 16:39:28 +00:00Commented Jan 29, 2018 at 16:39
I'm going to recommend the staging
code as the best baseline for your initial import. That's because there are changes in production
that aren't in staging
, due to the hot fixes, but far fewer if any changes in staging
that aren't in production
. Likewise there are changes in development
that aren't in staging
, due to the new features, but likely far fewer if any changes in staging
that aren't in development
.
Note, you do not want staging
to be your baseline after your initial import. This is just a temporary situation due to changes not being previously tracked. Branch operations go much more smoothly if you are adding changes rather than removing them. After your initial import, switch to whatever branching model suits your needs best going forward.
So, check your staging
code into a staging
branch, then do a git checkout -b master staging
to create your master
branch, and check your production code into there. Then do a git checkout -b development staging
to create your development
branch, and check your development code into there.
Now check out your development
branch and merge master
into it. This will let you solve the likely huge amount of merge conflicts while still maintaining master
as a record of what's actually in production. development
now contains all the changes from every environment. You can now switch to whatever branching model suits you best.
It's a good idea to have the history. I would create the repository (or one for each product) from the most stable environment. Create branches or diffs for the others.
At a high level:
- Create a new repo
- From a production-based working copy: add all, commit, and push
- Checkout master to a new directory
- For each additional environment
XYZ
- Create branch
Archive-XYZ
- Replace everything with
XYZ
source (except .git) - add all, commit, and push
- Create branch
Alternatively, if you're skeptical of the value of this, git diff > XYZ.diff
instead of actually committing and pushing, and archive the diff's.
Either way, you should end in a state where you can easily compare the code you have running in each environment, which you can use to settle on an single starting point for each project. And, if something breaks, you'll theoretically be able to compare your changes against any of the three environment.
Explore related questions
See similar questions with these tags.