Moving Git Repository Files Preserving Change History
- 1375 words
- 7 minutes read
- Updated on 4 Sep 2024
Sometimes, you realize that a common library is being used in a single project but resides in its own Git repository. At other times, you may decide to merge a few Git repositories into one to reduce maintenance issues. There are many other similar scenarios that involve moving a portion of one Git repository into another. A common problem that unites all these cases is how to preserve the Git history for the moved files.
The problem can be described as follows: given two Git repositories, source and target, the task is to move content from source to target, either wholly or partially, while preserving the Git change history.
How to preserve change history
The main idea, as described in this source of wisdom, is to create a patch file that includes all commits for the files you want to move. Then, apply that patch to a target repository, as shown in the code snippet below.
# Go into the source git repo and create a patch.
git log --pretty=email --reverse --binary --full-index \
--patch-with-stat --first-parent -m \
-- <path-to-file-or-directory> > history.patch
# Go into the target git repo and apply the patch.
git am --committer-date-is-author-date \
< <path-to-source-repo-with-patch>/history.patch
Options used in the aforementioned commands:
--pretty=email
- shows commit logs in theemail
format, which is required bygit am
to apply the patch.--reverse
- lists commits in reverse order.--binary
- includes a binary diff in the patch file.--full-index
- shows the full hash for each commit on the “index” line when generating patch output.--patch-with-stat
- generates a patch with diff statistics, which is a synonym for-p --stat
.--first-parent
- follows only the first parent commit when encountering a merge commit.-m
- provides information for each commit, including merge commits. Without this option the individual changes introduced by merge commits are not shown.-- <path-to-file-or-directory>
- shows only commits related to a specified file or directory.--committer-date-is-author-date
- keeps the committer date the same as the author date to preserve the original timestamps of the changes.
You may encounter some variations depending on the resulting directory structure where you want the files to be moved. However, the approach is almost the same. To gain hands-on experience, let’s use two Git repositories and explore the main possible scenarios for moving files along with their change history.
Setting up the stage
I have created two Git repositories with their initial history. Here’s the structure of the source-git-repo
repository:
source-git-repo
└── app
├── docs
│ └── HOWTO.txt
└── src
├── main
│ └── java
│ └── App.java
└── test
└── java
└── AppTest.java
And here’s the structure of the target-git-repo
repository:
target-git-repo
├── README.md
└── app
└── src
└── main
├── cpp
│ └── app.cpp
└── headers
└── app.h
Moving files into the same directory
Let’s move source-git-repo/app/src
into target-git-repo/app/src
.
Create a patch specifically for a directory to be moved:
cd source-git-repo
git log --pretty=email --reverse --binary --full-index \
--patch-with-stat --first-parent -m \
-- app/src > history.patch
Now, go to the target repository and apply the patch:
cd ../target-git-repo
git am --committer-date-is-author-date < ../source-git-repo/history.patch
# Output of the previous command:
Applying: source-git-repo -> add simple App
Applying: source-git-repo -> remove redundant comments
Applying: source-git-repo -> add tests
Applying: source-git-repo -> remove java comments from tests
Applying: source-git-repo -> move to default package
Applying: source-git-repo -> fix compilation errors
If you check the target repository’s structure, you’ll see new files have appeared under the correct structure:
target-git-repo
├── README.md
└── app
└── src
├── main
│ ├── cpp
│ │ └── app.cpp
│ ├── headers
│ │ └── app.h
│ └── java
│ └── App.java
└── test
└── java
└── AppTest.java
Let’s also check the Git history for the target repository before and after the patch is applied:
# Before patch
git log --pretty=oneline
238ebaae28d50f0a5598ea6c82a2f10f47552a66 (HEAD -> main) target-git-repo -> remove gradle and simplify repo
a00279e8ab44f3623e9b79f8915ffe1592709e38 target-git-repo -> init cpp app skeleton
# After patch
git log --pretty=oneline
21d445aab6ed57cda334021edac818f9d694d078 (HEAD -> main) source-git-repo -> fix compilation errors
16509a5d165e42637d4d34705a3d90f7087d2016 source-git-repo -> move to default package
051f615172de04db74869882d39183c8f2a5d2fd source-git-repo -> remove java comments from tests
47b06a0a1df7cd252fa6d7904224081a40a82ced source-git-repo -> add tests
c24f2d1faa637781d5e171904d3c78764dc1cee7 source-git-repo -> remove redundant comments
46220bc67b4a7121b396cd61e5650ca74862e776 source-git-repo -> add simple App
238ebaae28d50f0a5598ea6c82a2f10f47552a66 target-git-repo -> remove gradle and simplify repo
a00279e8ab44f3623e9b79f8915ffe1592709e38 target-git-repo -> init cpp app skeleton
You can see that the commits from the source repository were applied on top of the existing history. Therefore, we have successfully moved the files and preserved their change history in the target repository.
Moving files into a different directory
Let’s do the following file movement now: source-git-repo/app/*
-> target-git-repo/app-java
.
You don’t want the app
directory to be present in app-java
; instead,
you want all subdirectories and files from app
to be directly present under app-java
.
Let’s create a patch for this scenario:
cd source-git-repo
git log --pretty=email --reverse --binary --full-index \
--patch-with-stat --first-parent -m \
-- app > history.patch
Now, to apply it to the target repository and properly organize the new files, you need to use two additional options
-p2
and --directory
:
cd ../target-git-repo
git am -p2 --committer-date-is-author-date --directory app-java \
< ../source-git-repo/history.patch
# Output of the previous command:
Applying: source-git-repo -> add simple App
Applying: source-git-repo -> remove redundant comments
Applying: source-git-repo -> add build.gradle
Applying: source-git-repo -> add tests
Applying: source-git-repo -> remove java comments from tests
Applying: source-git-repo -> move to default package
Applying: source-git-repo -> fix compilation errors
Applying: source-git-repo -> fix build errors
Applying: source-git-repo -> simplify repo
Here, you are applying the patch while removing a leading path component source-git-repo/app
(stripping directory information)
using the -p2
option and placing the patched files into a subdirectory
named target-git-repo/app-java
via the --directory
option.
Take a look at the resulted structure of the target repository:
target-git-repo
├── README.md
├── app
│ └── src
│ └── main
│ ├── cpp
│ │ └── app.cpp
│ └── headers
│ └── app.h
└── app-java
├── docs
│ └── HOWTO.txt
└── src
├── main
│ └── java
│ └── App.java
└── test
└── java
└── AppTest.java
As expected, all the content from source-git-repo/app
is now available in target-git-repo/app-java
.
The Git history contains all changes related to the moved files:
git log --pretty=oneline
a770f378bced672290fc62da66145a9e3b073f37 (HEAD -> main) source-git-repo -> simplify repo
b16dd23121d56e9d0703a2ec5628f2c743207500 source-git-repo -> fix build errors
ae07d962403660b493d9a6454b0f8d0813015fb4 source-git-repo -> fix compilation errors
a2d1fe162ec9009f218031a3f49b9087683c6046 source-git-repo -> move to default package
fda2ef3b4d1c559f0ebb39eda3f61c7f62c68d81 source-git-repo -> remove java comments from tests
fd158342bc14c25db7aa7faba881e555012102a7 source-git-repo -> add tests
c7e626fafd5ad552f8f4ca0b0a038e8753498d89 source-git-repo -> add build.gradle
058a0331d922654b0c5209c3b3e7c227226a763f source-git-repo -> remove redundant comments
6b85c90fbfd5ae978946e117b01188b99a487a86 source-git-repo -> add simple App
238ebaae28d50f0a5598ea6c82a2f10f47552a66 target-git-repo -> remove gradle and simplify repo
a00279e8ab44f3623e9b79f8915ffe1592709e38 target-git-repo -> init cpp app skeleton
Moving a directory while excluding specific files or subdirectories
Let’s say you want to move source-git-repo/app
into target-git-repo/app
,
but you don’t want source-git-repo/app/src/test
to be included in the target repository.
This can be achieved using an exclusion string called pathspec
during patch creation, as follows:
cd source-git-repo
git log --pretty=email --reverse --binary --full-index \
--patch-with-stat --first-parent -m \
-- app ':!app/src/test' > history.patch
Where -- app ':!app/src/test'
means: obtain all changes related to app
directory but exclude all changes
under the app/src/test
path.
Apply the patch:
cd ../target-git-repo
git am --committer-date-is-author-date \
< ../source-git-repo/history.patch
The structure of the target repository includes everything from source-git-repo/app
,
excluding all files and subdirectories under source-git-repo/app/src/test
:
target-git-repo
├── README.md
└── app
├── docs
│ └── HOWTO.txt
└── src
└── main
├── cpp
│ └── app.cpp
├── headers
│ └── app.h
└── java
└── App.java
Summary
The approach to move files while preserving their change history, as described in this post, is one of many. It is not ideal and has its own drawbacks:
- It may be challenging to create a patch for files with a very large change history.
- The change history for moved files is applied on top of the existing history in the target repository. This can result in a lengthy list of commits related to the moved files on top of the current history, making it challenging to navigate the current history.
However, it gets the job done and can be considered a good starting point. If you know of a better approach, I would be eager to learn more. Please share in the comments below.