Combining Repositories

posted by Steve Losh on November 17, 2009

Thomas’ last tip about decomposing repositories made me want to write about the inverse: combining two separate repositories into a single repository.

Let’s say you’ve working on a project. The code lives in project while the documentation lives in docs. You might decide that it would be nice to have them in the same repository, so when someone clones your project’s code they get the documentation too.

You could just create a new repository and copy all the data into it, but that wouldn’t let you keep your nice history of changesets. Let’s take a look at how to combine these separate repositories.

Create the New, Combined Repository

First we’ll create the new repository where everything is going to live:

$ ls
total 24
drwxr-xr-x  5 sjl   170B Nov 17 19:56 docs
drwxr-xr-x  4 sjl   136B Nov 17 19:58 project

$ mv project project-code

$ hg init project

$ ls
total 24
drwxr-xr-x  5 sjl   170B Nov 17 19:56 docs
drwxr-xr-x  3 sjl   102B Nov 17 20:04 project
drwxr-xr-x  4 sjl   136B Nov 17 19:58 project-code

We’ve moved project to project-code to get it out of the way for the moment.

Prepare Each Repository

We probably don’t want to just dump everything into the root folder of the new repository, so let’s move things around a bit. First we’ll move everything in the project repository into a src/ directory:

$ cd project-code

$ mkdir src

$ mv * src
mv: rename src to src/src: Invalid argument

$ ls
total 0
drwxr-xr-x  3 sjl   102B Nov 17 20:08 src

$ hg addremove --similarity 100
removing myproject.py
adding src/myproject.py
recording removal of myproject.py as rename to src/myproject.py (100% similar)

$ hg commit -m 'Move the code into the src/ directory.'

Now we’ll do the same for the docs repository:

$ cd ../docs

$ mkdir docs

$ mv * docs
mv: rename docs to docs/docs: Invalid argument

$ ls
total 0
drwxr-xr-x  4 sjl   136B Nov 17 20:11 docs

$ hg addremove --similarity 100
removing LICENSE
removing README
adding docs/LICENSE
adding docs/README
recording removal of LICENSE as rename to docs/LICENSE (100% similar)
recording removal of README as rename to docs/README (100% similar)

$ hg commit -m 'Move the documentation into the docs/ directory.'

Pull Both Repositories

Now that we’ve adjusted the structure to our liking, we need to pull both repositories into project:

$ cd ..

$ ls
total 24
drwxr-xr-x  4 sjl   136B Nov 17 20:12 docs
drwxr-xr-x  3 sjl   102B Nov 17 20:04 project
drwxr-xr-x  4 sjl   136B Nov 17 20:08 project-code

$ cd project

$ ls

$ hg pull --update ../project-code
pulling from ../project-code
requesting all changes
adding changesets
adding manifests
adding file changes
added 4 changesets with 4 changes to 2 files
1 files updated, 0 files merged, 0 files removed, 0 files unresolved

$ ls
total 0
drwxr-xr-x  3 sjl   102B Nov 17 20:15 src

$ hg pull --force --update ../docs
pulling from ../docs
searching for changes
warning: repository is unrelated
adding changesets
adding manifests
adding file changes
added 3 changesets with 4 changes to 4 files (+1 heads)
not updating, since new heads added
(run 'hg heads' to see heads, 'hg merge' to merge)

$ ls
total 0
drwxr-xr-x  3 sjl   102B Nov 17 20:15 src

Notice that we used the --force flag with hg pull to say to Mercurial: “It’s okay, I know what I’m doing, I really do want to combine these repositories.”

Merge the Repositories

Take a look at the output of the last ls command. See how there’s still just the src/ directory? We need to merge the two repositories together to get a truly “combined” repository.

To make this a bit more clear, let’s look at the graph of our new repository:

$ hg glog
o  6 Move the documentation into the docs/ directory. (7 minutes ago by Steve Losh) tip
|
o  5 Add a LICENSE file. (22 minutes ago by Steve Losh)
|
o  4 Add a README file. (22 minutes ago by Steve Losh)

@  3 Move the code into the src/ directory. (10 minutes ago by Steve Losh)
|
o  2 Fix a bug. (21 minutes ago by Steve Losh)
|
o  1 Implement some basic functionality. (21 minutes ago by Steve Losh)
|
o  0 Start the project. (21 minutes ago by Steve Losh)

See how we have two separate graphs? Changesets 0 to 3 are linked, and changesets 4 to 6 are linked. Now we need to merge these two graphs together. This should be trivial because we’ve organized the folder structure beforehand so there won’t be any conflicts:

$ hg update 6
2 files updated, 0 files merged, 1 files removed, 0 files unresolved

$ hg merge 3
1 files updated, 0 files merged, 0 files removed, 0 files unresolved
(branch merge, don't forget to commit)

$ hg commit -m 'Combine the source and docs repositories.'

Now take a look at the graph:

$ hg glog
@    7 Combine the source and docs repositories. (37 seconds ago by Steve Losh) tip
|\
| o  6 Move the documentation into the docs/ directory. (11 minutes ago by Steve Losh)
| |
| o  5 Add a LICENSE file. (26 minutes ago by Steve Losh)
| |
| o  4 Add a README file. (27 minutes ago by Steve Losh)
|
o  3 Move the code into the src/ directory. (15 minutes ago by Steve Losh)
|
o  2 Fix a bug. (25 minutes ago by Steve Losh)
|
o  1 Implement some basic functionality. (25 minutes ago by Steve Losh)
|
o  0 Start the project. (26 minutes ago by Steve Losh)

$ ls
total 0
drwxr-xr-x  4 sjl   136B Nov 17 20:22 docs
drwxr-xr-x  3 sjl   102B Nov 17 20:22 src

Our two separate repositories are now nicely merged into one, with the changesets intact! Now we can delete the old repositories and push the new one to our public server for people to use.

This is especially nice because we haven’t changed the changeset hashes, which means that we can easily merge changesets from people that are still using the old, separate repositories.

What can you do when you realize: “Oh, maybe the documentation should be in the same repository as the code?”