in our company we have a huge code base (>100000 files) and so we keep it in several git repositories. So we have a forest of repositories and one super repository containing only submodule references on top of that.
The idea is to have the super repository just as a convenience glue and update it automatically whenever a developer updates any submodule.
I have experimented with the post-receive hook and ended up with the following implementation:
(it involves git plumbing in order to be able to modify the bare repository directly)
#!/bin/bash -e
UPDATED_BRANCHES="^(master|develop)$"
UPDATED_REPOS="^submodules/.+$"
# determine what branch gets modified
read REV_OLD REV_NEW FULL_REF
BRANCH=${FULL_REF##refs/heads/}
if [[ "${BRANCH}" =~ ${UPDATED_BRANCHES} ]] && [[ "${GL_REPO}" =~ ${UPDATED_REPOS} ]];
then
# determine the name of the branch in the super repository
SUPERBRANCH=$FULL_REF
SUBMODULE_NAME=${GL_REPO##submodules/}
# clean the submodule repo related environment
unset $(git rev-parse --local-env-vars)
# move to the super repository
cd $SUPERREPO_DIR
echo "Automaticaly updating the '$SUBMODULE_NAME' reference in the super repository..."
# modify the index - replace the submodule reference hash
git ls-tree $SUPERBRANCH | \
sed "s/\([1-8]*\) commit \([0-9a-f]*\)\t$SUBMODULE_NAME/\1 commit $REV_NEW\t$SUBMODULE_NAME/g" | \
git update-index --index-info
# write the tree containing the modified index
TREE_NEW=$(git write-tree)
COMMIT_OLD=$(git show-ref --hash $SUPERBRANCH)
# write the tree to a new commit and use the current commit as its parent
COMMIT_NEW=$(echo "Auto-update submodule: $SUBMODULE_NAME" | git commit-tree $TREE_NEW -p $COMMIT_OLD)
# update the branch reference
git update-ref $SUPERBRANCH $COMMIT_NEW
# shall we also update the HEAD?
# git symbolic-ref HEAD $SUPERBRANCH
fi
Now the questions are:
- Is it a good idea at all to use a git hook to modify another repository than the one that triggered the event?
- Is the hook implementation OK?
(It seems to be working on my machine, but I have no prior experience with git plumbing and so maybe I have omitted something) - I guess there is a possibility of race conditions in case of two (or more) submodules being updated simultaneously. Is it possible to prevent that somehow (e.g. a lock file)?
(we are using gitolite as the access layer). - Would it be better to use a clone of the super repository for the modification and then push (as opposed to modify the bare super repository directly)?
Thanks in advance.