Old post, but I faced the same problem recently.
The regex ^(https|git)(:\/\/|@)([^\/:]+)[\/:]([^\/:]+)\/(.+).git$ works for the three types of URL.
#!/bin/bash
# url="git://github.com/some-user/my-repo.git"
# url="https://github.com/some-user/my-repo.git"
url="git@github.com:some-user/my-repo.git"
re="^(https|git)(:\/\/|@)([^\/:]+)[\/:]([^\/:]+)\/(.+)(.git)*$"
if [[ $url =~ $re ]]; then
protocol=${BASH_REMATCH[1]}
separator=${BASH_REMATCH[2]}
hostname=${BASH_REMATCH[3]}
user=${BASH_REMATCH[4]}
repo=${BASH_REMATCH[5]}
fi
Explaination (see it in action on regex101):
^ matches the start of a string
(https|git) matches and captures the characters https or git
(:\/\/|@) matches and captures the characters :// or @
([^\/:]+) matches and captures one character or more that is not / nor :
[\/:] matches one character that is / or :
([^\/:]+) matches and captures one character or more that is not / nor :, yet again
[\/:] matches the character /
(.+) matches and captures one character or more
(.git)* matches optional .git suffix at the end
$ matches the end of a string
This if far from perfect, as something like https@github.com:some-user/my-repo.git would match, but I think it's fine enough for extraction.