Old post, but I faced the same problem recently.
The regex ^(https|git)(:\/\/|@)([^\/:]+)[\/:]([^\/:]+)\/(.+).git$
works for the three types of URL.
#!/bin/bash
# url="git://github.com/some-user/my-repo.git"
# url="https://github.com/some-user/my-repo.git"
url="git@github.com:some-user/my-repo.git"
re="^(https|git)(:\/\/|@)([^\/:]+)[\/:]([^\/:]+)\/(.+)(.git)*$"
if [[ $url =~ $re ]]; then
protocol=${BASH_REMATCH[1]}
separator=${BASH_REMATCH[2]}
hostname=${BASH_REMATCH[3]}
user=${BASH_REMATCH[4]}
repo=${BASH_REMATCH[5]}
fi
Explaination (see it in action on regex101):
^
matches the start of a string
(https|git)
matches and captures the characters https
or git
(:\/\/|@)
matches and captures the characters ://
or @
([^\/:]+)
matches and captures one character or more that is not /
nor :
[\/:]
matches one character that is /
or :
([^\/:]+)
matches and captures one character or more that is not /
nor :
, yet again
[\/:]
matches the character /
(.+)
matches and captures one character or more
(.git)*
matches optional .git
suffix at the end
$
matches the end of a string
This if far from perfect, as something like https@github.com:some-user/my-repo.git
would match, but I think it's fine enough for extraction.