I worked on this problem for an email scanning system, and can say that the lexical properties of URLs for maliciousness are minimal, especially with the constraints you are imposing.
It's true that malicious URLs often "Look random", but that's because your experience has transformed "imgur.com/gallery/lBKRZ" into "harmless image server gallery", but "is1.ecds.girfc.com/ljbm17vkel" is scarily nonsensical... until you learn that it is Image Server 1 on the East Coast Data Store for Getty Images Royalty Free Collection.
It is possible to assign heuristic responses based solely on the value of the URL, but in practice the weighting of the URL value tends to be so small that it fades into inconsequentiality when compared to content heuristics. For instance, take this URL:
super-zakonym.ru
What's the alarming part of this URL? The mix of English and Russian? The fact that it translates to "Super Legit"? The fact that the Russian is misspelled?
Or is it simply that it is a RU TLD?