Pattern Finder - Find Repeated Pattern in String

5

Introduction

This is a little challenge with the goal of identifying repeated pattern "blocks" in a string, and outputting them with their frequency of occurence from left to right.

A friend asked me this question and I realized it wasn't as simple as tossing what you've seen into a hash table and popping them off if previously seen.

Example Input and Output

Input:

"TTBBPPPBPBPBPBBBBPBPBPBB"

The idea:

T T | B B | P P P | BP BP BP | B B B B | PB PB PB | B

Output:

2T, 2B, 3P, 3BP, 4B, 3PB, B

However any output format that shows the number of repetitions, and respective repeated substrings, are allowed.

Is it possible to do this with reasonable time complexity?

I'm curious as to the standard, legible approach to this kind of question, but as this is code golf - shortest code wins!

Cheers!

R. Gosman

Posted 2019-01-04T20:21:09.210

Reputation: 51

Question was closed 2019-01-04T21:52:22.070

1Welcome to PPCG!. This seems a good challenge though you need to specify what is the winning criteria. – Luis felipe De jesus Munoz – 2019-01-04T20:24:17.347

Thanks @LuisfelipeDejesusMunoz ! I've set the criteria to be shortest code wins! But I'm hoping to also see understandable code, as I'm trying to learn from this experience myself :) – R. Gosman – 2019-01-04T20:34:52.130

1@R.Gosman Welcome to PPCG, where people abuse code to the fullest extent to golf it! Seriously though, please don't consider PPCG as an example of good practices. :P Also, is this challenge restricted to Python? – Erik the Outgolfer – 2019-01-04T20:36:05.277

4What if several outputs are possible? For instance, what is the expected result for "ABABCABCBC"? – Arnauld – 2019-01-04T20:44:35.093

7...also what about AAABBBAAABBB is it [A3, B3, A3, B3], [AAABBB2], or [[A3, B3]2]? – Jonathan Allan – 2019-01-04T20:52:40.920

As defined, the easiest way to implement this is to just output the number of repetitions of each character. Is the goal to minimize the number of separate blocks? – fəˈnɛtɪk – 2019-01-04T21:02:43.357

1

I've voted to close as unclear, but I think this is a good challenge! As you're new here, you probably haven't heard of The Sandbox, but it's pretty useful for refining challenges to fit our expectations, and for preventing the comments section from blowing up.

– Giuseppe – 2019-01-04T21:21:21.413

Must the number appear at the beginning of each string in the output or can it come at the end? How should we handle inputs that contain digits? (E.G., should 22 output 22?) – Shaggy – 2019-01-04T22:10:45.953

Welcome to PPCG. My suggestion would be that you move this question to the sandbox as suggested by @Giuseppe and temporarily delete it from the main site. There you will be able to get help from other community members on how to make your question into a good challenge and then repost it when you think it is ready. Have fun here anyway. – ElPedro – 2019-01-04T23:00:51.120

Potential duplicate of Detecting loops in in in strings

– Laikoni – 2019-01-05T12:34:32.893

No answers