Google search result short summary

-5

Intro

When you search in google, it always shows you a result with a sample text from the found webpage.

For example if you search for "Madonna greatest vinyl", google will show you one line link, and below a short excerpt from that found webpage:

Madonna Greatest Hits Records, LPs, Vinyl and CDs
Madonna - Greatest Hits Volume 2, Madonna, Greatest Hits ... vinyl Is Fully Restored To As Near New Condition As Possible. Shipping & Multiple Order D..

Task

Imagine yourself you work for google and you have to write a program/function which takes in:

  • a string containing many words (the webpage content)
  • list of searched words (at least 3)

and returns the shortest excerpt of given string (webpage) containing all searched words.

Example

Given this webpage content:

This document describes Session Initiation Protocol (SIP), an application-layer
 control (signaling) protocol for creating, modifying, and terminating
 sessions with one or more participants. These sessions include 
 Internet telephone calls, multimedia distribution, and multimedia conferences.

and these searched words:

calls, sessions, internet

the program should return:

sessions include Internet telephone calls
, as this is the shortest substring containing all 3 searched words. Note that one more substring contains these 3 words, it is "sessions with one or more participants. These sessions include Internet telephone calls", but it is longer, so it was discarded.

Rules

  • If the string is empty, return empty string
  • If all searched words are not found in given string, return empty string
  • Search is ignoring letters case
  • At least 3 words need to be specified for searching
  • The returned string may contain the searched words in different order than specified

Challenge

Write the fastest code. It's for google, right? Remember that repeatable strings comparison is very expensive.

Filip Stachowiak

Posted 2017-02-23T12:18:24.390

Reputation: 1

Question was closed 2017-02-23T12:42:07.627

1Google doesn't always show a sample. Next time please post to the sandbox first. – fəˈnɛtɪk – 2017-02-23T12:19:46.763

1This looks like it could be a good challenge, but you need to be more specific about the task, e.g. by defining possible inputs and outputs. Will there always be one text and three search words? Are the search words arbitrary strings? What should be returned if not all search words are present in the text? – Laikoni – 2017-02-23T12:41:58.693

So... what's the winning criteria? – James – 2017-02-23T19:46:42.813

I could maybe require to do it in one loop, but that would be a hint. – Filip Stachowiak – 2017-02-25T13:57:32.120

Answers

1

05AB1E, 13 bytes

Œévyl²l#åPiyq

Explanation:

Ύ            # Get substrings sorted by shortest first
  vyl         # For each substring (in lowercase)...
     ²l#      # Split the searched text (in lowercase)on spaces
        åP    # Check if each word of the searched text is in the substring
          iyq # If so, print the substring and terminate the program

Try it online!

It will only work for one line of input.

Okx

Posted 2017-02-23T12:18:24.390

Reputation: 15 025

It's very inefficient. Do you know how many substrings will it be if there are n words? I guess n!. And generating all of them will be time-consuming. – Filip Stachowiak – 2017-02-25T13:19:40.023

@FilipStachowiak This is [tag:code-golf], right? Not [tag:fastest-code]. I'm not entirely sure because there is no tag on your question for the winning criterion. – Okx – 2017-02-25T13:23:01.680

Actually I removed the code-golf tag when I understood its meaning. There are no special winning criteria for this task. I think it's interesting as-is. – Filip Stachowiak – 2017-02-25T13:56:46.570

@FilipStachowiak that's exactly why it's closed as unclear what you're asking. The rules of PPCG state that every challenge must have an objecting winning criterion. – Okx – 2017-02-25T14:07:50.050

Great! I'll figure sth out. – Filip Stachowiak – 2017-02-25T14:10:56.143