Substring with Concatenation of All Words
Problem Description
You are given a string s and an array of strings words. You need to find all the starting indices of substrings in s that is a concatenation of each word exactly once, in any order, and without any intervening characters.
Examples
Input: barfoothefoobarman, [foo,bar]
Output: [0,9]
Input: wordgoodgoodgoodbestword, [word,good,best,word]
Output: [8]
Constraints
1 <= s.length <= 10^4
s consists of lower-case English letters.
1 <= words.length <= 5000
1 <= words[i].length <= 30
words[i] consists of lower-case English letters.
Approach to Solve
Use a sliding window approach to check for the first occurrence of needle in haystack.
Code Implementation
class Solution:
def findSubstring(self, s: str, words: List[str]) -> List[int]:
if not s or not words:
return []
word_len = len(words[0])
word_count = len(words)
total_len = word_len * word_count
word_freq = {}
for word in words:
word_freq[word] = word_freq.get(word, 0) + 1
result = []
for i in range(len(s) - total_len + 1):
seen = {}
for j in range(word_count):
start = i + j * word_len
word = s[start:start + word_len]
if word in word_freq:
seen[word] = seen.get(word, 0) + 1
if seen[word] > word_freq[word]:
break
else:
break
else:
result.append(i)
return result
Explanation
This solution uses a sliding window approach to check for the first occurrence of needle in haystack. Here's a detailed explanation of the algorithm:
Check if needle is empty: If the needle is empty, return 0.
Iterate through the haystack: Use a loop to iterate through the haystack from the start to the position where the last character of the needle would fit.
Compare substrings: For each position in the haystack, extract a substring of the same length as the needle and compare it with the needle.
Return the index: If a match is found, return the current index. If no match is found after the loop, return -1.
This approach ensures that the entire haystack is scanned, and the first occurrence of the needle is found if it exists.
Time Complexity: O(n * m), where n is the length of the haystack and m is the length of the needle. In the worst case, the algorithm may need to compare every substring of length m in the haystack. Space Complexity: O(1), as we only use a constant amount of extra space for the loop index and substring extraction.
Complexity
- Time Complexity: O(n * m * k)
- Space Complexity: O(m)