Problem
You are building a news collection app that fetches articles from various sources and presents them in a summarized form to users, so they can quickly grasp the main points without having to read the full article. One important feature of your app could be to remove duplicate words or phrases from these summaries while still retaining as much information as possible.
Example 1:
Input: s = “xyzxyzyy” Output: 3 Explanation: The answer is “xyz”, with the length of 3.
Example 2:
Input: s = “qqqqq” Output: 1 Explanation: The answer is “q”, with the length of 1.
Example 3:
Input: s = “xauxauu” Output: 3 Explanation: The answer is “xau”, with the length of 3. Notice that the answer must be a substring, “xau” is a subsequence and not a substring.
Constraints:
0 <= s.length <= 5 * 104
s
consists of English letters, digits, symbols and spaces.
Java Solution
class Solution {
public int lengthOfLongestSubstring(String s) {
int ans = 0;
int[] count = new int[128];
for (int l = 0, r = 0; r < s.length(); ++r) {
++count[s.charAt(r)];
while (count[s.charAt(r)] > 1)
--count[s.charAt(l++)];
ans = Math.max(ans, r - l + 1);
}
return ans;
}
}
The algorithm uses a sliding window with two pointers, left and right, to iterate through the string. It also uses a set to store the unique characters in the current window.
- Initialize left and right pointers to the start of the string, and maxLength to 0.
- Check if the character at the right index is in the set.
- If it’s not in the set, add the character to the set, update maxLength, and move the right pointer forward.
- If it’s in the set, remove the character at the left index from the set, and move the left pointer forward.
- Repeat step 2 until the right pointer reaches the end of the string.
- Return maxLength.
The algorithm runs in O(n) time, where n is the length of the input string.
C++ Solution
class Solution {
public:
int lengthOfLongestSubstring(string s) {
int ans = 0;
vector<int> count(128);
for (int l = 0, r = 0; r < s.length(); ++r) {
++count[s[r]];
while (count[s[r]] > 1)
--count[s[l++]];
ans = max(ans, r - l + 1);
}
return ans;
}
};
The algorithm uses a sliding window with two pointers, left and right, to iterate through the string. It also uses a set to store the unique characters in the current window.
- Initialize left and right pointers to the start of the string, and maxLength to 0.
- Check if the character at the right index is in the set.
- If it’s not in the set, add the character to the set, update maxLength, and move the right pointer forward.
- If it’s in the set, remove the character at the left index from the set, and move the left pointer forward.
- Repeat step 2 until the right pointer reaches the end of the string.
- Return maxLength.
The algorithm runs in O(n) time, where n is the length of the input string.
Python Solution
class Solution:
def lengthOfLongestSubstring(self, s: str) -> int:
ans = 0
count = collections.Counter()
l = 0
for r, c in enumerate(s):
count[c] += 1
while count[c] > 1:
count[s[l]] -= 1
l += 1
ans = max(ans, r - l + 1)
return ans
The algorithm uses a sliding window with two pointers, left and right, to iterate through the string. It also uses a set to store the unique characters in the current window.
- Initialize left and right pointers to the start of the string, and maxLength to 0.
- Check if the character at the right index is in the set.
- If it’s not in the set, add the character to the set, update maxLength, and move the right pointer forward.
- If it’s in the set, remove the character at the left index from the set, and move the left pointer forward.
- Repeat step 2 until the right pointer reaches the end of the string.
- Return maxLength.
The algorithm runs in O(n) time, where n is the length of the input string.
JavaScript Solution
function lengthOfLongestSubstring(s) {
let left = 0, right = 0, maxLength = 0;
const characters = new Set();
while (right < s.length) {
if (!characters.has(s.charAt(right))) {
characters.add(s.charAt(right));
maxLength = Math.max(maxLength, right - left + 1);
right++;
} else {
characters.delete(s.charAt(left));
left++;
}
}
return maxLength;
}
The algorithm uses a sliding window with two pointers, left and right, to iterate through the string. It also uses a set to store the unique characters in the current window.
- Initialize left and right pointers to the start of the string, and maxLength to 0.
- Check if the character at the right index is in the set.
- If it’s not in the set, add the character to the set, update maxLength, and move the right pointer forward.
- If it’s in the set, remove the character at the left index from the set, and move the left pointer forward.
- Repeat step 2 until the right pointer reaches the end of the string.
- Return maxLength.
The algorithm runs in O(n) time, where n is the length of the input string.
For example, consider an article summary about a new scientific discovery: “A team of scientists has discovered a new form of matter that exhibits unusual properties. This groundbreaking research was conducted at the Large Hadron Collider, which is located near Geneva, Switzerland. The team’s findings suggest that this newfound matter could revolutionize our understanding of subatomic particles and their interactions.”
Your text summarization algorithm should process this summary and output an optimized version like: “A team of scientists has discovered a new form of matter with unusual properties. This groundbreaking research was conducted at the Large Hadron Collider near Geneva, Switzerland.” In this example, redundant words (like “a”, “scientists”) have been removed while preserving essential information.
In essence, the real-world problem is about creating concise and readable summaries without losing critical content by using algorithms to find the longest possible sequences of unique words in the given text. This mirrors the original algorithmic problem, where the goal was to identify the longest substring without repeating characters within a string.