Longest Substring Without Repeating Characters

 

01: Introduction

If you are learning Data Structures and Algorithms (DSA), you will almost certainly come across the problem β€”
Longest Substring Without Repeating Characters. It is one of the most popular problems
asked in coding interviews at companies like Google, Amazon, Microsoft, and many others.

At first, this problem might sound confusing. But once you understand the idea of a
sliding window, it becomes very straightforward. This guide will walk you through
everything β€” what the problem is, how to think about it, how to write clean Python code,
and what to expect in an interview.

πŸ’¬ Why This Problem Matters

This problem teaches you the sliding window technique β€” a powerful tool for solving many string and
array problems efficiently. Mastering it will help you solve dozens of related interview questions.

02: Problem Statement

πŸ“Œ Official Problem Statement

Given a string s, find the length of the longest substring
without repeating characters
.

What does “substring” mean?

A substring is a continuous sequence of characters within a string.
For example, in the string "abcde", the substrings include
"abc", "bcd", "de", and so on.
A single character like "a" is also a valid substring.

What does “without repeating characters” mean?

It means every character in the substring appears only once.
For example, "abc" has no repeating characters, but "abca"
has 'a' repeating β€” so it is not valid.

Let’s Understand with an Example

Take the string: "abcabcbb"

  • Starting from index 0: "abc" β†’ all unique β†’ length 3 βœ…
  • Continue: "abca" β†’ 'a' repeats β†’ stop
  • Slide the window: "bca" β†’ all unique β†’ length 3 βœ…
  • Continue sliding… the longest we find is still 3

So the answer is 3.

πŸ”‘ Core Idea

We want to find the longest window (a range inside the string)
where no character repeats. We slide this window from left to right β€”
this is exactly the sliding window technique.

03: Input & Output Format

πŸ“₯ Input
  • A single string s
  • Can contain letters, digits, symbols, or spaces
  • Length can be from 0 to 50,000
  • String may be empty
πŸ“€ Output
  • A single integer
  • Represents the length of the longest substring without repeating characters
  • If string is empty, output is 0
πŸ“‹ Sample

Input: "pwwkew"
Output: 3
Reason: "wke" is the longest substring without repeating characters.

04: Flowchart

Here is a visual diagram of the sliding window algorithm used to solve this problem:

Longest Substring Without Repeating Characters flowchart diagram

 

05: Pseudocode

Before writing the actual Python code, here is the pseudocode.
Pseudocode is like plain English instructions written in a code-like format.
It helps you understand the logic without worrying about syntax.

FUNCTION lengthOfLongestSubstring(s):
// Initialize a set to track characters in current window
char_set ← empty set
left ← 0
max_len ← 0FOR right FROM 0 TO length(s) – 1:// If current character is already in the window, shrink from left
WHILE s[right] IS IN char_set:
remove s[left] FROM char_set
left ← left + 1// Add current character to the set
add s[right] TO char_set// Update maximum length
max_len ← MAX(max_len, right – left + 1)

RETURN max_len
END FUNCTION

🧠 Key Observations from Pseudocode
  • left and right are two pointers defining our current window.
  • The set tells us instantly if a character is already in the window.
  • We only move left forward when we find a repeated character.
  • max_len keeps track of the best window size we’ve found so far.

06: Python Program

Here is the complete Python solution using the sliding window technique:


def length_of_longest_substring(s):
    """
    Find the length of the longest substring
    without repeating characters.

    Args:
        s (str): Input string

    Returns:
        int: Length of the longest valid substring
    """

    # A set to store characters in the current window
    char_set = set()

    # Left pointer of our sliding window
    left = 0

    # Variable to track the best (longest) length found
    max_len = 0

    # Right pointer moves across the string one step at a time
    for right in range(len(s)):

        # If s[right] is already in our window, shrink the window
        # from the left until the duplicate is removed
        while s[right] in char_set:
            char_set.remove(s[left])
            left += 1

        # Now s[right] is safe to add β€” no duplicates in current window
        char_set.add(s[right])

        # Update max_len if the current window is larger
        # Window size = right - left + 1
        max_len = max(max_len, right - left + 1)

    return max_len


# ============================================
# Test the function with different examples
# ============================================

test_cases = [
    ("abcabcbb", 3),
    ("bbbbb",    1),
    ("pwwkew",   3),
    ("",         0),
    ("abcdef",   6),
]

for s, expected in test_cases:
    result = length_of_longest_substring(s)
    status = "βœ… PASS" if result == expected else "❌ FAIL"
    print(f"{status} | Input: {repr(s):15} | Output: {result} | Expected: {expected}")

Expected Program Output

Output
βœ… PASS | Input: 'abcabcbb'     | Output: 3 | Expected: 3
βœ… PASS | Input: 'bbbbb'        | Output: 1 | Expected: 1
βœ… PASS | Input: 'pwwkew'       | Output: 3 | Expected: 3
βœ… PASS | Input: ''             | Output: 0 | Expected: 0
βœ… PASS | Input: 'abcdef'       | Output: 6 | Expected: 6

08: Explanation of the Program

Let’s break down exactly how the code works, step by step.
We’ll trace through the string "abcabcbb" to see it in action.

Step 1 β€” Setting Up

We create a set called char_set.
A set is a collection that stores unique items and lets us check if something is inside it
in constant time O(1).

We also create two variables: left = 0 (start of our window) and
max_len = 0 (best answer so far).

Step 2 β€” Expanding the Window

The right pointer moves from the beginning of the string to the end,
one character at a time.
At each step, we check if s[right] is already in our set.

  • If it is NOT in the set β†’ we add it and expand the window.
  • If it IS in the set β†’ we shrink from the left until the duplicate is removed.

Step 3 β€” Shrinking the Window

When a repeat is found, we remove s[left] from the set and
move left one step to the right.
We keep doing this until the duplicate character is gone.

Step 4 β€” Updating the Answer

After each valid addition, the window size is right - left + 1.
We compare this with our current max_len and keep the larger value.

Dry Run on “abcabcbb”

Trace
String:  a  b  c  a  b  c  b  b
Index:   0  1  2  3  4  5  6  7

right=0: s[0]='a' not in {} β†’ add β†’ set={'a'}  β†’ window="a"   β†’ max_len=1
right=1: s[1]='b' not in {'a'} β†’ add β†’ set={'a','b'} β†’ window="ab" β†’ max_len=2
right=2: s[2]='c' not in {'a','b'} β†’ add β†’ set={'a','b','c'} β†’ max_len=3
right=3: s[3]='a' IS in set β†’
         remove s[0]='a', left=1 β†’ set={'b','c'}
         now add 'a' β†’ set={'b','c','a'} β†’ window="bca" β†’ max_len=3
right=4: s[4]='b' IS in set β†’
         remove s[1]='b', left=2 β†’ set={'c','a'}
         add 'b' β†’ window="cab" β†’ max_len=3
right=5: s[5]='c' IS in set β†’
         remove s[2]='c', left=3 β†’ set={'a','b'}
         add 'c' β†’ window="abc" β†’ max_len=3
right=6: s[6]='b' IS in set β†’
         remove s[3]='a', left=4 β†’ set={'b','c'}
         'b' still in set, remove s[4]='b', left=5 β†’ set={'c'}
         add 'b' β†’ window="cb" β†’ max_len=3
right=7: s[7]='b' IS in set β†’
         remove s[5]='c', left=6 β†’ set={'b'}
         'b' still in set, remove s[6]='b', left=7 β†’ set={}
         add 'b' β†’ window="b" β†’ max_len=3

Final answer: 3
βœ… Why This Works

At every moment, the window between left and right
contains only unique characters. By tracking the maximum size this window ever reaches,
we find our answer without checking every possible substring β€” making it very efficient.

09: Time & Space Complexity

⏱ Time Complexity
O(n)
Each character is visited at most twice β€” once when right
moves over it, and once when left moves past it.
So the total work is proportional to n (string length).
πŸ’Ύ Space Complexity
O(k)
The set stores at most k unique characters, where k is the
size of the character set (e.g., 26 for lowercase English letters, 128 for ASCII).
In practice, this is O(1) because the alphabet is fixed.
πŸ“Š Why Not Use a Brute Force Approach?

A brute force solution would check every possible substring β€” that’s
O(nΒ²) or O(nΒ³) time. For a string of length 50,000,
this would be extremely slow. The sliding window approach solves it in
linear O(n) time, making it far superior for large inputs.

10: Test Cases

Test Case 1 β€” Normal String with Repeats
Input
“abcabcbb”
Expected Output
3

Explanation: The longest substrings without repeats are "abc",
"bca", or "cab" β€” each of length 3.

Test Case 2 β€” All Same Characters
Input
“bbbbb”
Expected Output
1

Explanation: Every character is 'b', so the longest
substring without repeating is just a single "b", giving length 1.

Test Case 3 β€” Partial Repeats
Input
“pwwkew”
Expected Output
3

Explanation: The substring "wke" (at indices 2–4)
has all unique characters and is the longest with length 3.

Test Case 4 β€” Empty String (Edge Case)
Input
“”
Expected Output
0

Explanation: There are no characters at all,
so the answer is 0. Always handle edge cases!

Test Case 5 β€” All Unique Characters
Input
“abcdef”
Expected Output
6

Explanation: Every character is unique,
so the entire string is the longest substring. Length = 6.

11: Common Interview Questions

When this problem (or similar ones) come up in technical interviews,
Interviewers often follow up with these questions. Be ready for them!

1. What is the brute force approach and why is it slow?
Brute force means checking every possible substring and then testing whether it contains duplicate characters. That is slow because the number of substrings is already large, and checking each one again makes it worse.
2. Explain the sliding window technique in your own words.
Sliding window means keeping a moving range of characters that is currently valid. In this problem, the window always contains characters with no repeats.
When a new character causes a duplicate, you do not restart from scratch. You move the left side forward until the duplicate is removed.
3. What data structure did you use and why a set instead of a list?
I used a set because I need fast membership checking: β€œIs this character already in the current window?” A set does that in average O(1) time.
A list is a bad choice here because checking char in list takes O(n) time, which makes the whole solution slower
4. What is the time and space complexity of your solution?
  • Time complexity: O(n)
  • Space complexity: O(k), where k is the number of unique characters in the window, worst case O(n)
5. How would you handle Unicode characters or special symbols?
In Python, this solution already works for Unicode and special symbols because strings are Unicode-aware and a set can store any character. The algorithm does not care whether the character is a letter, emoji, or symbol.
6. Can you optimize further using a HashMap to jump the left pointer directly?
Yes. That is a better version. Instead of removing characters one by one from the left, store the last seen index of each character in a HashMap/dictionary.
When you see a repeated character, you jump left directly to last_seen[char] + 1
7. What happens if the input string is empty or has only one character?
  • If the string is empty, the answer is 0 because there is no substring.
  • If the string has one character, the answer is 1 because that single character is a valid substring with no repeats.
8. Can you return the actual substring, not just its length?
Yes. You just need to track where the best window starts, not only its length.
Whenever you update max_len, also save the current window start index.
9. How does this relate to the “minimum window substring” problem?

Both use the sliding window idea, but the goal is different.

  • Longest substring without repeating characters:
    You want the largest valid window where all characters are unique.
  • Minimum window substring:
    You want the smallest window that contains all required characters.
10. What is the difference between a substring and a subsequence?
A substring is continuous. A subsequence can skip characters.
πŸ’‘ Bonus β€” HashMap Optimization

Instead of a set, you can use a dictionary (HashMap) to store
the last index where each character appeared. This lets you jump left
directly to the right position instead of moving it one step at a time,
which can reduce constant factors in practice.

12: Conclusion

The Longest Substring Without Repeating Characters problem is a
fantastic entry point into the world of the sliding window technique.
Once you understand why the two-pointer approach works β€” and how a set gives you
instant duplicate detection β€” the solution feels elegant and natural.

Here’s a quick recap of what you learned today:

  • βœ… A substring is a continuous slice of a string.
  • βœ… We use two pointers (left and right) to define a sliding window.
  • βœ… A set helps us check for duplicates in O(1) time.
  • βœ… We shrink the window from the left whenever a repeat is found.
  • βœ… The overall time complexity is O(n) β€” very efficient.
  • βœ… This same pattern applies to many other string and array problems.

Practice this problem a few times until the sliding window logic feels automatic.
Then try variations like finding the actual substring, or solving it with a HashMap.
You’ll be surprised how often this pattern shows up in real interviews!

13: Frequently Asked Questions

What is the longest substring without repeating characters problem?
β–Ά

Given a string, you need to find the length of the longest contiguous part
(substring) where every character appears only once β€” no duplicates allowed.
For example, in "abcabcbb", the answer is 3 because "abc"
is the longest unique-character substring.

What is the sliding window technique?
β–Ά

The sliding window is a technique where you maintain a “window” β€” a range
defined by two pointers (left and right) β€” and slide
it across the data. You expand it to include new elements and shrink it when
a condition is violated. It converts an O(nΒ²) brute force into a much faster O(n) solution.

Why use a set instead of a list for tracking characters?
β–Ά

Checking if an element exists in a set takes O(1) time on average,
while checking in a list takes O(n) time. Since we check for
duplicates at every step of our loop, using a set makes the whole algorithm
significantly faster for large strings.

Is this the same as “find all unique substrings”?
β–Ά

No. This problem only asks for the length of the
longest substring where all characters are unique.
It does not ask you to find all possible unique substrings (which would be
a much harder and more expensive problem).

What is the time complexity of the sliding window solution?
β–Ά

The time complexity is O(n), where n is the length of the string.
Even though there is a while loop inside the for loop, each character is added
and removed from the set at most once. So the total operations are bounded by 2n.

Can this problem be solved using a HashMap instead of a set?
β–Ά

Yes! Using a dictionary (HashMap) that maps each character to its last
seen index allows the left pointer to jump directly to the correct position
instead of moving one step at a time. This is slightly more optimized in
practice, though both approaches have the same O(n) time complexity.

Is this problem asked in real coding interviews?
β–Ά

Absolutely. This is LeetCode problem #3 and is considered a
Medium difficulty problem. It is frequently asked at companies
like Google, Amazon, Facebook (Meta), Microsoft, Adobe, and many others.
It’s also a great problem for understanding the sliding window pattern,
which applies to many other interview questions.

What is the difference between a substring and a subsequence?
β–Ά

A substring is a contiguous sequence of characters β€”
all characters must appear next to each other in the original string.
A subsequence can skip characters β€” the relative order
must be maintained, but characters don’t need to be adjacent.
This problem deals with substrings only.

 

Leave a Comment