LeetCode - Longest Palindromic Substring | Elixir Solution
An Elixir solution for LeetCode's Longest Palindromic Substring problem with detailed explanation and time complexity analysis.
This time around we have a different type of problem, one involving strings/binaries and palindromes which are words that spell the same words in reverse: noon
.
Likey many of these problems, they seem simple at first glance. As you dig into them however, you find that improving performance over a brute-force approach takes some careful consideration.
Problem
Given a string s
, return the longest palindromic substring in s
.
Example 1:
Input: s = "babad"
Output: "bab"
Explanation: "aba" is also a valid answer.
Example 2:
Input: s = "cbbd"
Output: "bb"
Constraints:
1 <= s.length <= 1000
s
consist of only digits and English letters.
Elixir Solution
TLDR;
- Convert String to Map for efficient random access (vs Linked List).
- Iterate over each char to
expand_range
(Expand Around Center). expand_range
expands while palindrome is found, returns currentlongest
when not found.expand_range
is called twice, once forodd
and once foreven
length substrings.
Time Complexity: O(n^2)
Source: https://github.com/tmartin8080/leet
Check Palindrome
Let's start by discussing a few ways to determine if a string is a palindrome. The first can be accomplished by using two pointers and working toward the center:
1. Outside Pointers
Create pointers for the start and end:
Then work toward to the center. If the strings at low
| high
positions match, continue inward by incrementing low + 1
and decrementing high - 1
.
Code Example:
When the two pointers pass each other low > high
then we know the string is a Palindrome.
2. Expand Around Center
The second approach is to start at the center:
Then work outward, and at each step check that the characters match:
Once we have checked all character pairs and go beyond the 0
index on the left, the string is a palindrome.
Even-Length Strings
This approach has a quirk though. When the string length is odd
, we can take the center and move outward, but when the length of the string is even
, there isn't a single center value. To account for this, we can compare two center-most values:
Then work outward:
With even length strings, once you find the middle characters working outward is the same. Let's look at a code example for this:
Code Example:
The benefits of either approach above is that there are n / 2
steps when the string is even
and n / 2 + 1
when the string length is odd
.
Approach - Explanation
Now onto the problem which is finding the longest substring that's a palindrome.
A brute-force approach might be to find all the possible substrings, check if they're palindromes and take the longest. We're going to skip this approach because it's too obviously flawed.
We can instead use Expand Around Center by traversing the string and for each character, expand outward until it's no longer a palindrome. Once for a single character and once for the current character and the next. This allows us to check for odd length palindromes and even length palindromes which we already know how to do.
After we've traversed the entire string return the longest one found.
First Iteration - odd
Using cbbd
as an example, we start at index 0
and check if the character is the center of an odd length palindrome:
Expanding here fails on the first iteration because there's no value on the left. So c
is our current longest palindrome substring.
First Iteration - even
Next, we check if characters at index 0
and 0 + 1
is the center of and even length palindrome:
This fails right away because c
and b
don't match. The current longest palindrome remains c
. We increase the current index
by one and continue to the next.
Second Iteration - odd
Checking for an odd length palindrome:
This fails, and so again, we check index
and index + 1
for an even length palindrome:
This time, the two center characters form a palindrome but expanding outward fails. We update the longest palindrome to: bb
.
Again, at this point we would update the current index
and continue to the end.
bb
.Let's skip the third and fourth iteration and jump into the implementation.
Approach - Implementation
As an aside, I bench-marked 3 different implementations for random-access on a 1000 character string:
String.at/2
:array.get/3
Map.get/2
And here are some crude bench-marking results:
Strings are charlists which are singly linked lists so random access has to start at index 0
and traverse to the given index
.
Let's start with a map datastructure to improve performance over random access on strings.
Setup
Now that we've selected maps as our workhorse datastructure, we start by guarding against invalid data
def longest_palindrome(s) when s == "" or is_nil(s), do: ""
And initialize data to feed into some recursive functions. For this example, let's imagine we're using babad
as the String input:
First we build a map of the string with the index as the key and the character or codepoint as the value: %{0 => "a"}
and return max index as well.
String.codepoints/1
keeps accents, etc together: è
whereas String.graphemes/1
would split them up into separate characters.Arguments
The first argument passed to find_longest/3
is a tuple of {s, chars, n}
which is the input string, the map representation of the string, and the right-most index of the map.
# s = "babad", chars: %{0 => "b", ...}, n = 4
find_longest({s, chars, n}, current_index, longest)
The second argument is the current index that we're expanding around.
The third argument passed to find_longest/3
is a tuple structure we're using to represent the longest palindrome found. This contains a left and right index of the string:
# left = 0, right = o
longest = {left, right}
find_longest({s, chars, n}, index, longest)
Base Case
Next we define the base case as:
When the current index is greater than the max available index n
we've traversed the entire string.
Here's the base case function:
Once we've gone beyond the right-most index, we slice the string using left
| right
indices we found while expanding around each character.
Main Recursion
Next, we have our main recursive function that will expand around each character twice. The first expand_range/4
call checks for odd-length palindromes and the second checks for even-length palindromes:
We do this to catch all available palindromes while we're visiting an index.
Expand Around Center
We initialized the longest palindrome as {0, 0}
and each call to these functions will either leave the longest
range alone, or update it to the left
| right
indices of a newly discovered palindrome.
The odd-length call is handled first when the index is passed twice:
All single characters are considered palindromes so we expand outward and pass the updated values along to the next function head:
First we check if we're till within the boundaries using guard clauses. Then we check if the value at index left
equals the value at index right
.
If the characters match, we call update_longest/2
to determine the longer of the two palindromes or index ranges:
This will determine which value is longer: the current longest or the newly found palindrome.
Our result will be "bab", but if we reversed this function's logic to keep the most recent found, our result would be "aba".
After updating the temporary storage longest
this function calls itself again to continue expanding:
expand_range(data, left - 1, right + 1, new_longest)
In call other cases, we return the current longest index range value: {left, right}
.
Returning Result
After traversing the entire string, we slice the value from the longest range:
defp find_longest({s, _chars, n}, index, {left, right}) when index > n do
String.slice(s, left..right)
end
Which will return "bab" in our example.
Complexity Analysis
- Convert string to map:
n steps
- Traverse the string expanding twice:
2n
Since expanding a palindrome around its center could take O(n)
time, the overall complexity is: O(n^2)
Wrapping Up
These code problems have been helpful in terms of developing patterns and analyzing performance characteristics of different Data Structures.
By working through some of these problems and solutions I'm realizing that there are distinct ideas involving Functional Data Structures based on immutability and there's much to understand about them.
The Erlang/Elixir communities have provided an immensely solid foundation for developing fast, reliable code, but as we have seen through a few of these solutions there are cases where a deeper understanding of how data structures are implemented can inform better decisions about system design.
Thanks for reading!
-- Troy
Twitter: @tmartin8080 | ElixirForum