Amortized Time Complexity in Elixir
Certain computational problems require a more nuanced understanding of time complexity, leading to the concept of "Amortized Time Complexity".
When assessing the efficiency of algorithms in Elixir (or any other language, for that matter), time complexity plays a vital role. However, what we commonly see in many time complexity discussions are the worst-case scenarios—these are expressed in Big O notation. Yet, certain computational problems require a more nuanced understanding of time complexity, leading to the concept of "Amortized Time Complexity".
Basics of Time Complexity
Before diving into the concept of amortized time complexity, we need to lay a groundwork understanding of time complexity itself. Time complexity measures the time an algorithm takes to run as a function of the size of its input. This helps in predicting the execution time and how it scales when the input size grows.
For example, a linear algorithm O(n)
grows directly in proportion to its input size, whereas a logarithmic algorithm O(log n)
grows more slowly. By understanding time complexity, developers can make informed decisions about their code to ensure it's as efficient as possible.
The Concept of Amortized Time Complexity
While average and worst-case time complexities provide valuable insights into an algorithm's efficiency, they don't always give the complete picture. This is where amortized time complexity comes into play.
Amortized time complexity is a way to measure the time complexity where an expensive operation can alter the state in a way that future operations are less expensive. Essentially, it is an average time per operation, over a worst-case sequence of operations. So, the amortized time per operation is the total time for all operations divided by the number of operations.
In other words, even if a single operation might have a high cost in terms of time or space complexity, if it leads to many other operations being much more efficient, the amortized cost might still be low.
Amortized Time Complexity with List Append
Elixir, as a functional language, has immutability ingrained in its core, which means that data cannot be changed once it's created. This leads to a specific pattern of data structure manipulation, particularly with lists, which are singly linked lists in Elixir.
When we want to add an element to a list, we could append it to the end of the list. This operation has a time complexity of O(n)
, as the full list would need to traversed to append the element:
list = [1, 2, 3, 4, 5]
list = list ++ [6] # Appends 6 to the end of the list
This can become inefficient for large lists and frequent append operations.
Optimization Using Prepend Operation
Elixir, however, provides an alternative. We could prepend an element to the list with a time complexity of O(1)
, which is much more efficient. Prepending simply adds a new head to the list without needing to traverse it:
list = [1, 2, 3, 4, 5]
list = [6 | list] # Prepends 6 to the start of the list
This makes adding elements efficient, but it changes the order of elements, which might not be what we want.
Benchee
Here are some simple benchmarks for 10 and 10_000 element lists:
Comparison:
10-prepend 3.46 M
10-append 2.24 M - 1.54x slower +156.51 ns
---
Comparison:
10_000-prepend 5.20 K
10_000-append 0.00480 K - 1083.26x slower +208.04 ms
Amortized Cost with List Reversal
To regain the original order of elements (with the new element at the end), we would need to reverse the list. The reversal operation has a time complexity of O(n)
:
Now, the time complexity of appending an element to the list using this approach is the total time complexity divided by the number of operations (n
append operations). That is, O(n) / n = O(1)
.
So, in an amortized sense, each append operation is O(1)
. The occasional expensive reverse operation, O(n)
, gets amortized over the numerous cheap prepend operations, O(1)
, effectively becoming a constant-time operation when averaged out over the sequence of operations.
This is an example of how understanding amortized time complexity can help us write more efficient code. By using amortized analysis, we can find efficient ways to perform operations that might be inefficient if done in a straightforward way.
Map.put and Hash Maps
The Map.put
operation in Elixir is generally considered a constant-time operation O(1)
, but this is an amortized time complexity. Under the hood, Elixir uses Hash Maps to implement map data structures.
In a Hash Map, when the load factor of the map (i.e., the number of items in the map divided by the total slots available) exceeds a specific threshold, the map increases its size to accommodate more elements and reduce collision probability. This resizing operation involves rehashing all the keys in the map and allocating them to new slots, which is an O(n)
operation.
Amortized Cost with Map Resize and Rehash
Let's say we start with an empty map and perform n
Map.put
operations. Most Map.put
operations are O(1)
, but occasionally, we perform an expensive O(n)
operation when the map needs to be resized and rehashed to prevent collisions.
To calculate the total time complexity, we need to account for all Map.put
operations and all resizing operations.
For simplicity's sake, let's assume the map resizes every time it doubles in size. This means the resizing operation was performed when the map had size 1, 2, 4, 8, 16, ..., n
. Each of these operations took a time of 1, 2, 4, 8, ..., n
respectively.
Therefore, the total time taken for n
insertions and all resizing operations is:
1 + 2 + 4 + 8 + ... + n + n = 2n - 1 = O(n)
So, when we calculate the amortized time complexity, it's (O(n) / n) = O(1)
.
This shows us that, on average, each Map.put
operation is O(1)
in an amortized sense. The occasional expensive O(n)
resizing operation gets amortized across many O(1)
Map.put
operations, and as such, each Map.put
operation can be considered to run in constant time when considering a sequence of operations.
Conclusion
Understanding amortized time complexity can give us a more nuanced view of the performance characteristics of our programs. It enables us to take into account both the regular "everyday" cost of operations and the occasional "expensive" operations that can affect overall performance. With this knowledge, we can make more informed decisions when designing algorithms and data structures in Elixir, and improve the efficiency and performance of our code.
Book
I'm in the process of writing a book for Elixir/Erlang developers dive deeper into functional data structures.
Subscribe to the free newsletter for updates and progress!
-- Troy
Twitter: @tmartin8080 | ElixirForum