Merge K Sorted Arrays
#include <bits/stdc++.h> // Includes standard libraries like vector, iostream, queue, algorithm.using namespace std; // Use the standard namespace.
// Custom Node class to store the element's value along with its source array and column index.// This is crucial for knowing which element to fetch next from which array.class Node {public: int data; // The value of the element. int row; // The index of the array (row) this element came from. int col; // The index of the element within its array (column).
// Constructor to initialize a Node object. Node(int data, int row, int col) { this->data = data; this->row = row; this->col = col; }};
// Custom comparator class for the priority_queue.// This defines how Node pointers should be ordered in the heap.// For a MIN HEAP, we want the operator() to return true if 'a' has LOWER priority than 'b',// meaning 'a's data is GREATER than 'b's data. This ensures smallest elements bubble to the top.class compare {public: bool operator () (Node* a, Node* b) { return a->data > b->data; // True if 'a' is "greater" (lower priority), so 'b' goes before 'a'. }};
// Function to merge K sorted arrays into a single sorted array.// 'kArrays': A vector of vectors, where each inner vector is a sorted array.// 'K': The number of sorted arrays.// Time Complexity: O(N log K), where N is the total number of elements across all arrays.// Space Complexity: O(N + K) (O(N) for result, O(K) for heap).vector<int> mergeSortedArrays(vector<vector<int> > kArrays, int K) { // Step 1: Initialize a Min Heap. // The heap will store Node pointers, using our custom 'compare' class for ordering. priority_queue<Node*, vector<Node*>, compare> minHeap;
// Insert the first element of each of the K arrays into the minHeap. // Each insertion takes O(log K) time. Total O(K log K) for this loop. for(int row = 0; row < K; row++) { // Create a new Node for the first element of the current array. Node* temp = new Node(kArrays[row][0], row, 0); // Element value, array index, column index. minHeap.push(temp); // Push the node pointer into the heap. }
vector<int> ans; // Vector to store the final merged sorted array.
// Step 2: Extract elements from the heap and replenish with next elements. // This loop runs N times (where N is the total number of elements). while(minHeap.size() > 0) { // Get the node with the smallest element (which is at the top of the minHeap). Node* temp = minHeap.top();
// Add this smallest element to our result array. ans.push_back(temp->data); minHeap.pop(); // Remove the element from the heap.
// Get the row and column of the extracted element. int row = temp->row; int col = temp->col;
// IMPORTANT: The Node 'temp' was dynamically allocated with 'new'. // In a real-world scenario, you would typically 'delete temp;' here // if not using smart pointers, to prevent memory leaks. // For competitive programming, this is often overlooked as the program terminates.
// Check if there is a next element in the same array (row). // If 'col+1' is within the bounds of that array's size. if(col + 1 < kArrays[row].size()) { // Create a new Node for the next element from the same array. Node* next = new Node(kArrays[row][col+1], row, col+1); minHeap.push(next); // Push this new element into the heap. (O(log K) time) } }
// Return the final merged sorted array. return ans;}
int main() { vector< vector<int> > kArrays; // Vector of vectors to store the K sorted arrays. int K; // Number of sorted arrays.
cout << "Enter the value of K (number of arrays) : "; cin >> K;
// Loop to input each of the K sorted arrays. for(int i = 0; i < K; i++) { vector<int> curr; // Temporary vector to store elements of the current array.
cout << "Enter the elements of array " << i + 1 << " (enter -1 to stop) : "; int temp_input; cin >> temp_input; // Using a do-while loop to read elements until -1 is entered. // This ensures at least one element is attempted to be read. do { if (temp_input != -1) { // Only push if it's not the sentinel -1 curr.push_back(temp_input); } cin >> temp_input; // Read the next element } while(temp_input != -1); // Continue as long as -1 is not read
kArrays.push_back(curr); // Add the current array to kArrays. }
// Call the function to merge the K sorted arrays. vector<int> mergedArray = mergeSortedArrays(kArrays, K);
// Print the merged sorted array. cout << "Merged array : "; for(auto i : mergedArray) { cout << i << " "; } cout << endl;
return 0;}
/*Example Usage:
Input K: 3Array 1: 1 5 9 -1Array 2: 2 3 8 -1Array 3: 4 6 7 -1
Step-by-step Trace:
Initial Min Heap:- Node(1, row=0, col=0) from Array 0- Node(2, row=1, col=0) from Array 1- Node(4, row=2, col=0) from Array 2Heap: { (1,0,0), (2,1,0), (4,2,0) } (top is 1)ans: []
Loop 1:- temp = (1,0,0). ans.push_back(1). ans: [1]- pop (1,0,0).- row=0, col=0. kArrays[0] has next element (5).- push Node(5,0,1).Heap: { (2,1,0), (4,2,0), (5,0,1) } (top is 2)
Loop 2:- temp = (2,1,0). ans.push_back(2). ans: [1, 2]- pop (2,1,0).- row=1, col=0. kArrays[1] has next element (3).- push Node(3,1,1).Heap: { (3,1,1), (4,2,0), (5,0,1) } (top is 3)
Loop 3:- temp = (3,1,1). ans.push_back(3). ans: [1, 2, 3]- pop (3,1,1).- row=1, col=1. kArrays[1] has next element (8).- push Node(8,1,2).Heap: { (4,2,0), (5,0,1), (8,1,2) } (top is 4)
Loop 4:- temp = (4,2,0). ans.push_back(4). ans: [1, 2, 3, 4]- pop (4,2,0).- row=2, col=0. kArrays[2] has next element (6).- push Node(6,2,1).Heap: { (5,0,1), (6,2,1), (8,1,2) } (top is 5)
Loop 5:- temp = (5,0,1). ans.push_back(5). ans: [1, 2, 3, 4, 5]- pop (5,0,1).- row=0, col=1. kArrays[0] has next element (9).- push Node(9,0,2).Heap: { (6,2,1), (8,1,2), (9,0,2) } (top is 6)
Loop 6:- temp = (6,2,1). ans.push_back(6). ans: [1, 2, 3, 4, 5, 6]- pop (6,2,1).- row=2, col=1. kArrays[2] has next element (7).- push Node(7,2,2).Heap: { (7,2,2), (8,1,2), (9,0,2) } (top is 7)
Loop 7:- temp = (7,2,2). ans.push_back(7). ans: [1, 2, 3, 4, 5, 6, 7]- pop (7,2,2).- row=2, col=2. kArrays[2] no more elements.Heap: { (8,1,2), (9,0,2) } (top is 8)
Loop 8:- temp = (8,1,2). ans.push_back(8). ans: [1, 2, 3, 4, 5, 6, 7, 8]- pop (8,1,2).- row=1, col=2. kArrays[1] no more elements.Heap: { (9,0,2) } (top is 9)
Loop 9:- temp = (9,0,2). ans.push_back(9). ans: [1, 2, 3, 4, 5, 6, 7, 8, 9]- pop (9,0,2).- row=0, col=2. kArrays[0] no more elements.Heap: {} (empty)
Loop terminates.Output: Merged array : 1 2 3 4 5 6 7 8 9*/
1. Problem Statement
- Goal: Given
K
sorted arrays, merge them into a single sorted array. - Example:
arr1 = {1, 5, 9}
,arr2 = {2, 3, 8}
,arr3 = {4, 6, 7}
- Merged sorted array:
{1, 2, 3, 4, 5, 6, 7, 8, 9}
- Merged sorted array:
2. Challenge & Approach: Efficiently Picking the Next Element
- Challenge: How do we efficiently find the absolute smallest element among the
K
arrays at each step? - Solution: Min Heap (Priority Queue)
- Core Idea: We use a Min Heap to keep track of the smallest element currently available from each of the
K
arrays. - How it works:
- Initialize the Min Heap by adding the first element of each of the
K
arrays into it. - Repeatedly extract the minimum element from the heap. This element is the next one in our merged sorted array.
- After extracting an element, if the array it came from still has more elements, add the next element from that same array into the heap. This ensures the heap always contains
K
(or fewer, if some arrays are exhausted) candidates for the next smallest element.
- Initialize the Min Heap by adding the first element of each of the
- Core Idea: We use a Min Heap to keep track of the smallest element currently available from each of the
3. Custom Node
Class and compare
Functor
- Why
Node
?: When an element is extracted from the heap, we not only need itsdata
(value) but also need to know which array it came from (row
) and its position within that array (col
). This information is crucial to fetch the next element from the correct array. Node
Structure:data
(the element value),row
(index of the array),col
(index within that array).- Why
compare
?: Thepriority_queue
by default usesstd::less
for basic types (making it a Max Heap). When storing custom objects or pointers (Node*
), we need to provide a custom comparison logic.- For a Min Heap, the
operator()
in thecompare
class should returntrue
ifa
should have lower priority thanb
(i.e.,a
’s data is greater thanb
’s data). This ensures smaller values float to the top.
- For a Min Heap, the
4. mergeSortedArrays(vector<vector<int>> kArrays, int K)
Logic
- Initialization (
Step 1
):- Create a
minHeap
ofNode*
using thecompare
functor. - For each of the
K
input arrays:- Create a
new Node
for its first element (kArrays[row][0]
), storing itsrow
and0
forcol
. push
thisNode*
into theminHeap
.
- Create a
- Create a
- Extraction & Replenishment (
Step 2
):- Create an empty
ans
vector to store the final sorted result. - Loop
while(minHeap.size() > 0)
:- Extract Minimum: Get the
Node* temp
fromminHeap.top()
. Thistemp->data
is the smallest element available globally. - Add
temp->data
toans
. minHeap.pop()
: Removetemp
from the heap.- Identify Source: Get
row
andcol
fromtemp
. - Fetch Next: If there’s a next element in the same array (
col + 1 < kArrays[row].size()
):- Create a
new Node
forkArrays[row][col+1]
, with the samerow
butcol+1
. push
thisnew Node*
into theminHeap
.
- Create a
- Extract Minimum: Get the
- Create an empty
- Return: The
ans
vector.
5. Complexity Analysis
- Let
N
be the total number of elements across allK
arrays. (N = sum of lengths of all kArrays
). - Time Complexity:
O(N log K)
- Initial Population:
K
insertions into the heap. Eachpush
takesO(log K)
. TotalO(K log K)
. - Main Loop: The loop runs
N
times (because each of theN
elements is extracted exactly once). In each iteration:minHeap.top()
isO(1)
.ans.push_back()
isO(1)
amortized.minHeap.pop()
isO(log K)
.minHeap.push()
isO(log K)
.- So, each iteration is
O(log K)
.
- Total:
O(K log K + N log K) = O(N log K)
(sinceN >= K
).
- Initial Population:
- Space Complexity:
O(N + K)
O(N)
for theans
vector (to store the final merged array).O(K)
for theminHeap
(it stores at mostK
elements/nodes simultaneously).- Memory Management Note: The code uses
new Node(...)
. For each element pushed into the heap, a newNode
object is dynamically allocated. These objects are not explicitlydelete
d in the provided code. In C++, this would lead to a memory leak. In competitive programming,new
is often used withoutdelete
if the program terminates quickly and memory limits are not strict. For robust code, consider usingstd::unique_ptr
with custom deleter,std::shared_ptr
, or manualdelete
when a node is no longer needed (e.g., afterminHeap.pop()
and before returning from the function, though careful management is needed to ensure nodes added toans
are not deleted prematurely).