Saturday, June 25, 2011

Write the code/algorithm to find the k-th Smallest Element in the Union of Two Sorted Arrays .

Given two sorted arrays A, B of size m and n respectively. Find the k-th smallest element in the union of A and B. You can assume that there are no duplicate elements.

would have to admit that this problem is pretty tricky to solve. Like most difficult problems, it requires some pretty clever observations to solve in a neat way.

The trivial way, O(m+n):
Merge both arrays and the k-th smallest element could be accessed directly. Merging would require extra space of O(m+n). The linear run time is pretty good, but could we improve it even further?

A better way, O(k):
There is an improvement from the above method, thanks to readers who suggested this. Using two pointers, you can traverse both arrays without actually merging them, thus without the extra space. Both pointers are initialized to point to head of A and B respectively, and the pointer that has the smaller of the two is incremented one step. The k-th smallest is obtained by traversing a total of k steps. This algorithm is very similar to finding intersection of two sorted arrays.

static int findKthSMallest(int[] A, int[] B, int k)//Need to Verify
{
int a_offset = 0, b_offset = 0;
if (A.length + B.length < k) return -1; while (true) { if (a_offset < A.length) { while (b_offset == B.length || A[a_offset] <= B[b_offset]) { a_offset++; if (a_offset + b_offset == k) return A[a_offset]; } } if (b_offset < B.length) { while (a_offset == A.length || A[a_offset] >= B[b_offset]) {
b_offset++;
}
if (a_offset + b_offset == k) return B[b_offset];
}
}
}


The best solution, but non-trivial, O(lg m + lg n):
Although the above solution is an improvement both in run time and space complexity, it only works well for small values of k, and thus is still in linear run time. Could we improve the run time further?

The above logarithmic complexity gives us one important hint. Binary search is a great example of achieving logarithmic complexity by halving its search space in each iteration. Therefore, to achieve the complexity of O(lg m + lg n), we must halved the search space of A and B in each iteration.

We try to approach this tricky problem by comparing middle elements of A and B, which we identify as Ai and Bj. If Ai is between Bj and Bj-1, we have just found the i+j+1 smallest element. Why? Therefore, if we choose i and j such that i+j = k-1, we are able to find the k-th smallest element. This is an important invariant that we must maintain for the correctness of this algorithm.

Idea is like this since both the arrays may not be of same length lets
divide (k-1) smallest elements proportionally in both the arrays:
let i point the array A by
i=m/(m+n) * (k-1) [since we have to divide k-1 elements among two]
j=(k-1) - i
then try to insert A[i] between B[j-1] and B[j] if three are not in asc
order try to insert B[j] between A[i-1] and A[i]
If any one of the above satisfies we found kth smallest element else,
check which one is smallest among A[i] and B[j] its logical that if A[i] is
smallest then we can A[0] to A[i] for the next iteration and
k becomes k-i-1 also m becomes m-i-1 i.e now we have only m-i-1+n elements
out of which we have to find k-i-1th smallest thus the iteration goes
on until we
find our kth smallest element.
Consider 2 arrays
A={5,7,9,20}; length of A: m=4
B={10,12,21,27,35,50}; length of B: n=6
let K be 4
i=4/10*3=1; A[1]=7;
j=3-1=2; B[2]=21;
B[1]=12 A[1]=7 B[2]=21 [not in asc order]
A[0]=5 B[2]=21 A[1]=7 [not in asc order]
so now,
k=k-i-1 =4-1-1=2
m=m-i-1=4-1-1=2
n=6
A={9,20}; length of A: m=2
B={10,12,21,27,35,50}; length of B: n=6
i=2/8*1=0; A[0]=9;
j=1-0=1; B[1]=12;
(acutally A[-1] is just for understanding)
B[0]=10 A[0]=9 B[1]=12 [not in asc order]
A[-1]=-INF B[1]=12 A[0]=9 [not in asc order]
now,
k=k-i-1=2-0-1=1;
m=m-i-1=2-0-1=1;
n=6;
A={20}; length of A: m=1
B={10,12,21,27,35,50}; length of B: n=6
i=1/7*0=0; A[0]=20;
j=0-0=0; B[0]=10;
(acutally A[-1] and B[-1] are just for understanding)
B[-1]=-INF A[0]=20 B[0]=10 [not in asc order]
A[-1]=-INF B[0]=10 A[0]=20 [in asc order]
We got the Kth(4th) smallest element which is 10.

int findKthSmallest(int A[], int m, int B[], int n, int k) {
assert(m >= 0); assert(n >= 0); assert(k > 0); assert(k <= m+n); int i = (int)((double)m / (m+n) * (k-1)); int j = (k-1) - i; assert(i >= 0); assert(j >= 0); assert(i <= m); assert(j <= n); // invariant: i + j = k-1 // Note: A[-1] = -INF and A[m] = +INF to maintain invariant int Ai_1 = ((i == 0) ? INT_MIN : A[i-1]); int Bj_1 = ((j == 0) ? INT_MIN : B[j-1]); int Ai = ((i == m) ? INT_MAX : A[i]); int Bj = ((j == n) ? INT_MAX : B[j]); if (Bj_1 < Ai && Ai < Bj) return Ai; else if (Ai_1 < Bj && Bj < Ai) return Bj; assert((Ai > Bj && Ai_1 > Bj) ||
(Ai < Bj && Ai < Bj_1)); // if none of the cases above, then it is either: if (Ai < Bj) // exclude Ai and below portion // exclude Bj and above portion return findKthSmallest(A+i+1, m-i-1, B, j, k-i-1); else /* Bj < Ai */ // exclude Ai and above portion // exclude Bj and below portion return findKthSmallest(A, i, B+j+1, n-j-1, k-j-1); } Time Complexity O(logn+logm0 Space Complexity O(1) Run Here http://ideone.com/SkaAI source http://www.ihas1337code.com/2011/01/find-k-th-smallest-element-in-union-of.html Another Algorithm & Solution Given By My Friend Dhumanshu Algorithms you have two arrays a,b and given k now logic is to compare k/2th element of first array and k/2th element of 2nd array. this is because total no. of elements under consideration are k/2 + k/2 = k(th element we have to find) you have to take care of the case wen k is odd, in that case compare k/2th of first and (k/2 + 1)th of 2nd array now if a[k/2] > b[k/2] but a[k/2] < b[k/2 + 1] this means if we sort first k/2 elements of both arrays together i.e total k elements then a[k/2] would be the last one - our required answer. if above fails then check for b with a in the same manner. if above fails, it means we have to expand our set - the elements in consideration (earlier we took k/2 of each) now check if a[k/2]>b[k/2], but here a[k/2] is also > b[k/2 +1], now we have to look on left side of array a and right side of array b,
so call recursively with array a between (0,k/2 -1) and array b between (k/2 +1 , b.length).
if above fails then check for b with a viceversa.

This is the algo behind but u have to take care of special cases like if one array elements are all out of set, you are left with 1 array, so call normal binary search on that leftover array to find kth element.

Working Code

#include
#include

int ssearch(int *a,int l,int h,int k)
{
if(l + k -1 > h)
return -1;
else
return a[l+k-1];
}


int kthlargest(int *a,int *b,int la,int ra,int lb,int rb,int k)
{
//get optimum mida and midb
int mida = la + k/2 - 1,midb = lb + k/2 - 1 + k%2;

if(midb>rb)
{
mida += midb-rb;
midb = rb;
}
else if(mida>ra)
{
midb += mida-ra;
mida = ra;
}

//check extremes in case one array expires
if(mida>ra || midara)
return ssearch(b,lb,rb,k-(ra-la+1));
if(midb>rb || midbrb)
return ssearch(a,la,ra,k-(rb-lb+1));
if(mida=b[midb] ? a[mida] : b[midb];
//either way
if(b[midb]>=a[mida])
if(mida==ra || a[mida+1]>=b[midb])
return b[midb];
else
return kthlargest(a,b,mida+1,ra,lb,midb-1,k-mida-1+la);

else
if (midb==rb || a[mida]<=b[midb+1]) return a[mida]; else return kthlargest(a,b,la,mida-1,midb+1,rb,k-midb-1+lb); } int main() { int a[]={4,8,12,18,25,33,56}; int b[]={1,2,3,6,17,18,25,26,32,89}; int k,i; for(i=0;ib[0]?b[0]:a[0]);
else
printf("k th smallest element is %d\n",kthlargest(a,b,0,sizeof(a)/sizeof(int)-1,0,sizeof(b)/sizeof(int)-1,k));
return 0;
}

No comments :