Wednesday, June 22, 2011

Detect 1% Duplicate Entry in a File of 13 Million Entries Efficiently.

We have a text file with 13 Million entries, all numbers in the range from 1.000.000 to 100.000.000. Each line has only one number. The file is unordered and we have about 1% duplicate entries. Remove the duplicates.

As We have a Huge Data obvious things that should come into mind is using bit array or bit set where ith bit represent the position of number in array of size 13* 2^20

before start solving question i suggest you people to check basics of bit operations here

#include
#include
using namespace std;
int main()
{
int i=0,j=0,k=0;
int n=18;
n|=1<<3;//Set the ith bit here i=3
printf( " %d ", n);
n&=~(1<<3);//clear ith bit
printf( " %d ", n);
n^=1<<1;//toggle ith bit
printf( " %d ", n);
i=n&(1<<4)?1:0;//check ith (i=4)bit set or not
printf( " %d ", i);
return 0;
}


Data Structure:Bit Array

Algorithm:
1.set ith bit n bit arry when reading number 1 by one from file .
2.if ith bit corrosponding to number is already set then number is duplicate .
& we are done.

as we are using bit array here so When using an array of bytes to represent set of bits, i.e., a bit array or bitset, the index of the byte in the array associated with a bit n can be calculated using division:

ith index / CHAR_BIT (number represented by 32 bit or 64 bit etc)

where n is the index of the given bit and CHAR_BIT gives the number of bits in a C char.

"this will give position of byte element in array elements"

The index of the bit within the byte indexed by the above can be calculated via a modulo operation:

n % CHAR_BIT

"so this will give us position of bit inside byte."

so then we can write the following method to find 1% duplicates

bool setBit(int a[], int i)
{
int index = i / 8; //position number in array
int offset = i % 8;//position of bit in byte
if (a[index] & (1<< offset))//check bit is set or not
return false;//when ith bit is already set
a[index] |= 1< return true;
}

Time Complexity O(N) size of file
Feel Free to Comment or Optmize the Solution

Given a 1D Array (linear) , you have to convert it in to 2D array & then find sum of each column.you have given number of rows K & number of elemnets N in array.

You are given a 1D array of integers, such as: int[] array = [3,4,7,2,2,6,0,9];
Suppose you need to treat this array as a 2D table with a given number of rows.
You want to sum the columns of the table.One value of numRows is 4..in that case the resultant array would look like

what if numRows==4?
3 4
7 2
2 6
0 9
—-
12 21

Data Structure : 2D Array


Algorithm:
1.As we have given Number of elements N in array & number of rows k to make 2D
Array we can easily find out number of column C=N/k
2.Then using two for loop down the 2D array 1st loop run for coumns & inner loop
work for rows.(A Tricky Mind Needed here although so Simple
).as in inner loop
we will increment by value of column we get.

Working Code:

#include
#include
using namespace std;
int main()
{
int i=0,j=0,k=0,sum=0;
int r=3;//row
int a[]={3,4,7,2,2,6,0,9,5,8,11};
int n=sizeof(a)/sizeof(int);
j=n/r;//column

while(k {
for(i=k;i {
sum=sum+a[i];
}
cout< sum=0;
k++;
}

return 0;
}

Time Complexity O(column*row)=O(N) see above relation C=N/r
Space Complexity O(1)
Run Here https://ideone.com/RWjEf

Tuesday, June 21, 2011

Given ‘n’ no of sets and each set is of a variable size. Find out the cross product of these ‘n’ of sets.

For eg. {1,2} , {3,4} , {5,6,7} these are the three sets.The cross product of these sets would be as below.
{1,3,5},{1,3,6},{1,3,7},{1,4,5},{1,4,6},{1,4,7},{2,3,5},{2,3,6},{2,3,7},{2,4,5},
{2,4,6},{2,4,7}.


Data Structure: Binary Tree

Algorithm:


Working Code:






Time Complexity
Space Complexity
Run Here

Given an array of integers, find out number of ways in which you can select increasing subsequences of length k(k<=n).

eg array is 1 4 6 2 5 & k=3
Output: 3
Sequences: 1 4 6, 1 4 5, 1 2 5


Question is a variant of longest increasing sub sequence problem(LIS Problem) or longest consecutive elements

See http://en.wikipedia.org/wiki/Longest_increasing_subsequence

Data Structure: Array
Algorithm:LIS


Working Code:By Manu

import java.io.*

class LISOfLengthK
{

private static void lisOfLengthK(Integer [] inputArray, int k) {
int n = inputArray.length, temp, count;
Integer [] best = new Integer[n];
Integer [] last = new Integer[n];
Integer [] outputArray = new Integer[k];
for(int i=0;i inputArray[i]) {
best[j] = best[i]+1;
last[j] = i;
if(best[j] == k) {
temp = j;
count =0;
outputArray[count++] = inputArray[temp];
while(last[temp] != -1) {
outputArray[count++] = inputArray[last[temp]];
temp = last[temp];
}
for(int index=k-1;index>=0;index--)
System.out.print(outputArray[index] + " ");
System.out.println();
}
}
}
}
}

public static void main(String[] args) throws NumberFormatException, IOException {
BufferedReader br = new BufferedReader(new InputStreamReader(System.in));
int n = Integer.parseInt(br.readLine());
Integer [] inputArray = new Integer[n];
for(int i=0;i inputArray[i] = Integer.parseInt(br.readLine());
int k = Integer.parseInt(br.readLine());
lisOfGivenLength(inputArray, k);
}

}


Time Complexity O(N^2)
Space Complexity O(1)
https://ideone.com/Jbtw0

WAP Convert Set of Nodes of Binary Tree where each node contains two value i, j & i value follow BST property , j value follow Heap Property, Build BST form these Nodes.

A rooted binary tree with keys in its nodes has the binary search tree
property (BST property) if, for every node, the keys in its left
subtree are smaller than its own key, and the keys in its right
subtree are larger than its own key. It has the heap property if, for
every node, the keys of its children are all smaller than its own key.
You are given a set of n binary tree nodes that each contain an
integer i and an integer j. No two i values are equal and no two j
values are equal. We must assemble the nodes into a single binary tree
where the i values obey the BST property and the j values obey the
heap property. If you pay attention only to the second key in each
node, the tree looks like a heap, and if you pay attention only to the
first key in each node, it looks like a binary search tree.Describe a
recursive algorithm for assembling such a tree

Indirectly They Asked to Implement Treap

Data Structure: Treap( Heap + Binary Search Tree)

Algorithm:
1.Sort the nodes in decreasing order of their j values and then simply build the
2.BST according to the i values starting from the first node in the obtained sorted
list. The first node will be the root. For example, let the given nodes (i,j
pairs) be:

so our structure of node will looks like this

struct node
{
int i;
int j;
struct node *left;
struct node *right;
};



For Example:
(12,6) (18,25) (19,10) (17,5) (19,10) i.e. 5 nodes
nodes after sorting in decreasing order according to j values:
(18,25) (16,11) (19,10) (12,6) (17,5)
So the tree would be something like this:


(18,25)
|
| |
(16,11) (19,10)
|
| |
(12,6) (17,5)


Working Code:


#include
#include



struct node
{
int i;
int j;
struct node *left;
struct node *right;
};


void mySwap(struct node **n1, struct node **n2)
{
struct node *tmp;
tmp = *n1;
*n1 = *n2;
*n2 = tmp;
}


struct node *myInsert(struct node *root, struct node *nodeToInsert)
{
if(root == NULL)
{
return nodeToInsert;
}
else
{
if(nodeToInsert->i <= root->i)
{
root->left = myInsert(root->left,nodeToInsert);
}
else
{
root->right = myInsert(root->right,nodeToInsert);
}
return root;
}
}


void myFunc(struct node *arr[], struct node **resultTree, int size)
{
if(!size)
return;
int ind;
int maxInd = 0;
for(ind=0;indj > arr[maxInd]->j)
maxInd = ind;
*resultTree = myInsert(*resultTree,arr[maxInd]);//inserting node with maximum j value
mySwap(&arr[maxInd],&arr[size-1]);
myFunc(arr,resultTree,size-1);
}


int main()
{
int n;
struct node *arr[100];
scanf("%d",&n);
int ind;
int ival,jval;
struct node *result = NULL;
for(ind=0;indi = ival;
arr[ind]->j = jval;
arr[ind]->left = NULL;
arr[ind]->right = NULL;
}
myFunc(arr,&result,n);
//levelOrderTraversal(result);
//PreOrderTraversal(result);
for(ind=0;ind free(arr[ind]);
return 0;
}

Solution Given BY Vipul

Time Complexity O(N^2)
Space Complexity O(logn)
Run Here https://ideone.com/o7dfU

WAP to Find Two Leaf Nodes in Binary Tree such that F(x,y) is maximum

Given a binary tree, find 2 leaf nodes say X and Y such that F(X,Y) is maximum where F(X,Y) = sum of nodes in the path from root to X + sum of nodes in the path from root to Y - sum of nodes in the common path from root to first common ancestor of the Nodes X and Y


Data Structure: Binary Tree

Algorithm:


Working Code:




Time Complexity
Space Complexity

WAP to Find Number of Pairs of Intersecting Disc ,Given Array of N Integers .

Given an array A of N integers we draw N discs in a 2D plane, such that i-th disc has center in (0,i) and a radius A[i]. We say that k-th disc and j-th disc intersect, if k! =j and k-th and j-th discs have at least one common point.

Write a function

class Solution { public int number_of_disc_intersections(int[] A); }

which given an array A describing N discs as explained above, returns the number of pairs of intersecting discs. For example, given N=6 and

A[0]=1 A[1]=5 A[2]=2 A[3]=1 A[4]=4 A[5]=0

there are 11 pairs of intersecting discs:

0th and 1st
0th and 2nd
0th and 4th
1st and 2nd
1st and 3rd
1st and 4th
1st and 5th
2nd and 3rd
2nd and 4th
3rd and 4th
4th and 5th

so the function should return 11.

The function should return -1 if the number of intersecting pairs exceeds 10,000,000. The function may assume that N does not exceed 10,000,000

Hint & Algorithms: Use HashTable , Linear Probing Based, in Case of Collision increment Corresponding Counter,if i am correct ,coding part is very easy, still if get time, will surely put running code here as usual :) Happy Coding

Time Complexity O(N) Where N is Size of Array
Space Complexity O(N)

WAP to Find equilibrium index in Unsorted Array of Size N, Efficiently

The equilibrium index of a sequence is an index such that the sum of elements at lower indexes is equal to the sum of elements at higher indexes. For example, in a sequence A:

A[0]=-7 A[1]=1 A[2]=5 A[3]=2 A[4]=-4 A[5]=3 A[6]=0

3 is an equilibrium index, because:
A[0]+A[1]+A[2]=A[4]+A[5]+A[6]

6 is also an equilibrium index, because:
A[0]+A[1]+A[2]+A[3]+A[4]+A[5]=0

(The sum of zero elements is zero) 7 is not an equilibrium index - because it is not a valid index of sequence A. If you still have doubts, here is a precise definition: The integer k is an equilibrium index of a sequence A[0],A[1]..,A[n-1] if and only if 0<= k and sum(A[0..(k-1)])=sum(A[(k+1)..(n-1)]). Assume the sum of zero elements is equal to zero.

Write a function
int equi(int A[], int n)

that, given a sequence, returns its equilibrium index (any) or -1 if no equilibrium index exists. Assume that the sequence may be very long.

The problem can be solved by using various approaches, the most common being simply to follow the equilibrium definition:

int equi ( int A[], int n ) {
int k, m, lsum, rsum;
for(k = 0; k < n; ++k) {
lsum = 0; rsum = 0;
for(m = 0; m < k; ++m) lsum += A[m];
for(m = k + 1; m < n; ++m) rsum += A[m];
if (lsum == rsum) return k;
}
return -1;
}

Time Complexity o(N^2)
Space Complexity O(1)
Run Here https://ideone.com/fThsV

Unfortunately, this approach has two disadvantages:

* it fails on large input data sets, since the time complexity is O(n2)
* it fails on large input values (for example if the input array contains values like MIN/MAX_INT) due to the arithmetic overflows

The solution analysis will detect such problems in submitted code:

Optimized Solution

We can fix the first problem with a better algorithm, and the second problem with a better data-type (for example, using long long type instead of int for sum computations). The key observation for better running time is to update the left/right sums in constant time instead of recomputing them from the scratch.

int equi(int arr[], int n) {
if (n==0) return -1;
long long sum = 0;
int i;
for(i=0;i
long long sum_left = 0;
for(i=0;i long long sum_right = sum - sum_left - (long long) arr[i];
if (sum_left == sum_right) return i;
sum_left += (long long) arr[i];
}
return -1;
}


Using this solution, you can obtain a perfect answer:

To return all equilibrium index we can print instead of returning the single index

Time Complexity O(N)
Space Complexity O(1)
Run Here https://ideone.com/OKM8d

Construct Binary Tree From Ancester Matrix Efficiently

You are given a matrix M[n, n] for a tree with n nodes. In the given matrix M, M[i, j] is true iff node j is an ancestor of node i. Construct the tree from the given matrix.

For example, you are given below matrix.

1 2 3 4
1 0 1 1 0

2 0 0 1 0

3 0 0 0 0

4 0 1 1 0

You need to construct the below tree. In the constructed tree ancestor relationship should be correct. A node can come on left or right of its parent as you cannot determine this information from the ancestor matrix

Node numbers used in matrix are in bracket
5(3)
/
/
10(2)
/ \
/ \
12(1) 13(4)


Data Structure Used: 2-D Array , Binary Tree

Algorithm


Time Complexity
Space Complexity

Given an array and an integer k, find the maximum for each and every contiguous sub array of size k.

Sample Input :
1 2 3 1 4 5 2 3 6
3 [ value of k ]

Sample Output :
3
3
4
5
5
5
6

This Question is Asked By Google Indirectly for Maximum in window or subarray of size k in array of size n, yes it is sliding window problem, as window slides we need to find out the maximum in each window. is it ??

Input: A long array A[], and a window width w
Output: An array B[], B[i] is the maximum value of from A[i] to A[i+w-1]
Requirement: Find a good optimal way to get B[i]

Solution: It Can Solved By Two Ways in This Question Data Structure Plays Important Role

The obvious solution with run time complexity of O(nw) is which is not efficient enough. Every time the window is moved, we have to search for the maximum from w elements in the window. where w is size of window & n is size of array

1st Method(Naive Approach)

Data Structure Used : Array
Algorithm: A.for all i=0 to n-w+1 (we should have at-least w elements in window)
B.for all j=i to i+w (keep incrementing windows size form left to right)
C find maximum inn each window & print it or put in array (Auxiliary)

#include

void printMaxInSlidingWindows(int a[],int n,int w)
{

int max=0;
int i=0,j=0;

for (i = 0; i max)
}
max=a[j];

}

}
printf( " %d ", max);
}

}

int main()
{
int a[]={1,3,-1,-3,5,3,6,7};
printMaxInSlidingWindows(a,8,3);
}

TC O(nw)
SC O(1)
Run Here http://ideone.com/7o3Ta


2nd Method

Data Structure Used: Queue(More Efficient)

We need a data structure where we can store the candidates for maximum value in the window and discard the element, which are outside the boundary of window. For this, we need a data structure in which we can edit at both the ends, front and back. Deque is a perfect candidate for this problem.

We are trying to find a way in which, we need not search for maximum element among all in the window. We will make sure that the largest element in the window would always appear in the front of the queue.
While traversing the array in forward direction if we find a window where element A[i] > A[j] and i > j, we can surely say that A[j], will not be the maximum element for this and succeeding windows. So there is no need of storing j in the queue and we can discard A[j] forever.
For example, if the current queue has the elements: [4 13 9], and a new element in the window has the element 15. Now, we can empty the queue without considering elements 4, 13, and 9, and insert only element 15 into the queue.

Every time, we move to a new window, we will be getting a new element and leave an old element. We should take care of:
Popping elements outside the window from queue front.
Popping elements that are less than new element from the queue.
Push new element in the queue as per above discussion.

Note:Optimization Done In Done so that we can Find The Maximum of Each Window in O(1)

Here Is Tunning Code

import java.util.*;

class Maximumin_SlidingWindow
{

public static void main(String ar[])
{

Integer a[]=new Integer[]{1,3,-1,-3,5,3,6,7};
int w=3,i=0;
int size=a.length;
Integer b[]=new Integer[size-w+1];

maxSlidingWindow(a,size,w,b);

for(i=0;iQ=ar[0];

//Initilize deque Q for first window
for (int i = 0; i < w; i++) { while (!Q.isEmpty() && A[i] >= A[Q.peekLast()])
Q.pollLast();
Q.offerLast(i);
}

for (int i = w; i < n; i++) { B[i-w] = A[Q.peekFirst()]; //update Q for new window while (!Q.isEmpty() && A[i] >= A[Q.peekLast()])
Q.pollLast();

//Pop older element outside window from Q
while (!Q.isEmpty() && Q.peekFirst() <= i-w)
Q.pollFirst();

//Insert current element in Q
Q.offerLast(i);
}
B[n-w] = A[Q.peekFirst()];
}

}

TC O(n)n Since Eacj Array Element is Inserted & deleted At-Most Once
SC O(1)
Run Here http://ideone.com/KLIpO
http://ideone.com/TftYg