Thursday, August 4, 2011

Give a data-structure which will guarantee O(log n) time per operation.

Complete binary tree as an efficient data-structure

You are given an array of size n(n being a power of two). All the entries of the array are initialized to zero. You have to perform a sequence of the following online operations :

* (i) Add(i,x) which adds x to the entry A[i].
* (ii) Report sum(i,j) = sum of the entries in the array from indices i to j for any 0 < i < j <= n.

It can be seen easily that we can perform the first operation in O(1) time whereas the second operation may cost O(n) in worst case. Your objective is to perform these operations efficiently. Give a data-structure which will guarantee O(log n) time per operation. (The title of the problem is a hint).

Merge the two Binary Search Trees in time O(log m + log n)

You are given two height balanced binary search trees T and T', storing m and n elements respectively. Every element of tree T is smaller than every element of tree T'. Every node u also stores height of the subtree rooted at it. Using this extra information how can you merge the two trees in time O(log m + log n) (preserving both the height balance and the order)?

Determine whether the kth largest element of the heap is greater than x or not.

Consider a binary heap containing n numbers (the root stores the greatest number). You are given a positive integer k < n and a number x . You have to determine whether the kth largest element of the heap is greater than x or not. Your algorithm must take O(k) time. You may use O(k) extra storage.

Searching for a friend

You are standing at a crossing from where there emerge four roads extending to infinity. Your friend is somewhere on one of the four roads. You do not know on which road he is and how far he is from you. You have to walk to your friend and the total distance traveled by you must be at most a constant times the actual distance of your friend from you. In terminology of algorithms, you should traverse O(d) distance, where d is the distance of your friend from you.

Searching For Celebrity

Celebrity is a person whom everybody knows but he knows nobody. You have gone to a party. There are total n persons in the party. Your job is to find the celebrity in the party. You can ask questions of the form Does Mr. X know Mr. Y ?. You will get a binary answer for each such question asked. Find the celebrity by asking only O(n) questions.

You are given n real numbers in an array. A number in the array is called a decimal dominant if it occurs more than n/10 times in the array. Give an O(n) time algorithm to determine if the given array has a decimal dominant.

Generating lexial predecessor of a k-subset of set of n elements

Wednesday, August 3, 2011

Given a file containing 4,300,000,000 integers, how can you find one that appears at least twice

A Theoritical Approach From Programming Pearls

Binary search find an element that occurs at least twice by recursively searching the subinterval that contains more than half of the integers. My original solution did not guarantee that the number of integers is halved in each iteration, so the worst case run time of its log2 n passes was proportional to n·log n. Jim Saxe reduced that to linear time by observing that the search can avoid carrying too many duplicates.

When his search knows that a duplicate must be in a current range of m integers, it will only store m+1 integers on its current work tape;

If more integers would have gone on the tape, his program discards them. Although his method frequently ignores input variables, its strategy is conservative enough to ensure that it finds at least one duplicate.

The algorithm that Bentley is talking about works by repeatedly halving
the candidate range in which the duplicate element must lie. Initially
this range is 0..2^32-1. On each pass it throws away the half of the
range that contains fewer data values and it also throws away the data
lying in that half. Eventually the range decreases to a single value,
which must be a duplicate because the remaining data has at least two
elements. The problem that Bentley notes is that the data may still
have 4 billion elements at this stage! The final improvement he
mentions is that you can throw away data _inside_ the candidate range
as long as you keep enough data around to ensure that at least one
duplicate occurs in it. This is equal to the size of the current
candidate range plus 1!


Another Similer Approach I Found

Create a bit array of length 2^32 bits (initialize to zero), that would be about 512MB and will fit into RAM on any modern machine.

Start reading the file, int by int, check bit with the same index as the value of the int, if the bit is set you have found a duplicate, if it is zero, set to one and proceed with the next int from the file.

The trick is to find a suitable data structure and algorithm. In this case everything fits into RAM with a suitable data structure and a simple and efficient algorithm can be used.
If the numbers are int64 you need to find a suitable sorting strategy or make multiple passes, depending on how much additional storage you have available.

The Pigeonhole Principle -- If you have N pigeons in M pigeonholes, and N>M, there are at least 2 pigeons in a hole. The set of 32-bit integers are our 2^32 pigeonholes, the 4.3 billion numbers in our file are the pigeons. Since 4.3x10^9 > 2^32, we know there are duplicates.

You can apply this principle to test if a duplicate we're looking for is in a subset of the numbers at the cost of reading the whole file, without loading more than a little at a time into RAM-- just count the number of times you see a number in your test range, and compare to the total number of integers in that range. For example, to check for a duplicate between 1,000,000 and 2,000,000 inclusive:

int pigeons = 0;
int pigeonholes = 2000000 - 1000000 + 1; // include both fenceposts
for (each number N in file) {
if ( N >= 1000000 && N <= 2000000 ) { pigeons++ } } if (pigeons > pigeonholes) {
// one of the duplicates is between 1,000,000 and 2,000,000
// try again with a narrower range
}

Picking how big of range(s) to check vs. how many times you want to read 16GB of data is up to you :)

As far as a general algorithm category goes, this is a combinatorics (math about counting) problem.

Given an array[] of integers (+ive , -ive, 0) find the subarray which gives the largest product.

Data Structure Used: Array

Algorithm & Expliantion

Lets us take an array = { 2,-25,4,5,-3,-5}

We take 3 variables P , N , Val

P=1 , N=0 , Val=0

first value is 2 , A[0] which is +ve . So we multiply both P & N with A[0] .
P=2 , N=0

now V[1] = -25 -ve .

We multiply P with -V[1] & N with -V[1] .

P=50
N=0

As V[1] is -ve we swap P & N .
P=0 , N=50

if V[i] == 0
We initialise P=1 , N=0

if( P < 1 ) /* analogous to kadane's algo's if( sumSoFar < 0 ) */ { P=1; N=0; } at every step val = max( val , P ) We proceed in the same fashion till the end . What we are trying to do is to maintain max Positive product so far & max -ve product so far . Hence when we encounter a -ve value we multiply both with - V[i] & then swap as -ve * -ve = +ve -ve * +ve = -ve Working Code #include

int swap(int *a,int *b)
{
int t;
t=*a;
*a=*b;
*b=t;
}

int main()
{
int a[]={6,-1,2,-33,4,15,-7,28,-9,-10};
int maxp=1,maxn=0;
int i=0;
for(i=0;i<10;i++) { if(a[i]>0)
{
maxp*=a[i];
maxn*=a[i];
}
else
{
maxp*=-a[i];
maxn*=-a[i];
swap(&maxp,&maxn);
}


}

printf("%d",maxp);
}

Time Complexity O(N)
Space Conmplexity O(1)
Run Here https://ideone.com/VaKgk

Design An Algorithm to Exhibit Did You Mean Feature of Google Search Engine

When people search at Google, they often type two words without space. For example, instead of typing "binary tree", someone can type "binarytree". Given an input, what is a good way of finding if it is a combination of two valid words e.g. In Directly You have Add Space between two such or if are using some data structure , whene words would have been inserted , you would have mainted a boolen varible to defined the end of word (eod)


An Exanmple Will be Let's say you have a phrase without any spaces - eg. "thisisawesome". Given a dictionary, how would you add spaces in this string?





When people search at Google, they often type two words without space. For example, instead of typing "binary tree", someone can type "binarytree". Given an input, what is a good way of finding if it is a combination of two valid words e.g. In Directly You have Add Space between two such or if are using some data structure , where words would have been inserted , you would have maintained a boolean variable to defined the end of word (eod)
An Example Will be Let's say you have a phrase without any spaces - eg. "thisisawesome". Given a dictionary, how would you add spaces in this string?



        Trie trie = new Trie();
        trie.insert("i");
        trie.insert("this");
        trie.insert("saw");
        trie.insert("is");
        trie.insert("a");
        trie.insert("some");
        trie.insert("awesome");
        String inputString = "thisisawesome";
        String outputString = "";
      
        int left = 0;
        int right = 1;
        while ( right <= inputString.length()) 
        {
            {
                String str = inputString.substring(left,right);
                boolean found = trie.containsKey(str);
                if ( found) 
                {
                    outputString += str;
                    outputString += " ";
                    left += str.length();
                    right = left + 1;
                }
                 else
                 {
                    right++;
                }
            }
        }
      
   

Another Good Working Code  You Can Refer Avinash Solution


Time Complexity O(K) k is length of input string