Example: [10, 4, -2, 7, 8, -1, -5] Maximum average of continuous elements 7+8=15/2 =7.5
Saturday, December 10, 2011
Construct a BST where each node has 3 pointers instead of 2. the structure is struct node { int data; struct node *left; struct node *right; struct node *inordersuccessor; } write a code to add nodes in a binary search tree . inordersuccessor pointing to inorder successor.
Labels:Data
Amazon Interview
Thursday, December 8, 2011
Reverse the direction of each pointers of Linked List which haveing next & random pointer with it.
Consider a Linked List with each Node, in addition to having a 'next'
pointer also has a 'random' pointer. The 'random' pointer points to
some random other Node on the linked list. It may also point to NULL. To
simplify things, no two 'random' pointers will point to the same node,
but more than 1 Node's random pointer can point to NULL.
Now we are required to reverse the direction of all the pointers (both the 'next' and 'random') of the Linked list. The constraint is the solution MUST be O(1) space complexity (A constant number of new nodes can be created but not proportional to the length of the list)
Now we are required to reverse the direction of all the pointers (both the 'next' and 'random') of the Linked list. The constraint is the solution MUST be O(1) space complexity (A constant number of new nodes can be created but not proportional to the length of the list)
Labels:Data
Amazon Interview
Wednesday, December 7, 2011
Find The Largest Binary Search Tree in GIven Tree . Its Should be the Subtree.
Very Faq. Asked Question All Tech Giants :) Will Post The Solution Soon :) Meanwhile You Can Try & Post the approach .
Prerequisite : Most Efficient Solution to Check if Given Tree is BST or not ? !st Try it or Search Archives :)
Basic Idea :
Traverse the tree. From each node to the parent, return the following
set of values.
1) If BST, size of the current BST or -1 if the tree is not.
2) Minval & Maxval of the subtree and maxbstsize seen so far (
probably using references)
So in each node check the following:
set of values.
1) If BST, size of the current BST or -1 if the tree is not.
2) Minval & Maxval of the subtree and maxbstsize seen so far (
probably using references)
So in each node check the following:
If( leftmax < node->data && node->data < rightmin) // Means it is a BST { new size = rightsize+leftsize+1 update size if greater than max return size } else { return -1; }
Working Code:
/* Largest Binary search tree inside a binary tree -- O(n)*/
#include <stdlib.h>
#include <stdio.h>
using namespace std;
struct node
{
int data;
struct node* left;
struct node* right;
};
struct node* insert_bst(struct node* root, int num){
if(root == NULL){
root = (struct node*)malloc(sizeof(struct node));
root->data = num;
root->left = NULL;
root->right = NULL;
return root;
}else{
if(num == root->data){
return root;
}
if(num > root->data){
root->right=insert_bst(root->right,num);
}else{
root->left=insert_bst(root->left,num);
}
}
return root;
}
void inorder_traverse(struct node* root){
if(root == NULL) return;
inorder_traverse(root->left);
printf("%d ",root->data);
inorder_traverse(root->right);
}
struct node* bst_root = NULL;
int getmaxbst(struct node* root, int& subtreemin, int &subtreemax, int& max)
{
if(root == NULL) return 0;
int leftsubtreemin = -32767, rightsubtreemin = -32767;
int leftsubtreemax = 32767, rightsubtreemax = 32767;
int x,y;
x = getmaxbst(root->left, leftsubtreemin, leftsubtreemax, max);
y = getmaxbst(root->right, rightsubtreemin, rightsubtreemax, max);
if(x==-1 || y ==-1)
return -1;
if(x==0) { leftsubtreemax = root->data; leftsubtreemin = root->data;}
if(y==0) { rightsubtreemin = root->data; rightsubtreemax = root->data;}
if(root->data < leftsubtreemax ||
root->data > rightsubtreemin){
return -1;
}
subtreemin = leftsubtreemin;
subtreemax = rightsubtreemax;
if(x+y+1 > max){
max = x+y+1;
bst_root = root;
}
return x+y+1;
}
int main()
{
struct node* root=NULL;
root=insert_bst(root,5);
root=insert_bst(root,3);
root=insert_bst(root,9);
root=insert_bst(root,7);
root=insert_bst(root,4);
root=insert_bst(root,1);
root=insert_bst(root,14);
root=insert_bst(root,11);
root->data = 0;
int a,b,c,max=-32767;
c = getmaxbst(root,a,b,max);
printf("\nmax is %d\n",max);
inorder_traverse(bst_root);
return 1;
}
TC O(N)
SC O(1)
Labels:Data
Adobe Question
,
Amazon Interview
,
Google Interview
,
Trees
Wednesday, November 30, 2011
Tuesday, November 29, 2011
Given a number 123456789 and two opearators + and *. You have also given value k , You have to find all the such expressions that evaluates and value is equal to the given value.
Given a number 123456789 and two opearators + and
*. You can use this two operators as many times u want. But you cant
change the sequence of the number given there. The evaluated value is
2097.
e.g. 1+2+345*6+7+8+9=2097
You have to find all the such expressions that evaluates and value is
equal to the given value. You can use concatenation of numbers like 345 is concatenated there.
e.g. some more example
1+2+345*6+7+8+9 = 2097
12*3*45+6*78+9 = 2097
12*34+5*6*7*8+9 = 2097
e.g. 1+2+345*6+7+8+9=2097
You have to find all the such expressions that evaluates and value is
equal to the given value. You can use concatenation of numbers like 345 is concatenated there.
e.g. some more example
1+2+345*6+7+8+9 = 2097
12*3*45+6*78+9 = 2097
12*34+5*6*7*8+9 = 2097
Given an image that is represented by Nx1000 matrix of binary numbers. 1 represents black(image ink) and 0 represents white(blank).,Return all the positions of the pixels where you break the page and the number of pages, so that the image can be printed in the minimum number of pages.
Given an image that is represented by Nx1000
matrix of binary numbers. 1 represents black(image ink) and 0 represents
white(blank).
The page breaks are applied in two ways:
1.)Find the row with all the white pixels.
(But this selection should be efficient as we want to print in minimum no. of pages.
For example: if we get a white line on 200th row, 600th row and 900th row, we should choose 900th line to break page).
2.) If no such row exists, break on the 1000th line.
Return all the positions of the pixels where you break the page and the number of pages, so that the image can be printed in the minimum number of pages.
The page breaks are applied in two ways:
1.)Find the row with all the white pixels.
(But this selection should be efficient as we want to print in minimum no. of pages.
For example: if we get a white line on 200th row, 600th row and 900th row, we should choose 900th line to break page).
2.) If no such row exists, break on the 1000th line.
Return all the positions of the pixels where you break the page and the number of pages, so that the image can be printed in the minimum number of pages.
Devise an algorithm for maximizing the sum of the randomly selected elements from the k subarrays. Basically means that we will want to split the array in such way such that the sum of all the expected values for the elements selected from each subarray is maximum.
You have an array with *n* elements. The elements are either 0 or 1. You
want to *split the array into kcontiguous subarrays*. The size of each
subarray can vary between ceil(n/2k) and floor(3n/2k). You can assume that
k << n. After you split the array into k subarrays. One element of each
subarray will be randomly selected.
Devise an algorithm for maximizing the sum of the randomly selected
elements from the k subarrays. Basically means that we will want to split
the array in such way such that the sum of all the expected values for the
elements selected from each subarray is maximum.
You can assume that n is a power of 2.
Example:
Array: [0,0,1,1,0,0,1,1,0,1,1,0]
n = 12
k = 3
Size of subarrays can be: 2,3,4,5,6
Possible subarrays [0,0,1] [1,0,0,1] [1,0,1,1,0]
Expected Value of the sum of the elements randomly selected from the
subarrays: 1/3 + 2/4 + 3/5 = 43/30 ~ 1.4333333
Optimal split: [0,0,1,1,0,0][1,1][0,1,1,0]
Expected value of optimal split: 1/3 + 1 + 1/2 = 11/6 ~ 1.83333333
Source -> http://stackoverflow.com/ questions/8189334/google- combinatorial-optimization- interview-problm
want to *split the array into kcontiguous subarrays*. The size of each
subarray can vary between ceil(n/2k) and floor(3n/2k). You can assume that
k << n. After you split the array into k subarrays. One element of each
subarray will be randomly selected.
Devise an algorithm for maximizing the sum of the randomly selected
elements from the k subarrays. Basically means that we will want to split
the array in such way such that the sum of all the expected values for the
elements selected from each subarray is maximum.
You can assume that n is a power of 2.
Example:
Array: [0,0,1,1,0,0,1,1,0,1,1,0]
n = 12
k = 3
Size of subarrays can be: 2,3,4,5,6
Possible subarrays [0,0,1] [1,0,0,1] [1,0,1,1,0]
Expected Value of the sum of the elements randomly selected from the
subarrays: 1/3 + 2/4 + 3/5 = 43/30 ~ 1.4333333
Optimal split: [0,0,1,1,0,0][1,1][0,1,1,0]
Expected value of optimal split: 1/3 + 1 + 1/2 = 11/6 ~ 1.83333333
Source -> http://stackoverflow.com/
Labels:Data
Facebook Interview
,
Google Interview
Saturday, November 26, 2011
SMS Problem
1 - NULL, 2 - ABC, 3 - DEF, 4 - GHI, 5 - JKL, 6 - MON, 7 - PQRS, 8 - TUV, 9 - WXYZ, * - <Space>, # - <Break>
We must convert the numbers to text.
Eg
I/P - O/P
22 - B
23 - AD
223 - BD
22#2 - BA (# breaks the cycle)
3#33 - DE
2222 - 2
2222#2 - 2A
22222 - A (cycle must wrap around)
222222 - B
We must convert the numbers to text.
Eg
I/P - O/P
22 - B
23 - AD
223 - BD
22#2 - BA (# breaks the cycle)
3#33 - DE
2222 - 2
2222#2 - 2A
22222 - A (cycle must wrap around)
222222 - B
Labels:Data
EPIC System
Convert Binary Tree into Doubly Linked List Such That Each Node of Doubly Linked List Contains The Vertical Sum of Binary Tree . We Can't Modify The Orginal Linked List
P.S. Its Extension of Vertical Sum & BT to DLL Problem . Will Take Some Time to Think About The Solution .
Please Search Previous Threads for above two problems .Although I Will Try to Pust Exact Working Code ASAP i will finish it .
Please Search Previous Threads for above two problems .Although I Will Try to Pust Exact Working Code ASAP i will finish it .
Sunday, November 20, 2011
Devise a small memory streaming algorithm to determine if |A| = 1.
For a stream of insertions and deletions, recall that x[j] = #insertions - #deletions of element j.
Given such a stream satisfying x[j] >= 0 for all elements j, let
A = { j : x[j] > 0 }
Determining whether A is empty is easy: just check if the sum of all x[j]'s equals 0 (which is easily doable in a stream).
Problem: devise a small memory streaming algorithm to determine if |A| = 1.
Extensions: What about higher sizes of A? What if the promise is not satisfied and we define A to be the set of j's with x[j] not equal to 0.
Labels:Data
Google Interview
Saturday, November 19, 2011
Given a string consisting of a,b and c's, we can perform the following operation: Take any two adjacent distinct characters and replace it with the third character. For example, if 'a' and 'c' are adjacent, they can replaced with 'b'. What is the smallest string which can result by applying this operation repeatedly?
Labels:Data
Amazon Interview
Given a string of lowercase characters, reorder them such that the same characters are at least distance d from each other.
Input: { a, b, b }, distance = 2
Output: { b, a, b }
Labels:Data
Google Interview
Thursday, November 17, 2011
Why 11/11/11 Is Mathematically Amazing
You All Know Some Times Back We Had Data 11/11/11, is a once-in-a-century occurrence, adding to a November has been a very fun month for recreational mathematicians.
Last week, a rare eight-digit palindrome date — 11/02/2011, which reads the same frontward and backward — was found to have other mathematical qualities that made it a once-in-10,000-years date.Aziz Inan, a professor of electrical engineering at the University of Portland, Oregon, crunched the numbers and found that when the date was expressed as a number, 11,022,011, it has very special properties.
"It is the product of 7 squared times 11 cubed times 13 squared. That is impressive because those are three consecutive prime numbers. No other palindrome date up to A.D. 10,000 is like that," Inan said. "Not only that, if you write it out as 72 x 113 x 132, you'll notice that even the superscript power numbers — 232 — are a palindrome."
A once-in-10,000-years date is hard to top, but 11/11/11 is no slouch. Some people believe that the date 11/11/11 is a mystical day of good luck, or that 11/11/11 is a good day to make money. Inan explained that when one looks closely at the date, it too has some interesting mathematical properties.
After today, 11/11/11 will next occur 100 years down the road, on Nov. 11, 2111. Interestingly, in 2111, 11/11/11 will be followed by an eight-digit palindrome day, 11/12/2111, which is quite exciting for palindrome fans like Inan.
If you consider today's date as a number — 111,111 — you can run some additional fun math tricks, Inan explained. 111,111 can be obtained from its largest prime number factor, 37, like so: First, subtract 37 from its reverse (73) and you get 36. (Inan added that 36 is equal to the square of the sum of the digits in 111,111.)
Then, split 36 into three consecutive numbers that add up to 36 (11, 12 and 13). Then, multiply 11, 13, 37 and the reverse of 12 (21). And what comes out? You guessed it: 111,111.
Source www.lifeslittlemysteries.com/
Labels:Data
Yahoo Interview
Tuesday, November 15, 2011
Given Table of Cities , Find Minimum Number of Hopes Required to Fly From One to Other City
There is a very Primitive Database and it has a table say "Travel". The content of table is as follows:
Source | Dest
--------------
Sea | LA
LA | FL
LA | MA
FL | Sea
Sea | FL
--------------
Sea | LA
LA | FL
LA | MA
FL | Sea
Sea | FL
The ask is to find out all routes between (Sea) to (FL) with mininum hop.
the Result would be:
1. Sea -> FL
2. Sea -> LA - > FL
the Result would be:
1. Sea -> FL
2. Sea -> LA - > FL
You have to write a Middle tier function to achieve above result. You can assume there is DBAPI that return the Destination city if you provide the source city.
Labels:Data
Amazon Interview
,
DirectI Interview
,
Dynamic Programming
,
Facebook Interview
,
Google Interview
,
Microsoft Interview
,
Recursion
Monday, October 31, 2011
An array is given consisting of real integers , can hav repeatative elements , Now u need to find K partitions whose union will hav all the elements of the array , but the intersection of any pair of cannot hav K elements in common.
Example:-
Array- 1 1 2 2 3 1 4 3 5
For K=3 the partition are :-
(1 2 3) (1 5 2) (1 4 3)
cann't be :- (1 2 3)(3 1 2)(1 4 5)
Array- 1 1 2 2 3 1 4 3 5
For K=3 the partition are :-
(1 2 3) (1 5 2) (1 4 3)
cann't be :- (1 2 3)(3 1 2)(1 4 5)
Suppose u have a square matrix, where every cell is filled with 0 or 1 . U need to find the maximum subsquare such that all four borders are filled with all 1s.
Ex:-
1 0 0 1 1 0
1 0 1 1 1 0
0 0 1 0 1 1
0 1 1 1 1 0
1 0 0 1 1 1
Here the maximum square (3X3) possible is from the TOP LEFT (2 3) TO
BOTTOM RIGHT (4 5) .
1 0 0 1 1 0
1 0 1 1 1 0
0 0 1 0 1 1
0 1 1 1 1 0
1 0 0 1 1 1
Here the maximum square (3X3) possible is from the TOP LEFT (2 3) TO
BOTTOM RIGHT (4 5) .
Thursday, October 20, 2011
Given two bst, print the output that would come if two bsts are merged and then an inorder traversal is performed. Of, course you can't merge trees here.
PS:Heard From One of The My Friend :
Labels:Data
Amazon Interview
,
Google Interview
Given an integer N, there are 2^N binary codewords of length N. All of them are taken and sorted to create a sequence: Sort by the number of 1-bits in the codeword bit-string. If there is a tie, break tie so that the smaller number comes first in the output sequence
Example and Testcases :You would be given 2 integers N (<=10) and K (1 <= K <= 2^N)
Output the codeword with index K into this sequence
The input is from stdin and output to stdout
Sample Testcases:
Input #00:
3 5
Output #00:
011
Explanation:
For N = 3, the sequence consists of {000,001,010,100,011,101,110,
Output the codeword with index K into this sequence
The input is from stdin and output to stdout
Sample Testcases:
Input #00:
3 5
Output #00:
011
Explanation:
For N = 3, the sequence consists of {000,001,010,100,011,101,110,
111}. The 5th indexed number would be 011
Input #01:
7 127
Output #01:
1111110
Input #01:
7 127
Output #01:
1111110
Labels:Data
Facebook Interview
An array of integers is given and you have to find the largest possible integer by concatenating all elements:
Example:
array: 87 36 52
answer: 875236
array: 87 9 52
answer: 98752
|
Provide an efficient algorithm to solve the question "Can I buy N items?
A shop sells an item in packets of 6, 9, and 17. A customer can buy any number of packets, but a packet cannot be broken up. Provide an efficient algorithm to solve the question "Can I buy N items?". For example, is it possible to buy 29 items? What about 19? Your algorithm simply needs to return true or false.
Labels:Data
Coin Change
,
DirectI Interview
,
Dynamic Programming
,
Facebook Interview
,
Google Interview
,
Knapsack
Design Data Structure For Students-Courses Realtionship
In a university, students can enroll in different courses. A student may enroll for more than one course. Both students and courses can be identified by IDs given to them. Design a data structure to store students, courses, and the student-course relationships. You can use arrays, lists, stacks, trees, graphs, etc. or come up with your own data structures. Give the running times, in Big O notation, for the following operations for your data structure and justify the answers: a) Return all students in a list. b) Return all courses in a list. c) Return all courses in a list for a given student. d) Return all students in a list for a given course.
Labels:Data
Amazon Interview
,
Data Structures
,
DirectI Interview
,
Facebook Interview
,
Google Interview
Thursday, October 13, 2011
Determine a user’s influence on the system by determining how many users he is responsible for
You Know Invitation Based System. on Google+, Gmail,Yahoo,Facebook etc...users can add their friends by emailing a . We want to determine a user’s influence on the system by determining how many users he is responsible for. A user’s influence is calculated by giving him 1 point for every user he’s added in addition to the sum of the influence scores of each user that he’s added.
Example: User 0 adds User 1 and User 2. User 1 adds User 3.
User 0’s influence score is 3. (He added two users and one of them added added a third user.)
User 1's is 1.
User 2’s is 0.
User 3’s is 0.
P.S.:One of The Question Asked By Google M.V. From Me. :)
User 1's is 1.
User 2’s is 0.
User 3’s is 0.
P.S.:One of The Question Asked By Google M.V. From Me. :)
The above scenario is represented by the following input file. Line i is user ID i and position j within the line is marked with an X if user ID iadded user ID j. Both row and column indicies are 0-based:
OXXO OOOX OOOO OOOO
Sunday, October 9, 2011
How Google Won User's Heart :: Google Auto-Suggestion/Auto-Completion Algorithm Exposed
How Auto Suggestion Works e.g. How Google Won User's Heart
Some Useful Links
http://en.wikipedia.org/wiki/Autocomplete
http://igoro.com/archive/efficient-auto-complete-with-a-ternary-search-tree/
http://stevedaskam.wordpress.com/2009/05/17/autocomplete-data-structures/
http://sujitpal.blogspot.com/2007/02/three-autocomplete-implementations.html
http://stackoverflow.com/questions/73095/how-to-implement-a-simple-auto-complete-functionality
http://code.google.com/p/autocomplete-server/
http://suggesttree.sourceforge.net/
http://www.quora.com/What-is-the-best-open-source-solution-for-implementing-fast-auto-complete
http://www.phpfreaks.com/forums/index.php?topic=240881.0
http://dhruvbird.blogspot.com/2010/09/very-fast-approach-to-search.html
You've seen search engines suggest queries when you begin typing the first few letters of your search string. This is being done by Duck Duck Go as well as Google (to name a few). This is typically done by maintaining a list of past queries and/or important strings that the search engine thinks are worthy of being suggested to a user that is trying to find something similar. These suggestions are effective only if the search engine spits them out very fast since these should show up on the screen before the user has finished typing what he/she wanted to type. Hence the speed with which these suggestions are made is very critical to the usefulness of this feature.
Let us consider a situation (and a possible way of approaching this problem) in which when a user enters the first few letters of a search query, he/she is presented with some suggestions that have as their prefix, the string that the user has typed. Furthermore, these suggestions should be ordered by some score that is associated with each such suggestion.
Approach-1:
Our first attempt at solving this would probably involve keeping the initial list of suggestions sorted in lexicographic order so that a simple binary search can give us the 2 ends of the list of strings that serve as candidate suggestions. These are all the strings that have the user's search query as a prefix. We now need to sort all these candidates by their associated score in non-increasing order and return the first 6 (say). We will always return a very small subset (say 6) of the candidates because it is not feasible to show all candidates since the user's screen is of bounded size and we don't want to overload the user with too many options. The user will get better suggestions as he/she types in more letters into the query input box.
We immediately notice that if the candidate list (for small query prefixes say of length 3) is large (a few thousand), then we will be spending a lot of time sorting these candidates by their associated score. The cost of sorting is O(n log n) since the candidate list may be as large as the original list in the worst case. Hence, this is the total cost of the approch. Apache's solr uses this approach. Even if we keep the scores bound within a certain range and use bucket sort, the cost is still going to be O(n). We should definitely try to do better than this.
Approach-2:
One way of speeding things up is to use a Trie and store (pointers or references to) the top 6 suggestions at or below that node in the node itself. This idea is mentioned here. This results in O(m) query time, where m is the length of the prefix (or user's search query).
However, this results in too much wasted space because:
We can mitigate (1) by using Radix(or Patricia) Trees instead of Tries.
Approach-3:
There are also other approaches to auto-completion such as prefix expansion that are using by systems such asredis. However, these approaches use up memory proportional to the square of the size of each suggestion (string). The easy way to get around this is to store all the suggestions as a linear string (buffer) and represent each suggestion as an (index,offset) pair into that buffer. For example, suppose you have the strings:
Then your buffer would look like this:
hello worldhell breaks losewhiteboardblackboard
And the 4 strings are actually represented as:
(0,11), (11,16), (27,10), (37,10)
Similarly, each prefix of the suggestion "whiteboard" is:
Approach-4:
We can do better by using Segment (or Interval) Trees. The idea is to keep the suggestions sorted (as in approach-1), but have an additional data structure called a Segment Tree which allows us to perform range searches very quickly. You can query a range of elements in Segment Tree very efficiently. Typically queries such as min, max, sum, etc... on ranges in a segment tree can be answered in O(log n) where n is the number of leaf nodes in the Segment Tree. So, once we have the 2 ends of the candidate list, we perform a range search to get the element with the highest score in the list of candidates. Once we get this node, we insert this range (with the maximum score in that range as the key) into the priority queue. The top element in the queue is popped and split at the location where the element with the highest score occurs and the scores for the 2 resulting ranges are computed and pushed back into the priority queue. This continues till we have popped 6 elements from the priority queue. It is easy to see that we will have never considered more than 2k ranges (here k = 6).
Hence, the complexity of the whole process is the sum of:
Giving a total complexity of:
O(log n) + O(2k log n)
which is: O(k log n)
Let us consider a situation (and a possible way of approaching this problem) in which when a user enters the first few letters of a search query, he/she is presented with some suggestions that have as their prefix, the string that the user has typed. Furthermore, these suggestions should be ordered by some score that is associated with each such suggestion.
Approach-1:
Our first attempt at solving this would probably involve keeping the initial list of suggestions sorted in lexicographic order so that a simple binary search can give us the 2 ends of the list of strings that serve as candidate suggestions. These are all the strings that have the user's search query as a prefix. We now need to sort all these candidates by their associated score in non-increasing order and return the first 6 (say). We will always return a very small subset (say 6) of the candidates because it is not feasible to show all candidates since the user's screen is of bounded size and we don't want to overload the user with too many options. The user will get better suggestions as he/she types in more letters into the query input box.
We immediately notice that if the candidate list (for small query prefixes say of length 3) is large (a few thousand), then we will be spending a lot of time sorting these candidates by their associated score. The cost of sorting is O(n log n) since the candidate list may be as large as the original list in the worst case. Hence, this is the total cost of the approch. Apache's solr uses this approach. Even if we keep the scores bound within a certain range and use bucket sort, the cost is still going to be O(n). We should definitely try to do better than this.
Approach-2:
One way of speeding things up is to use a Trie and store (pointers or references to) the top 6 suggestions at or below that node in the node itself. This idea is mentioned here. This results in O(m) query time, where m is the length of the prefix (or user's search query).
However, this results in too much wasted space because:
- Tries are wasteful of space and
- You need to store (pointers or references to) 6 suggestions at each node which results in a lot of redundancy of data
We can mitigate (1) by using Radix(or Patricia) Trees instead of Tries.
Approach-3:
There are also other approaches to auto-completion such as prefix expansion that are using by systems such asredis. However, these approaches use up memory proportional to the square of the size of each suggestion (string). The easy way to get around this is to store all the suggestions as a linear string (buffer) and represent each suggestion as an (index,offset) pair into that buffer. For example, suppose you have the strings:
- hello world
- hell breaks lose
- whiteboard
- blackboard
Then your buffer would look like this:
hello worldhell breaks losewhiteboardblackboard
And the 4 strings are actually represented as:
(0,11), (11,16), (27,10), (37,10)
Similarly, each prefix of the suggestion "whiteboard" is:
- w => (27,1)
- wh => (27,2)
- whi => (27,3)
- whit => (27,4)
- white => (27,5)
- whiteb => (27,6)
- and so on...
Approach-4:
We can do better by using Segment (or Interval) Trees. The idea is to keep the suggestions sorted (as in approach-1), but have an additional data structure called a Segment Tree which allows us to perform range searches very quickly. You can query a range of elements in Segment Tree very efficiently. Typically queries such as min, max, sum, etc... on ranges in a segment tree can be answered in O(log n) where n is the number of leaf nodes in the Segment Tree. So, once we have the 2 ends of the candidate list, we perform a range search to get the element with the highest score in the list of candidates. Once we get this node, we insert this range (with the maximum score in that range as the key) into the priority queue. The top element in the queue is popped and split at the location where the element with the highest score occurs and the scores for the 2 resulting ranges are computed and pushed back into the priority queue. This continues till we have popped 6 elements from the priority queue. It is easy to see that we will have never considered more than 2k ranges (here k = 6).
Hence, the complexity of the whole process is the sum of:
- The complexity for the range calculation: O(log n) (omitting prefix match cost) and
- The complexity for a range search on a Segment Tree performed 2k times: O(2k log n) (since the candidate list can be at most 'n' in length)
Giving a total complexity of:
O(log n) + O(2k log n)
which is: O(k log n)
Some Useful Links
http://en.wikipedia.org/wiki/Autocomplete
http://igoro.com/archive/efficient-auto-complete-with-a-ternary-search-tree/
http://stevedaskam.wordpress.com/2009/05/17/autocomplete-data-structures/
http://sujitpal.blogspot.com/2007/02/three-autocomplete-implementations.html
http://stackoverflow.com/questions/73095/how-to-implement-a-simple-auto-complete-functionality
http://code.google.com/p/autocomplete-server/
http://suggesttree.sourceforge.net/
http://www.quora.com/What-is-the-best-open-source-solution-for-implementing-fast-auto-complete
http://www.phpfreaks.com/forums/index.php?topic=240881.0
http://dhruvbird.blogspot.com/2010/09/very-fast-approach-to-search.html
Friday, September 16, 2011
Wednesday, September 14, 2011
Use Google Chrome For Better View of Contents on Blog :)
Dear Readers,
I am getting regular suggestion & useful comments from many geeks to re-arrange the question according to type, i will try to do this asap i get time as i have to edit HTML Code & also some people are saying that contents are good but unable to view them. for this problem try Google Chrome Until I change the GUI & edit Html Code.
Thanks For Giving Suggestion to Solving Problems Efficiently
Happy Coding
Shashank !!!
Monday, September 5, 2011
Find the smallest number expressible as the sum of cubes of two numbers in two different ways. Indeed, 10^3 + 9^3 =12^3+1^3=1729 , Do It Efficiently
Problem:
A mathematician G. H. Hardy was on his way to visit his collaborator S. Ramanujan an who was in the hospital. Hardy remarked to Ramanujan that he traveled in taxi cab number 1729 which seemed a dull one and he hoped it was not a bad omen. To this, Ramanujan replied that 1729 was a very interesting number-it was the smallest number expressible as the sum of cubes of two numbers in two different ways. Indeed,
103 + 93 =123 + 1 3 二1729.
Hint: This problem is very similar to another very popular problem that is asked in interviews. You are given a η ×η matrix in which both rows and columns are sorted in ascending order and you are supposed
to find a given number in the matrix.
Algorithm:
In this case, we are essentially looking for an implicit matrix A such that A(i,j) =i^3十j^3. In our case, the matrix will have n^1/3 rows and columns and this matrix of size n^1/3 * n^1/3 we try to search k=i^3+j^3 isn't it ans algorithms for searching for a number in such a matrix that are linear in the number of rows.
One approach is to start by comparing x to An,1. If x = An ,l , stop.Otherwise, there are two cases:
- x < A1,n in which case x is less than element at Column n at 1,n position. which mean we can escape whole row itself . we decrement column pointer to left.e.g. previous column.
- x > A1,n , in which case if x > a[1][n] is greater than it will be greater then all elements in Row 1.. we increment the row pointer to next row.
Efficient Solution:
Here is the Pseudo Code For The Same.
int IsNumberEqualstoSumeTwoCubes(int n)
{
int row=Ceil(Pow(n,1/3));// Pow Has Complexity of O(logn) Ceil(1729)=577
int column=row;
int i=0;int j=column;
while(i<row && j>=0)
{
int m=i*i*i+j*j*j;
if(m==n)
return 1; // print i^3 and j^3 these are two such numbers
else if ( m < n)
i++;
else
j--;
}
return 0; //such number DNE
}
Complexity Analysis:
In either case, we have a matrix with η fewer elements to search. In each iteration, we remove a row or a column, which means we inspect 2η-1 elements.
We claim that our algorithm that solves the matrix search problem will have to compare x with each of the 2n-l elements shown (i.e the diagonal elements (in case of column element matching )and the elements immediately below them (in case of row elements matching )and this is obvious just draw a diagram in which both row & column are in sorted order & try to run the above algorithm ).
Call these elemmts the # elements..
Comparing x with any other elements in matrix does not eliminate the any of the # elements.we can easily Proof this by Contradiction:Suppose X algorithm doesn't compare x with any of the above # elements.
then we could make that element x ( instead of x-1 got to prev. column or x+1 go to next row.).hence we will get wrong result.
Above Algorithm Really Requires Deep Understanding of Young_tableau :)
Time Complexity O(row+column)=O(N^1/3+N^1/3)=O(2*N^1/3)=O(N^1/3)
Space Complexity O(1)
Labels:Data
Facebook Interview
,
Google Interview
,
Hardy Number
Friday, September 2, 2011
Get Rid Off Suffix Tree !!!!! An Online Compilation of Helpful Tutorial
Excellent Study Material For Understanding Algorithms & implementing Suffix Tree
6.http://apps.topcoder.com/forums/?module=RevisionHistory&messageID=1171511
7.http://www.stanford.edu/class/cs97si/suffix-array.pdf
Application of Suffix Tree
1.Longest repeated substring problem:
The longest repeated substring problem is finding the longest substring of a string that occurs at least twice. This problem can be solved in linear time and space by building a suffix tree for
the string, and finding the deepest internal node in the tree. The
string spelled by the edges from the root to such a node is a longest
repeated substring. The problem of finding the longest substring with
at least k occurrences
can be found by first preprocessing the tree to count the number of
leaf descendants for each internal node, and then finding the deepest
node with at least k descendants.
By from Sartaj Sahni
Find the longest substring ofS
that appears at leastm > 1
times. This query can be answered inO(|S|)
time in the following way:
(a) Traverse the suffix tree labeling the branch nodes with the sum of the label lengths from the root and also with the number of information nodes in the subtrie. (b) Traverse the suffix tree visiting branch nodes with information node count>= m
. Determine the visited branch node with longest label length.
Note that step (a) needs to be done only once. Following this, we can do step (b) for as many values ofm
as is desired. Also, note that whenm = 2
we can avoid determining the number of information nodes in subtries. In a compressed trie, every subtrie rooted at a branch node has at least two information nodes in it.
Note that step (a) needs to be done only once. Following this, we can do step (b) for as many values ofm
as is desired. Also, note that whenm = 2
we can avoid determining the number of information nodes in subtries. In a compressed trie, every subtrie rooted at a branch node has at least two information nodes in it.
2. Pattern Searching/Sub-String Searching
Searching for a substring,pat[1..m]
, intxt[1..n]
, can be solved in O(m) time (after the suffix tree fortxt
has been built in O(n) time).
3. Longest Repeated Substring
4. Longest Common Subsequence
5. Longest Palindrome
http://www.cs.helsinki.fi/u/ukkonen/SuffixT1withFigs.pdf
Add a special ``end of string'' character, e.g. `$', to i.e., `issi' in the case of `mississippi'. The longest repeated substring can be found in O(n) time using a suffix tree.
The longest common substring of two strings,
Equivalently, one can build a (basic) suffix tree for the string
(Try it using the HTML FORM above.)
txt[1..n]
and build a suffix tree; the longest repeated substring oftxt[1..n]
is indicated by the deepest fork node in the suffix tree, where depth is measured by the number of characters traversed from the root, The longest common substring of two strings,
txt1
and txt2
, can be found by building a generalized suffix tree for txt1
andtxt2
: Each node is marked to indicate if it represents a suffix of txt1
or txt2
or both. The deepest node marked for both txt1
and txt2
represents the longest common substring.Equivalently, one can build a (basic) suffix tree for the string
txt1$txt2#
, where `$' is a special terminator for txt1
and `#' is a special terminator for txt2
. The longest common substring is indicated by the deepest fork node that has both `...$...' and `...#...' (no $) beneath it.(Try it using the HTML FORM above.)
Note that the `longest common substring problem' is different to the `longest common subsequence problem' which is closely related to the `edit-distance problem': An instance of a subsequence can have gaps where it appears in
txt1
and in txt2
, but an instance of a substring cannot have gaps.A palindrome is a string, P, such that P=reverse(P). e.g. `abba'=reverse(`abba'). e.g. `ississi' is the longest palindrome in `mississippi'. The longest palindrome of
txt[1..n]
can be found in O(n) time, e.g. by building the suffix tree for txt$reverse(txt)#
or by building the generalized suffix tree for txt
and reverse(txt)
.(Try it.)
Others
--
Some Awesome Algorithms/Codes For Suffix Tree Implementation
Sunday, August 28, 2011
Given an array A[] and a integer num. Find four no.s in the array whose sum is equal to given num.
Algorithm :
Naive Algorithm Will O(N^4) Use Four Loop & keep Checking for all four elements.its so Naive ,easily understood-able..
Algorithm:
Little Efficient O(N^3).
Sort the input array. O(nlogn)
Then take 4 pointers p1,p2,p3,p4 where
0<=p1<=n-3 1st loop
p1+1<=p2<=n-2 1st inner loop
in loop set p3=p1+ & p4=n-1
( e.g. Take 3 Pointer Pointing to three successive indexes from starting & 4th Pointer pointing to n-1th position. )
Now use same algo we used to check for two elements in sorted array whose sum equals to given element or not 3rd Inner Loop
Most Efficient O(N^2)
Algorithm:
Create an array of all possible PAIR sums..that would be done in O(n^2)
Sort the Above array of size n^2 that can be done in.O(n^2log(n)) .
Then Search this array for two pairs..that sum to the required value..this can be done by maintaining two pointer One at the starting index..and other at the highest index..and moving them accordingly..(if sum of pair exceeds given value..move up highest value pointer..else move down..lowest value pointer) .it Will take O(n)
Total Time Complexity Will be O(n^2logn) +O(nlogn) +O(n)=O(n^2logn).
Space Complexity O(N^2).
Naive Algorithm Will O(N^4) Use Four Loop & keep Checking for all four elements.its so Naive ,easily understood-able..
Algorithm:
Little Efficient O(N^3).
Sort the input array. O(nlogn)
Then take 4 pointers p1,p2,p3,p4 where
0<=p1<=n-3 1st loop
p1+1<=p2<=n-2 1st inner loop
in loop set p3=p1+ & p4=n-1
( e.g. Take 3 Pointer Pointing to three successive indexes from starting & 4th Pointer pointing to n-1th position. )
Now use same algo we used to check for two elements in sorted array whose sum equals to given element or not 3rd Inner Loop
Most Efficient O(N^2)
Algorithm:
Create an array of all possible PAIR sums..that would be done in O(n^2)
Sort the Above array of size n^2 that can be done in.O(n^2log(n)) .
Then Search this array for two pairs..that sum to the required value..this can be done by maintaining two pointer One at the starting index..and other at the highest index..and moving them accordingly..(if sum of pair exceeds given value..move up highest value pointer..else move down..lowest value pointer) .it Will take O(n)
Total Time Complexity Will be O(n^2logn) +O(nlogn) +O(n)=O(n^2logn).
Space Complexity O(N^2).
Labels:Data
Amazon Interview
find the minimum number of platforms so that all the buses can be placed as per their schedule.
At a bus-station, you have time-table for buses arrival and departure. You need to find the minimum number of platforms so that all the buses can be placed as per their schedule.
Example
Now modify the array as put 1 where you see A and -1 where you see D.
So new array will be like this:
1 1 -1 1 1 -1 -1 -1
And finally make a cumulative array out of this:
1 2 1 2 3 2 1 0
Your solution will be the maximum value in this array. Here it is 3.
I think that code for this will not be complex so I am skipping that part.
The complexity of this solution depends on the complexity of sorting.
Also we don not need to create a cumulative array or an array with 1 and -1;
you just need a counter (cnt) initialized at '0'. Whenever, you find an 'A' in
arrival-departure array, increment cnt by 1. Compare it with maximum value (max);
if it is greater than max, make max equal to cnt. If you get a 'D' in arrival-departure
array, decrement cnt by 1. At the end, return 'max'.
Example
Bus Arrival Departure
BusA 0900 hrs 0930 hrs
BusB 0915 hrs 1300 hrs
BusC 1030 hrs 1100 hrs
BusD 1045 hrs 1145 hrs
Algorithm
Its simple dynamic programming question that calculate the
number of buses at station at any time(when a bus comes or
leaves). Maximum number in that pool will be nothing but
the maximum number of buses at the bus-station at any time
,which is same as max number of platforms required.
So first sort
all the arrival(A) and departure(D) time in an int array.
Please save the corresponding arrival or departure in the
array also.Either you can use a particular bit for this
purpose or make a structure. After sorting our array will
look like this:
0900 0915 1930 1030 1045 1100 1145 1300
A A D A A D D D
Now modify the array as put 1 where you see A and -1 where you see D.
So new array will be like this:
1 1 -1 1 1 -1 -1 -1
And finally make a cumulative array out of this:
1 2 1 2 3 2 1 0
Your solution will be the maximum value in this array. Here it is 3.
I think that code for this will not be complex so I am skipping that part.
The complexity of this solution depends on the complexity of sorting.
Also we don not need to create a cumulative array or an array with 1 and -1;
you just need a counter (cnt) initialized at '0'. Whenever, you find an 'A' in
arrival-departure array, increment cnt by 1. Compare it with maximum value (max);
if it is greater than max, make max equal to cnt. If you get a 'D' in arrival-departure
array, decrement cnt by 1. At the end, return 'max'.
Time Compelxity O(nlogn)
Space Complexity O(1)
Labels:Data
Adobe Question
,
Amazon Interview
Thursday, August 25, 2011
find at what time the maximum number of people will be in the party .
Found This Question on One of the Forum :D & posting here as thought it seems to be something interesting :) There is a list containing the checkin and checkout time of every person in a party . The checkin time is in ascending order while the checkout is random .
Eg:
Check_in Check_out
Person 1 8.00 9.00
Person 2 8.15 8.30
Person 3 8.30 9.20
and so on ...Now , give an optimized solution to find at what time the maximum number of people will be in the party .
Algorithm
It has the same Algorithm That We have been used to solve Bus-Station Problem :)
Algorithm
It has the same Algorithm That We have been used to solve Bus-Station Problem :)
Time Compelxity O(nlogn)
Space Complexity O(1)
Labels:Data
Adobe Question
Subscribe to:
Posts
(
Atom
)