Friday, February 3, 2012

We are interested to find out the number of Distinct customer visited at-least p distinct pages within exactly T days.

Suppose, Amazon have a Logging system Like This:
They Track all logs daily basis, stored in separate log file.Log contains a collection of tuples of Customer ID and Page ID. The length of Customer ID and Page ID is L1 and L2. Suppose We have a log of D-days , and size of each log fine can be Order of n , where n be the number of customer.
In a most generalized situation, you can assume that a customer can visit the same page multiple times in a day or any number of days.
We are interested to find out the number of Distinct customer visited at-least p distinct pages within exactly T days.
Propose a Data Structure to be use to solve this problem efficiently . Design an Algorithm to solve this problem and Find out the complexity of the algorithm.

No comments :