As We have a Huge Data obvious things that should come into mind is using bit array or bit set where ith bit represent the position of number in array of size 13* 2^20
before start solving question i suggest you people to check basics of bit operations here
#include
#include
using namespace std;
int main()
{
int i=0,j=0,k=0;
int n=18;
n|=1<<3;//Set the ith bit here i=3
printf( " %d ", n);
n&=~(1<<3);//clear ith bit
printf( " %d ", n);
n^=1<<1;//toggle ith bit
printf( " %d ", n);
i=n&(1<<4)?1:0;//check ith (i=4)bit set or not
printf( " %d ", i);
return 0;
}
Data Structure:Bit Array
Algorithm:
1.set ith bit n bit arry when reading number 1 by one from file .
2.if ith bit corrosponding to number is already set then number is duplicate .
& we are done.
as we are using bit array here so When using an array of bytes to represent set of bits, i.e., a bit array or bitset, the index of the byte in the array associated with a bit n can be calculated using division:
ith index / CHAR_BIT (number represented by 32 bit or 64 bit etc)
where n is the index of the given bit and CHAR_BIT gives the number of bits in a C char.
"this will give position of byte element in array elements"
The index of the bit within the byte indexed by the above can be calculated via a modulo operation:
n % CHAR_BIT
"so this will give us position of bit inside byte."
so then we can write the following method to find 1% duplicates
bool setBit(int a[], int i)
{
int index = i / 8; //position number in array
int offset = i % 8;//position of bit in byte
if (a[index] & (1<< offset))//check bit is set or not
return false;//when ith bit is already set
a[index] |= 1<
}
Time Complexity O(N) size of file
Feel Free to Comment or Optmize the Solution