is it used to merge 2 HLLCollector objects ?
i.e
lets say we have one dimension Location and we have HLL specified on person id at ingestion time
Time - t1 - bucket , Location - L1 , (hll) <—lets say 2 people A,B were there
Time - t2 - bucket , Location - L1 , (hll) <—lets say 2 people A,C were there
if i ask for uniq PPL from Time T1, T2 where location = L1,
i should get 3, i.e some sort of merge of the 2 HLL objects will be done … correct ?
so i was wondering if the FOLD method was used for that purpose.
i figured it out. here is some sample code
String id1 = "id-1";
String id2 = "id-2";
MessageDigest MD5 = MessageDigest.getInstance("MD5");
HyperLogLogCollector collector1 = HyperLogLogCollector.makeLatestCollector();
collector1.add(MD5.digest(id1.getBytes()));
collector1.add(MD5.digest(id2.getBytes()));
System.out.println("collector 1 cardinality : " +collector1.estimateCardinality()); // u get 2
HyperLogLogCollector collector2 = HyperLogLogCollector.makeLatestCollector();
collector2.add(MD5.digest(id1.getBytes()));
System.out.println("collector 2 cardinality : " + collector2.estimateCardinality()); // u get 1
//now merge the 2 HLL structures using fold method.
collector2.fold(collector1.toByteBuffer());
System.out.println("After merge collector 2 cardinality : " + collector2.estimateCardinality()); // u get 2