Panini stickers follow up

Written on 30 January 2014, 11:23pm

Tagged with: , , ,

The previous post about Panini stickers got into some mathematical formulas. However, the 2 main conclusions were referring to the duplicates probability and distinct probability. That was the mathematical approach to the problem.
Below – the geeky one 🙂

1. Duplicates probability

In a Panini pack of 17 stickers (out of 192 possible stickers), there are 50% chances to have a duplicate.

The geeky way:
– generate a random array of ‘n’ integers in the range [1,192]
– calculate how many duplicates has the array
– repeat this a number of times to get a reliable view.

Results (PHP code at the end of the post):

Number of stickers - Probability of duplicate
10 - 20.47% 
11 - 25.8% 
12 - 31.2% 
13 - 37.13% 
14 - 40.6% 
15 - 45.47% 
16 - 47% 
17 - 53.4%
18 - 58.4% 
19 - 63.27% 
20 - 66.53% 
21 - 69.87% 
22 - 74.53% 
23 - 76.53% 
24 - 80.27% 
25 - 82.33% 
26 - 85.47% 
27 - 86.27% 
28 - 87.87% 
29 - 89.93% 
30 - 91.67% 
31 - 93.73% 
32 - 94.4% 
33 - 94.87% 
34 - 96.07% 
35 - 96.53% 
36 - 97.13% 
37 - 97.47% 
38 - 97.6% 
39 - 98% 
40 - 98.33% 

2. Distinct probability

Buying 250 stickers (out of 192 possible stickers) will probably leave you with about 139 distinct stickers

The geeky way:
– generate n random numbers in range [1,192] – with n>192
– count how many elements are unique (array_unique)
– repeat this a number of times to get a reliable view.

Results (PHP code at the end of the post):

Number of stickers - Distinct stickers
200 - 125 
250 - 140 
300 - 152 
350 - 161 
400 - 168 
450 - 174 
500 - 178 
550 - 181 
600 - 183 
650 - 185 
700 - 187 
750 - 188 
800 - 189 
850 - 190 
900 - 190 
950 - 191 
1000 - 191 
1050 - 191 
1100 - 192 
1150 - 192 

PHP code:


//duplicates.php
include('functions.php');

$tick = microtime(true);
_e("Here we go...");

const REPEAT = 1000;
const TOTAL = 192;

for($k=10;$k<=40;$k++) {
	$dupes = 0;
	for($i=1;$i<=REPEAT;$i++) {
		$a = generate_random_array($k,TOTAL);
		if(array_has_dupes($a))
			$dupes++;
	}
	$percent = round(100 * $dupes / REPEAT,2);
	_e("$k - $percent%");
}
$tock = microtime(true);
$elapsed = round($tock - $tick,2);
_e("Done in $elapsed seconds!");
 //distinct.php
include('functions.php');

set_time_limit(60);

$tick = microtime(true);
_e("Here we go...");

const REPEAT = 100;
const TOTAL = 192;

for($k=200;$k<=1150;$k+=50) {
	$c = 0;
	for($i=1;$i<=REPEAT;$i++) {
		$a = generate_random_array($k,TOTAL);
		$c += count(array_unique($a));
	}
	$percent = round($c/REPEAT);
	_e("$k - $percent");
}
$tock = microtime(true);
$elapsed = round($tock - $tick,2);
_e("Done in $elapsed seconds!");
 //functions.php
function array_has_dupes($array) {
	return count($array) !== count(array_unique($array));
}

function generate_random_array($n, $d){
	$array = [];
	for($i=1;$i<=$n;$i++) {
		$array[] = rand(1,$d);
	}
	return $array;
}

function _e($str) {
	echo "$str <br />";
}

Next I should probably plot these results…
Later edit: I did 🙂

Comments (1)

Leave a response