## Panini stickers follow up

Written on **30 January 2014, 11:23pm **

**Tagged with:** geek, maths, php, probabilities

The previous post about Panini stickers got into some mathematical formulas. However, the 2 main conclusions were referring to the **duplicates probability** and **distinct probability**. That was the mathematical approach to the problem.

Below – the geeky one ðŸ™‚

### 1. Duplicates probability

In a Panini pack of 17 stickers (out of 192 possible stickers), there are 50% chances to have a duplicate.

The geeky way:

– generate a random array of ‘n’ integers in the range [1,192]

– calculate how many duplicates has the array

– repeat this a number of times to get a reliable view.

Results (PHP code at the end of the post):

```
Number of stickers - Probability of duplicate
10 - 20.47%
11 - 25.8%
12 - 31.2%
13 - 37.13%
14 - 40.6%
15 - 45.47%
16 - 47%
17 - 53.4%
18 - 58.4%
19 - 63.27%
20 - 66.53%
21 - 69.87%
22 - 74.53%
23 - 76.53%
24 - 80.27%
25 - 82.33%
26 - 85.47%
27 - 86.27%
28 - 87.87%
29 - 89.93%
30 - 91.67%
31 - 93.73%
32 - 94.4%
33 - 94.87%
34 - 96.07%
35 - 96.53%
36 - 97.13%
37 - 97.47%
38 - 97.6%
39 - 98%
40 - 98.33%
```

### 2. Distinct probability

Buying 250 stickers (out of 192 possible stickers) will probably leave you with about 139 distinct stickers

The geeky way:

– generate n random numbers in range [1,192] – with n>192

– count how many elements are unique (`array_unique`

)

– repeat this a number of times to get a reliable view.

Results (PHP code at the end of the post):

```
Number of stickers - Distinct stickers
200 - 125
250 - 140
300 - 152
350 - 161
400 - 168
450 - 174
500 - 178
550 - 181
600 - 183
650 - 185
700 - 187
750 - 188
800 - 189
850 - 190
900 - 190
950 - 191
1000 - 191
1050 - 191
1100 - 192
1150 - 192
```

PHP code:

```
//duplicates.php
include('functions.php');
$tick = microtime(true);
_e("Here we go...");
const REPEAT = 1000;
const TOTAL = 192;
for($k=10;$k<=40;$k++) {
$dupes = 0;
for($i=1;$i<=REPEAT;$i++) {
$a = generate_random_array($k,TOTAL);
if(array_has_dupes($a))
$dupes++;
}
$percent = round(100 * $dupes / REPEAT,2);
_e("$k - $percent%");
}
$tock = microtime(true);
$elapsed = round($tock - $tick,2);
_e("Done in $elapsed seconds!");
```

```
//distinct.php
include('functions.php');
set_time_limit(60);
$tick = microtime(true);
_e("Here we go...");
const REPEAT = 100;
const TOTAL = 192;
for($k=200;$k<=1150;$k+=50) {
$c = 0;
for($i=1;$i<=REPEAT;$i++) {
$a = generate_random_array($k,TOTAL);
$c += count(array_unique($a));
}
$percent = round($c/REPEAT);
_e("$k - $percent");
}
$tock = microtime(true);
$elapsed = round($tock - $tick,2);
_e("Done in $elapsed seconds!");
```

```
//functions.php
function array_has_dupes($array) {
return count($array) !== count(array_unique($array));
}
function generate_random_array($n, $d){
$array = [];
for($i=1;$i<=$n;$i++) {
$array[] = rand(1,$d);
}
return $array;
}
function _e($str) {
echo "$str <br />";
}
```

Next I should probably plot these results…

**Later edit:** I did ðŸ™‚

*Written by*
**Dorin Moise**
(Published articles: 233)

- Likes (
*1*) -
Share
- Comments (1)

## Comments (1)