# Another approach to ships with gold and silver

Find this notebook on the web at
<a class="quarto-xref" href="https://resampling-stats.github.io/edition-3-python/more_sampling_tools.html#nte-gold_silver_booleans">Note <span>10.3</span></a>.

This notebook is a variation on the problem with gold and silver chests
in ships. It shows how we can count and tally the results at the end,
rather than in the trial itself.

Notice that the first part of the code is identical to the first
approach to this problem. There are two key differences — see the
comments for an explanation.

In [None]:
import numpy as np
rnd = np.random.default_rng()

In [None]:
# The 3 buckets, each representing two chests on a ship.
# As before.
bucket1 = ['Gold', 'Gold']  # Chests in first ship.
bucket2 = ['Gold',  'Silver']  # Chests in second ship.
bucket3 = ['Silver', 'Silver']  # Chests in third ship.

In [None]:
# Here is where the difference starts.  We are now going to fill in
# the result for the first chest _and_ the result for the second chest.
#
# Later we will fill in all these values, so the string we put here
# does not matter.

# Whether the first chest was Gold or Silver.
first_chests = np.repeat(['To be announced'], 10000)
# Whether the second chest was Gold or Silver.
second_chests = np.repeat(['To be announced'], 10000)

for i in range(10000):
    # Select a ship at random from the three ships.
    # As before.
    ship_no = rnd.choice([1, 2, 3])
    # Get the chests from this ship.
    # As before.
    if ship_no == 1:
        bucket = bucket1
    if ship_no == 2:
        bucket = bucket2
    if ship_no == 3:
        bucket = bucket3

    # As before.
    shuffled = rnd.permuted(bucket)

    # Here is the big difference - we store the result for the first and second
    # chests.
    first_chests[i] = shuffled[0]
    second_chests[i] = shuffled[1]

# End loop, go back to beginning.

# We will do the calculation we need in the next cell.  For now
# just display the first 10 values.
ten_first_chests = first_chests[:10]
print('The first 10 values of "first_chests:', ten_first_chests)

In [None]:
ten_second_chests = second_chests[:10]
print('The first 10 values of "second_chests', ten_second_chests)

In this variant, we recorded the type of first chest for each trial
(“Gold” or “Silver”), and the type of second chest of the second chest
(“Gold” or “Silver”).

**We would like to count the number of times there was “Gold” in the
first chest *and* “Gold” in the second.**

## 10.6 Combining Boolean arrays

We can do the count we need by *combining* the Boolean arrays with the
`&amp;` operator. `&amp;` combines Boolean arrays with a *logical and*. *Logical
and* is a rule for combining two Boolean values, where the rule is: the
result is `True` if the first value is `True` *and* the second value if
`True`.

Here we use the `&amp;` *operator* to combine some Boolean values on the
left and right of the operator:

In [None]:
True & True   # Both are True, so result is True

In [None]:
True & False   # At least one of the values is False, so result is False

In [None]:
False & True   # At least one of the values is False, so result is False

In [None]:
False & False   # At least one (in fact both) are False, result is False.

<div __quarto_custom="true" __quarto_custom_context="Block" __quarto_custom_id="69" __quarto_custom_type="Callout">
<div __quarto_custom_scaffold="true">

`&amp;` and `and` in Python

</div>
<div __quarto_custom_scaffold="true">

In fact Python has another operation to apply this *logical and*
operation to values — the `and` operator:</div></div>

In [None]:
print(True and True)

In [None]:
print(True and False)

In [None]:
print(False and True)

In [None]:
print(False and False)

You will see this `and` operator often in Python code, but it does not
work well when combining Numpy *arrays*, so we will use the similar `&amp;`
operator, that does work on arrays.




Above you saw that the `==` operator (as in `== 'Gold'`), when applied
to arrays, asks the question of every element in the array.

First make the Boolean arrays.

In [None]:
ten_first_gold = (ten_first_chests == 'Gold')
print("Ten first == 'Gold'", ten_first_gold)

In [None]:
ten_second_gold = (ten_second_chests == 'Gold')
print("Ten second == 'Gold'", ten_second_gold)

Now let us use `&amp;` to combine Boolean arrays:

In [None]:
ten_both = (ten_first_gold & ten_second_gold)
ten_both

Notice that Python does the comparison *elementwise* — element by
element.

You saw that when we did `second_chests == 'Gold'` this had the effect
of asking the `== 'Gold'` question of *each element*, so there will be
one answer per element in `second_chests`. In that case there was an
array to the *left* of `==` and a single value to the *right*. We were
comparing an array to a value.

Here we are asking the `&amp;` question of `ten_first_gold` and
`ten_second_gold`. Here there is an array to the *left* and an array to
the *right*. We are asking the `&amp;` question 10 times, but the first
question we are asking is:

In [None]:
# First question, giving first element of result.
(ten_first_gold[0] & ten_second_gold[0])

The second question is:

In [None]:
# Second question, giving second element of result.
(ten_first_gold[1] & ten_second_gold[1])

and so on. We have ten elements on *each side*, and 10 answers, giving
an array (`ten_both`) of 10 elements. Each element in `ten_both` is the
answer to the `&amp;` question for the elements at the corresponding
positions in `ten_first_gold` and `ten_second_gold`.

We could also create the Boolean arrays and do the `&amp;` operation all in
one step, like this:

In [None]:
ten_both = (ten_first_chests == 'Gold') & (ten_second_chests == 'Gold')
ten_both

<div __quarto_custom="true" __quarto_custom_context="Block" __quarto_custom_id="70" __quarto_custom_type="Callout">
<div __quarto_custom_scaffold="true">

Parentheses, arrays and comparisons

</div>
<div __quarto_custom_scaffold="true">

Again you will notice the round brackets (parentheses) around
`(ten_first_chests == 'Gold')` and `(ten_second_chests == 'Gold')`.
Above, you saw us recommend you always use paretheses around Boolean
expressions like this. The parentheses make the code easier to read —
but be careful — in this case, we actually *need* the parentheses to
make Python do what we want; see the footnote for more detail.[^1]

</div>
</div>

Remember, we wanted the answer to the question: how many trials had
“Gold” in the first chest *and* “Gold” in the second. We can answer that
question for the first 10 trials with `np.sum`:

In [None]:
n_ten_both = np.sum(ten_both)
n_ten_both

We can answer the same question for *all* the trials, in the same way:

In [None]:
first_gold = (first_chests == 'Gold')
second_gold = (second_chests == 'Gold')
n_both_gold = np.sum(first_gold & second_gold)
n_both_gold

We could also do the same calculation all in one line:

In [None]:
# Notice the parentheses - we need these - see above.
n_both_gold = np.sum((first_chests == 'Gold') & (second_chests == 'Gold'))
n_both_gold

We can then count all the ships where the first chest was gold:

In [None]:
n_first_gold = np.sum(first_chests == 'Gold')
n_first_gold

The final calculation is the proportion of second chests that are gold,
given the first chest was also gold:

In [None]:
p_g_given_g = n_both_gold / n_first_gold
p_g_given_g

Of course we won’t get exactly the same results from the two
simulations, in the same way that we won’t get exactly the same results
from any two runs of the same simulation, because of the random values
we are using. But the logic for the two simulations are the same, and we
are doing many trials (10,000), so the results will be very similar.

[^1]: We warned that we need parentheses around our `&amp;` expressions to
    get the result we want. We would add the parentheses in any case, as
    good practice, but here we also *need* the parentheses in
    `(ten_first_chests == 'Gold') &amp; (ten_second_chests == 'Gold')`.
    Remember *operator precedence*; for example, the multiply operator
    `*` has *higher precedence* than the operator `+`, so `3 + 5 * 2` is
    equal to `3 + (5 * 2)` = 13. If we want to do addition before
    multiplication, we use parentheses to tell Python the order it
    should use: `(3 + 5) * 2` = 16.

    The same applies for the two operators `==` and `&amp;` here. In fact
    `&amp;` has a higher precedence than `==`. This means that, if we write
    the expression without parentheses —
    `ten_first_chests == 'Gold' &amp; ten_second_chests == 'Gold'` — because
    of operator precedence, Python takes this to mean
    `ten_first_chests == ('Gold' &amp; ten_second_chests) == 'Gold'`. Python
    does not know what to do with `'Gold' &amp; ten_second_chests` and
    generates an error of form
    `'bitwise_and' not supported for the input types`. The error tells
    you that Python does not know how to apply `&amp;` (`'bitwise_and'`) to
    the string `'Gold`’ and the array `ten_second_chests`.

    This is the same error you would get for running the code
    `'Gold' &amp; ten_second_chests` on its own.

    The point to take away is, that when you are using `&amp;` to combine
    Boolean arrays in Python, remember operator precedence, and, when in
    doubt, put parentheses around the expressions on either side of `&amp;`,
    as here.