This coursework investigates the literature on birthday paradox. It is achieved through a systematic analysis of data or information generated from a personal research. Besides, this paper evaluates data analysis to determine if it conforms to the probability theory of birthdays. According to this coursework, probability theory is concerned with the idea in any set of personalities chosen at random. For instance, two different personalities will have their birthday falling on the same day. In compliance with the principles of origin, the probability of two people sharing a birthday stands at 100% for a study that involves a total of 366 participants. Ideally, this is only true if the idea that each of the 366 days of the year has an equal chance of being a common birthday.

The research involved seven groups of students who were all asked to tell their dates of birthdays for the main purpose of investigation. The respondents were all assured that the data they gave would be treated as private and confidential. Consequently, they all felt comfortable revealing their birthdays. The first group in the senior category comprised a total of forty nine personalities of diverse ethnic and racial backgrounds as they were randomly drawn from Y10 category. The analysis of their noted birthdays revealed a probability of 97% that any two individuals in the group shared a birthday. The second group that comprised a total of fifty students from a different year group, that is Y11, resulted in a probability of 97%. From the third year group category of students Y12, a probability percentage of 80% was obtained from a group that comprised of forty two students. In addition, the fourth and fifth categories that comprised of twenty one and fifty four participants respectively yielded percentages of 50% and 97% respectively. Moreover, the research attained probabilities of 98% and 99% for the sixth and seventh groups that correspondingly had a total of 52 and 70 persons. This formed the basis of theoretical analysis of the birthday paradox (Shirky, Clay 2008).

Analysis of the Data

The analysis of the obtained data involved the computation of the approximate probability that two individuals in a group of N personalities would share the same birthday. The entire analysis is based on an assumption that each day of the year has an equal chance of being a birthday. This is, however, far from the truth considering that the distribution of real life birthdays can never be uniform. Considering a group of 20 persons, P (A) would be the probability that two of them share a birthday. This implies that the probability P (A’) that two members of the group do not share a birthday becomes 1-{P (A)}. In the case of this research data P (A’) can be said to be a description of 20 independent events, whereby the chances of all these events occurrence equals the product of the individual possibility of each event. For instance, the probability for the group of twenty people would be calculated in the following manner:

P(1)* P (2)* P (3)* P (4)*…*P (20).

Thereby, the events can be described as though they correspond to individual member of the group not having a similar birthday like any other person within the group who has been previously analyzed. In light of this, it would occur that event number one P(1) has no people in the category of “previously analyzed”. Due to this fact, the probability of person number one not having a similar birthday as any other member of the group previously analyzed is 100% (Z. E. Schnabel 1938).

The analysis of person number two will be compared to another one, previously analyzed. Hence, the probability that the birthday of person number two falls on a different date comparing to person number one becomes 364/365. This simply implies that the birthday of person number two would fall on one of the 364 remaining days of the year because the birthday of person number one had already been removed from the count. The same trend would apply to person number three in that two slots have been already occupied by individuals number one and two. If this analysis proceeds up to person number twenty, the probability of person number twenty not sharing a similar birthday with individuals previously analyzed becomes 345/365. However, it is mathematically correct that P(A’) equals the product of the individual probabilities up to that of person number twenty. For instance:

P(A’) would be equivalent to 365/365*364/365*363/365…345/365.

This gives a value of P(A’) to be 0.589 implying that P(A) would be the difference between this value and 1 i.e. 1-0.589 giving 0.411. Converted into a percentage, a value of 41.1% is obtained for the group of twenty individuals. When this computation is continued up to person number twenty three, the obtained probability value becomes 50.7% while it would give a probability percentage of 97% for other fifty individuals (Shirky, Clay 2008).

Moreover, the analytical process could be made easier by generalizing the calculations of groups of N people such that P(N) becomes the probability that at least two individuals within the group of N people share a similar birthday. However, this is more often not approached considering the calculations’ first account for the probability that all N birthdays are completely different. According to Pigeonhole principle of probability theory, P(N) becomes zero whenever the number N rises to a figure higher than 365. This can be properly represented in a factorial operator so that the entire equation conforms to the idea that any two individuals would only have different birthdays if second individual cannot have the same birthday as the first, the third as the second etcetera. In this respect, the event of any two personalities within the group of N individuals would be mathematically complementary to all the members of the entire category having different birthdays. Accordingly, the probability P(N)= 1-P(N’).

Taking the example of members of group Y10 for the senior category where the first student has her birthday in August 30, 1997. In this way, calculating the probability of the second student, there is an assumption that her birthday can only be similar to forty eight out of the forty nine members of the group. This is because one birthday is already predetermined and categorized as “previously analyzed”. Typically, all the subsequent calculations of probability would put a bigger number of “previously analyzed” individuals into consideration. For instance, the probability of the first student in the category would be calculated as 49/49 while the second student in the Y10 category would be given 49/49*48/49. This trend would continue for all the members of the entire group up to the forty ninth individual whose probability of sharing a birthday with someone from the group would be obtained from the equation 49/49* 48/49* 47/49…1/49. Eventually, the probability would get much smaller considering that more slots from the possible 365 days of the year would have been occupied by the “previously analyzed” individuals in the group (Z. E. Schnabel 1938).

For the group F1 of the junior category that comprised of a total of sixty students, the analysis would be basically the same, only a little more involving. For instance, the second individual in the group would have a probability of a similar birthday with another in the group calculated as follows 60/60* 59/60. This would typically give a big value considering that there is still a large pool of individuals to choose from. However, this would consequently go down with a similar trend giving the fourth person a slightly smaller probability of 60/60* 59/60* 58/60* 57/60. The overall probability for the whole group would then be a product of all these individual values (Shirky, Clay 2008).

Evaluation of the Analysis

The probability values obtained from these experimental data are almost perfectly in agreement with the theoretical data. For instance, the conventional data gives a probability of 50% to a group of 23 individuals. Although our group was slightly smaller as it consisted of 21 personalities, the closeness of the values points to a possible accuracy of our calculations. Furthermore, my calculations of probability value for group number two that had fifty students gave a result of 96%. This was only a slight deviation from the theoretical value of 97%. In addition, the analysis of group number seven that was compounded of seventy students gave a value of 99%. Even though the calculations were not able to yield the absolute accuracy to the point of decimal points, it remains largely within the acceptable limits (Z. E. Schnabel 1938).

Conclusion

The birthday problem remains an excellent way of determining the probability that two individuals within a group would share a birthday. This theory has been used by students in the world in order to predict the chances of shared birthdays. Indeed, it has attracted attentions of several mathematicians who have greatly developed it. Perhaps, further mathematical research should be carried out in accordance with this principle to find out how it can apply to the industrial or economic sectors (D. E. Knuth 1973).