SrGrace’s 10 Days of Statistics Challenges-wiki

Collection of solution for 10 Days of Statistics Challenges at HackerRank.

Contents

Day 0


####Day 0: Mean, Median, and Mode
Task:
Given an array, X, of N integers, calculate and print the respective mean, median, and mode on separate lines. If your array contains more than one modal value, choose the numerically smallest one.

Solution

####Day 0: Weighted Mean
Task:
Given an array, X, of N integers and an array, W, representing the respective weights of X’s elements, calculate and print the weighted mean of X’s elements. Your answer should be rounded to a scale of decimal place.

Solution

Day 1


####Day 1: Quartiles
Task:
Given an array, X, of N integers, calculate the respective first quartile (Q1), second quartile (Q2), and third quartile (Q3). It is guaranteed that Q1, Q2, and Q3 are integers.

Solution

####Day 1: Interquartile Range
Task:
The interquartile range of an array is the difference between its first (Q1) and third (Q3) quartiles (i.e., Q3-Q1).
Given an array, X, of N integers and an array, F, representing the respective frequencies of X’s elements, construct a data set, X, where each Xi occurs at frequency Fi. Then calculate and print S’s interquartile range, rounded to a scale of 1 decimal place (i.e., 12.3 format).

Solution

####Day 1: Standard Deviation
Task:
Given an array, X, of N integers, calculate and print the standard deviation. Your answer should be in decimal form, rounded to a scale of decimal place (i.e., 12.3 format). An error margin of +-0.1 will be tolerated for the standard deviation.

Solution

Day 2


####Day 2: Basic Probability
Task:
In a single toss of 2 fair (evenly-weighted) six-sided dice, find the probability that their sum will be at most 9.

Ans: 30/36 = 5/6

####Day 2: More Dice
Task:
In a single toss of 2 fair (evenly-weighted) six-sided dice, find the probability that the values rolled by each die will be different and the two dice have a sum of 6.

Ans: 4/36 = 1/9

####Day 2: Compound Event Probability
Task:
There are 3 urns labeled X, Y, and Z.

Urn contains red balls and black balls.
Urn contains red balls and black balls.
Urn contains red balls and black balls.

One ball is drawn from each of the 3 urns. What is the probability that, of the 3 balls drawn, 2 are red and 1 is black?

Ans:

Urn X has a 4/7 probability of giving a red ball. 
Urn Y has a 5/9 probability of giving a red ball. 
Urn Z has a 1/2 probability of giving a red ball. 

Urn X has a 3/7 probability of giving a black ball.
Urn Y has a 4/9 probability of giving a black ball. 
Urn Z has a 1/2 probability of giving a black ball. 

=>P(2 red, 1 black) 
= P(Red Red Black) + P(Red Black Red) + P(Black Red Red) 
= (4/7)(5/9)(1/2) + (4/7)(4/9)(1/2) + (3/7)(5/9)(1/2)
= 20/126 + 16/126 + 15/126 
= 51/126 
= 17/42  


Day 3


####Day 3: Conditional Probability
Task:
Suppose a family has 2 children, one of which is a boy. What is the probability that both children are boys?

Ans:

Approach is to reduce the sample space so that it only contains events where one child is a boy 
S(boy) = {BB, GB, BG}. If we consider event E to be the event in which both children are boys (so E = {BB}), 
we can find the probability as a fraction of the reduced sample space:
                    P(E) = |E|/|S| = 1/3.


####Day 3: Cards of the Same Suit
Task:
You draw 2 cards from a standard 52-card deck without replacing them. What is the probability that both cards are of the same suit?

Ans:

The first card drawn will be from any of the 4 suits and there will be 51 cards left in the deck, 
only 12 of which match the drawn card's suit. The probability of the second card being of the same suit is:
                    P(E) = 12/51 = 4/17.


####Day 3: Drawing Marbles
Task:
A bag contains 3 red marbles and 4 blue marbles. Then, 2 marbles are drawn from the bag, at random, without replacement. If the first marble drawn is red, what is the probability that the second marble is blue?

Ans:

since already 1 ball is choosen, there are 6 balls left. We need to pick up the blue ball, 
where there are 4 blue coloured balls. The probabilty is P(E) = 4/6 = 2/3.


Day 4


####Day 4: Binomial Distribution I
Task:
The ratio of boys to girls for babies born in Russia is 1.09 : 1. If there is 1 child born per birth, what proportion of Russian families with exactly 6 children will have at least 3 boys?

Solution

####Day 4: Binomial Distribution II
Task:
A manufacturer of metal pistons finds that, on average, 12% of the pistons they manufacture are rejected because they are incorrectly sized. What is the probability that a batch of 10 pistons will contain:

  1) No more than 2 rejects?
  2) At least 2 rejects?

Solution

####Day 4: Geometric Distribution I
Task:
The probability that a machine produces a defective product is 1/3. What is the probability that the 1st defect is found during the 5th inspection?

Solution

####Day 4: Geometric Distribution II
Task:
The probability that a machine produces a defective product is 1/3. What is the probability that the defect is found during the first 5 inspections?

Solution

Day 5


####Day 5: Poisson Distribution I
Task:
A random variable, X, follows Poisson distribution with mean of 2.5. Find the probability with which the random variable X is equal to 5.

Solution

####Day 5: Poisson Distribution II
Task:
The manager of a industrial plant is planning to buy a machine of either type A or type B. For each day’s operation:

The number of repairs, X, that machine A needs is a Poisson random variable with mean 0.88. The daily cost of operating
A is Ca = 160 + 40X^2.
The number of repairs, Y, that machine B needs is a Poisson random variable with mean 1.55. The daily cost of operating
B is Cb = 128 + 40Y^2.

Assume that the repairs take a negligible amount of time and the machines are maintained nightly to ensure that they operate like new at the start of each day. Find and print the expected daily cost for each machine.

Solution

####Day 5: Normal Distribution I
Task:
In a certain plant, the time taken to assemble a car is a random variable, X, having a normal distribution with a mean of 20 hours and a standard deviation of 2 hours. What is the probability that a car can be assembled at this plant in:

Less than 19.5 hours?
Between 20 and 22 hours?

Solution

####Day 5: Normal Distribution II
Task:
The final grades for a Physics exam taken by a large group of students have a mean of 70 and a standard deviation of 10. If we can approximate the distribution of these grades by a normal distribution, what percentage of the students:

Scored higher than 80 (i.e., have a grade > 80)?
Passed the test (i.e., have a grade >= 60)?
Failed the test (i.e., have a grade < 60)?

Find and print the answer to each question on a new line, rounded to a scale of decimal places.

Solution

Day 6


####Day 6: The Central Limit Theorem I
Task:
A large elevator can transport a maximum of 9800 pounds. Suppose a load of cargo containing 49 boxes must be transported via the elevator. The box weight of this type of cargo follows a distribution with a mean of 205 pounds and a standard deviation of 15 pounds. Based on this information, what is the probability that all 49 boxes can be safely loaded into the freight elevator and transported?

Solution

####Day 6: The Central Limit Theorem II
Task:
The number of tickets purchased by each student for the University X vs. University Y football game follows a distribution that has a mean of 2.4 and a standard deviation of 2.0.

A few hours before the game starts, 100 eager students line up to purchase last-minute tickets. If there are only 250 tickets left, what is the probability that all 100 students will be able to purchase tickets?

Solution

####Day 6: The Central Limit Theorem III
Task:
You have a sample of 100 values from a population with mean 500 and with standard deviation 80. Compute the interval that covers the middle 95% of the distribution of the sample mean; in other words, compute A and B such that P(A < x < B) = 0.95. Use the value of z = 1.96. Note that z is the z-score.

Solution

Day 7


####Day 7: Pearson Correlation Coefficient I
Task:
Given two n-element data sets, X and Y, calculate the value of the Pearson correlation coefficient.

Solution

####Day 7: Spearman’s Rank Correlation Coefficient
Task:
Given two n-element data sets, X and Y, calculate the value of Spearman’s rank correlation coefficient.

Solution

Day 8


####Day 8: Least Square Regression Line
Task:
A group of five students enrolls in Statistics immediately after taking a Math aptitude test. Each student’s Math aptitude test score, x, and Statistics course grade, y, can be expressed as the following list of (x, y) points:
If a student scored an 80 on the Math aptitude test, what grade would we expect them to achieve in Statistics? Determine the equation of the best-fit line using the least squares method, then compute and print the value of y when x = 80.

Solution

####Day 8: Pearson Correlation Coefficient II
Task:
The regression line of y on x is 3x + 4y + 8 = 0, and the regression line of x on y is 4x + 3y + 7 = 0. What is the value of the Pearson correlation coefficient?

Solution

Day 9


####Day 9: Multiple Linear Regression
Task:
Andrea has a simple equation: Y = a + b1f1 + b2f2 + …. +bm*fm. for (m+1) real constants (a1, f1, f2,…, fm). We can say that the value of Y depends on m features. Andrea studies this equation for n different feature sets (f1, f2,…, fm) and records each respective value of Y. If she has q new feature sets, can you help Andrea find the value of Y for each of the sets?

Note: You are not expected to account for bias and variance trade-offs.

Solution



NOTE: PRs and Stars are always welcome :)