150 likes | 388 Views
In this Cumulative Distribution Function, for example, we will talk about the cumulative distribution function, a statistical function that gives us the total probability below a certain, selected point. We will see how to find the CDF and implement it using Python. The topics covered in this CDF in Probability presentation are :<br><br>1. What is a Cumulative Distribution Function?<br>2. Case Study on Iris Dataset<br>3. Cumulative Distribution Function with Python
E N D
What’s in it for you? What is Cumulative Distribution Function? Case Study on IRIS Flower Dataset Cumulative Distribution Function with Python
What is Cumulative distribution Function?
What is Cumulative Distribution Function? The cumulative distribution function is used to describe the probability distribution of random variables. It can be used to describe the probability for a discrete, continuous or mixed variable. Consider the curve below with the probability function f(x) f(x) f(x) = x b a
What is Cumulative Distribution Function? Consider a point c within the interval (a, b). The cumulative distribution function of a random variable X evaluated at c will take a value less than or equal to c. This is done by finding the cumulative/ total probability for a given value f(x) P(X<=c) = (c-a)f(x) (c-a)f(x) = x a c b
What is Cumulative Distribution Function? In general, the formula for the cumulative distribution function for a given probability distribution function P(X<=x) is the cumulative of that function over the interval it is being calculated over. This can be gotten by integrating the pdf Fx(x) = Fx(x) = P(X <= x)
What is Cumulative Distribution Function? Continuous Random Variable Probability Distribution Function A function which defines the relationship between a random variable and its probability is called a Probability Density Function (PDF). A continuous random variable is one which can take on infinite different values within a range of values, eg : Rainfall in a month
What is Cumulative Distribution Function? Mixed Random Variable Discrete Random Variable A variable which can only take a definitive value within a certain range. The value is usually within a certain distance of another finite value, eg : Rolling a die A mixture of discrete and continuous random variables are called mixed random variables.
Case Study In this case study, we will be working with the Iris dataset. It is a dataset which has the 50 datapoints for three different species of irises. Four features are measured, petal length, petal width, sepal length, and sepal width
Case Study We first find the PDF by using different features and plotting their histograms. We can see that the petal length has the least overlap between the three species and data is normally distributed
Case Study • We can conclude that: • Petal Length < 1.9 is most definitely ‘Setosa’ • 3.2 < Petal length < 5 has a 95% chance of being versicolor • Petal length > 5 has a 90% chance of being ‘Virginia’ We find the CDF of the dataset using the petal length feature. This can be done by integrating over the PDF. The below graph is observed PDF and CDF of Petal Length of 3 species of iris flowers
Join us to learn more! simplilearn.com UNITED STATES Simplilearn Solutions Pvt. Limited 201 Spear Street, Suite 1100 San Francisco, CA 94105 Phone: (415) 741-3319 INDIA Simplilearn Solutions Pvt. Limited #53/1C, 24th Main, 2nd Sector HSR Layout, Bangalore 560102 Phone: +91 8069999471 UNITED STATES Simplilearn Solutions Pvt. Limited 801 Corporate Center Drive, Suite 138 Raleigh, NC 27607 Phone: (919) 205-5565