Distribution Analysis In Machine Learning
What type of projects or assignments help looking for?

Assignment or Project Help

Online Training and Mentorship

New Idea or project

Existing project that need more resources
Distribution Analysis
A sample of data will form a distribution, and by far the most wellknown distribution is the Gaussian distribution, often called the Normal distribution.
A distribution is simply a collection of data, or scores, on a variable. Usually, these scores are arranged in order from smallest to largest and then they can be presented graphically
Type of Distribution
Gaussian Distribution
Probability Distribution

Normal Distribution

Lognormal Distribution

Gamma Distribution

Weibull Distribution
Student’s tDistribution
ChiSquared Distribution
And more other
Skewed Distribution
Left Skewed Distribution
When data points cluster on the right side of the distribution, then the tail would be longer on the left side. This is the property of Left Skewed Distribution. The tail is longer in the negative direction so we also call it Negatively Skewed Distribution
Right Skewed Distribution
When data points cluster on the left side of the distribution, then the tail would be longer on the right side. This is the property of Right Skewed Distribution. Here, the tail is longer in the positive direction so we also call it Positively Skewed Distribution.
Visualization Techniques use in distribution
Histogram

A Histogram visualizes the distribution of data over a continuous interval

Each bar in a histogram represents the tabulated frequency at each interval/bin

In simple words, height represents the frequency for the respective bin (interval)
KDE Plots
Histogram results can vary wildly if you set different numbers of bins or simply change the start and end values of a bin. To overcome this, we can make use of the density function.
A density plot is a smoothed, continuous version of a histogram estimated from the data. The most common form of estimation is known as kernel density estimation (KDE). In this method, a continuous curve (the kernel) is drawn at every individual data point and all of these curves are then added together to make a single smooth density estimation.