Distribution Analysis In Machine Learning

What type of projects or assignments help looking for?​

  • Assignment or Project Help

  • Online Training and Mentorship

  • New Idea or project

  • Existing project that need more resources

Distribution Analysis

A sample of data will form a distribution, and by far the most well-known distribution is the Gaussian distribution, often called the Normal distribution.

A distribution is simply a collection of data, or scores, on a variable. Usually, these scores are arranged in order from smallest to largest and then they can be presented graphically

Type of Distribution

Gaussian Distribution

Probability Distribution

  • Normal Distribution

  • Lognormal Distribution

  • Gamma Distribution

  • Weibull Distribution

Student’s t-Distribution

Chi-Squared Distribution

And more other

Skewed Distribution

Left Skewed Distribution

When data points cluster on the right side of the distribution, then the tail would be longer on the left side. This is the property of Left Skewed Distribution. The tail is longer in the negative direction so we also call it Negatively Skewed Distribution

Right Skewed Distribution

When data points cluster on the left side of the distribution, then the tail would be longer on the right side. This is the property of Right Skewed Distribution. Here, the tail is longer in the positive direction so we also call it Positively Skewed Distribution.

Visualization Techniques use in distribution

Histogram

  • A Histogram visualizes the distribution of data over a continuous interval

  • Each bar in a histogram represents the tabulated frequency at each interval/bin

  • In simple words, height represents the frequency for the respective bin (interval)

KDE Plots

Histogram results can vary wildly if you set different numbers of bins or simply change the start and end values of a bin. To overcome this, we can make use of the density function.

A density plot is a smoothed, continuous version of a histogram estimated from the data. The most common form of estimation is known as kernel density estimation (KDE). In this method, a continuous curve (the kernel) is drawn at every individual data point and all of these curves are then added together to make a single smooth density estimation.