Skewness
Skewness shows two additional features of a data set apart from a central location (mean) and dispersion. To understand the inherent nature of a given dataset we have to measure and know these four basic features the average, variance, skewness.
Skewness means the symmetry or the lack of symmetry of a data. Skewness can be easily observed from the frequency curve. In frequency curve of the data and draw a reference line at the value of mode then if we find the curve on either side of the line have equal, that data is called symmetric.
Positive Skewness: Skewness is said to be positive when the tail of the curve of the frequency distribution elongates more on the right. Also, skewness is positive if mean, median and mode of the frequency distribution satisfy the following condition:
Mean>Median>mode
Negative Skewness: Skewness is said to be negative when the tail of the curve of the frequency distribution elongates more on the left. Also, skewness is negative if mean, median and mode of the frequency distribution satisfy the condition
Mean<Median<Mode
If the curve of the frequency distribution is symmetrical, then skewness is zero. In this case, we have the relation
Mean=Median=Mode
The figure of the symmetrical, positively skewed and negatively skewed distribution are given below:
The object of studying skewness is to estimate the direction of which and the extent to which the curve of the frequency distribution is away from the symmetrical distribution.
Measures of Skewness
Absolute measure
The absolute measure of skewness are given by the following formula:
= Mean – Mode or Mean - Median
$$=(Q_3-M_d)-(M_d-Q_1)$$
$$=Q_3+Q_1-2M_d$$
The first measure of skewness is based on the assumption that in a skewed distribution, mean, median and mode do not coincide. But the second measure of skewness is based on the assumption that in a skewed distribution, median does not lie in a middle way between the lower and upper quartiles.
Relative measure of skewness
The coefficient of skewness based on mean, mode and standard deviation, as Karl Pearson’s Coefficient of Skewness.
$$S_k=\frac{Mean-Mode}{\sigma}$$
This formula can be used for a fairly symmetric (more symmetric) data. However, if the given data is moderately skewed (more skewed) we use the empirical relation given by Pearson’s defined as:
$$S_k(P)=\frac{3(Mean-Mode}{\sigma}$$
Karl Pearson’s coefficient of skewness lies between -3 and +3.
The coefficient of skewness based on quartile is known as Bowley’s coefficient of skewness is given by
$$S_k(B)=\frac{Q_3+Q_1-2M_d}{Q_3-Q_1}$$
Bowley’s coefficient of skewness is bitterly used when the given distribution has open end class. Bowley’s coefficient of skewness lies between -1 and +1.
For both coefficients of skewness, we have the following conclusions
Example: Below is given the IQ scores of 120 students of a class:
IQ score | 50-60 | 60-70 | 70-80 | 80-90 | 90-100 | 100-110 | 110-120 | 120-130 | 130-140 |
No. of std | 5 | 8 | 10 | 18 | 25 | 21 | 19 | 10 | 4 |
Calculate the following type of skewness:
Solution:
To calculate the above type of skewness we will first calculate mean, median, mode, standard deviation and the quartiles using the following table:
IQ score | No of std | C.f. | Mid Value X | fX | fX^{2} |
50-60 60-70 70-80 80-90 90-100 100-110 110-120 120-130 130-140 | 5 8 10 18 25 21 19 10 4 | 5 13 23 41 66 87 106 116 120 | 55 65 75 85 95 105 115 125 135 | 275 520 750 1530 2375 2205 2185 1250 540 | 15125 33800 56250 130050 225625 231525 251275 156250 72900 |
Sum | 120 | 11630 | 1172800 |
$$Mean(\overline{x})=\frac{\Sigma\;fX}{n}=\frac{11630}{120}=96.9$$
$$Variance=\frac{\Sigma\;fX^2}{N}-\bigg(\frac{\Sigma\;fx}{N}\bigg)^2=\frac{1172800}{120}-(96.92)^2=380.49$$
$$Standard\;deviation\;=\;\sqrt{variance}=\sqrt{380.49}=19.50$$
$$First\;Quartiles=Q_1=L+\frac{\frac{N}{4}-c.f.}{f}\times\;h=80+(30-23)\times\;\frac{10}{18}=83.89$$
$$Second\;Quartiles=Q_1=L+\frac{\frac{N}{2}-c.f.}{f}\times\;h=90+(60-41)\times\;\frac{10}{25}=97.60$$
$$Third\;Quartiles=Q_3=L+\frac{\frac{3N}{4}-c.f.}{f}\times\;h=100+(90-18)\times\;\frac{10}{19}=111.58$$
Also,
$$Mode=M_0=L+\frac{f_1-f_0}{2f_1-f_0-f_2}\times\;h=90+\frac{10-(25-18)}{2\times\;25-18-21}=96.36$$
Hence,
S_{k}( Based on median ) = mean – median = 96.92-97.60=-0.68
S_{k}( Based on mode) = mean – mode = 96.92 – 96.36 = 0.56
S_{k}( Based on quartiles ) = \((Q_3-Q_2)-(Q_2-Q_1)\)=(111.58-97.60)-(97.6-83.89)=0.27
$$S_k(Karl\;Pearson)=\frac{Mean-Mode}{\sigma}=\frac{96.92-06.36}{19.50}=0.029$$
$$S_k(Bowley)=\frac{Q_3+Q_1-2M_d}{Q_3-Q_1}=\frac{(111.58+83.89)-2\times\;97.2}{11.58-83.89}=0.039$$
Taken reference from
( Basic mathematics Grade XII and A foundation of Mathematics Volume II and Wikipedia.com )
Skewness is said to be positive when the tail of the curve of the frequency distribution elongates more on the right. Also, skewness is positive if mean, median and mode of the frequency distribution satisfy the following condition:
Mean>Median>mode
Skewness is said to be negative when the tail of the curve of the frequency distribution elongates more on the left. Also, skewness is negative if mean, median and mode of the frequency distribution satisfy the condition
Mean<Median<Mode
ASK ANY QUESTION ON Skewness with Example
No discussion on this note yet. Be first to comment on this note