Notes on Skewness with Example | Grade 12 > Mathematics > Dispersion, correlation and regression | KULLABS.COM

Skewness with Example

  • Note
  • Things to remember

Skewness

Skewness shows two additional features of a data set apart from a central location (mean) and dispersion. To understand the inherent nature of a given dataset we have to measure and know these four basic features the average, variance, skewness.

Skewness means the symmetry or the lack of symmetry of a data. Skewness can be easily observed from the frequency curve. In frequency curve of the data and draw a reference line at the value of mode then if we find the curve on either side of the line have equal, that data is called symmetric.

Positive Skewness: Skewness is said to be positive when the tail of the curve of the frequency distribution elongates more on the right. Also, skewness is positive if mean, median and mode of the frequency distribution satisfy the following condition:

Mean>Median>mode

Negative Skewness: Skewness is said to be negative when the tail of the curve of the frequency distribution elongates more on the left. Also, skewness is negative if mean, median and mode of the frequency distribution satisfy the condition

Mean<Median<Mode

If the curve of the frequency distribution is symmetrical, then skewness is zero. In this case, we have the relation

Mean=Median=Mode

The figure of the symmetrical, positively skewed and negatively skewed distribution are given below:

skewness frequency graph
skewness frequency graphwww.safaribooksonline.com

The object of studying skewness is to estimate the direction of which and the extent to which the curve of the frequency distribution is away from the symmetrical distribution.

Measures of Skewness

Absolute measure

The absolute measure of skewness are given by the following formula:

  • The first measure of skewness

= Mean – Mode or Mean - Median

  • The second measure of skewness

$$=(Q_3-M_d)-(M_d-Q_1)$$

$$=Q_3+Q_1-2M_d$$

The first measure of skewness is based on the assumption that in a skewed distribution, mean, median and mode do not coincide. But the second measure of skewness is based on the assumption that in a skewed distribution, median does not lie in a middle way between the lower and upper quartiles.

Relative measure of skewness

  • Based on Karl Pearson’s coefficient of skewness,

The coefficient of skewness based on mean, mode and standard deviation, as Karl Pearson’s Coefficient of Skewness.

$$S_k=\frac{Mean-Mode}{\sigma}$$

This formula can be used for a fairly symmetric (more symmetric) data. However, if the given data is moderately skewed (more skewed) we use the empirical relation given by Pearson’s defined as:

$$S_k(P)=\frac{3(Mean-Mode}{\sigma}$$

Karl Pearson’s coefficient of skewness lies between -3 and +3.

  • Based on Bowley’s coefficient of skewness:

The coefficient of skewness based on quartile is known as Bowley’s coefficient of skewness is given by

$$S_k(B)=\frac{Q_3+Q_1-2M_d}{Q_3-Q_1}$$

Bowley’s coefficient of skewness is bitterly used when the given distribution has open end class. Bowley’s coefficient of skewness lies between -1 and +1.

For both coefficients of skewness, we have the following conclusions

  • If S­=0, the distribution is symmetrical
  • If S­>0, then the distribution is positively skewed.
  • If S­<0, then the distribution is negatively skewed.

Example: Below is given the IQ scores of 120 students of a class:

IQ score

50-60

60-70

70-80

80-90

90-100

100-110

110-120

120-130

130-140

No. of std

5

8

10

18

25

21

19

10

4

Calculate the following type of skewness:

  • Based on median
  • Based on mode
  • Based on quartiles
  • Based on Karl Pearson’s definition
  • Based on Bowley’s definition

Solution:

To calculate the above type of skewness we will first calculate mean, median, mode, standard deviation and the quartiles using the following table:

IQ score

No of std

C.f.

Mid Value X

fX

fX2

50-60

60-70

70-80

80-90

90-100

100-110

110-120

120-130

130-140

5

8

10

18

25

21

19

10

4

5

13

23

41

66

87

106

116

120

55

65

75

85

95

105

115

125

135

275

520

750

1530

2375

2205

2185

1250

540

15125

33800

56250

130050

225625

231525

251275

156250

72900

Sum

120

11630

1172800

$$Mean(\overline{x})=\frac{\Sigma\;fX}{n}=\frac{11630}{120}=96.9$$

$$Variance=\frac{\Sigma\;fX^2}{N}-\bigg(\frac{\Sigma\;fx}{N}\bigg)^2=\frac{1172800}{120}-(96.92)^2=380.49$$

$$Standard\;deviation\;=\;\sqrt{variance}=\sqrt{380.49}=19.50$$


$$First\;Quartiles=Q_1=L+\frac{\frac{N}{4}-c.f.}{f}\times\;h=80+(30-23)\times\;\frac{10}{18}=83.89$$

$$Second\;Quartiles=Q_1=L+\frac{\frac{N}{2}-c.f.}{f}\times\;h=90+(60-41)\times\;\frac{10}{25}=97.60$$

$$Third\;Quartiles=Q_3=L+\frac{\frac{3N}{4}-c.f.}{f}\times\;h=100+(90-18)\times\;\frac{10}{19}=111.58$$

Also,

$$Mode=M_0=L+\frac{f_1-f_0}{2f_1-f_0-f_2}\times\;h=90+\frac{10-(25-18)}{2\times\;25-18-21}=96.36$$

Hence,

S­k­­( Based on median ) = mean – median = 96.92-97.60=-0.68

S­k­­( Based on mode) = mean – mode = 96.92 – 96.36 = 0.56

S­k­­( Based on quartiles ) = \((Q_3-Q_2)-(Q_2-Q_1)\)=(111.58-97.60)-(97.6-83.89)=0.27

$$S_k(Karl\;Pearson)=\frac{Mean-Mode}{\sigma}=\frac{96.92-06.36}{19.50}=0.029$$

$$S_k(Bowley)=\frac{Q_3+Q_1-2M_d}{Q_3-Q_1}=\frac{(111.58+83.89)-2\times\;97.2}{11.58-83.89}=0.039$$

Taken reference from

( Basic mathematics Grade XII and A foundation of Mathematics Volume II and Wikipedia.com )



  • Skewness shows two additional features of a data set apart from a central location (mean) and dispersion. To understand the inherent nature of a given dataset we have to measure and know these four basic features the average, variance, skewness.
  • Skewness is said to be positive when the tail of the curve of the frequency distribution elongates more on the right. Also, skewness is positive if mean, median and mode of the frequency distribution satisfy the following condition:

    Mean>Median>mode

  •  Skewness is said to be negative when the tail of the curve of the frequency distribution elongates more on the left. Also, skewness is negative if mean, median and mode of the frequency distribution satisfy the condition

    Mean<Median<Mode

.

Very Short Questions

0%

ASK ANY QUESTION ON Skewness with Example

No discussion on this note yet. Be first to comment on this note