statistics - Entropy and Information Gain -


I hope simple questions.

If I have a set of such data:

  Classification Specialty-1 attribute-2 perfect dog dog right dog dog wrong dog cat right cat cat Wrong cat dog wrong cat dog   

Then what is the benefit to notice of the attribute -2 related to the specialty -1?

I have calculated the entropy of the whole data set: - (3/6) log2 (3/6) - (3/6) log2 (3/6)) = 1

Then I'm stuck! I think you need to calculate the attributes of attribute-1 and attribute-2 as well. Then do these three calculations in the calculation of profit?

Any help will be great,

thanks :).

Well first you have to calculate the entropy for each attribute, after that you calculate the information profit Do just give me a moment and I will show how it should be done.

for attribute -1

  attr-1 = dog: info ([2c, 1w]) = entropy (2 / 3,1 / 3) attr-1 = Cat Information ([1C, 2V]) = Entropy (1 / 3,2 / 3)   

The value of the attribute -1:

  Information ([2C, 1V], [1C, 2o]) = (3/6) * Information ([2C, 1ST]) + (3/6) * Insights (1C, 2V) )   

Advantages for Attribute -1:

  Profit ("atri-1") = information [3C, 3W] -   

And you have to do this for the next feature.

Comments