You have to answer only the questions (c) and (d) with Python code.
1. Decision trees As part of this question you will implement and compare the Information Gain, Gini Index and CART evaluation measures for splits in decision tree construction.Let D-(X,y),IDI-n be a dataset with n samples. The entropy of the dataset is defined as H (D) = i=1 where P(lD) is the fraction of samples in class i. A split on an attribute of the form X, £ c partitions the dataset into two subsets Dy and DN based on whether samples satisfy the split predicate or not respectively. The split Entropy is the weighted average Entropy of the resulting datasets Dy and Dx LN where ny are the number of samples in Dy and nN are the number of samples in DN. The Information Gain (IG) of a split is defined as the the difference of the Entropy and the split entropy: The higher the information gain the better The Gini index of a data set is defined as G(D)-1-Σ2-1 PcID)2 and the Gini index of a split is defined as the weighted average of the Gini indices of the resulting partitions: LN The lower the Gini index the better Finally, the CART measure of a split is defined as: CART(Dy,D n The higher the CART the better You will need to fill in the implementation of the three measures in the provided Python code as part of the homework. Note: You are not allowed to use existing implementations of the measures. The homework includes two data files, train.trt and test.trt. The first consists of 100 observations to use to train your classifiers; the second has 10 to test. Each file is comma-separated, and each row contains 11 values the first 10 are attributes (a mix of numeric and categorical translated to numeric, e.g. T,F 0,1]), and the final being the true class of that observation. You wl need to separate attributes and class in your load(filename) function.