![]() Sometimes it happens that training set contains all the unique values in a column but the test set only contains fewer unique values compared to the training set. If you are going to do One-Hot Encoding then it is better to use scikit-Learn OneHotEncoder instead of pandas get_dummies. Logreg = LogisticRegression(max_iter=1000,random_state=42) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3) # split the data into training and test set from sklearn.model_selection import train_test_splitįrom sklearn.linear_model import LogisticRegressionįrom trics import accuracy_score Now, we can train a logistic regression model on this data. Now, to do One-Hot Encoding in Pandas we use the pd.get_dummies() method. This is why it is called One-Hot Encoding. When the quality of wine is bad then the bad column gets a value of 1 and all the other column gets a value of 0 and when the quality is medium then the medium column gets a value of 1 and all the other columns get the value of 0. ![]() One-Hot Encoding is a method of converting categorical data to numeric data in which for every unique value in the categorical column we create a new numeric column. In this post, you will learn How to do one Hot Encoding in pandas using pd.get_dummies() method. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |