Transformer In Sklearn
Scikit-learn has an object usually something called a Transformer. The use of a transformer is that it will be performing data preprocessing and feature transformation but in the case of model training, we have objects called models like linear regression, classification, etc… if we talk about the examples of Transformer-like StandardScaler which helps us to do feature transformation where it converts the feature with mean =0 and standard deviation =1, PCA, Imputer, MinMaxScaler, etc… then all these particular techniques have seen that we are doing some preprocessing on the input data will change the formate of data and that data will be used for model training
Suppose we take f1, f2, f3 and f4 feature where f1,f2,f3 are independent features and f4 is our dependent feature and we apply a standardization process in which it takes a feature F and converts into F’ by applying a formula of standardization, If you notice at this stage we take one input feature F and convert it into other input feature F’ itself So, in this condition we do Three difference operation:
1. fit()
2. transform()
3. fit_transform()
fit() :
In the fit() method, where we use the required formula and perform the calculation on the feature values of input data and fit this calculation to the transformer. For applying the fit() method we have to use .fit() in front of the transformer object.
transform() :
For changing the data we probably do transform, in the transform() method, where we apply the calculations that we have calculated in fit() to every data point in feature F. We have to use .transform() in front of a fit object because we transform the fit calculations.
fit_transform():
This fit_transform() method is basically the combination of fit method and transform method, it is equivalent to fit().transform(). This method performs fit and transform on the input data at a single time and converts the data points. If we use fit and transform separate when we need both then it will decrease the efficiency of the model so we use fit_transform() which will do both the work.
0 Comments