Feature scaling is a method used to normalize the range of independent variables or features of data.
In data processing, it is also known as data normalization and is generally performed during the data preprocessing step.
If one of the features has a broad range of values, the distance will be governed by this particular feature.
Therefore, the range of all features should be normalized so that each feature contributes approximately proportionately to the final distance.
Another reason why feature scaling is applied is that gradient descent converges much faster with feature scaling than without it.