What is Underfitting in Machine learning?

Post by **quantumadmin** » Wed May 15, 2024 10:03 am

Underfitting in machine learning occurs when a model is too simple to capture the underlying patterns in the data. This results in poor performance on both the training and test datasets because the model fails to learn the complexities of the data. Essentially, underfitting means the model has not learned enough from the training data to make accurate predictions.

Characteristics of Underfitting

High Training Error: The model performs poorly on the training dataset, indicating that it is unable to capture the patterns in the data.
High Validation/Test Error: The model also performs poorly on the validation or test dataset, showing that it has not generalized well to new, unseen data.
Simple Model: Underfitting often occurs with models that are too simple, such as linear models applied to non-linear data or shallow neural networks applied to complex tasks.

Causes of Underfitting

1. Too Simple Model: Using a model that does not have enough complexity or capacity to learn the data. Examples include:

Linear regression for a problem that requires polynomial regression.
Decision trees with too few splits.
Neural networks with too few layers or neurons.

2. Insufficient Training Time: Not training the model long enough for it to learn the patterns in the data.
3. Inadequate Features: Using too few or irrelevant features that do not provide enough information for the model to learn effectively.
4. High Regularization: Applying too much regularization can penalize the model to the point where it cannot learn the training data properly.

Mitigation Strategies

1. Increase Model Complexity: Use a more complex model that has the capacity to learn the underlying patterns in the data.

Use polynomial regression instead of linear regression if the relationship is non-linear.
Increase the depth of decision trees.
Use deeper neural networks with more layers and neurons.

2. Feature Engineering: Add more relevant features that provide useful information to the model. This can involve:

Creating new features based on domain knowledge.
Using techniques like polynomial features, interactions, and transformations.

3. Reduce Regularization: Decrease the regularization parameters to allow the model more flexibility to learn the data.
4. Train Longer: Allow the model to train for more epochs or iterations, ensuring it has enough time to learn from the data.
5. Hyperparameter Tuning: Adjust the model’s hyperparameters to find a better balance between bias and variance.

Example of Underfitting and Mitigation in Python
Here’s an example using a linear regression model to fit non-linear data:

Code: Select all

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import PolynomialFeatures
from sklearn.pipeline import make_pipeline
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Generate synthetic data
np.random.seed(0)
X = np.sort(np.random.rand(100, 1) * 10, axis=0)
y = np.sin(X).ravel() + np.random.normal(0, 0.1, X.shape[0])

# Split data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)

# Create a linear regression model (which will underfit the non-linear data)
linear_model = LinearRegression()
linear_model.fit(X_train, y_train)

# Predict and evaluate
y_train_pred = linear_model.predict(X_train)
y_test_pred = linear_model.predict(X_test)

print(f"Train MSE (Linear Model): {mean_squared_error(y_train, y_train_pred):.3f}")
print(f"Test MSE (Linear Model): {mean_squared_error(y_test, y_test_pred):.3f}")

# Plotting the results for the linear model
plt.scatter(X, y, color='black', label='Data')
plt.plot(X, linear_model.predict(X), color='blue', label='Linear Model')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.title('Underfitting Example')
plt.show()

# Now, create a polynomial regression model to mitigate underfitting
degree = 5  # Using a polynomial degree to better fit the data
poly_model = make_pipeline(PolynomialFeatures(degree), LinearRegression())
poly_model.fit(X_train, y_train)

# Predict and evaluate
y_train_pred_poly = poly_model.predict(X_train)
y_test_pred_poly = poly_model.predict(X_test)

print(f"Train MSE (Polynomial Model): {mean_squared_error(y_train, y_train_pred_poly):.3f}")
print(f"Test MSE (Polynomial Model): {mean_squared_error(y_test, y_test_pred_poly):.3f}")

# Plotting the results for the polynomial model
plt.scatter(X, y, color='black', label='Data')
plt.plot(X, poly_model.predict(X), color='red', label=f'Polynomial Model (degree {degree})')
plt.xlabel('X')
plt.ylabel('y')
plt.legend()
plt.title('Mitigating Underfitting with Polynomial Regression')
plt.show()

Explanation

1. Data Generation: We generate synthetic data using a sine function and add some noise.
2. Linear Model Training: We train a linear regression model on the data, which is too simple for this non-linear data, leading to underfitting.
3. Evaluation: We calculate and print the mean squared error (MSE) for both the training and test sets, showing poor performance due to underfitting.
4. Visualization: We plot the original data and the linear model's predictions to visually inspect underfitting.
5. Polynomial Model Training: We then train a polynomial regression model with a degree of 5, which is more appropriate for the non-linear data.
6. Evaluation and Visualization: We evaluate the polynomial model, print the MSE, and plot its predictions, showing improved performance and mitigation of underfitting.

By using a more complex model (polynomial regression in this case), we can better capture the underlying patterns in the data and reduce underfitting.