From 4426d9dca5e3049ef27084be60a9ab4660a35185 Mon Sep 17 00:00:00 2001
From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com>
Date: Fri, 24 Apr 2026 10:59:19 +0000
Subject: [PATCH] Fix polynomial regression plot zigzag by using np.linspace
for smooth curve
Agent-Logs-Url: https://github.com/microsoft/ML-For-Beginners/sessions/ec6b5ce7-6500-4c18-8c79-bc21779ac649
Co-authored-by: leestott <2511341+leestott@users.noreply.github.com>
---
2-Regression/3-Linear/README.md | 24 ++++++++++++++++++-
2-Regression/3-Linear/solution/notebook.ipynb | 7 ++++--
2 files changed, 28 insertions(+), 3 deletions(-)
diff --git a/2-Regression/3-Linear/README.md b/2-Regression/3-Linear/README.md
index 8978b79ee..70e55dd29 100644
--- a/2-Regression/3-Linear/README.md
+++ b/2-Regression/3-Linear/README.md
@@ -258,7 +258,29 @@ pipeline.fit(X_train,y_train)
Using `PolynomialFeatures(2)` means that we will include all second-degree polynomials from the input data. In our case it will just mean `DayOfYear`2, but given two input variables X and Y, this will add X2, XY and Y2. We may also use higher degree polynomials if we want.
-Pipelines can be used in the same manner as the original `LinearRegression` object, i.e. we can `fit` the pipeline, and then use `predict` to get the prediction results. Here is the graph showing test data, and the approximation curve:
+Pipelines can be used in the same manner as the original `LinearRegression` object, i.e. we can `fit` the pipeline, and then use `predict` to get the prediction results:
+
+```python
+pred = pipeline.predict(X_test)
+
+mse = np.sqrt(mean_squared_error(y_test,pred))
+print(f'Mean error: {mse:3.3} ({mse/np.mean(pred)*100:3.3}%)')
+
+score = pipeline.score(X_train,y_train)
+print('Model determination: ', score)
+```
+
+To plot the smooth approximation curve, we use `np.linspace` to create a uniform range of input values, rather than plotting directly on the unordered test data (which would produce a zigzag line):
+
+```python
+X_range = np.linspace(X_test.min(), X_test.max(), 100).reshape(-1,1)
+y_range = pipeline.predict(X_range)
+
+plt.scatter(X_test, y_test)
+plt.plot(X_range, y_range)
+```
+
+Here is the graph showing test data, and the approximation curve:
diff --git a/2-Regression/3-Linear/solution/notebook.ipynb b/2-Regression/3-Linear/solution/notebook.ipynb
index 23943df4f..9953bf948 100644
--- a/2-Regression/3-Linear/solution/notebook.ipynb
+++ b/2-Regression/3-Linear/solution/notebook.ipynb
@@ -781,8 +781,11 @@
"score = pipeline.score(X_train,y_train)\n",
"print('Model determination: ', score)\n",
"\n",
- "plt.scatter(X_test,y_test)\n",
- "plt.plot(sorted(X_test),pipeline.predict(sorted(X_test)))"
+ "X_range = np.linspace(X_test.min(), X_test.max(), 100).reshape(-1,1)\n",
+ "y_range = pipeline.predict(X_range)\n",
+ "\n",
+ "plt.scatter(X_test, y_test)\n",
+ "plt.plot(X_range, y_range)"
]
},
{