From 4426d9dca5e3049ef27084be60a9ab4660a35185 Mon Sep 17 00:00:00 2001 From: "copilot-swe-agent[bot]" <198982749+Copilot@users.noreply.github.com> Date: Fri, 24 Apr 2026 10:59:19 +0000 Subject: [PATCH] Fix polynomial regression plot zigzag by using np.linspace for smooth curve Agent-Logs-Url: https://github.com/microsoft/ML-For-Beginners/sessions/ec6b5ce7-6500-4c18-8c79-bc21779ac649 Co-authored-by: leestott <2511341+leestott@users.noreply.github.com> --- 2-Regression/3-Linear/README.md | 24 ++++++++++++++++++- 2-Regression/3-Linear/solution/notebook.ipynb | 7 ++++-- 2 files changed, 28 insertions(+), 3 deletions(-) diff --git a/2-Regression/3-Linear/README.md b/2-Regression/3-Linear/README.md index 8978b79ee..70e55dd29 100644 --- a/2-Regression/3-Linear/README.md +++ b/2-Regression/3-Linear/README.md @@ -258,7 +258,29 @@ pipeline.fit(X_train,y_train) Using `PolynomialFeatures(2)` means that we will include all second-degree polynomials from the input data. In our case it will just mean `DayOfYear`2, but given two input variables X and Y, this will add X2, XY and Y2. We may also use higher degree polynomials if we want. -Pipelines can be used in the same manner as the original `LinearRegression` object, i.e. we can `fit` the pipeline, and then use `predict` to get the prediction results. Here is the graph showing test data, and the approximation curve: +Pipelines can be used in the same manner as the original `LinearRegression` object, i.e. we can `fit` the pipeline, and then use `predict` to get the prediction results: + +```python +pred = pipeline.predict(X_test) + +mse = np.sqrt(mean_squared_error(y_test,pred)) +print(f'Mean error: {mse:3.3} ({mse/np.mean(pred)*100:3.3}%)') + +score = pipeline.score(X_train,y_train) +print('Model determination: ', score) +``` + +To plot the smooth approximation curve, we use `np.linspace` to create a uniform range of input values, rather than plotting directly on the unordered test data (which would produce a zigzag line): + +```python +X_range = np.linspace(X_test.min(), X_test.max(), 100).reshape(-1,1) +y_range = pipeline.predict(X_range) + +plt.scatter(X_test, y_test) +plt.plot(X_range, y_range) +``` + +Here is the graph showing test data, and the approximation curve: Polynomial regression diff --git a/2-Regression/3-Linear/solution/notebook.ipynb b/2-Regression/3-Linear/solution/notebook.ipynb index 23943df4f..9953bf948 100644 --- a/2-Regression/3-Linear/solution/notebook.ipynb +++ b/2-Regression/3-Linear/solution/notebook.ipynb @@ -781,8 +781,11 @@ "score = pipeline.score(X_train,y_train)\n", "print('Model determination: ', score)\n", "\n", - "plt.scatter(X_test,y_test)\n", - "plt.plot(sorted(X_test),pipeline.predict(sorted(X_test)))" + "X_range = np.linspace(X_test.min(), X_test.max(), 100).reshape(-1,1)\n", + "y_range = pipeline.predict(X_range)\n", + "\n", + "plt.scatter(X_test, y_test)\n", + "plt.plot(X_range, y_range)" ] }, {