Training a Simple Linear Regression Model on Azure

This page demonstrates how to train a simple linear regression model using Python on the Azure Machine Learning platform with a free Azure account. It includes the Python code to train the model, deploy it, and steps to set it up in Azure ML Studio.

Overview

We use a tiny dataset (x,y pairs where y = 2x) stored in Azure Blob Storage, train a linear regression model using scikit-learn, register it in Azure ML, and deploy it as a web service. The process uses Azure ML Studio's free tier compute and storage.

Dataset

Create a CSV file named simple_data.csv with the following content and upload it to Azure Blob Storage via Azure ML Studio:

x,y
1,2
2,4
3,6
4,8
5,10

Training Script

This Python script loads the dataset from Azure storage, trains a linear regression model, and registers it. Run it in a Jupyter notebook in Azure ML Studio.

import pandas as pd
from azureml.core import Workspace, Dataset
from sklearn.linear_model import LinearRegression
import joblib
from azureml.core.model import Model

# Connect to Azure ML workspace
ws = Workspace.from_config()

# Load data from Azure Blob Storage
dataset = Dataset.get_by_name(ws, name='simple-dataset')
df = dataset.to_pandas_dataframe()
X = df[['x']].values
y = df['y'].values
print("Data loaded from Azure Blob Storage:", df.head())

# Train the model
model = LinearRegression()
model.fit(X, y)
print("Model trained. Coefficient:", model.coef_[0])

# Save and register the model
joblib.dump(model, 'linear_model.pkl')
registered_model = Model.register(workspace=ws,
                                 model_path='linear_model.pkl',
                                 model_name='simple-linear-model',
                                 description='Simple linear regression model')
print("Model registered:", registered_model.name)

Scoring Script for Deployment

This script (score.py) is used when deploying the model as a real-time endpoint in Azure ML.

import joblib
import json

def init():
    global model
    model = joblib.load('model.pkl')

def run(data):
    input_data = json.loads(data)['data']
    prediction = model.predict([[input_data]])
    return {'prediction': prediction[0]}

Steps to Train and Deploy on Azure

Create Azure ML Workspace: In Azure Portal, create a Machine Learning workspace (free with your account).
Upload Data: In Azure ML Studio (ml.azure.com), upload simple_data.csv to the default Blob Storage and register it as a dataset named "simple-dataset".
Create Compute: Set up a compute instance (e.g., Standard_DS3_v2) in Azure ML Studio for running scripts.
Run Training Script: Create a notebook in Azure ML Studio, paste the training script, and execute it on the compute instance.
Register Model: The script registers the model in Azure ML's model registry.
Deploy Model: In Azure ML Studio, go to "Models", select "simple-linear-model", deploy to a real-time endpoint using Azure Container Instance, and provide score.py.
Use Model: Test the endpoint with a POST request (e.g., {'data': 6} to predict y=12) using tools like Postman or Python's requests.

Testing the Deployed Model

Use this Python code locally to test the deployed endpoint:

import requests
import json

endpoint_url = 'YOUR_ENDPOINT_URL'  # From Azure ML Studio
scoring_uri = endpoint_url + '/score'
headers = {'Content-Type': 'application/json'}
data = json.dumps({'data': 6})
response = requests.post(scoring_uri, data=data, headers=headers)
print("Prediction:", response.json())