If you're a seasoned C# developer who's been curious about machine learning but intimidated by Python-heavy tutorials and complex mathematical concepts, this article is for you. Today, I'll walk you through building a complete house price prediction system using nothing but C# and ML.NET – no Python required, no need to learn new syntax, just pure C# goodness applied to machine learning.
By the end of this journey, you'll have:
As C# developers, we have a secret weapon in the machine learning world: ML.NET. While the data science community gravitates toward Python, ML.NET offers several compelling advantages:
Our goal is ambitious but practical: build a system that can predict house price appreciation over 3, 5, 10, and 20-year time horizons. This isn't just an academic exercise – it's the kind of problem real estate professionals, investors, and homeowners face daily.
Before diving into ML.NET, we needed training data. Real estate data is often:
Instead of struggling with messy real-world data, we took a pragmatic approach: generate synthetic but realistic training data that captures real market dynamics.
#r "nuget: Microsoft.Data.Analysis, 0.21.1"
#r "nuget: MathNet.Numerics, 5.0.0"
using Microsoft.Data.Analysis;
using MathNet.Numerics.Statistics;
// Non-linear price projection model
public class PriceProjectionModel
{
private const double BaseAnnualAppreciationRate = 0.03; // 3% baseline
public double ProjectPrice(double currentPrice, string city, string sizeCategory, int years)
{
var cityMultiplier = GetCityMultiplier(city);
var sizeMultiplier = GetSizeMultiplier(sizeCategory);
var baseRate = BaseAnnualAppreciationRate * cityMultiplier * sizeMultiplier;
// Non-linear growth with diminishing returns
var adjustedRate = baseRate * (1.0 - Math.Exp(-years / 10.0));
return currentPrice * Math.Pow(1 + adjustedRate, years);
}
}
For C# developers, this data generation approach offers several advantages:
The key insight: we're not just generating random numbers – we're modeling real market behavior using mathematical principles that C# handles beautifully.
// Calculate appreciation rates with precision control
var appreciationRates = roundedCurrentPrices.Zip(historicalPrices, (current, historical) =>
Math.Round(DataFrameHelpers.CalculateAppreciationRate(current, historical), 3)).ToArray();
// Add demographic categories
var sizeCategories = sizes.Select(DataFrameHelpers.GetSizeCategory).ToArray();
df["SizeCategory"] = new StringDataFrameColumn("SizeCategory", sizeCategories);
This DataFrame approach feels natural to C# developers because it's essentially strongly-typed data manipulation – like LINQ for structured data.
The beauty of ML.NET lies in its pipeline approach, which mirrors the builder pattern familiar to C# developers:
#r "nuget: Microsoft.ML, 3.0.1"
using Microsoft.ML;
using Microsoft.ML.Data;
public static class MLPipelineBuilder
{
public static IEstimator<ITransformer> CreateHousePricePipeline(MLContext mlContext)
{
return mlContext.Transforms.Categorical.OneHotEncoding("CityEncoded", "City")
.Append(mlContext.Transforms.Categorical.OneHotEncoding("SizeCategoryEncoded", "SizeCategory"))
.Append(mlContext.Transforms.Concatenate("Features",
"Size", "HistoricalPrice", "CurrentPrice", "AppreciationRate",
"CityEncoded", "SizeCategoryEncoded"))
.Append(mlContext.Regression.Trainers.Sdca(labelColumnName: "Label", featureColumnName: "Features"))
.AppendCacheCheckpoint(mlContext);
}
}
Fluent Interface Pattern: The pipeline uses method chaining, which feels natural to C# developers familiar with LINQ:
// This ML.NET pipeline...
mlContext.Transforms.Categorical.OneHotEncoding("CityEncoded", "City")
.Append(mlContext.Transforms.Concatenate("Features", ...))
.Append(mlContext.Regression.Trainers.Sdca(...));
// ...feels like familiar LINQ
data.Where(x => x.IsActive)
.Select(x => x.Price)
.OrderBy(x => x);
Strongly-Typed Models: Define your data structure with attributes – just like Entity Framework:
public class HouseData
{
[LoadColumn(0)]
public float Id { get; set; }
[LoadColumn(1)]
public string City { get; set; }
[LoadColumn(2)]
public float Size { get; set; }
[LoadColumn(4)]
public float CurrentPrice { get; set; }
}
public class HousePricePrediction
{
[ColumnName("Score")]
public float PredictedPrice { get; set; }
}
One of our key insights was that different time horizons require different models. Rather than trying to create one model that predicts everything, we created specialized models:
// Train separate models for each time horizon
var models = new Dictionary<string, ITransformer>();
foreach (var (timeHorizon, targetColumn) in timeHorizons)
{
var data = mlContext.Data.LoadFromTextFile<HouseData>(dataPath, hasHeader: true, separatorChar: ',');
var trainTestSplit = mlContext.Data.TrainTestSplit(data, testFraction: 0.2);
var pipeline = MLPipelineBuilder.CreateHousePricePipeline(mlContext);
var model = pipeline.Fit(trainTestSplit.TrainSet);
models[timeHorizon] = model;
// Save for production use
mlContext.Model.Save(model, trainTestSplit.TrainSet.Schema, $"models/house-price-{timeHorizon}-model.zip");
}
This approach leverages a key C# strength: organizing complex logic into manageable, reusable components.
public static void EvaluateModel(MLContext mlContext, ITransformer model, IDataView testData, string modelName)
{
var predictions = model.Transform(testData);
var metrics = mlContext.Regression.Evaluate(predictions, "Label", "Score");
Console.WriteLine($"\n{modelName} Model Evaluation:");
Console.WriteLine($"R-Squared: {metrics.RSquared:F4}");
Console.WriteLine($"Root Mean Squared Error: ${metrics.RootMeanSquaredError:F0}");
Console.WriteLine($"Mean Absolute Error: ${metrics.MeanAbsoluteError:F0}");
}
The evaluation metrics are presented in a way that makes sense to developers – no need for deep statistical knowledge to understand model performance.
Here's where C# really shines. We built a complete command-line interface that any developer can understand and extend:
#!/usr/bin/env dotnet-script
#r "nuget: Microsoft.ML, 3.0.1"
public class HousePricePredictor
{
private readonly MLContext _mlContext;
public PredictionResult PredictAllTimeHorizons(string city, float size, float currentPrice)
{
var house = CreateHouseData(city, size, currentPrice);
var result = new PredictionResult();
var models = new[]
{
("house-price-5year-model.zip", 5),
("house-price-10year-model.zip", 10),
("house-price-20year-model.zip", 20)
};
foreach (var (modelFile, years) in models)
{
var predictedPrice = PredictPrice(house, modelFile);
// Store results...
}
return result;
}
}
# Interactive mode
dotnet script home-price-predictor.csx interactive
# Direct prediction
dotnet script home-price-predictor.csx predict "Desirable City A" 2500 750000
============================================================
HOUSE PRICE APPRECIATION FORECAST
============================================================
Input Parameters:
City: Desirable City A
Size: 2500 sq ft (Large)
Current Price: $750000
Estimated Historical Appreciation: 26.3%
Price Projections:
Time Horizon Projected Price Price Change % Appreciation
------------ --------------- ------------ ---------------
5 Years $921,875 $171,875 22.9%
10 Years $1,125,000 $375,000 50.0%
20 Years $1,687,500 $937,500 125.0%
// This feels like any other C# service class
public class PredictionService
{
private readonly MLContext _mlContext;
private readonly Dictionary<string, ITransformer> _models;
public async Task<PredictionResult> PredictAsync(HouseData input)
{
// Standard async/await patterns
var result = await ProcessPredictionAsync(input);
return result;
}
}
// Drop into any ASP.NET Core application
[ApiController]
[Route("[controller]")]
public class PredictionController : ControllerBase
{
private readonly PredictionService _predictionService;
[HttpPost("predict")]
public async Task<ActionResult<PredictionResult>> Predict([FromBody] HouseData input)
{
var result = await _predictionService.PredictAsync(input);
return Ok(result);
}
}
Inference Speed: Our trained models can make predictions in under 10ms – fast enough for real-time web applications.
Memory Efficiency: ML.NET models are optimized for production scenarios, with minimal memory overhead.
Scalability: Easy to deploy with familiar .NET hosting options (IIS, Azure App Service, containers).
// Standard Visual Studio debugging works
public float? PredictPrice(HouseData house, string modelFileName)
{
try
{
var model = _mlContext.Model.Load(modelPath, out var schema);
var predictionEngine = _mlContext.Model.CreatePredictionEngine<HouseData, HousePricePrediction>(model);
// Set breakpoints, inspect variables, step through code
var prediction = predictionEngine.Predict(house);
return prediction.PredictedPrice;
}
catch (Exception ex)
{
// Familiar exception handling patterns
_logger.LogError(ex, "Prediction failed for model {ModelFile}", modelFileName);
return null;
}
}
// In Startup.cs
services.AddSingleton<MLContext>();
services.AddScoped<IPredictionService, PredictionService>();
// Model loading with dependency injection
public class PredictionService : IPredictionService
{
public PredictionService(MLContext mlContext, IConfiguration config)
{
_mlContext = mlContext;
LoadModels(config.GetValue<string>("ModelPath"));
}
}
// Hot-swap models without application restart
public async Task UpdateModelAsync(string modelPath, string timeHorizon)
{
var newModel = _mlContext.Model.Load(modelPath, out var schema);
_models[timeHorizon] = newModel; // Thread-safe replacement
_logger.LogInformation("Model updated for {TimeHorizon}", timeHorizon);
}
ML.NET abstracts away much of the mathematical complexity, letting you focus on solving business problems with familiar C# patterns.
Spending time on good data generation and feature engineering pays dividends. The SDCA algorithm we used is simple but effective because our data is well-structured.
Our first model was basic linear regression. We improved incrementally, just like refactoring any C# application.
This pattern extends beyond house prices:
// Product demand forecasting
public class DemandPredictor : IPredictionService<DemandData, DemandPrediction> { }
// Customer lifetime value
public class CLVPredictor : IPredictionService<CustomerData, CLVPrediction> { }
// Inventory optimization
public class InventoryPredictor : IPredictionService<InventoryData, InventoryPrediction> { }
# Install .NET SDK
dotnet --version
# Install dotnet-script for easy prototyping
dotnet tool install -g dotnet-script
# Create a new project
dotnet new console -n MyMLProject
cd MyMLProject
dotnet add package Microsoft.ML
Clone our repository and explore the notebooks:
home-price-projections.ipynb
- Data generationhousing-price-prediction-mlnet.ipynb
- Model traininghome-price-predictor.csx
- Production inferenceReplace our house price logic with your business domain:
As C# developers, we have a unique advantage in the machine learning space. ML.NET allows us to leverage our existing skills, tooling, and ecosystem knowledge to build powerful ML solutions without leaving the .NET world.
The house price prediction system we built demonstrates that you don't need to become a data scientist or learn Python to solve real ML problems. You need:
✅ Good software engineering practices (which you already have)
✅ Understanding of your business domain (which you already have)
✅ Willingness to iterate and improve (which you already do)
The machine learning part? ML.NET handles that for you, presenting it through familiar C# abstractions.
Machine learning isn't magic – it's just another tool in your software development toolkit. With ML.NET, it's a tool that speaks C#.
Ready to build your next ML-powered application? Start with familiar C# patterns, add ML.NET for the smart bits, and watch your applications become more intelligent, one dotnet add package
at a time.