Instantaneous, on-demand insights into a commercial building’s energy use could have real-world use cases. With such an insight, real estate investors could estimate a property’s energy costs to inform purchase decisions, micro-grid developers could better estimate aggregate energy needs in a certain location to inform system design, and governments could use it to inform net-zero policy design for energy-efficient buildings.

Today, information on a building’s energy use is not readily available to these types of users. It is only available to those with access to the building’s energy bills – usually the building operator, owner, or the utility itself. While some local and state/provincial governments in Canada and the U.S. are making efforts for this data to be part of open databases (e.g., Ontario’s Energy Benchmarking and Reporting Regulation), these datasets are either not yet available, are hard to find, or are constrained to a geographic region.

What if building energy use could be accurately predicted using machine learning without previous knowledge of a building’s energy use?

About Supervised Machine Learning

Machine learning methods use historical data to train a function, which we call a model, to map inputs to an output. In a commercial building energy use prediction model, inputs like weather, property type, and property size could be mapped to a building’s energy use (the output).

Here are the technical details. It is standard to train the model using supervised learning, which requires historical data that has both known inputs and outputs (this is what makes it “supervised”). The relationship between the inputs and the energy usage is computed statistically and then the relationship (formula) is used to predict energy use based on different (future) inputs. The model is then updated with the historical data till it reasonably maps the building features to the energy usage. See the example in the table below.

Table 1 – Example of a supervised training data set using weather data (inputs) to predict next-day building energy (output)

Date Cooling Degree Days, Heating Degree Days* Actual Building Use
Tuesday 50, 0 200 kWh
Wednesday 54, 0 210 kWh
Today 46, 0 ?

*Heating and cooling degree days are units of measure used in the building energy efficiency community that measure the degree of heating or cooling a building needs in a day.

Gradient Boosting Machine Learning Research in Building Energy Consumption

In Machine Learning Approaches for Estimating Commercial Building Energy Consumption, Caleb Robinson, Bistra Dilkina, and associates evaluated various machine learning models as commercial building energy use predictors.

They trained the models with a large U.S. building energy consumption data set that contained only a small amount of building information: heating degree days, cooling degree days, square footage, the number of floors, and principal building activity. The target to predict was the annual BTUs (British Thermal Units) consumed by the building. The authors wanted to determine the accuracy of various models using limited, easily-accessible information about each building.

The model experiments included linear regression models, support vector machines, neural networks, and gradient boosting machines (GBMs). All these models are classical machine learning models. Don’t let these complex machine learning model types intimidate you. Actually, these models can be developed fairly simply using Python’s scikit-learn package as long as one is not concerned about the details of the model. Although the models and the training algorithms can be adjusted, the authors chose to keep default scikit-learn settings for ease of implementation.

Per the study, the model that had the best overall energy use prediction accuracy was the GBM. Further, when the authors added more building features to the training data set, the average prediction performance of the GBM improved.

We now will provide a high-level review of how the GBM works. Note, the details of the other models are out of the scope of this article.

About Gradient Boosting Machines (GBMs)

The GBM is composed of weak learner decision trees. Decision trees produce output by using a sequence of simple rules (kind of like a series of go “left” or “right” decision points).

Figure 1 – Example of a Machine Learning Decision Tree

The training of the GBM consists of a fixed number of stages. At each stage, a new decision tree is added and trained to reduce the leftover error from all previous decision trees (like any algorithm, a decision tree has some level of error).

The idea is to improve the model performance by training new decision trees to account for the weaknesses of the old decision trees. The implementation of this algorithm can be a few lines of code when using the scikit-learn package.

Improving the GBM with More Testing

The GBM as an energy use predictor could benefit from further model exploration. As mentioned above, training the model with more building features can improve the prediction performance of GBMs. So, for each building type, it is worth trying a GBM trained with many specialized features.

Another future direction to advance the study’s work would be to test the model against actual energy use data from other U.S. and Canadian buildings to see how it performs in different regions and climates.


The mentioned study is an interesting probe into building commercial building energy prediction models (predictors) using easily-accessible data sets that are readily available to a wide variety of stakeholders and excluding hard-to-acquire data like a building’s actual energy use data. The GBM model performed the best in the study.

The GBM model could be expanded by testing with data from other jurisdictions and with additional input parameters. Care needs to be taken with adding too many inputs as to not violate the original goal – making accurate building energy use predictions with high-availability data sets.