Contact Sales

In retail real estate, you can't go very far without hearing discussion about retail sales forecasting models. Everyone knows about them and many people like them.

Yet few people really understand them.  I've spent a good chunk of my life creating and working on retail models, both for my own business and for others and I don't make them anymore. So you might say I've seen how the sausage is made.  That said, models are very complex and generalizations are just that.  This is a blog, not a journal article, so there is no way to be totally definitive on the subject in the space allotted.  However, we can gain some insights by looking at models from 50 thousand feet.


There are a number of different types of models used in retail to forecast sales. In general, they can be categorized into one of the following:

Aggregate Regression Models

  • How they work:     Analyze the trade area as one big chunk using mathematical equations that contain information about the trade area (like population, income, competition, etc.).  When you calculate the equation for any given trade area, the result is a sales forecast.

  • Pros:     Relatively easy to build. Can be easy to run.

  • Cons:     Less accurate than other techniques unless combined with a heuristic model (below). 

Disaggregate Regression Model
  • How they work:     These are like aggregate regression models except you analyze the trade area in little pieces and add up the results to get a sales forecast.

  • Pros:     Can be more accurate than aggregate regression models

  • Cons:    Harder to build. May be harder to run than aggregate models

Heuristic Model

  • How they work:     "If this, then that." If there's more than 20 thousand people within 2 miles, then if there are less than 4 competitors within 5 miles... And so on. These models can run the gamut from overly simplistic to extremely sophisticated and are often combined with regression models to say, "If it's like this, use this regression model.  If it’s like that, use that regression model" and so on.

  • Pros:      Can be very powerful, especially combined with other model types.

  • Cons:     Can be overly simplistic

Spatial Interaction Model

  • How they work:    This is the only modeling technology that actually tries to model people's shopping behavior directly. The goal is to create mathematical equations that describe the flow of dollars or people from each neighborhood to each store. Once you find the math, you can put a new store into the model and recalculate. The model will then tell you how many dollars or people should flow to your proposed store.

  • Pros:    Can be purchased commercially.  Can implicitly provide estimates of sister store and competitive impacts.

  • Cons:   Requires a very skilled person to run them. Need very accurate data about competitor locations and demand or they will work poorly.
Analog Model
  • How they work:     This is a different kind of model. Instead of providing a sales forecast, it provides a list of your existing stores that have characteristics similar to your site, usually along with a score of how well they match. You can then look at the stores that match and see how well they perform.

  • Pros:     Analog models are typical fairly easy to build and to run. They have the added benefit of being easy to explain to stakeholders.

  • Cons:    They do not provide an actual sales forecast although some people use an average of the matching stores sales to estimate one.

The question that people always ask the model builders is "How accurate is your model?" They usually answer with a number (like within ±10%). The question that should be asked by the model builder in return (but usually isn’t) is "How much does the quality of the store manager impact your sales?" Whatever number that is, the model builder’s answer should be higher. Many retail chains claim that the quality of manager can impact sales ±15% or more. In a grocery store case I heard from a reliable source, a bad store manager cut sales in half!

There are other factors that can drastically affect sales that can't be readily captured by models. For example, site characteristics are too complex to be fully considered by any model.  In addition, most models use some measure of competition, usually from a 3rd party source.  But the best of the best of these is only 85% accurate, implying 15% error. And almost no one is using this quality of data to build models; it's too expensive. So, unless the competitor data is field verified, the real error rate is worse.

Things you need to know about models can be summed up as follows:

  1. Models can provide an assessment of the relative quality of different sites within the constraints of the model's design and inherent error. So if the competition data used is wrong 20% of the time, we really shouldn't expect the model to do better than that. If the quality of the store manager is not a model parameter, we will see that variability in the model results as well. The bottom line is that anyone who expects a model to produce sales estimates within ±10% or 15% is going to be dissapointed.  So don't expect it!

  2. Models ARE useful as a reality check. They are never opinionated and don’t fall in love with deals. They can be thought of as an unbiased opinion with emphasis on the word opinion; they may be wrong.  A model should never make the decision for you (see Abdication: The Sea Slug as Retail Site Selector).

  3. If you build your model using data that is wrong, your model will be wrong.  Before you engage on a model project, engage on a project to field verify your data. 

  4. If you plan to have a third party build your model for you, find out who owns the resulting model, you or them.  Can you see under the hood? Can you implement it in the system of your choice or can you only run it through their system.

  5. People like to blame models when a store performs worse than projected. Don't let them. It's their job to find good sites, not the model's.

  6. As things change, your model gets old. New products, new marketing, competition, industry trend all shift the characteristics that drive your business. The model will need to be updated or it will get continually less useful. Analog models are less susceptible to this because they rely on existing stores and those store's performance is changing as well.

  7. Companies rarely do post mortems on their model results to see how close they came.  I’m not sure why.  Maybe they don’t really want to know.  If you decide to implement a model, don’t make this mistake.

Regardless of which modeling technology you choose, the most important thing is to understand the true nature and limitations of a model and to use it appropriately.  This requires setting expectations, in terms of what can be expected from the model as well as what will be expected from the real estate reps. Once you understand what models are good at and what they aren’t, you can decide whether they are worth the investment and, if they are, how they can be properly implemented into your site selection process.

Joe Rando