Does MLXChange MLS Software Calculate Price Per Square Foot Properly?
I have have been critical of our new Austin MLS System called MLXChange since it was first placed into production in Austin in November 2007. It produces what I believe to be incorrect data. I was contacted yesterday by a product manager from MLXChange because they saw my last blog article showcasing some of the bad data. In that article, I mentioned that the price per square foot is not calculated correctly by the system.
The product manager was very nice and I enjoyed our conversation. She wanted to explain to me how the average price per square foot numbers are calculated and that, in fact, the numbers are calculated correctly. She is right on a certain level. That is, if one understands and follows the formula being used by MLXChange, it can be argued that their number is a mathematically correct.
But I maintain that the method being used by MLXChange to calculate average price per square foot is not the correct formula. I’m not smart enough to articulate it in a scientific or mathematical argument, but my intuition and instinct tells me I’m right. I’ll do my best to explain below and I hope a smart reader can chime in and offer a mathematical or statistical point of view and explanation of why two different methods, each of which seems correct independently on face value, produce different results.
So let’s look at an example that illustrates the MLXChange formula versus the method I believe most of us would use, and the method I believe makes the most sense. I’ve created a small sample set of data, shown in the chart below.
Square Feet

Sold Price

Sold $ per SQFT


House 1 
1200

$150,000

$125.00

House 2 
1300

$110,000

$84.62

House 3 
1400

$95,000

$67.86

House 4 
1500

$120,000

$80.00

House 5 
1600

$85,000

$53.13

MLXChange Avg 
1400

$112,000

$82.12

My Average 
1400

$112,000

$80.00

What is the average sold price per square foot of the 5 homes shown above?
MLXChange computes the Average Price per Square Foot by adding together all of the individual price per square foot numbers (the five psf numbers in the far right column) and dividing by the number of sales (five). This produces a result of $82.12 per square foot as the average price per square foot of the sold homes in our sample set.
I compute the average price per square foot by taking the average sqft size of all homes (1400) and dividing it into the average sold price of all homes ($112,000). My result is $80 per square foot. That’s what you would see on a stats chart that I manually produce and post regularly on this blog.
The two methods, in this particular example, produce results that are 2.65% different. 2.65% is not an insignificant amount when pricing a home. It’s a bigger gap than the List/Sold price difference on most sold homes in Austin. It’s a $5,300 difference on a 2500 sqft home at our example psf rates. Would you like to sell your home for $5,300 too little? Would you like to pay $5,300 too much, based on your Realtor’s CMA, produced by MLXChange?
Let’s dig deeper into why I think the MLXChange method is flawed.
First of all, what really bothers me as a Realtor and a follower of the Austin Real Estate Market is that the MLXChange computation method produces a result that is visually and mathematically disconnected from the relationship between the two other numbers that the result is suppose to represent.
In other words, a casual observer of the above chart, were it to show only the $82.12 per square foot, would not be able to easily understand how $82.12 relates to the average sold price of $112,000 or the average sqft size of 1400. As I told the MLXChange rep, it’s a “voodoo” number. A mystery result which can’t be reproduced or validated with the numbers provided on a summary stats report. Sure, we can add together these 5 numbers in my example, when when running a quarterly comparison report with 5,000 sales, comparing the current year to the previous year, we can’t manually validate the data. We just look at the number and say “that ain’t right”.
When I learned math, a Numerator (Avg Sales Price) divided by a Denominator (Avg Size SQFT) equals a number than can be operated in reverse with either of the other two numbers to produce the third. So, an average price per square figure necessarily says to me that “$80 represents the answer derived from dividing $112,000 by 1400”.
I reject and do not accept that $82.12 can represent $112,000 divided by 1400. It simply doesn’t compute.
Why do the different methods produce different results in the first place?
I don’t know for certain why the MLXChange method produces a result that can vary significantly from my method, but I suspect it may have something to do with the combination of rounding and averages in general. Maybe there is some invisible embedded calculus involved. Sometimes there is little if any variance at all between the result produced by each method, but the results are rarely if ever the same across the two methods. In homogeneous data sets with a narrow band of price and size, the numbers remain fairly close. More diverse ares, such as Central Austin, where smaller homes can sell for more than bigger homes, and vice versa, the variance can look similar to what is shown in my example. Certainly, when evaluating data across the entire Austin market, a tremendous amount of diversity exists in the individual per square foot sold amounts.
Like I said, this where my ability to articulate the problem falls apart from a scientific or mathematical standpoint. I won’t argue that if we add the numbers in the right column and then divide by 5, that the answer is $82.12. But 82.12 is not the result of dividing 1400 into $112,000, and that is what the result is supposed to be representing, so it’s wrong in that context.
Why is the Other Method Better?
My method is better because it’s more static and less susceptible to unexplainable swings. No matter what the variance is in sold prices, square footage sizes of homes, or how dishomogeneous the neighborhoods from where the data is drawn, the answer is always the same relative to the numbers it represents. The relationship between the results can be visually understood and validated with simple math. In other words, Avg Sold Price divided by Average Square Footage always equals a constant result that can be reverse operated and validated with the other two numbers. A casual observer of a stats report or CMA would not have questions as to where that psf number came from or think that it might be wrong.
With the MLXChange computation method, the number produced can take wild swings relative to it’s relationship to the other two numbers. In other words, it’s not wrong in a consistent, predictable or observable way. Sometimes the number is way off from my simple method, sometimes it’s very close. Sometimes it’s way higher, sometimes it’s way lower. I find this maddening. Let’s look at an example of what I mean.
Let’s change ONLY the square foot size of TWO of the sales in our chart. I’m simply going to swap the first and fifth square foot number such that the average size and average sold price doesn’t change at all. The fields that do change as a result of this are shaded in grey so you can see what I’m doing.
Square Feet

Sold Price

Sold $ per SQFT


House 1 
1600

$150,000

$93.75

House 2 
1300

$110,000

$84.62

House 3 
1400

$95,000

$67.86

House 4 
1500

$120,000

$80.00

House 5 
1200

$85,000

$70.83

MLXChange Avg 
1400

$112,000

$79.41

My Average 
1400

$112,000

$80.00

Note the the average square foot and average sales price remain unchanged at 1400 and $112,000. So does my Average Sold price per square foot at $80. But look at the average price per square foot of the MLXChange formula computation. Before, it was 2.65% higher than my computation at $82,12. Now it’s 0.74% lower at $79.41. That’s a spread of 3.39% according the the MLXChange formula for calculating average price per square foot. How can the same average data produce such varying results?
Why does it Matter?
Often, our CMA analysis that we print out and use in determining listing or offer prices will include only a handful of comparable sold listings, such as in my sample set above.
In our market, correct or not, agents rely heavily on average price per square foot in setting or defending home prices and values. So do appraisers. As illustrated in my example above, minor changes in the data set can produce inexplicable swings in the price per square foot numbers in MLXChange. These numbers print at the bottom of our CMA reports, which we use to advise buyers and sellers. We are supposed to be informed and empowered by the numbers we see, not confused.
So, depending on whether an agent plucks out “Comparable Sold Property A” versus “Comparable Sold Property B”, to include in a group of 4 or 5 other comps, the CMA could suggest to Realtor A and Seller A that House A should be priced over or under the true market value.
And to Realtor B and Buyer B who are considering an offer on House A, their CMA may suggest a price per square foot 2 or 3 percent different (could be either higher or lower) than the CMA being relied upon by the other side of the deal.
With the more simple method, even if the set of comps are slightly different in size and square footage, the per square foot amount will be less susceptible to variance and both sides of the deal can be measuring apples to apples.
I’m not comfortable with an MLS tool that leaves me scratching my head as to how one of my most important measures of home value is calculated. I don’t know from CMA to CMA when these swings can occur, when they apply, whether they are high or low from how I would do it, or whether the per square foot calculation is an accurate reflection of true market value. So I have resorted to verifying with a calculator.
Nice huh? MLXChange, our “state of the art” MLS Software, better than anything else that could have been purchased for us by our MLS, can’t be relied upon to evaluate the market value of a home. Yes, that chaps me.
So, to MLXChange I ask:
1) Who determined that this computation method is the superior and/or correct method to use for real estate sales analysis?
2) Can you explain why the variance occur and how they benefit us as Realtors?
3) Did our Board of Realtors instruct the computations to be done this way, or did some programmer just think that it would make sense to produce an average of the actual individual per square foot amounts?
4) Why is it better to have a number on a stats report or CMA that doesn’t correlate with the corresponding companion numbers and who told you that we have the time or desire to explain this to buyers and sellers when looking at our CMA reports?
5) Why can’t the price per square foot simply be calculated as I’ve suggested and what bad outcome would result if it were?