How to buy a car using data part III: The cheap-ass car version

Guestpost by Dev Nambi on May 20th, 2013

Remember when Dev introduced us to buying a car using data part I and II? He's back with part III, and this time it's all about buying cheap-ass cars.

Screen Shot 2013-05-11 at 11.43.27 AMMy sister called me from her trusted car repair shop. Her '95 Ford Escort had been troublesome for months, and was now truly dead. This means the car's demise had left my sister, her husband, and their two-year-old without transport. Worse, they had a 20+ mile commute to work, didn't have time off, and would be fired if they couldn't get to work at a moment's notice.

I had less than 72 hours to find a replacement car. So I broke out my data nerd skills and got to work.

My first step was to find out about what features my sister cared about the most in a car:

  • Space for a child seat and groceries
  • Reliable
  • Less than $5,000
  • Low operating cost: the cost to run the car each year, repairs, insurance, and gas.

I had researched how to buy a car using data. Sleuthing on Craigslist and AutoTrader revealed that vehicles this cheap are 9+ years old and have 100K+ miles. Many seemed of dubious reliability.

There was no way to know the reliability of a car from its description. That suggested there were both ripoffs and deals in the listings. This was an information asymmetry problem. The seller had perfect knowledge and the buyer had little.

Where to start: Make and Model

Internet sleuthing led to FleetBusiness, which reported how long different brands last before they die (are junked). I also found TrueDelta, which had reports from car owners about repairs, mileage and cost. Here's what I found in the FleetBusiness data:

death_vs_age_half

The car brand that died off quickest was Suzuki. The brands that died off slowest were Toyota, Honda and Subaru. The die-off rate was not a straight line… it was an S-shape, like the continuous normal distribution. Looking at the scrap rate per year, I saw a roughly normal distribution:

incr_death_vs_age_half

Most cars died after 10-20 years. The cars I was looking at were the worst possible age. The odds were good the car I purchased would die within the next 5-10 years. However, the cars I was looking at were 10-13 years old. Any cars that died before then weren't for sale so I could exclude that percentage.

death_vs_age_10_half

The most reliable brands to buy at 10 years' age were Honda and Toyota, followed by Chrysler. I picked 6 reliable models:

  • Toyota Corolla
  • Honda Fit
  • Honda Civic
  • Toyota Camry
  • Hyundai Sonata
  • Hyundai Elantra

I added two Hyundai models, the Elantra and Sonata, because I heard their later-generation models were well-built. This was not data-driven and foolish.

What's on sale?

I collected 117 car listings. My goal was to have enough listings that there were a few good deals.

The biggest cost of owning a car is depreciation: the difference between your purchase price and what you sell it for. Buying a cheap car that lasts a long time seemed the best way to reduce that cost.

I didn't care about car mileage or age. I wanted a car with as many miles remaining as possible. I needed to find out how long each car model would last. If a car has 125K miles already there's a big difference between a car that lasts 200K miles vs. 150K miles. The 200K car will get you 3X farther.

I guessed mileage was roughly 5X more important than age. Maintenance costs would increase exponentially as mileage and age increased. I puzzled out an equation to compute a "quality score for each car.

Score = fnNormalize ( Age^1.2 ) * 20% + fnNormalize ( Mileage^1.4 ) * 80%

quality_histogram_half

The ratio of this score to the price is the "value score." Higher value scores were better deals:

quality_vs_price_half

Roughly, better-quality cars were more expensive. However, there isn't a straight line. There were ripoffs (in the upper left, with smaller dots) and potential deals (in the lower right, with larger dots).

Let's go shopping!

Now I had a shopping list: the five cars with the highest value scores.

  • The first car had sold, in under an hour.
  • We went to see the #3 car at a nearby dealership. The test drive was illuminating: the car was junk. The brakes barely worked, the fan belt made a whistling sound, and the lowest gear didn't work… in an automatic. We left in a hurry.
  • For car #2, I wasn't hopeful after that first test drive, but was surprised when this car handled well. The engine, brakes, and steering all worked perfectly. A roller-coaster route through West Seattle found no issues. We made plans for my trusted mechanic to look over the car.

Open your bonnet and say "vrooom"

The car and seller were legitimate. A check of the vehicle's VIN number found no thefts or accidents.

The mechanic confirmed car #2 was in good working condition except the it burned some oil when accelerating. Some hasty Internet searches suggested this was not unusual for old Toyota Corollas and didn't mean the engine was toast. We quickly bought the car. Success!

Epilogue

  • Work quickly. Good deals sell fast, in a day or two.
  • Hundreds of cars in Seattle were listed on Craigslist and AutoTrader each day.
  • The dealer car we tried was worse and more expensive than the private seller. A NADA report shows that used-car dealerships' profit margins were 12% for used cars. A $5,000 dealer car would cost $4465 on by a private seller.

Read more posts about: , ,


About Dev Nambi

I am a data nerd, developer and aspiring polymath. My day job is a developer for the University of Washington. The rest of the time I play board games, bike, and play with Legos.

http://devnambi.com