More information makes for better predictions-and better management of public budgets.
Do you know what your salary is going to be three months from now? Or nine months? If you live in federally subsidized housing, the U.S. government knows.
At the September 2012 Predictive Analytics World conference in Washington, D.C., Dave Vennergrund, director of business analytics for the business consultancy CACI, discussed how his firm helped the U.S. Department of Housing and Urban Development (HUD) anticipate the future cost of one its key programs by forecasting how much renters would be able to afford in the future.
The Project-Based Rental Assistance program (PBRA) connects HUD to the owners of apartment buildings and housing developments. The building owners rent their apartments or houses directly to HUD. The government then rents those units to applicants and subsidizes the portion of the rent that the applicants can’t pay. Knowing what tenants will be able to pay in the future is key to funding the program.
CACI used time-series forecasting to predict subsidies, on an individual renter basis, by month. They used Office of Management and Budget (OMB) forecasts for wage, income, and cost of living, going out 36 months. Result: CACI can predict the subsidy that HUD will have to pay at the unit level, based on area demographics and the type of income that the tenant receives.
The system isn’t perfect. OMB forecasts can be off, and when they are, the rest of the model works less well. But HUD serviced 17,000 contracts with building owners, for 1 million units and 2.2 million tenants, at a cost of $9.2 billion in 2012. Even a slight improvement in forecasting the cost of the PBRA program can make a big difference. This, in part, is why the U.S. government got into the business of predicting future trends with massive data sets in the first place, and why it has recently moved more aggressively in this direction.
Big Data in Big Government
In March 2012, the Obama administration announced $200 million in big data research and development funding. Big data-or petabyte-scale stores of structured and unstructured data-has emerged as one of the most important business concepts of 2012 and 2013, gracing the cover of the Harvard Business Review, Scientific American, the New York Times, and the Wall Street Journal. The U.S. government has been trying to extract useful insight from massive data sets for decades.
“Data mining wasn’t used as a business practice until the mid-1990s; we were using it at the dawn of the computer age,” said Dean Silverman of the Internal Revenue Service. The IRS processes 140 million individual tax returns a year. Auditing even a small fraction of these returns can be extremely costly, so knowing which returns to flag for possible audit when they come is the key to managing costs. Silverman didn’t disclose how the IRS does that, but did indicate that bad tax returns fit certain patterns that individual agents might not notice, while cutting-edge machinelearning programs pick them up fairly easily.
Using a database of seven years of adjudicated claims to train its machine-learning system, the Centers for Medicare and Medicaid Services now knows right away when one of the doctors who get payments from Medicaid dies, loses his license, etc. This cuts down considerably on fraud, a principal goal of the fraudprevention program mandated by Health and Human Services Secretary Kathleen Sebelius in 2010.
The agency now says that the false-positive rate on its predictive models is 22%, so when a health-care provider is flagged as a fraudster, there’s a 78% chance the agency is right. And Medicare can now detect fraud at the first instance, for amounts as low as $4,000.
“We call that a huge success,” said David Nelson, director of Medicare’s Data Analytics and Control Group.
Medicare recently built a new central command room that looks like something from a spy movie, and the agency brings in law enforcement professionals across other agencies to attack specific types of fraud. These sound like big improvements, but for most agencies, the challenge-and the amount of data-will far outsize the ability of budget-constrained government workers to deal with it for the foreseeable future.
As originally reported by Bloomberg News, senators Orrin G. Hatch (R-Utah), and Tom Coburn (R-Oklahoma), have sent letters urging the Centers for Medicare and Medicaid Services to be more open about the performance metrics for the predictive analytics program. The absence of such metrics makes it difficult to judge the success of the program, the senators told Bloomberg. But establishing such metrics won’t be easy.
“We don’t even know how much fraud there is in Medicare,” said Nelson.
Big data, it seems, can create more problems than it solves. -Patrick Tucker
Source: Predictive Analytics World, Washington, D.C., September 17- 18, 2012, http://www.predictiveanalyticsworld.com/gov/2012/.
[Editor’s note: FUTURIST deputy editor Patrick Tucker’s forthcoming book, A Future Ever Certain: How the Science of Prediction Will Change the Way We Live, Work, and Love, will be published by Current, an imprint of Penguin, in 2013.]
Originally published in THE FUTURIST, January-February 2013