The fight to make New York City's complex algorithmic math public
The fight to make New York City's complex algorithmic math public
In New York City, government bureaucrats use algorithms to help make decisions on where students are assigned to school, whether a suspect is allowed out of jail and which buildings should be targeted by inspectors.
But amid the growing reliance on technology and complicated formulas to deploy resources and make critical decisions, criminal justice advocates and watchdog groups are seeking more transparency and input over the way they’re used.
Legislation, which may be the first of its kind in the country, was recently introduced in the New York City Council that would peek under the hood of tools being deployed across the city, and help legislators and citizens shed light on any biases in the computations. The bill’s sponsor, City Councilman James Vacca, said his legislation, which would require agencies to publish the source code of the algorithms they use for targeting and penalizing individuals, has “touched a nerve” and started a conversation about the usage of such tools.
“I want people to know how data is analyzed and how data is utilized,” he told New York Nonprofit Media. “The governance of data is going to be increasingly important to our society going forth. Yet, very few places in our country are discussing algorithms, and they’re not discussing the collection of data and then how it’s used.”
In the 1970s, the New York City-RAND Institute developed a formula for the New York City Fire Department that officials reportedly used to justify closing fire stations in the Bronx, just as a wave of arson swept across the borough. In the 1980s, Vacca, then a district manager in the borough, also had requests for more police officers stymied because of formulas, experiences he said helped inspire the bill. “To this day, I don’t know what is the formula the police use to determine how many officers are in every station house,” he said.
Vacca’s bill would apply to every city agency, and it would mandate the agencies post the algorithms on their websites. Criminal justice advocates have been some of its loudest backers because of the direct effect the legislation would haveon its clients, and because the biases of some algorithms have already been well-publicized.
These formulas, or “risk assessment instruments,” weigh a number of factors, such as previous arrests, to determine whether someone arrested for a particular crime is likely to return for court dates or should get bail under certain conditions.
Proponents say the tools can keep the public safe. A working paper released in February by the National Bureau of Economic Research examined simulations of the tools using arrest data from New York City between 2008 and 2013 and found crime could be reduced by nearly a quarter with no change in jailing rates, and the number of people detained in jails could be reduced by 42 percent with no increase in the crime rate.
But critics warn that these technologies have built-in flawed assumptions that can lead to continued bias against affected communities.
In an example highlighted by ProPublica last year, an algorithm wrongly considered a black woman who briefly took a bike from a neighbor’s yard to be more likely to commit a crime in the future than a white man arrested for shoplifting who had a lengthy criminal record. The judge set a higher bond for the black woman, but did not recall whether her “risk score” affected his decision.
Rashida Richardson, the legislative counsel at the New York Civil Liberties Union, said she’s concerned about how the algorithms are used. Even if the formulas are written in a way to reduce any bias, the end users – the civil servants, for example – might not fully understand how to wield the results, or might rely too heavily on them. “It’s possible with those systems that the person making the decision will just rely on what the system shoots out, rather than using any human judgement,” Richardson said.
It’s a delicate balance. Richardson acknowledged that there could be benefits to using the algorithms, but there’s a need for legislation that raises the level of data transparency across agencies without violating individual rights and while retaining each tool’s effectiveness.
“Is there a one-size-fits-all regulation model that can be developed for all of these different agency uses?” Richardson asked. “Or is it going to have to be agency-specific, where NYPD has one form of regulatory oversight and then DOE will have another, because of not only the population it’s serving, but how it’s being used.”
Richardson also said New York largely lags behind other progressive-leaning states when it comes to releasing data. “A lot of the times, I feel like we’re chasing behind California or Massachusetts, or other fairly progressive states,” she said.
“To this day, I don’t know what is the formula the police use to determine how many officers are in every station house.” – New York City Councilman James Vacca
Attorneys with Brooklyn Defender Services, a public defender organization representing nearly 40,000 people each year, said that possible indicators of flight risk – such as homelessness, employment status, school enrollment, previous convictions or imprisonment – can be discriminatory because of the societal factors that lead to those indicators.
Currently, judges are only able to consider whether someone is at risk of fleeing when setting bail conditions. But the city is supportive of allowing judges to also consider public safety.
Yung-Mi Lee, a supervising attorney specializing in criminal defense at BDS, said it was important to alert people of this technology and how it is used before it expands to the point where people are singled out even before they commit a crime. “That’s also the inherent danger of risk assessment instruments: That it will allow for the detention of people that have not even committed that future crime yet,” she said.
Scott Hechinger, the senior staff attorney and director of policy at Brooklyn Defender Services, said, “Any time there’s a criminal justice reform conversation, public defenders and clients – the people affected by the practice – should be at the table. And unfortunately we’re not called upon enough, our voices are not listened to enough.”
Getting updated risk assessment tools into the hands of judges is the first strategy listed in New York City Mayor Bill de Blasio’s plan to close the Rikers Island jail complex and reduce the city’s jail population.
Elizabeth Glazer, director of the Mayor’s Office of Criminal Justice, said researchers are helping redesign the risk assessment tools for judges, with the goal of showing advocates and citizens the processes that go into the next generation of tools. “The big change for us is really just to make the process as open as possible, to ultimately post the data that underlies the tool publicly so that people can see for themselves how the tools operate,” she said. (Some of that data is already available to the public.)
Glazer’s office is seeking a partnership with ideas42, a nonprofit behavioral economics firm, to help redesign the “failure to appear” risk assessment tool to add clarity to how judges see and weigh the results in order to ensure they aren’t solely relying upon the tools. “Judges are humans. They’re not machines. And you truly don’t want simply this algorithm to rule a judgement,” Glazer said. “A judge can see all kinds of things and an algorithm can’t.”
Formulas and technologieshave long shaped how police are deployed in the city, most notably with the introduction of CompStat in the 1990s, which helped police predict where an incident is likely to occur and is credited in helping to drive down the astronomical crime levels of that era. Today, the predictive policing technology is based on historical data and dozens of data points, making the assessments that much more granular.
Another City Council bill, backed by Council members Dan Garodnick and Vanessa Gibson, seeks to make public more information about the use, capabilities, guidelines and training surrounding the NYPD’s surveillance technology.
Academics and civil liberties groups have also been seeking information on how predictive policing tools are used, arguing that they create feedback loops that send more police to areas that already have a high concentration of police, based on the increased number of infractions and crimes that police observe. Critics also argue there is little evidence that such efforts reduce crime; a study on an algorithm-based crime prevention program in Chicago found it didn’t save any lives.
The Brennan Center for Justice has been fighting to learn more about these technologies. A legal claim against the NYPD is pending after it withheld some information about a $2.5 million software contract from Palintir that provides, among other services, predictive policing tools.
The AI Now Institute, a collection of researchers at New York University, issued a report earlier this year that encouraged governments to eschew “black box” tools in favor of openness, test appropriately for any bias and encourage staff with diverse backgrounds and from various specialties to help develop and test the algorithms.
For its part, the city does share a lot of information. The city’s open data portal makes a tremendous amount of data generated by city agencies available to the public. The Mayor’s Office of Data Analytics even put online some of the analytics tools it used to identify which cooling towers to inspect after a 2015 outbreak of Legionnaires’ disease.
During an October New York City Council hearing on the measure, city officials said publishing the source code that companies use to generate these algorithms could allow people to hack the systems, and would have a chilling effect on technology vendors looking to do business with the city. There is precedent for some open-source developers to share their code for predictive policing tools, but many vendors worry that sharing their code could reduce their competitive advantage.
Vacca has suggested that the city appoint an expert who can gauge the openness and fairness of city agencies’ use of formulas. It isn’t clear yet that Vacca’s broad 150-word legislative proposal will ultimately strike the balance of revealing the factors behind these formulas while retaining the confidence of the city’s tech staffers and external partners.
Vacca said he is taking a look at the officials’ comments and will work to incorporate those concerns into the legislation, particularly because it could influence how similar rules are formed in other jurisdictions. “I realize that because this bill is tackling things that have not been tackled throughout the nation, I realize that we have to be very deliberate,” he said.
As Noel Hidalgo of BetaNYC, a technology civic organization, said during the hearing on Vacca’s bill, “If we refuse to hold algorithms and their authors accountable, we outsource our government to the unknown.”