Last year, Chinese police arrested a man at a pop concert after he was flagged as a criminal suspect by a facial recognition system installed at the venue. The software that called the cops was developed by Shanghai startup Yitu Tech. It was marketed with a stamp of approval from the US government.
Yitu is a top performer on a testing program run by the National Institute of Standards and Technology that’s vital to the fast-growing facial recognition industry. More than 60 companies took part in the most recent rounds of testing. The rankings are dominated by entrants from Russia and China, where governments are bullish about facial recognition, and relatively unconcerned about privacy.
“It’s considered the industry standard and users rely on NIST’s benchmark for their business decisions and purchases,” says Shuang Wu, a Yitu research scientist and head of Yitu’s Silicon Valley outpost. “Both Chinese and international customers ask about it.”
Yitu’s technology is in use by police, and at subway stations and ATMs. It’s currently ranked first on one of NIST’s two main tests, which challenges algorithms to detect when two photos show the same face. That task is at the heart of systems that check passports or control access to buildings and computer systems.
The next five best-performing companies on that test are Russian or Chinese. When the State Department last June picked Paris-based Idemia to provide software used to screen passport applications, it said it had chosen “the most accurate non-Russian or Chinese software” to manage the 360 million faces it has on file.
In a subsequent round of tests, US startup Ever AI ranked seventh, making it the top-performing company outside Russia and China. “Ever since the NIST results came out there’s been a pretty steady stream of customers,” including new interest from government agencies, says Doug Aley, Ever AI’s CEO.
NIST is an arm of the US Commerce Department with the mission of promoting US competitiveness by advancing the science of measurement. Its Facial Recognition Vendor Test program began in 2000, with the support of the Pentagon, after numerous US agencies became interested in using the technology.
Since then, NIST has tracked the steady improvement in algorithms designed to scrutinize human physiognomy, and developed new testing regimes to keep up. The agency now tests algorithms in a subterranean computer room in Gaithersburg, Maryland, using millions of anonymized mugshots and visa photos sourced from government agencies. Its results show that accuracy has improved significantly since the emergence of the neural network technology driving the tech industry’s current AI obsession.
The other NIST test simulates the way facial recognition is used by police investigators, asking algorithms to search for a specific face in a sea of many others. In 2010, the best software could identify someone in a collection of 1.6 million mugshots about 92 percent of the time. In a late 2018 version of that test the best result was 99.7 percent, a nearly 30-fold reduction in error rate.
The best performer on that test is Microsoft, which was scored by NIST for the first time in November. The next three best entrants were Russian and Chinese, with Yitu fourth. Ever AI came fifth. Of the more than 60 entrants listed in NIST’s most recent reports whose home base could be identified, 13 were from the US, 12 from China, and 7 from Russia.
For companies outside of Russia and China, doing well on NIST’s rankings opens the door to contracts with the US government. “Federal agencies don’t make buying decisions without checking with NIST,” says Benji Hutchinson, vice president of federal operations at NEC. The company has facial recognition contracts with the departments of State, Homeland Security, and Defense, and its technology is being tested to check the identity of international passengers at several US airports.
Microsoft President Brad Smith touted the company’s new NIST results in a December blog post that called for federal regulations on the technology and highlighted the importance of independent testing. The company declined to answer queries about its decision to enter the program and interest in government facial recognition contracts, but defended government use of the technology in recent testimony opposing a Washington state bill that would restrict facial recognition.
IBM and Amazon both sell facial recognition to local US law enforcement agencies, but neither has submitted its technology to NIST’s testing. Amazon said in January it respects NIST’s test but that its technology is deeply integrated with Amazon’s cloud computing platform and can’t be sent off to Gaithersburg for the agency to test on its own computers.
IBM computer vision research manager John Smith said the company was working with NIST to broaden its testing of how well facial recognition works across different demographics before deciding whether to take part.
Tech companies and their critics have become more concerned about demographic bias in facial recognition after experiments showed that Amazon’s technology made more errors on black faces, and that facial analysis software from IBM and Microsoft was less accurate for women with darker skin. Amazon disputes the findings, and Microsoft and IBM say they have upgraded their systems.
Os Keyes, a researcher at the University of Washington, says findings like those help show that facial recognition must be scrutinized more broadly than through lab tests of accuracy.
Keyes published a paper last year criticising NIST and others for contributing to the development of gender recognition software that doesn’t account for trans people, potentially causing problems for an already marginalized group. A 2015 NIST report on testing gender recognition software suggested that the technology could be used in alarm systems for women’s bathrooms or locker rooms to alert if a man enters. “NIST needs to employ ethicists or sociologists or qualitative researchers that could go out and look at the impact of these technologies,” Keyes says.
Patrick Grother, one of the NIST scientists leading the testing exercise, says his group is expanding its testing of demographic differences in facial recognition technology, and helping address potential flaws in the technology in its own way.
Although discussion of racial and gender bias has grown, more work is needed on figuring out how to test and measure it, he says. NIST can help the industry address any problems by advancing the science of detecting and tracking them, he says. “We try and bring sunlight and oxygen to the marketplace,” Grother says.
President Trump appears to want NIST to take a more active role in sustaining the development of artificial intelligence. An executive order he signed last month to encourage AI development in the US directed the agency to develop standards and tools to encourage “reliable, robust, and trustworthy” AI systems.
More Great WIRED Stories