You are here
Ahead of 2020, Facebook falls short on plan to share data on disinformation
IN APRIL 2018, Mark Zuckerberg, Facebook's chief executive, told Congress about an ambitious plan to share huge amounts of posts, links and other user data with researchers around the world so that they could study and flag disinformation on the site.
"Our goal is to focus on both providing ideas for preventing interference in 2018 and beyond, and also for holding us accountable," Zuckerberg told lawmakers questioning him about Russian interference on the site in the 2016 presidential election. He said he hoped "the first results" would come by the end of that year.
But nearly 18 months later, much of the data remains unavailable to academics because Facebook said it has struggled to share the information while also protecting its users' privacy. And the information the company eventually releases is expected to be far less comprehensive than originally described.
As a result, researchers said, the public may have little more insight into disinformation campaigns on the social network heading into the 2020 presidential election than they had in 2016. Seven non-profit groups that have helped finance the research efforts, including the Knight Foundation and the Charles Koch Foundation, have even threatened to end their involvement.
BuzzFeed News earlier reported on researchers' concerns over delays in Facebook's data-sharing project.
"Silicon Valley has a moral obligation to do all it can to protect the American political process," said Dipayan Ghosh, a fellow at the Shorenstein Center at Harvard and a former privacy and public policy adviser at Facebook. "We need researchers to have access to study what went wrong."
Political disinformation campaigns have continued to grow since the 2016 campaign. Last week, Oxford researchers said that the number of countries with disinformation campaigns more than doubled to 70 in the last two years, and that Facebook remained the No 1 platform for those campaigns.
But while company executives express an eagerness to prevent the spread of knowingly false posts and photos on the social network, by far the world's largest, they also face numerous questions about their ability to secure people's private information.
Revelations last year that Cambridge Analytica, a political consulting firm, had harvested the personal data of up to 87 million Facebook users set off an outcry in Washington.
In the months after the scandal, Facebook cut off many of the most common avenues for researchers accessing information about the more than two billion people on the service. This past July, it also agreed with federal regulators to pay US$5 billion for mishandling users' personal information.
"At one level, it's difficult as there's a large amount of data and Facebook has concerns around privacy," said Tom Glaisyer, chairman of the group of seven nonprofits supporting the research efforts. "But frankly, our digital public square doesn't appear to be serving our democracy," added Mr Glaisyer, who is also managing director of the Democracy Fund, a nonpartisan group that promotes election security.
Three months after Mr Zuckerberg spoke in Washington last year, Facebook announced plans to provide approved researchers with detailed information about users, like age and location, where a false post appeared in their feeds and even their friends' ideological affiliation. Dozens of researchers applied to get the information.
The company partnered with an independent research commission, Social Science One, which had been set up for the initiative, to determine what information could be sent to researchers. Facebook and Social Science One also brought in the Social Science Research Council, an independent nonprofit organisation that oversees international social science research, to sort through applications from academics and conduct a peer review and an ethical review on their research proposals.
But privacy experts brought in by Social Science One quickly raised concerns about disclosing too much personal information. In response, Facebook began trying to apply what's known in statistics and data analytics as "differential privacy", in which researchers can learn a lot about a group from data, but virtually nothing about a specific individual. It is a method that has been adopted by directors at the Census Bureau and promoted by Apple.
Facebook is still working on that effort. But researchers said that even when Facebook delivers the data, what they can learn about activity on the social network will be much more limited than they planned for.
"We and Facebook have learned how difficult it is to make" a database that was not just privacy-protected but at a "grand scale", said Nate Persily, a Stanford law professor and co-founder of Social Science One.
Facebook said researchers had access to other data sets, including from its ads archive and Crowdtangle, a news-tracking tool that Facebook owns. Two researchers said they and others visited Facebook's headquarters in California in June to learn how to study the available data set.
And both Facebook and Social Science One said they would continue to make more data available to researchers in time. In September, the two released 32 million links that included data about whether users labelled millions of posts as fake news, spam or hate speech, or if fact-check organisations raised doubts about the posts' accuracy. It also included how many times stories were shared publicly and the countries where the stories were most shared.
Facebook's effort is a "tremendous step forward", said Joshua Tucker, a professor at New York University studying the spread of polarising content across multiple platforms. "In the long term, if methods for making these data available for outside research are successfully implemented, it will have a very positive impact."
But other researchers said the existing databases are severely limiting. And some said that Facebook's concerns about privacy are overblown.
Ariel Sheen, a doctoral student at Universidad Pontificia Bolivariana in Medellin, Colombia, whose research team has been through the Social Science One approval process but has not yet received the data, said his group has uncovered on its own hints of a large coordinated campaign in Venezuela.
His group believes it has found more than 3,000 still-active fake Facebook accounts - profiles run by people impersonating others, for example - that are spreading false information. The accounts, Mr Sheen said, are tied to Telesur, a Latin American television network largely financed by the Venezuela government.
But because Facebook is not providing the original data described, Mr Sheen noted, his team's work cannot proceed as planned. "We believe that it is imperative for our research to continue as was originally agreed to by Facebook," he added. NYTIMES