Birds of a feather flock together. The popular saying applies to politics and to the computational analysis of complex networks in research on corruption, judging from a study by scientists at the University of São Paulo in Brazil.
According to the authors, it is possible to predict whether deputies (members of Câmara dos Deputados, the lower house of Congress in Brazil) will be convicted of corruption or white-collar crime in future by analyzing the similarity between their voting records and those of already convicted lawmakers.
The study is published as a chapter of the book Corruption Networks. The researchers analyzed the voting history and consonance of 2,455 politicians elected to the lower house between 1991 and 2019, in a total of 3,407 sessions involving votes on bills covering a wide array of subjects.
“The surprising aspect of the study is that we didn’t need to use data from cases tried by the law courts to find this correlation between voting history and corruption. We used only house voting records to create networks showing what we call ‘voting vicinity’ in terms of how and with whom deputies voted. On this basis, our model can predict whether a deputy is corrupt with 90% accuracy,” said Tiago Colliri, a co-author of the study.
The analysis was conducted during Colliri’s PhD research at USP’s São Carlos Department of Computer Science, with a scholarship from the National Council for Scientific and Technological Development (CNPq). FAPESP provided support via a Thematic Project and the Center for Artificial Intelligence (C4AI), a partnership between FAPESP and IBM.
The analysis of complex networks has been applied very widely, including such fields as biological neural networks and food chains, for example. In crime-related studies, the aims have ranged from finding a correlation between social capital and the risk of corruption in local government contracts to identifying hidden links among the members of an Italian mafia group.
To understand the approach, it is necessary to bear in mind that complex networks refer to large-scale graphs with non-trivial connection patterns. “One of the main features of any complex network is the modeling of various types of relationship between nodes – or deputies, in the case of our study. These may be local, intermediate or global relationships. Here we set out to identify the relationship between deputies and how they voted in Congress, and our method was highly accurate for the purpose of prediction,” said Zhao Liang, a professor in the Department of Computing and Mathematics at USP’s Ribeirão Preto School of Philosophy, Science and Letters (FFCLRP) and the other co-author of the study.
After creating a network based on the voting histories of almost 2,500 deputies, the researchers noted that some deputies who had been notoriously convicted of corruption had similar voting histories to those of other deputies. “In this kind of network analysis, each node represents a deputy and each edge stands for the similarity of votes between a pair of deputies,” Colliri explained.
The researchers detected a pattern in the network. “The pattern showed voting vicinity or similarity among deputies whose convictions had been reported in the media. There was consonance in their voting histories,” he said.
To validate the finding, they assembled a separate database with data on deputies convicted for corruption taken from sources such as Brazil’s Supreme Court (STF). “With this secondary database, we verified 33 individuals who had been convicted, and they were not scattered but clustered in the network,” Colliri said. “They formed a pattern of what we call ‘corruption neighbors’. We then tested this map using a number of link prediction algorithms based on common neighbors. The algorithms proved capable of predicting whether a deputy is corrupt with 90% accuracy.”
One of the conclusions to be drawn from the study is that corruption in Congress can be monitored in a simpler manner. “We discovered that corrupt deputies vote similarly in our Congress, so that a predictive model can be obtained more simply and monitoring can be much easier to do. It’s far more straightforward to analyze this data than search among lawsuits, criminal trials, media reports, and even family trees,” Zhao said.