 Research
 Open Access
 Published:
Scientific authorship and collaboration network analysis on malaria research in Benin: papers indexed in the web of science (1996–2016)
Global Health Research and Policy volume 3, Article number: 11 (2018)
Abstract
Background
To sustain the critical progress made, prioritization and a multidisciplinary approach to malaria research remain important to the national malaria control program in Benin. To document the structure of the malaria collaborative research in Benin, we analyze authorship of the scientific documents published on malaria from Benin.
Methods
We collected bibliographic data from the Web Of Science on malaria research in Benin from January 1996 to December 2016. From the collected data, a mulitigraph coauthorship network with authors representing vertices was generated. An edge was drawn between two authors when they coauthor a paper. We computed vertex degree, betweenness, closeness, and eigenvectors among others to identify prolific authors. We further assess the weak points and how information flow in the network. Finally, we perform a hierarchical clustering analysis, and MonteCarlo simulations.
Results
Overall, 427 publications were included in this study. The generated network contained 1792 authors and 116,388 parallel edges which converted in a weighted graph of 1792 vertices and 95,787 edges. Our results suggested that prolific authors with higher degrees tend to collaborate more. The hierarchical clustering revealed 23 clusters, seven of which form a giant component containing 94% of all the vertices in the network. This giant component has all the characteristics of a smallworld network with a small shortest path distance between pairs of three, a diameter of 10 and a high clustering coefficient of 0.964. However, MonteCarlo simulations suggested our observed network is an unusual type of smallworld network. Sixteen vertices were identified as weak articulation points within the network.
Conclusion
The malaria research collaboration network in Benin is a complex network that seems to display the characteristics of a smallworld network. This research reveals the presence of closed research groups where collaborative research likely happens only between members. Interdisciplinary collaboration tends to occur at higher levels between prolific researchers. Continuously supporting, stabilizing the identified key brokers and most productive authors in the Malaria research collaborative network is an urgent need in Benin. It will foster the malaria research network and ensure the promotion of junior scientists in the field.
Background
Malaria remains one of the three major public health concerns in Sub Saharan Africa where it affects millions of people and impact negatively on their socioeconomic life [1]. In the Millenium Declaration, Malaria has been given a special attention in terms of the successful achievement of the 6th development goal of the Millenium Challenge [2]. In Benin, initiatives such as the US President’s Malaria Initiative have supported governmental and nongovernmental organizations to reduce the mortality and morbidity related to Malaria [3, 4]. With these financial supports at hand, such efforts in Benin have led to a sharp increase in public health interventions and many positive public health outcomes in terms of the reduction of mortality and morbidity related to Malaria [5]. Such increase in public health interventions translated in the successful implementation and sustainability of entomological surveillance of malaria for more than six years since 2008 [6]. Between the years 2000 and 2009, the increase in funding led to an annual decrease of 5.2% in the incidence of malaria and 5.3% in malariarelated deaths [7]. This encouraging success stories have even motivated other authors to enunciate the ambitious malaria eradication plan [8].
Despite the progress in malaria, very little is known on the dynamics of the malaria research collaboration network. This situation results in a lack of information on the main players and drivers of the progress made. As for the eradication of chickenpox [9], collaborative research will undoubtedly play an important role in the successful attainment of the malaria eradication plan in Subsaharan Africa in general and in Benin in particular. By collaborating with each other, researchers form continuous and sustainable collaboration through intensive network practices that go beyond the regional boundaries [10]. In addition, the fact that the extensive research conducted has not prevented malaria from outpacing the proposed solutions is a definitive clue to investigating the structure of the malaria research community. Research collaboration constitutes a stable basis for the provision of evidence based information in the formulation of fundamental principles and guidelines for the elaboration of public health strategies. Therefore, we propose in this study, to document, describe and analyze the different aspects of the malaria research collaboration in Benin.
Understanding the structure of this network is capital since it can help improve research prioritization [11], identify prolific researchers, better design, strategic planning and implementation of research programs [12], and promote cooperation and translational research initiatives [13]. We choose a social network analysis approach which will reveal undiscovered knowledge on effort of researchers in working together towards the reduction of the burden of Malaria in Benin.
Our study focuses on the Network analysis of the scientific collaborations through coauthorship network analysis. Its aim is to document the structure of the malaria collaborative research in Benin.
Methods
Data collection
The data collection was carried on papers indexed in Thompson’s Institute for Scientific Information Web Of Science (formerly known as the Web of Knowledge). The search was conducted using combinations of Malaria related MeSH terms including “malaria”, “Anopheles”, “Plasmodium” and “vector”. We restricted the search to the period from 1996 to 2016 and to “Benin” for country. We further screened the papers in order to only select those published by Beninese authors, or papers published on Malaria from Benin. All published documents under considerations included at least one Author from Benin. No restriction was placed upon the document types. We first started querying with each term independently, we then combined the other terms so the query return the maximum number of results. The Full citations information containing the authors’ names, their institutional affiliations, the year of publication, as well as the number of times the document was cited were recorded as a bibliographic corpus in text format. After a second screening only research that have met the above listed inclusion criteria and that were published between January 1, 1996 and December 31, 2016 were selected in this study.
Text mining and network generation
From the bibliographic text files, we built a corpus of the published documents using Tethne v0.8, a python software for parsing bibliographic data. Using NetworkX [14], another python package, we generated an undirected multigraph coauthorship networks containing parallel edges. Vertices were defined by several attributes including name, affiliation, city, country, number of publication and total number of times cited. Edges too, had attributes associated with them such as a unique identifier, the number of times a pair of authors was cited and the number of publications of a pair of authors. We normalized and disambiguate the information collected such as researchers’ names, research center denominations, and any other information that appeared ambiguous.
Author name disambiguation
One common challenge in collecting bibliometric data is the matching problem. Multiple names can refer to the same author. A wellknown approach to solving this issue is termed as Author Name Disambiguation (AND). While many AND methods have been reported in the literature [15, 16], we performed a fuzzy matching machine learning technique of AND. We used Dedupe, a python library to disambiguate authors’ names and assign a unique identification number to each author. We manually annotated 10% of the names and then trained the algorithm to automatically disambiguate the remaining of the entries. Dedupe is interactive and adjusts further annotations as the disambiguation process evolves. Dedupe is based on the work of Bilenko [17] and has been developed by Gregg Forest and Derek Eder. For more information on Dedupe, we refer the reader to the author’s Github repository available at https://github.com/dedupeio/dedupe. We evaluated our AND fuzzy matching machine learning method by computing Precision and recall metrics.
Descriptive data analysis
Using igraph, a network analysis package developed in R, we computed the following vertex centrality measures:

Degree of the vertices in the network defined as the number of ties to a given author. After converting the multigraph network in a weighted graph where weights are the number of authorships between two authors, the strength of the vertices was also computed.

Betweenness: it is the number of shortest paths between alters that go through a particular author. It relates to the perspective that importance relates to where a vertex is located with respect to the paths in the network graph. According to Freeman [18], it is defined as:
where σ(s, t v) is the total number of shortest paths between s and t that pass through v, and σ(s, t) is the total number of shortest paths between s and t regardless of whether or not they pass through v.

Closeness: the number of steps required for a particular author to access every other author in the network. It captures the notion that a vertex is central if it is close to many other vertices. Considering a network G = (V, E) where V is the set of vertices and E, the set of edges, the closeness centrality c_{ Cl }(v) of a vertex v is defined as:
where dist(v, u) is defined as the geodesic distance between the vertices u, v ∈ V.

Eigenvectors: degree to which an author is connected to other well connected authors in the network. It seeks to capture the idea that the more central the neighbors of a vertex are, the more central that vertex itself is. According to Bonacich [19] and Katz [20], the Eigenvector centrality measure is defined as:
Where the vector \( {\mathbf{c}}_{E_i}={\left({c}_{E_i}(1),\dots, {c}_{E_i}\left({N}_v\right)\right)}^T \) is the solution to the eigenvalue problem \( {\mathbf{Ac}}_{E_i}={\alpha}^{1}{\mathbf{c}}_{E_i} \), where A is the adjacency matrix for the network G. According to Bonacich [19], an optimal choice of α^{−1} is the largest eigenvalue of A.

Brokerage: degree to which an actor occupies a brokerage position across all pairs of alters.
We also computed edge betweenness centrality which extends from the notion of vertex centrality by assigning to each edge a value reflecting the number of shortest paths traversing that edge. We calculated edge betweenness to assess which coauthorship collaborations are important for the flow of information. In the result section, we present the 10 most important collaborations in the malaria coauthorship network.
Characterizing network cohesion
The extent to which subsets of authors are cohesive with respect to their relation in the coauthorship network was assessed through network cohesion. Specifically, we determined if collaborators (coauthors) of a given author tend to collaborate as well, and what subset of collaborating authors tend to be more productive in the network. While there are many techniques to determine network cohesion, we chose local triads and global giant components. In addition, we conducted cliques detection and clustering or communities detection on the network:

Cliques: According to Kolaczyk and Csárdi [21], cliques are defined as complete subgraphs such that all vertices within the subset are connected by edges. We computed the number of maximal cliques and assessed their size.

Density: Defined as the frequency of realized edges relative to potential edges, the density of a subgraph H in G provides a measure of how close H is to be a clique in G. Density values vary between 0 and 1:

Relative frequency: we assess the relative frequency of G by computing its transitivity defined as:
where τ_{Δ}(G) is the number of triangles in G, and τ_{3}(G) is the number of connected triples (sometimes referred to as 2star).
This measure is also referred to as the fraction of transitive triples. It represents a measure of global clustering of G summarizing the relative frequency with which connected triples close to form triangles [21].

Connectivity, Cuts, and Flows: We investigated the concepts of vertex and edge cuts derived from the concept of vertex (edge) connectivity. The vertex (edge) connectivity of a graph G is the largest integer such that G is kvertex (edge) connected [21]. These measures helped assess the information flow in the network. Since coauthorship networks are undirected graphs, the concept of weak and strong connectivity was irrelevant in this study. A graph G is said to be connected if every vertex in G is reachable from every other vertex. Usually, one of the connected components dominate the others, hence the concept of giant component.

Graph Partitioning: Regularly framed as community detection problem, we applied graph partitioning to find subsets of vertices that demonstrate a ‘cohesiveness’ with respect to their underlying relational patterns. Cohesive subsets of vertices generally are well connected among themselves and are well separated from the other vertices in the graph. Two established methods of graph partitioning are Hierarchical clustering (agglomerative vs divisive) and Spectral clustering [21]. In this study, we applied agglomerative Hierarchical Clustering to the coauthorship network.
Mathematical modeling
The purposes of network graph modeling are to test significance of the characteristics of observed network graphs, and to study proposed mechanisms of realworld networks such as degree distributions and smallworld effects [21]. A model for a network graph is a collection of possible graphs \( \mathcal{G} \) with a probability distribution ℙ_{ θ } defined as:
where θ is a vector of parameters ranging over values in Θ.Given our observed malaria coauthorship network graph G^{obs} and some structural characteristics η(·), our goal is to assess if η(G^{obs}) is unusual. We then compare η(G^{obs}) to collection of values \( \left\{\eta (G):G\in \mathcal{G}\right\} \). If η(G^{obs}) is too extreme with respect to this collection, then we have enough evidence to assert that η(G^{obs}) is not a uniform draw from \( \mathcal{G} \).Given the computationally expensive calculations involved in modeling in general, and the expected large size of our network, we parallelized all the processings.We applied different mathematical models for network graphs including:

Classical Random Graph Models: First established by Erdős and Rényi [22,23,24], it specifies a collection of graphs \( \mathcal{G} \) with a uniform probability ℙ(·) over \( \mathcal{G} \). A variant of this model called the Bernoulli Random Graph Model was also defined by Gilbert [25].

Generalized Random Graph Models: These models emanated from the generalization of Erdős and Rényi’s formulation, defining a collection of graphs \( \mathcal{G} \) with prespecified degree sequence.

Mechanistic Network Graph Models: These models mimic realworld phenomena and include SmallWorld Models commonly referred to as “sixdegree separation”. It was introduced by Watts and Strogatz [26] and have since received a lot of interests in the existing literature especially in Neuroscience. Smallworld networks usually exhibit high levels of clustering and small distances between vertices. Examples of known smallworld networks include the network of connected proteins or the transcriptional networks of genes [27]. A variant of SmallWorld models is the Preferential Attachment Models defined based on the popular principle of “the rich get richer”. Examples of Preferential Attachment networks include that of World Wide Web [28] and the scientific citation network [29, 30]. An important characteristic of these models is that as time tend to infinity, there degree distribution tends to follow a power law.
For each mathematical model, we ran 1000 MonteCarlo based simulations. We then compared the observed characteristics to the simulated ones thanks to a sample Student’s ttest. Characteristics we assess significance for are the average shortest paths, the clustering coefficient and the number of communities detected by the hierarchical clustering methods.
Results
Data collection
Of all the different queries formulated, the WOS query “TOPIC: (malaria) OR TOPIC: (mosquito) OR TOPIC: (anopheles) OR TOPIC: (plasmodium) OR TOPIC: (net) OR TOPIC: (vector) Refined by: COUNTRIES/TERRITORIES: (BENIN)” returned 630 records. After a rigorous screening process carried out by all the authors, 424 documents met the selection criteria. On average, there was 10.67 authors per published document.
After the Author Name Disambiguation, we identified 1792 unique authors with a precision of 99.87% and a recall of 95.46%. The generated multigraph coauthorship network therefore contained 1792 vertices (authors) and 116,388 parallel edges (collaborations). Each vertex (author) in the network has 2 attributes: name and a unique identification number. Each edge has 8 attributes: key, subject, abstract, year, wosid (Web of science Identification number), journal, title and doi (digital identifier object).
Descriptive data analysis
The degrees of the multigraph network range between 1 and 1338 with an average degree distribution of 106.46. We noted in addition, a substantial number of vertices with low degrees (Fig. 1). There was also a nontrivial number of vertices with higher order of degree magnitudes. A log scale distribution of the degrees demonstrate that the vertex degrees tend to follow a heavytail distribution (Fig. 2).
After we convert the multigraph network in a weighted graph, it results in a simple graph of 1792 vertices and 95,787 weighted edges. Mean Closeness centrality ranges between 3.118 × 10^{−7} and 5.152 × 10^{−6} with a median of 5.112 × 10^{−6}. This measure suggests a highly rightskewed distribution. Betweenness measures range between 0 and 245,600 with a median of 1985. A network visualization with the vertices’ size proportional to betweenness centrality measures clearly reveals the presence of broker authors (Table 1). The median Eigenvectors median is 0.005 and a mean of 0.09. Eigenvectors measures reveal the presence of multiple cluttered authors suggesting the presence of closed collaboration groups. Table 1 presents a list of the 10 authors with the highest Eigenvectors values.
The computation of edge betweenness identifies coauthorship collaborations that are important for the flow of information. In Table 1, We present the top 10 most important collaborations for the flow of information in the Malaria Coauthorship network in Benin.
Network cohesion
A total of 365 maximal cliques are identified in the network among which 9 cliques of size 2, 14 cliques of size 3, 155 cliques of size 8, and 142 cliques of size 7. Larger maximal cliques sizes range from 102 authors to 365 authors and are all found once across the network.
The malaria coauthorship network has a density of 0.0596 and a transitivity of 0.965 indicating that 96.5% of the connected triples in the network are close to form triangles. The transitivity metrics is a measure of the global clustering of the network.
The network is not connected and a census of all the connected components within the network reveals the existence of a giant component that dominates all the other connected components. This giant component includes 94% (1686 vertices) of all the vertices in the network with none of the other components alone carrying less than 1% of the vertices in the network (Fig. 3).
The assessment of information flow in the network via cut vertices reveal the existence of 16 authors as the most vulnerable vertices in the network. Table 1 lists the authors that constitute the weak articulation points in the malaria coauthorship network. Cut vertices are crucial to the sustainability of networks [21].
The agglomerative hierarchical clustering method identifies 23 research communities (or clusters) in the network. Sizes of the clusters range between 2 and 570 with large research communities containing between 202 and 569 authors. Medium size research communities contain between 10 and 62 authors. Only seven out of the 23 research communities identified are part of the giant component. Figure 3 displays the giant component of the network with each different colors representing each of the seven research communities.
Mathematical modeling
The hierarchical clustering method of community detection algorithm has identified 23 different clusters/communities in the coauthorship network out of which seven form a giant component. One of the question of interest in this section is whether the number of communities detected is expected or not. We performed 1000 Monte Carlo based simulations to test the significance of this observed characteristics on the malaria coauthorship network. Figure 4 clearly demonstrates that the number of communities detected is unusual from the perspective of both Classical random graphs and generalized random graphs (pvalue < 0.0001). From the Classical random graph model, the expected number of communities is 3.934 (95%CI: 3.90–3.97). Similarly, the expected number of communities from the generalized random graph model is 7.501 (95%CI: 7.39–7.61).
Figure 5 displays the number of detected research communities using the BarabásiAlbert’s preferential attachment and the WattsStrogatz models. Supprisingly enough, the observed number of communities is also extreme per both models (pvalue < 0.0001). The expected number from the WattsStrogatz model simulations is 3.056 (95%CI: 3.04–3.07) and 45.569 (95%CI: 45.42–45.72) from the BarabásiAlbert model simulations.
We also compared the clustering coefficient and the average shortestpath length. The observed clustering coefficient is 0.9645. Surprisingly, there is substantially more clustering in our malaria coauthorship network than expected from all 4 mathematical models (pvalue < 0.0001). The expected clustering coefficient is 0.0596 (95%CI: 0.05963068–0.05964648) and 0.4334 (95%CI: 0.4333912–0.4334522) respectively for the classic random graph and the generalized random graph models. Similarly, The WattsStrogatz Small World model expected clustering is 0.7464 (95%CI: 0.7464326–0.7464356).
We observed an average shortestpath length of 2.99 in the malaria coauthorship network. This observed shortestpath length is significantly larger than what is expected from the random graph models (pvalue < 0.0001) and significantly lower than what is expected from WattsStrogatz small world model and the BarabásiAlbert preferential attachment model (pvalue < 0.0001).
The average shortestpath length is 1.94 (95%CI: 1.941955–1.941960) and 2.26 (95%CI: 2.259468–2.259586) respectively for the classic random graph and the generalized random graph models. For the WattsStrogatz small world and the BarabÃ¡siAlbert models, the average shortestpath length is respectively 3.83 (95%CI: 3.81–3.86) and 9.17 (95%CI: 9.14–9.21).
All simulations were also performed on the giant component of the network and led to similar outcomes.
Discussion and conclusion
This study provides insights in the structural characteristics of the malaria coauthorship network in Benin over a relatively long period. The 20 years of data collected coincides with the onset of active malaria research from 1996 until December 2016 in the country. The significant increase in malaria research and collaborations (Table 2) between the authors over the years is an expected finding given the regain and renewed interest in malaria control and elimination goals set forth [8, 31]. This research shows that the mechanism underlying the formation of the malaria coauthorship network in Benin is not random. It further demonstrates that the malaria research collaboration network in Benin is a complex network that seems to display smallworld properties (often referred to as “six degrees of separation”).
The nontrivial number of authors with higher order of magnitudes confirms the presence of closed research groups where collaborative research likely happens only among members. In other words, interdisciplinary collaboration tends to occur at higher levels between prolific researchers with the majority of the collaborations happening between researchers from the same scientific communities. Prominent authors with important collaborations tend to collaborate with similar authors, young or less prolific authors tend to collaborate with both prolific authors and authors with very few collaborations. Similar findings were reported by Janet Okamoto [32] who studied scientific collaboration on a much smaller scale. Key brokers facilitate scientific collaborations within and outside their scientific community [33]. Betweenness centrality measures identifies such brokers who are important hubs for inter and transdisciplinary research. Many of the main brokers proved to also be the most connected and the most central authors confirming the presence of long publishing tenure authors in our network [34]. The flow of information in the malaria coauthorship in Benin is slow as it only relies on 16 authors representing less than 1% of all the authors in the network. Such a low information flow was also reported by Salamatia and Soheili [35] in a 2016 study on a coauthorship analysis of Iranian researchers in the field of violence. Generally, the most important authors in a coauthorship network are the ones with the highest degree of collaborations [36, 37]. However, to the longterm substainability of the malaria research network in Benin, The 16 authors identified as cut vertices are the most important authors. In other words, the removal of less than 1% of the authors from the network would lead to its collapse. Such a collapse would undoubtedly be detrimental to the future of malaria research in Benin. This finding clearly confirms the conclusion of Toivanen and Ponomariov [38] that the African research collaboration network is vulnerable to structural weaknesses and uneven integration.
Smallworld networks are known to have small shortest path distance and a high clustering coefficient. Although our network seems to display such properties, the MonteCarlo simulations revealed that the observed network has unexpected properties compared to classic smallworld networks. A study of coauthorship network conducted on Chagas disease has found similar findings [13]. Unlike our study, the authors of this study did not deepen their analysis to confirm the smallworld nature of their observed network. Other mechanisms such as preferential attachement have been found to explain the structure of international scientific collaboration network [39]. Unlike those studies, our network displayed unexpected properties that are more extreme that the 4 mathematical models we simulated. Our network has significantly larger shortest path distance and significantly higher clustering than expected from the 4 mathematical models presented here. One observation we are sure of is that none of the random graph models used here tend to explain the growth and the structure of the malaria coauthorship network in Benin. We therefore claim without any doubt that the structure and growth of our network is not random confirming the presence of hidden factors explaining the current structure of the network. Assessing such factors and the extent to which they influence scientific collaborations is important for the future of malaria research and its longterm sustainability. Unfortunately, none of the proposed models seem to accurately describe the observed structure of the network. This is why we believe that Advanced analyses involving statistical modeling are needed to better explain the structure of this network. In addition, unlike mathematical modeling, statistical modeling allow model fitting to the observed network [21, 40].
Our research has strengths. Unlike most studies on coauthorship analysis, it applies not only descriptive methods but also robust network analysis methods such as inferential methods like MonteCarlo simulations. We test significance of the properties of our network to accurately understand its structure. Our data mining strategy involved a robust machine learning algorithm that helped address the crucial issue of the disambiguation of authors names and assign a unique identification to each of them. This technique maintained a good quality of the data collected throughout the preprocessing and analysis steps. To the best of our knowledge, our study is the first to describe the malaria research collaborations network via coauthorship network analysis in Benin.
The fact that our study collected data only from the Web Of Science can be considered as an important limitation of this study. However, according to Falagas and colleagues [41], who compared PubMed, Scopus, Web Of Science and Google Scholar in their paper, the Web Of Science appears as a reasonable scientific database source for our analysis. In addition, it proved to cover a wide range of both old and recently published papers. Falagas and colleagues [41] found PubMed to be the optimal choice in terms of scientific database. For that reason, we did run the same bibliographic search in PubMed. Unfortunately, the Web Of Science returns more relevant data than PubMed. Another limitation worth noting is that this study only looks at a snapshot of the malaria research network on a static fashion. There is also a need to apply dynamic statistical models such as Temporal Exponential Random Graph [42] and Dynamic Stochastic Block [43] modeling to better understand the temporal dynamic of collaboration formation in this network. Yet another limitation is inherent to the nature of all coauthorship studies. Collaborators, in a coauthorship network, do not often come from the same scientific discipline, or do not play the same roles on a particular research project. The data we collected did not allow us to accurately assess or even infer the disciplines each author came from or their specific contribution in the published document.
As malaria continues to be highly prevalent in Benin, it is essential to consolidate the knowledge generated from the numerous studies on the disease and reinforce the different communities involved in the research effort. Our results suggest that there is an urgent need to foster the malaria research network in Benin by continuously supporting, stabilizing the identified key brokers and most productive authors, and promoting the junior scientists in the field. Taking such measures will ultimately insure the longterm sustainability of the malaria coauthorship and collaboration network in Benin.
Abbreviations
 AND:

Author Name Disambiguation
 CI:

Confidence Interval
 US:

United States
 WOS:

Web Of Science
References
 1.
Davis JR, Lederberg J. Emerging infectious diseases from the global to the local perspective: workshop summary. National Academies Press. 2001.
 2.
United Nations. Department of Economic. The Millennium Development Goals Report 2008. United Nations Publications; 2008.
 3.
Arthur, M. Institute for Health Metrics and Evaluation. Nurs Stand. 2014;28(42);32–32.
 4.
Stoops C. President's malaria initiative. Washington DC: Navy Medical Services Corps; 2008.
 5.
Barat LM. Four malaria success stories: how malaria burden was successfully reduced in Brazil, Eritrea, India, and Vietnam. Am J Trop Med Hyg. 2006;74(1):12–6.
 6.
Akogbéto MC, Aïkpon RY, Azondékon R, Padonou GG, Ossè RA, Agossa FR, Beach R, Sèzonlin M. Six years of experience in entomological surveillance of indoor residual spraying against malaria transmission in Benin: lessons learned, challenges and outlooks. Malar J. 2015;14(1) https://doi.org/10.1186/s1293601507575.
 7.
World Health Organization. World malaria report 2010. Geneva: World Health Organization View Article Google Scholar; 2012.
 8.
Alonso PL, Brown G, ArevaloHerrera M, Binka F, Chitnis C, Collins F, Doumbo OK, Greenwood B, Hall BF, Levine MM. A research agenda to underpin malaria eradication. PLoS Med. 2011;8(1):1000406.
 9.
Jamison D, Feacham R, Makgoba M, Bos E, Baingana F, Hofman K, Rogo K. Disease and mortality in subSaharan Africa. Second Edition. Washington, DC: World Bank; 2006.
 10.
Newman MEJ. The structure of scientific collaboration networks. Proc Natl Acad Sci. 2001;98(2):404–9. https://doi.org/10.1073/pnas.98.2.404.%2004061.
 11.
Ghafouri HB, Mohammadhassanzadeh H, Shokraneh F, Vakilian M, Farahmand S. Social network analysis of Iranian researchers on emergency medicine: a sociogram analysis. Emerg Med J. 2014;31(8):619–24. https://doi.org/10.1136/emermed2012201781.
 12.
Morel CM, Serruya SJ, Penna GO, Guimarães R. Coauthorship network analysis: a powerful tool for strategic planning of research, development and capacity building programs on neglected diseases. PLoS Negl Trop Dis. 2009;3(8):501. https://doi.org/10.1371/journal.pntd.0000501.
 13.
GonzálezAlcaide G, Park J, Huamaní C, Gascón J, Ramos JM. Scientific authorships and collaboration network analysis on Chagas disease: papers indexed in PubMed (19402009). Rev Inst Med Trop Sao Paulo. 2012;54(4):219–28.
 14.
Schult, D.A., Swart, P.. Exploring network structure, dynamics, and function using NetworkX, vol. 2008, pp. 11–16 (2008).
 15.
Ferreira AA, Gonçalves MA, Laender AH. A brief survey of automatic methods for author name disambiguation. ACM SIGMOD Rec. 2012;41(2):15–26.
 16.
Giles, C.L., Zha, H., Han, H.. Name disambiguation in author citations using a kway spectral clustering method. IEEE; 2005. p. 334343.
 17.
Bilenko MY. Learnable similarity functions and their application to record linkage and clustering. Austin: PhD thesis, University of Texas at Austin; 2006.
 18.
Freeman LC. A set of measures of centrality based on betweenness. Sociometry. 1977;40(1):35–41.
 19.
Bonacich P. Factoring and weighting approaches to status scores and clique identification. J Math Sociol. 1972;2(1):113–20.
 20.
Katz L. A new status index derived from sociometric analysis. Psychometrika. 1953;18(1):39–43.
 21.
Kolaczyk ED, Csárdi G. Statistical analysis of network data with R (vol. 65). 2014.
 22.
Erdös P, Rényi A. On random graphs, I. Publ Math Debr. 1959;6:290–7.
 23.
Erdos P, Rényi A. On the evolution of random graphs. Publ Math Inst Hung Acad Sci. 1960;5(1):17–60.
 24.
Erdös P, Rényi A. On the strength of connectedness of a random graph. Acta Math Acad Sci Hung. 1964;12(1–2):261–7.
 25.
Gilbert EN. Random graphs. Ann Math Stat. 1959;30(4):1141–4.
 26.
Watts DJ, Strogatz SH. Collective dynamics of 'smallworld' networks. Nature. 1998;393(6684):440–2.
 27.
Van Noort V, Snel B, Huynen MA. The yeast coexpression network has a smallworld, scalefree architecture and can be explained by a simple model. EMBO Rep. 2004;5(3):280–4.
 28.
Barabási AL, Albert R. Emergence of scaling in random networks. Science. 1999;286(5439):509–12.
 29.
Albert R, Jeong H, Barabási AL. Internet: diameter of the worldwide web. Nature. 1999;401(6749):130–1.
 30.
Jeong H, Néda Z, Barabási AL. Measuring preferential attachment in evolving networks. EPL (Europhysics Letters). 2003;61(4):567.
 31.
Breman JG. Eradicating malaria. Sci Prog. 2009;92(1):1–38.
 32.
The Centers for Population Health and Health Disparities Evaluation Working Group, Okamoto J. Scientific collaboration and team science: a social network analysis of the centers for population health and health disparities. Transl Behav Med. 2015;5(1):12–23. https://doi.org/10.1007/s1314201402801.
 33.
Bellanca L. Measuring interdisciplinary research: analysis of coauthorship for research staff at the University of York. Biosci Horiz. 2009;2(2):99–112. https://doi.org/10.1093/biohorizons/hzp012.
 34.
Li EY, Liao CH, Yen HR. Coauthorship networks and research impact: a social capital perspective. Res Policy. 2013;42(9):1515 1530. https://doi.org/10.1016/j.respol.2013.06.012.
 35.
Salamati P, Soheili F. Social network analysis of Iranian researchers in the field of violence. Chin J Traumatol. 2016;19(5):264–70. https://doi.org/10.1016/j.cjtee.2016.06.008.
 36.
Bales, M.E., Johnson, S.B., Weng, C.: Social network analysis of interdisciplinarity in obesity research, vol. 870. 2008.
 37.
Bales ME, Johnson SB, Keeling JW, Carley KM, Kunkel F, Merrill JA. Evolution of coauthorship in public health services and systems research. Am J Prev Med. 2011;41(1):112–7.
 38.
Toivanen H, Ponomariov B. African regional innovation systems: bibliometric analysis of research collaboration patterns 20052009. Scientometrics. 2011;88(2):471–93. https://doi.org/10.1007/s1119201103901.
 39.
Wagner CS, Leydesdorff L. Network structure, selforganization, and the growth of international collaboration in science. Res Policy. 2005;34(10):1608–18. https://doi.org/10.1016/j.respol.2005.08.002.
 40.
Kolaczyk, E.D.: Statistical Analysis of Network Data: Methods and Models. Springer series in statistics. Springer, New York; [London] (2009). OCLC: ocn288985465.
 41.
Falagas ME, Pitsouni EI, Malietzis GA, Pappas G. Comparison of PubMed, Scopus, web of science, and Google scholar: strengths and weaknesses. FASEB J. 2007;22(2):338–42. https://doi.org/10.1096/fj.079492LSF.
 42.
Leifeld P, Cranmer SJ, Desmarais BA. Temporal exponential random graph models with xergm: estimation and bootstrap confidence intervals. J Stat Softw. 2015;83(6).
 43.
Matias C, Miele V. Statistical clustering of temporal networks through a dynamic stochastic block model. J R Stat Soc B. 2016. https://doi.org/10.1111/rssb.12200.
Acknowledgements
We would like to thank the Welzig Computational Neuroscience and Neurotechnology Laboratory at the Medical College of Wisconsin for making available their servers for the MonteCarlo simulations.
Funding
Not applicable.
Availability of data and materials
Please contact corresponding author for data and material requests.
Author information
Affiliations
Contributions
RA designed the study, collected the data, conducted the analysis and wrote the manuscript. ZJH contributed to the choice of the analytical tools, participated in the analysis and interpretation of the results. FRA participated to the data collection and proofread the manuscript. Prof. CMW cosupervised the work, provided the analytical platform for this work and proofread the manuscript. Prof. SM supervised the work and proofread the manuscript. All authors read and approved the final manuscript.
Corresponding author
Correspondence to Roseric Azondekon.
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare that they have no competing interests.
Rights and permissions
Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
About this article
Cite this article
Azondekon, R., Harper, Z.J., Agossa, F.R. et al. Scientific authorship and collaboration network analysis on malaria research in Benin: papers indexed in the web of science (1996–2016). glob health res policy 3, 11 (2018) doi:10.1186/s412560180067x
Received
Accepted
Published
DOI
Keywords
 Network analysis
 Scientific collaboration
 Coauthorship
 Malaria
 Benin