Integrative taxonomy resuscitates two species in the Lasioglossum villosulum complex (Kirby, 1802) (Hymenoptera: Apoidea: Halictidae)

Morphological and allozyme analyses suggested the occurrence of a pseudocryptic species in the Lasioglossum villosulum (Kirby, 1802) species complex (Hymenoptera: Halictidae). We analysed the morphology of more than 1500 specimens and the DNA barcode fragment of the cytochrome c oxidase subunit I (COI) of 102 specimens of this species complex from several Palaearctic countries. Our phylogenetic tree reconstructions, based on maximum likelihood and Bayesian inference revealed one clade corresponding to all specimens morphologically identifi ed as Lasioglossum medinai (Vachal, 1895) and one divergent specimen morphologically identifi ed as Lasioglossum berberum (Benoist, 1941). The other specimens, morphologically identifi ed as L. villosulum, aggregated into at least three other lineages in our phylogenetic trees. The tree-based species delineations methods based on the Generalized Mixed Yule Coalescent (GMYC) model and the Bayesian Poisson Tree Process (bPTP) identifi ed fi ve to ten candidate species within the L. villosulum species complex, with L. medinai and L. berberum consistently recognized as separated from all other candidate species. Diagnostic morphological differences were found among L. medinai, L. berberum and the remaining specimens identifi ed as L. villosulum. No diagnostic morphological differences were found to distinguish the different phylogenetic candidate species or lineages found within L. villosulum and L. medinai. Thus, European Journal of Taxonomy 541: 1–43 ISSN 2118-9773 https://doi.org/10.5852/ejt.2019.541 www.europeanjournaloftaxonomy.eu 2019 · Pauly A. et al. This work is licensed under a Creative Commons Attribution License (CC BY 4.0). R e s e a r c h a r t i c l e urn:lsid:zoobank.org:pub:FDA7A52D-404E-4E50-A519-AF1300D8F435

The Lasioglossum villosulum (Kirby, 1802) species complex is a highly variable group of halictid bees, widespread throughout the Palaearctic (Fig. 1). The nominal subspecies ( Fig. 2A, B) is widely distributed from the Azores to Mongolia, from the Canary Islands to Egypt, and north to Finland. The subspecies trichopse (Strand, 1914) occurs from Japan to Manchuria, Taiwan and Malaysia. The subspecies arabicum Ebmer, 2008 (Fig. 2E, F) is endemic to the Arabian Peninsula (Ebmer 1988(Ebmer , 2008. The biology of L. villosulum was studied in detail by Plateaux-Quénu & Plateaux (1981, 1986, 1993, , and Plateaux-Quénu (1993). These authors report the presence of larger females inhabiting the south of France (Fig. 2C, D). On the basis of an electrophoretic analysis  (Kirby, 1802), ♀ and ♂. C-D. L. medinai (Vachal, 1895), ♀ and ♂ (holotype). E-F. L. villosulum arabicum Ebmer, 2008, ♀ and ♂. PAULY A. et al., Lasioglossum medinai allied to L. villosulum of allozymes, Packer et al. (1999) conclude that these unusually large individuals belong to a new species, but without naming it, and they write: "a formal description of the new species will be presented elsewhere pending additional surveys of museum collections from a wider range of localities and detailed examination of male specimens", but their description was never published. As in other specifi c complex, Andrena bicolor (Praz et al. 2019) or Bombus lucorum (Bossert et al. 2016), combination of morphological character and genetic analysis of L. villosulum could also reveal distinctive species.
The objective of this study is to analyse the morphology and a DNA barcode fragment of the cytochrome c oxidase subunit I (COI) gene of specimens previously identifi ed as L. villosulum from various localities from the Palaearctic, with a focus on Europe and the Mediterranean Basin, and to evaluate if larger females collected from the south of France, Israel and Spain belong to a separate species or not. If our results show the existence of a hitherto undescribed species, our aim is, either to resurrect a name among the eight synonyms of L. villosulum listed by Ebmer (1988), or to give a new name to this species.

Repositories
The morphological analysis is based on the examination of specimens of the Lasioglossum villosulum complex preserved in the following institutes as well as the private collections mentioned in the acknowledgements:

Specimens examined
We re-examined all the types necessary to establish the oldest name of the pseudocryptic species as L. medinai (Vachal, 1895), the oldest synonym from southern Europe with a description corresponding potentially to the unusually large pseudocryptic species. Ebmer (1988) has examined all the types and we follow him for synonymy where we did not examine the types.
NHMUK specimens were assigned unique specimen numbers. Specimen data and images for NHMUK specimens were recorded on the NHMUK database and are publically available through the NHMUK Data Portal (Natural History Museum 2014).
For DNA analysis, we collected 22 female specimens (Table 1) of Lasioglossum villosulum including larger specimens of the presumed new species and representing the widest possible distribution coverage (i.e., representing the maximum number of countries).

DNA extraction
We used the NucleoSpin® Tissue Kit (Macherey-Nagel, Germany) to extract genomic DNA, following the manufacturer's protocol but eluting the DNA in 80 μl elution buffer. If possible, one middle leg per specimen was chosen to prepare samples, and the rest of the specimen was preserved as voucher.

DNA amplifi cation
Two pairs of primers were used to amplify the 5' end of the cytochrome c oxidase subunit I (COI) mitochondrial gene (the standard DNA barcode region for animals). Primers LCO1490 and HCO2198 (Folmer et al. 1994) were used fi rst on all specimens and, when amplifi cation failed or the sequence was not valid (contamination or multiple peaks), a second primer pair, BarBeeF (Françoso & Arias 2013) and MtD9 (Simon et al. 1994) was used (

Molecular and phylogenetic analysis
All 22 sequences obtained here (with accession numbers from MK388879 to MK388901) were merged with all 80 additional unique barcode sequences retrieved from GenBank and BOLD (using keywords "Lasioglossum villosulum COI" and "Lasioglossum villosulum", respectively, on 23 January 2018) and with barcode data provided by Professor Laurence Packer from BOLD on November 2018 (Table 2). One barcode of Lasioglossum bluethgeni, a halictid belonging to the subgenus Evylaeus Robertson, 1902 (AP222; Table 1), was added as outgroup to root phylogenetic trees. All COI sequences were aligned using ClustalW (Larkin et al. 2007) with the default parameters implemented and pairwise deletion for gaps treatment in MEGA ver. 7.0.21 (Kumar et al. 2016). After quality control of all barcodes, the obtained alignment included at most 658 nucleotide characters. For a fi rst distance-based analysis, pairwise p-distances (i.e., proportion of variable sites) were computed and a Neighbour-Joining (NJ) tree was reconstructed with MEGA ver. 7.0.21 with bootstrap pseudo-replicates (n = 1000).
For phylogenetic analysis, we extracted the unique haplotypes from the DNA sequence dataset obtained above using the ape (Paradis & Schliep 2018) and pegas (Paradis 2010) packages in RStudio ver. 3. 5.1 (2018). Label details of each haplotype are given in Table 3. Maximum likelihood (ML) and Bayesian inference (BI) methods were conducted using GARLI ver. 2.1 (Zwickl 2006) and Mr Bayes ver. 3.2.6 (Ronquist et al. 2012), respectively. The best substitution model for each codon position (1 st , 2 nd and 3 nd codon position) was searched using PartitionFinder2 ver. 2.1.1 (Lanfear et al. 2017), with linked branch lengths parameter, the AICc model selection metrics and the greedy search algorithm (Lanfear et al. 2012). The chosen models for partitioned codon position of COI were: TIM+G (1 st ), TrN+I (2 nd ) and F81+I (3 rd ). For ML analysis, we confi gured a group of independent runs: 10 runs for conducting bootstrapping (ntotal = 1000; n/run = 100) and another run for searching the best ML tree. A random unconstrained starting tree was used for each run and automated stopping criterion was applied: when ln score remained constant for 10000 consecutive generations for bootstrap calculation and 100000 consecutive generations for the best ML tree with a required score improvement for topology of minimum 0.01 and 0.00001, respectively. The set of ML trees generated in GARLI (best ML tree + 10 ML trees resulting from the ten bootstrapping runs treesmethod in Newick format) were summarized European Journal of Taxonomy 541: 1-43 (2019)     Fig. 3) with the support values mapped from non-parametric bootstrap calculation using SumTrees ver. 4.3 (Sukumaran & Holder 2017) in DendroPy (Sukumaran & Holder 2010). Bootstrap values ≥ 70% were considered as the threshold for good confi dence to tree topologies (Hillis & Bull 1993).

Species delineation analyses
For species delineation analysis, we generated an ultrametric tree using the BEAST2 ver. 2.5.2 environment (Bouckaert et al. 2014). Following Michonneau (2016), coalescent model with constant population size combined with constant clock was used as prior for expressing the expected topology of the tree. We followed GTR model as in MrBayes analysis with Gamma Category Count of four to capture most of the rate variation. We set-up MCMC length to fi ve million of generation. For the run, we checked the posterior probabilities as well as the effective sample size (> 200) with Tracer ver. 1.7.1 program (Rambaut et al. 2018). Then, we summarized Bayesian information into phylogenetic tree via TreeAnnotator ver. 2.5.2. (included in BEAST environment) program with a burn-in of 10%.
As recommended by Tang (Kirby, 1802), Lasioglossum medinai (Vachal, 1895) and one as Lasioglossum berberum (Benoist, 1941). Each label corresponds to one roman letter which encompasses all sequence from a haplotype (for more details see Table 3). A. Lasioglossum medinai; B-D: three supported clusters (a fourth cluster could be defi ned in C) within Lasioglossum villosulum. This phylogenetic tree is rooted using Lasioglossum bluethgeni Ebmer, 1971 as outgroup (label III, voucher AP222). Posterior probabilities are given at nodes. The three colour gradients on the tree correspond to morphological delineation. Results of the species delimitations analyses are represented on the right side of the fi gure: the Bayesian Poisson Tree Process (bPTP) analyses based on the trees obtained using Bayesian inference (BI) or maximum likelihood (ML); The Generalized Mixed Yule Coalescent (GMYC) analysis resulting in seven candidate species (using the single threshold represented as a red line on the tree) and fi ve alternative scenarios. Numbers at the top of the columns corresponds to the number of candidate species in the GMYC analysis.
European Journal of Taxonomy 541: 1-43 (2019) performed on the Web Server (https://species.h-its.org/ptp/) of The Exelixis Lab with default parameters and using the ultrametric tree as input.

DNA analyses
Our COI dataset comprised 29 unique haplotypes: 1 corresponding to the outgroup ( Fig. 3; label III or L. bluethgeni), 1 to L. berberum ( Fig. 3; label XX), 6 to L. medinai (Fig. 3A) and 21 to L. villosulum (Fig. 3B, C (1-2) and D). All specimens identifi ed as L. medinai and sequenced here cluster together in all phylogenetic analyses, with a bootstrap support of 94% in the ML analysis, and a posterior probability of one in the Bayesian inferences (ultrametric and non ultrametric trees) (clade A in Fig. 3 and Fig. 4). The haplotype identifi ed as Lasioglossum berberum (haplotype XX highlighted in blue in Fig. 3) appears as a sister-species of L. medinai according to all tree reconstructions. All remaining haplotypes identifi ed as L. villosulum were found on other branches (Fig. 3B-D and Fig. 4). They formed a clade in the ultrametric tree obtained by BI (Fig. 4) but their relationships were not resolved in the other phylogenetic analyses (Fig. 3). They clustered in at least three divergent groups with posterior probabilities of 0.96-1.00 in the BI analysis. Only two of these clusters were supported (with bootstrap values of 74-76%) in the ML analysis (Fig. 3).
The GMYC analysis estimated seven candidate spec ies (six clusters of haplotypes and the single haplotype of L. berberum (haplotype XX in Fig. 4) with a minimum of fi ve and a maximum of 10 candidates species. The same seven candidate species were suggested using the bPTP approach (Fig. 4).
In the species delimitation suggesting the lowest number of species (fi ve candidates species), L. medinai, L. berberum and three clades of L. villosulum (corresponding to the clades B-D in the phylogenetic trees) were considered as possible distinct species. In the species delimitations proposing more than fi ve species, L. medinai and L. villosulum were further split in two to seven candidate species (Fig. 4). We did not fi nd any diagnostic morphological characters enabling the distinction among these candidate species. They are also not distributed according to geographical patterns.
Intraspecifi c genetic divergences within and among clusters B, C and D of L. villosulum (ranging from 0% to 4.8%) are comparable to those within the cluster of L. medinai (ranging from 0% to 3.2%). In the NJ tree, all specimens identifi ed as L. medinai and sequenced here cluster together with maximum bootstrap support (Fig. S1A). In all analyses, the cluster of L. medinai also includes seven sequences from GenBank, two identifi ed as L. villosulum in the study of Schmidt et al. (2015) (accession numbers KJ838409 and KJ837678; highlighted in red colour in Fig. S1, available in Supplementary material) and fi ve records of specimens that were not identifi ed to the species level.

PAULY
The punctation and genitalia of males are similar to those of L. villosulum.

Distribution
This subspecies inhabits the Arabian Peninsula (Oman, UAE) and is distinguished mainly by red terga . Also conspicuous are the zones of snow white, felted, adpressed pubescence on the face, particularly on frontal area and pronotum, and white, felted, upstanding hairs on the mesopleura, particularly the front as well as the posterior propodeal surface and the lateral fi elds above. This pubescence marks the subspecies as a desert form (Ebmer 2008: 560;Dathe 2009: 385). Specimens from Oman and UAE have red terga, whereas those from Yemen, identifi ed as L. villosulum by Ebmer (D. Notton, unpublished new record), have dark black terga. Maybe there is more taxonomic complexity here, but it is outside the scope of the current project to further examine this. It may correlate with topography and climate since Oman and UAE are generally lower altitude (hotter) and Yemen is generally higher altitude (cooler).

New diagnosis
Morphological examination of the type of Halictus medinai confi rms that this large specimen belongs to the cryptic species. Its size is 7 mm (Fig. 2C-D), the punctation of the scutum is denser than in L. villosulum. Head, punctation of the scutum, sculpture of propodeum and punctation of terga of the male holotype are illustrated (Fig. 15).
Morphological examination of the females makes it possible to note immediately, in addition to a larger body size (7 mm), the denser punctation of the scutum (Fig. 16) as well as the stronger wrinkles of the propodeum, which reach the posterior edge (Fig. 17). In most cases, we can verify the identifi cation using two more subtle characters: (1) the punctation of the tergum 1 is much more superfi cial in the middle and on the apical margin, absent in the middle of the apical margin, whereas in L. villosulum the punctation of the tergum 1 is deeper and the apical margin is punctuated even in the middle (Fig. 18).
(2) a small unpunctuated area in front of the larger anterior ocelli, more extended than in L. villosulum, and punctation around this area fi ner (Fig. 19).
Morphological examination of a single male of L. medinai obtained by breeding and from Uchaux confi rms that males of this species also have denser punctation on the scutum. The genitalia of several large males with denser punctation of the scutum and collected at the same time as females of L. medinai show only very subtle differences which may be characteristic (Fig. 20):  (2019) -L. medinai: refl exed gonostylus lobe with apex blunt (Fig. 20Cb), outer hind corner more produced (Fig. 20Ca); gonostylus in line with axis of gonocoxite (Fig. 20Cc).
Also, in L. medinai the bristles of the last sterna form a slightly thicker fringe (Fig. 21).

Ebmer
European Journal of Taxonomy 541: 1-43 (2019) Fig. 22. Geographical distribution of the two pseudocryptic species in the Western Palaearctic. A. Lasioglossum villosulum (Kirby, 1802). B. L. medinai (Vachal, 1895 (2019) Other cryptic or pseudocryptic species of Halictidae may be discovered in Europe in the future, thanks to molecular analyses. In this bee family, molecular analysis enables the identifi cation of females of several species that are indistinguishable by morphology, whereas males generally have species-specifi c traits in genitalia (Pauly et al. 2015). Therefore, we advocate for developing a DNA library on a European scale, to obtain reliable identifi cations of halictid bee species.