
Estimate Dimorphism in a Univariate or Multivariate Sample
dimorph.RdFunction to calculate or estimate dimorphism for a univariate or multivariate sample.
Usage
dimorph(
x,
method = "SSD",
methodMulti = "GMM",
sex = NULL,
sex.female = 1,
center = "geomean",
ads = NULL,
templatevar = NULL,
completevars = F,
na.rm = T,
ncorrection = F,
details = F,
dfout = F
)Arguments
- x
A dataframe, matrix, or vector of positive numbers corresponding to measurements for one or more size variable(s). Values must be in the original measurement space, not log-transformed.
- method
A character string specifying the univariate method used to calculate or estimate dimorphism. Options include:
"SSD": Sexual Size Dimorphism. Follows Smith (1999). Calculates actual sexual dimorphism in the sample as the ratio of mean male size to mean female size. Depending oncenter, sex-specific means are calculated either as geometric means or arithmetic means. Requiressexto be specified."MMR": Mean Method Ratio. Follows Godfrey et al. (1993). Splits the sample at its mean, then calculates the ratio of the mean of measurements larger than the overall mean to the mean of measurements smaller than the overall mean. If any measurements are exactly equal to the overall mean, they contribute to both the larger and smaller group as half an individual in a weighted mean. Depending oncenter, the overall mean and subgroup means are calculated either as geometric means or arithmetic means. Ignoressex."BDI": Binomial Dimorphism Index. Follows Reno et al. (2003). Given n measurements, calculates all possible ratios of the mean of larger specimens to the mean of smaller specimens when the sample is split into the k largest specimens and n-k smallest specimens, where k ranges from 1 to n-1. A weighted mean is then calculated for all ratios, where the weights are equal to the probability of k successes in n trials in a binomial distribution. Depending oncenter, means are calculated either as geometric means or arithmetic means. Ignoressex."ERM": Exact Resampling Method. Modification of Lee's (2001) Assigned Resampling Method (ARM) following Gordon (2025a). ARM is a resampling-based estimate of dimorphism that repeatedly samples two values with replacement fromx, then calculates their ratio as long as both are neither more than 0.5 standard deviations above the mean or both 0.5 standard deviations below the mean (otherwise the pair is rejected). ARM typically oversamples the possible combination of two values sampled from a small sample (as originally described it samples 1,000 pairs, whereas a sample of 20 measurements only has 210 possible pairs) and sampling with replacement biases dimorphism estimates downwards by the incorporation of multiple ratios of 1 whenever the same value is sampled twice and is not rejected by retention criterion."ERM"performs an exact resampling of all possible pairs of values without replacement, but otherwise follows Lee's algorithm. Depending oncenter, the procedure is applied in either logarithmic ("geomean") or raw ("mean") data space. Ignoressex."MoM": Method of Moments. Follows Josephson et al. (1996). Assumes that the sample is a mixture of two underlying lognormal distributions and uses three moments around the mean of the logged combined sex distribution to estimate the means of the underlying distributions. This calculation is always performed on the log-transformed data regardless of the value ofcenter. Assumes that the sample contains an equal proportion of females and males and that those two subsamples have equal variance. Ignoressex."FMA": Finite Mixture Analysis. Follows Godfrey et al. (1993). Assumes that the sample is a mixture of two underlying normal or lognormal distributions. Assumes that the sample contains an equal proportion of females and males and that those two subsamples have equal variance, then estimates the maximum separation of the two means. Depending oncenter, the underlying distributions are treated as either normal ("mean") or lognormal ("geomean"). Ignoressex."BFM": Bayesian Finite Mixture. Follows Gordon (2025a). Assumes that the sample is a finite mixture of two underlying normal or lognormal distributions. Unlike"FMA"and"MoM", estimates the proportion of females and males assuming that they may not be equal, and uses a Bayesian Information Criterion (BIC) approach to select between a model that estimates a single variance for both sexes and a model that estimates variances separately for the two constituent distributions usingmclustBIC. It then calculates the ratio of the two estimated means. Depending oncenter, the underlying distributions are treated as either normal ("mean") or lognormal ("geomean"). Ignoressex. When performed on lognormal data, this method is similar to the pdPeak method of Sasaki et al. (2021), particularly when the BIC procedure selects an equal variance model (which it typically does)."CV": Coefficient of Variation. Calculates the coefficient of variation as the standard deviation ofxdivided by the mean ofxthen multiplied by 100. This calculation is always performed on the raw data regardless of the value ofcenter(an analogous method using logarithmic data is"sdlog"). Ignoressex. Additionally, Sokal and Braumann's (1980) size correction factor can be applied by settingncorrectiontoTRUE, although this isFALSEby default."CVsex": Modified Coefficient of Variation. Calculates a modified version of the coefficient of variation: the standard deviation is replaced by the square root of the sum of squared differences of every value ofxfrom the unweighted mean of the sex-specific means inxdivided by the square root of n-1, and this is divided by the unweighted mean of the sex-specific means, then multiplied by 100. This calculation is always performed on the raw data regardless of the value ofcenter(an analogous method using logarithmic data is"sdlogsex"). Requiressex. Additionally, Sokal and Braumann's (1980) size correction factor can be applied by settingncorrectiontoTRUE, although this isFALSEby default."sdlog": Standard Deviation of Logged Data. First,xis log-transformed using the natural logarithm, then the standard deviation is calculated. This is a measure of proportional variation of the values ofxabout their geometric mean; i.e., analagous to the coefficient of variation for a lognormal distribution. This calculation is always performed on the log-transformed data regardless of the value ofcenter(an analogous method using raw data is"CV"). Ignoressex."sdlogsex": Modified Standard Deviation of Logged Data. First,xis log-transformed using the natural logarithm. Then a modified version of standard deviation is calculated: the square root of the sum of squared differences of every logged value ofxfrom the unweighted mean of the sex-specific means of log-transformedx, divided by the square root of n-1. This calculation is always performed on the log-transformed data regardless of the value ofcenter(an analogous method using raw data is"CVsex"). Requiressex.
Defaults to
"SSD".- methodMulti
A character string specifying the multivariate method used to calculate or estimate dimorphism. Note that regardless of the value of this argument, multivariate estimation procedures will only be carried out if
xis a multivariate dataset. Options include:"GMsize": Ifxis a dataframe or matrix, this method first calculates overall size as the geometric mean of measurements in all variables for those specimens that are complete for all variables in the data set. The selected univariate method is then applied to this measure of overall size. If any specimens are missing data, those specimens will be dropped from the analysis ifna.rm=TRUE, or the function will returnNAifna.rm=FALSE."GMM": Follows the Geometric Mean Method of Gordon et al. (2008). The selected univariate method is applied to all variables separately, then the geometric mean is calculated of the dimorphism estimates for all variables to produce a single estimate for the whole data set. Note that this methodology is not appropriate for variance-based univariate methods; i.e.,"CV","CVsex","sdlog", and"sdlogsex"."TM": Follows the Template Method of Reno et al. (2003). A variable of interest is specified by the user with the argumenttemplatevar. The algorithm identifies a template individual that can be used to estimate the largest number of values for the selected variable of interest. It does so by calculating ratios between the value of the variable of interest and other variables in the template individual, which are then multiplied by the value of those other variables in individuals missing the target variable. The template individual selected is the specimen that allows for the largest number of target variable estimates, maximizing the data set for that variable. A user-selected univariate method is then applied to the combined data set of actual and estimated values for the target variable. Note that this method has been critiqued by several authors on multiple grounds (see Gordon 2025a for a summary of those critiques and references), and is only included here for the sake of completeness.
Defaults to
"GMM".- sex
A vector indicating sex for the measurements in
x. If present, must include exactly two groups and have the same length as the number of specimens inx. Non-factor vectors will be coerced to factors if possible. May beNULLsince some methods do not require sex information. Methods which require sex information will generate an error ifsexisNULL. For methods that do not require sex information, if sex is provided it will be ignored for the calculation of the estimate, but it will be used to report the actual proportion of females and males in the sample. Defaults toNULL.- sex.female
An integer scalar (1 or 2) specifying which level of
sexcorresponds to female. Ignored ifsexisNULL. Defaults to 1.- center
A character string specifying the method used to calculate a mean, either
"geomean"(default) which uses the geometric mean, or"mean"which uses the arithmetic mean. More broadly,"geomean"indicates analyses are conducted in logarithmic data space and"mean"indicates analyses are conducted in raw data space. Some methods can only be applied in one domain or the other:"CV"and"CVsex"are always calculated in raw data space andcenterwill be set to"mean"for these methods regardless of the value set by the user;"MoM","sdlog", and"sdlogsex"are always calculated in logarithmic data space andcenterwill be set to"geomean"for these methods regardless of the value set by the user.- ads
A vector of integer addresses for positions in the data vector
xto be included in the calculation of dimorphism; any other data inxwill be ignored. IfadsisNULLthen all data are included in the calculation. Defaults toNULL.- templatevar
A character object or integer value specifying the name or column number of the variable in
xto be estimated using the template method. Ignored if template method is not used. Defaults toNULL.- completevars
A logical scalar indicating whether geometric mean method multivariate estimates should require all variables to contain enough observations to calculate the selected univariate estimator. If some variables need to be dropped, NA will be returned if
completevarsisTRUE, while the geometric mean of the dimorphism estimate for the remaining variables will be returned ifcompletevarsisFALSE. Defaults toFALSE(although this is alwaysTRUEwhendimorphis called bybootdimorph,SSDtest, orresampleSSD).- na.rm
A logical scalar indicating whether NA values should be stripped before the computation proceeds. Defaults to
TRUE.- ncorrection
A logical scalar indicating whether to apply Sokal and Braumann's (1980) size correction factor to CV estimates. Defaults to
FALSE.- details
A logical scalar indicating whether variable name and specimen names should be retained (if available) as attributes in the output object. Defaults to
FALSE.- dfout
A logical scalar indicating whether the result should be given as a
dimorphEstDFobject; ifFALSE, returns adimorphEstobject. Defaults toFALSE.
Value
Either a class dimorphEst or dimorphEstDF object. dimorphEst objects are numeric
vectors of length one corresponding to measured or estimated dimorphism in x with associated information
preserved as attributes. dimorphEstDF objects are single-row data frames that contain the dimorphism
estimate for x along with other associated information. Applying summary to either of these
objects provides information about the dataset and method used to generate it.
References
Godfrey LR, Lyon SK, Sutherland MR. (1993) Sexual dimorphism in large-bodied primates: The case of the subfossil lemurs. American Journal of Physical Anthropology. 90:315-334. (https://doi.org/10.1002/ajpa.1330900306)
Gordon AD. (2025a) Interpreting statistical significance in hominin dimorphism: Power and Type I error rates for resampling tests of univariate and missing-data multivariate size dimorphism estimation methods in the fossil record. Journal of Human Evolution. 199:103630. (https://doi.org/10.1016/j.jhevol.2024.103630)
Gordon AD, Green DJ, Richmond BG. (2008) Strong postcranial size dimorphism in Australopithecus afarensis: Results from two new resampling methods for multivariate data sets with missing data. American Journal of Physical Anthropology. 135:311-328. (https://doi.org/10.1002/ajpa.20745)
Josephson SC, Juell KE, Rogers AR. (1996) Estimating sexual dimorphism by method-of-moments. American Journal of Physical Anthropology. 100:191-206. (https://doi.org/10.1002/(SICI)1096-8644(199606)100:2<191::AID-AJPA3>3.0.CO;2-0)
Lee S-H. (2001) Assigned Resampling Method: A new method to estimate size sexual dimorphism in samples of unknown sex. Anthropological Review. 64:21–39. (https://doi.org/10.18778/1898-6773.64.02)
Reno PL, Meindl RS, McCollum MA, Lovejoy CO. (2003) Sexual dimorphism in Australopithecus afarensis was similar to that of modern humans. Proceedings of the National Academy of Sciences. 100:9404-9409. (https://doi.org/10.1073/pnas.1133180100)
Sasaki T, Semaw S, Rogers MJ, Simpson SW, Beyene Y, Asfaw B, White TD, Suwa G. (2021) Estimating sexual size dimorphism in fossil species from posterior probability densities. Proceedings of the National Academy of Sciences. 118:e2113943118. (https://doi.org/10.1073/pnas.2113943118)
Sokal RR, Braumann CA (1980) Significance tests for coefficients of variation and variability profiles. Systematic Zoology. 29:50–66. (https://doi.org/10.2307/2412626)
Smith RJ. (1999) Statistics of sexual size dimorphism. Journal of Human Evolution. 36:423-458. (https://doi.org/10.1006/jhev.1998.0281)
Examples
## Univariate estimates:
data(apelimbart)
gorillas <- apelimbart[apelimbart$Species=="Gorilla gorilla",]
# Next line would generate an error: sex is required
# dimorph(x=gorillas$FHSI, method="SSD")
gorSSD <- dimorph(x=gorillas$FHSI, # variable and specimen names not preserved
method="SSD", sex=gorillas$Sex, details=TRUE)
gorSSD2 <- dimorph(x=gorillas[,"FHSI",drop=FALSE], # variable and specimen names preserved
method="SSD", sex=gorillas$Sex, details=TRUE)
gorSSD
#> SSD
#> 1.22447
str(gorSSD)
#> 'dimorphEst' Named num 1.22
#> - attr(*, "details")=List of 19
#> ..$ estimate : logi NA
#> ..$ methodUni : chr "SSD"
#> ..$ methodMulti : logi NA
#> ..$ center : chr "geomean"
#> ..$ n.vars.overall : num 1
#> ..$ n.specimens.overall : int 94
#> ..$ proportion.female.overall : num 0.5
#> ..$ n.vars.realized : num 1
#> ..$ n.specimens.realized : int 94
#> ..$ proportion.female.realized : num 0.5
#> ..$ proportion.missingdata.overall : num 0
#> ..$ proportion.missingdata.realized: num 0
#> ..$ proportion.templated : logi NA
#> ..$ template.specimen : logi NA
#> ..$ ratio.means : Named num [1:2] 50 40.8
#> .. ..- attr(*, "names")= chr [1:2] "numerator" "denominator"
#> ..$ vars.used : logi NA
#> ..$ specimens.used : chr [1:94] "specimen_1" "specimen_2" "specimen_3" "specimen_4" ...
#> ..$ model.parameters : logi NA
#> ..$ estvalues : chr "raw"
#> - attr(*, "names")= chr "SSD"
summary(gorSSD)
#> estimate: 1.22447
#> univariate method: SSD
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
#> ratio numerator and denominator: 49.98, 40.81
summary(gorSSD2) # results are identical to 'gorSSD'
#> estimate: 1.22447
#> univariate method: SSD
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
#> ratio numerator and denominator: 49.98, 40.81
summary(gorSSD, verbose=TRUE)
#> estimate: 1.22447
#> univariate method: SSD
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
#> ratio numerator and denominator: 49.98, 40.81
#>
#> Included specimens:
#> specimen_1
#> specimen_2
#> specimen_3
#> specimen_4
#> specimen_5
#> specimen_6
#> specimen_7
#> specimen_8
#> specimen_9
#> specimen_10
#> specimen_11
#> specimen_12
#> specimen_13
#> specimen_14
#> specimen_15
#> specimen_16
#> specimen_17
#> specimen_18
#> specimen_19
#> specimen_20
#> specimen_21
#> specimen_22
#> specimen_23
#> specimen_24
#> specimen_25
#> specimen_26
#> specimen_27
#> specimen_28
#> specimen_29
#> specimen_30
#> specimen_31
#> specimen_32
#> specimen_33
#> specimen_34
#> specimen_35
#> specimen_36
#> specimen_37
#> specimen_38
#> specimen_39
#> specimen_40
#> specimen_41
#> specimen_42
#> specimen_43
#> specimen_44
#> specimen_45
#> specimen_46
#> specimen_47
#> specimen_48
#> specimen_49
#> specimen_50
#> specimen_51
#> specimen_52
#> specimen_53
#> specimen_54
#> specimen_55
#> specimen_56
#> specimen_57
#> specimen_58
#> specimen_59
#> specimen_60
#> specimen_61
#> specimen_62
#> specimen_63
#> specimen_64
#> specimen_65
#> specimen_66
#> specimen_67
#> specimen_68
#> specimen_69
#> specimen_70
#> specimen_71
#> specimen_72
#> specimen_73
#> specimen_74
#> specimen_75
#> specimen_76
#> specimen_77
#> specimen_78
#> specimen_79
#> specimen_80
#> specimen_81
#> specimen_82
#> specimen_83
#> specimen_84
#> specimen_85
#> specimen_86
#> specimen_87
#> specimen_88
#> specimen_89
#> specimen_90
#> specimen_91
#> specimen_92
#> specimen_93
#> specimen_94
summary(gorSSD2, verbose=TRUE) # variable and specimen names preserved
#> estimate: 1.22447
#> univariate method: SSD
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
#> ratio numerator and denominator: 49.98, 40.81
#>
#> Included variables:
#> FHSI
#>
#> Included specimens:
#> AIMZ 13488
#> CMNH HTB 1423
#> CMNH HTB 1710
#> CMNH HTB 1725
#> CMNH HTB 1743
#> CMNH HTB 1756
#> CMNH HTB 1765
#> CMNH HTB 1806
#> CMNH HTB 1846
#> CMNH HTB 1849
#> CMNH HTB 1851
#> CMNH HTB 1852
#> CMNH HTB 1854
#> CMNH HTB 1856
#> CMNH HTB 1897
#> CMNH HTB 1997
#> CMNH HTB 2069
#> CMNH HTB 3393
#> MNHN 1856-67
#> PCM Gg-C1-042
#> PCM Gg-C1-096
#> PCM Gg-C1-098
#> PCM Gg-C1-149
#> PCM Gg-C1-150
#> PCM Gg-FC146
#> PCM Gg-M035
#> PCM Gg-M058
#> PCM Gg-M095
#> PCM Gg-M096
#> PCM Gg-M136
#> PCM Gg-M138
#> PCM Gg-M150
#> PCM Gg-M174
#> PCM Gg-M329
#> PCM Gg-M470
#> PCM Gg-M696
#> PCM Gg-M716
#> PCM Gg-M786
#> PCM Gg-M798
#> PCM Gg-M840
#> PCM Gg-M856
#> PCM Gg-M877
#> PCM Gg-M902
#> PCM Gg-M932
#> PCM Gg-Z6-33
#> RBINS 33238
#> ZSM 1954/0201
#> AIMZ 6841
#> AIMZ 6884
#> AIMZ PAL-11
#> AIMZ PAL-12
#> AIMZ PAL-8
#> CMNH HTB 1407
#> CMNH HTB 1712
#> CMNH HTB 1729
#> CMNH HTB 1731
#> CMNH HTB 1732
#> CMNH HTB 1746
#> CMNH HTB 1796
#> CMNH HTB 1797
#> CMNH HTB 1859
#> CMNH HTB 1954
#> CMNH HTB 1991
#> CMNH HTB 1994
#> CMNH HTB 2028
#> CMNH HTB 2745
#> CMNH HTB 2767
#> CMNH HTB 3391
#> CMNH HTB 3400
#> CMNH HTB 3404
#> MNHN 1866-92
#> MNHN 1899-16
#> MNHN 1912-475
#> MNHN 1914-99
#> MNHN 1931-657
#> MNHN 1982-56
#> MNHN 2007-1493
#> MNHN A12747
#> PCM Gg-C1-099
#> PCM Gg-C1-105
#> PCM Gg-C1-106
#> PCM Gg-C1-229
#> PCM Gg-M183
#> PCM Gg-M264
#> PCM Gg-M372
#> PCM Gg-M687
#> PCM Gg-M962
#> PCM Gg-Z1-30
#> PCM Gg-Z2-65
#> PCM Gg-Z6-32
#> RBINS 871
#> ZSM 1908/0034
#> ZSM 1911/2397
#> ZSM 1962/0333
# A subset of specimens can be specified for analysis using 'ads'
summary(dimorph(x=gorillas$FHSI, method="SSD", sex=gorillas$Sex, ads=c(1:10, 51:60)))
#> estimate: 1.22771
#> univariate method: SSD
#> no. of variables (overall): 1
#> no. of specimens (overall): 20
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 1
#> no. of specimens (realized): 20
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
#> ratio numerator and denominator: 49.84, 40.6
# Methods for estimating dimorphism:
summary(dimorph(x=gorillas$FHSI, method="MMR"))
#> estimate: 1.2261
#> univariate method: MMR
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
#> ratio numerator and denominator: 49.69, 40.52
summary(dimorph(x=gorillas$FHSI, method="BDI"))
#> estimate: 1.2243
#> univariate method: BDI
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
summary(dimorph(x=gorillas$FHSI, method="MoM"))
#> estimate: 1.22663
#> univariate method: MoM
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
summary(dimorph(x=gorillas$FHSI, method="FMA"))
#> estimate: 1.1292
#> univariate method: FMA
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
#> ratio numerator and denominator: 47.99, 42.5
summary(dimorph(x=gorillas$FHSI, method="BFM"))
#> estimate: 1.21998
#> univariate method: BFM
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
#> ratio numerator and denominator: 49.72, 40.75
#> BFM model parameters:
#> BFM estimate of proportion of sample composed of smaller sex: 0.48289
#> BFM model of variance: equal for both sexes
#> BFM estimate of variance: 0.00354 (logged data)
summary(dimorph(x=gorillas$FHSI, method="ERM"))
#> estimate: 1.18265
#> univariate method: ERM
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
summary(dimorph(x=gorillas$FHSI, method="CV"))
#> estimate: 11.65676
#> univariate method: CV
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: arithmetic mean
summary(dimorph(x=gorillas$FHSI, method="CV", ncorrection=TRUE))
#> estimate: 11.68776
#> univariate method: CV (sample size correction factor applied)
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: arithmetic mean
summary(dimorph(x=gorillas$FHSI, method="CVsex", sex=gorillas$Sex))
#> estimate: 11.65676
#> univariate method: CVsex
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: arithmetic mean
summary(dimorph(x=gorillas$FHSI, method="CVsex", sex=gorillas$Sex, ncorrection=TRUE))
#> estimate: 11.68776
#> univariate method: CVsex (sample size correction factor applied)
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: arithmetic mean
summary(dimorph(x=gorillas$FHSI, method="sdlog"))
#> estimate: 0.11642
#> univariate method: sdlog
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
summary(dimorph(x=gorillas$FHSI, method="sdlogsex", sex=gorillas$Sex))
#> estimate: 0.11642
#> univariate method: sdlogsex
#> no. of variables (overall): 1
#> no. of specimens (overall): 94
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 1
#> no. of specimens (realized): 94
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
# Now setting 'dfout' to TRUE:
allmethods <- rbind(dimorph(x=gorillas$FHSI, method="SSD", sex=gorillas$Sex, dfout=TRUE),
dimorph(x=gorillas$FHSI, method="MMR", dfout=TRUE),
dimorph(x=gorillas$FHSI, method="BDI", dfout=TRUE),
dimorph(x=gorillas$FHSI, method="MoM", dfout=TRUE),
dimorph(x=gorillas$FHSI, method="FMA", dfout=TRUE),
dimorph(x=gorillas$FHSI, method="BFM", dfout=TRUE),
dimorph(x=gorillas$FHSI, method="ERM", dfout=TRUE),
dimorph(x=gorillas$FHSI, method="CV", dfout=TRUE),
dimorph(x=gorillas$FHSI, method="CVsex", sex=gorillas$Sex, dfout=TRUE),
dimorph(x=gorillas$FHSI, method="sdlog", dfout=TRUE),
dimorph(x=gorillas$FHSI, method="sdlogsex", sex=gorillas$Sex, dfout=TRUE))
allmethods
#> estimate methodUni methodMulti center n.vars.overall
#> SSD 1.224470 SSD NA geomean 1
#> MMR 1.226097 MMR NA geomean 1
#> BDI 1.224297 BDI NA geomean 1
#> MoM 1.226632 MoM NA geomean 1
#> FMA 1.129200 FMA NA geomean 1
#> BFM 1.219985 BFM NA geomean 1
#> ERM 1.182653 ERM NA geomean 1
#> CV 11.656756 CV NA mean 1
#> CVsex 11.656756 CVsex NA mean 1
#> sdlog 0.116417 sdlog NA geomean 1
#> sdlogsex 0.116417 sdlogsex NA geomean 1
#> n.specimens.overall proportion.female.overall n.vars.realized
#> SSD 94 0.5 1
#> MMR 94 NA 1
#> BDI 94 NA 1
#> MoM 94 NA 1
#> FMA 94 NA 1
#> BFM 94 NA 1
#> ERM 94 NA 1
#> CV 94 NA 1
#> CVsex 94 0.5 1
#> sdlog 94 NA 1
#> sdlogsex 94 0.5 1
#> n.specimens.realized proportion.female.realized
#> SSD 94 0.5
#> MMR 94 NA
#> BDI 94 NA
#> MoM 94 NA
#> FMA 94 NA
#> BFM 94 NA
#> ERM 94 NA
#> CV 94 NA
#> CVsex 94 0.5
#> sdlog 94 NA
#> sdlogsex 94 0.5
#> proportion.missingdata.overall proportion.missingdata.realized
#> SSD 0 0
#> MMR 0 0
#> BDI 0 0
#> MoM 0 0
#> FMA 0 0
#> BFM 0 0
#> ERM 0 0
#> CV 0 0
#> CVsex 0 0
#> sdlog 0 0
#> sdlogsex 0 0
#> proportion.templated
#> SSD NA
#> MMR NA
#> BDI NA
#> MoM NA
#> FMA NA
#> BFM NA
#> ERM NA
#> CV NA
#> CVsex NA
#> sdlog NA
#> sdlogsex NA
# Alternatively, using apply
res <- apply(data.frame(method=c("SSD","MMR","BDI","MoM","FMA","BFM",
"ERM","CV","CVsex","sdlog","sdlogsex")),
MARGIN=1,
FUN=function(method, ...) dimorph(x=gorillas$FHSI, method=method,
sex=gorillas$Sex, dfout=TRUE),
simplify=FALSE)
as.data.frame(do.call(rbind, res))
#> estimate methodUni methodMulti center n.vars.overall
#> SSD 1.224470 SSD NA geomean 1
#> MMR 1.226097 MMR NA geomean 1
#> BDI 1.224297 BDI NA geomean 1
#> MoM 1.226632 MoM NA geomean 1
#> FMA 1.129200 FMA NA geomean 1
#> BFM 1.219985 BFM NA geomean 1
#> ERM 1.182653 ERM NA geomean 1
#> CV 11.656756 CV NA mean 1
#> CVsex 11.656756 CVsex NA mean 1
#> sdlog 0.116417 sdlog NA geomean 1
#> sdlogsex 0.116417 sdlogsex NA geomean 1
#> n.specimens.overall proportion.female.overall n.vars.realized
#> SSD 94 0.5 1
#> MMR 94 0.5 1
#> BDI 94 0.5 1
#> MoM 94 0.5 1
#> FMA 94 0.5 1
#> BFM 94 0.5 1
#> ERM 94 0.5 1
#> CV 94 0.5 1
#> CVsex 94 0.5 1
#> sdlog 94 0.5 1
#> sdlogsex 94 0.5 1
#> n.specimens.realized proportion.female.realized
#> SSD 94 0.5
#> MMR 94 0.5
#> BDI 94 0.5
#> MoM 94 0.5
#> FMA 94 0.5
#> BFM 94 0.5
#> ERM 94 0.5
#> CV 94 0.5
#> CVsex 94 0.5
#> sdlog 94 0.5
#> sdlogsex 94 0.5
#> proportion.missingdata.overall proportion.missingdata.realized
#> SSD 0 0
#> MMR 0 0
#> BDI 0 0
#> MoM 0 0
#> FMA 0 0
#> BFM 0 0
#> ERM 0 0
#> CV 0 0
#> CVsex 0 0
#> sdlog 0 0
#> sdlogsex 0 0
#> proportion.templated
#> SSD NA
#> MMR NA
#> BDI NA
#> MoM NA
#> FMA NA
#> BFM NA
#> ERM NA
#> CV NA
#> CVsex NA
#> sdlog NA
#> sdlogsex NA
## Multivariate estimates:
# GMsize (only usable for complete datasets)
Gg.GMsize <- dimorph(x=gorillas[,c("FHSI","HHMaj","TPMAP","RHMaj")],
method="SSD", methodMulti="GMsize", sex=gorillas$Sex, details=TRUE)
Gg.GMsize
#> GMsize.SSD
#> 1.258901
summary(Gg.GMsize)
#> estimate: 1.2589
#> univariate method: SSD
#> multivariate method: GMsize
#> no. of variables (overall): 1 (geometric mean of 4 variables)
#> no. of specimens (overall): 94
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 1 (geometric mean of 4 variables)
#> no. of specimens (realized): 94
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
#> ratio numerator and denominator: 49.82, 39.58
# GMM (produces the same values for ratio estimators as GMsize when data are complete)
Gg.GMM1 <- dimorph(x=gorillas[,c("FHSI","HHMaj","TPMAP","RHMaj")],
method="SSD", methodMulti="GMM", sex=gorillas$Sex, details=TRUE)
# now with subset of gorilla data
Gg.GMM2 <- dimorph(x=gorillas[,c("FHSI","HHMaj","TPMAP","RHMaj")],
method="SSD", methodMulti="GMM", sex=gorillas$Sex,
ads=c(1:10, 51:60), details=TRUE)
summary(Gg.GMM1)
#> estimate: 1.2589
#> univariate method: SSD
#> multivariate method: GMM
#> no. of variables (overall): 4
#> no. of specimens (overall): 94
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 4
#> no. of specimens (realized): 94
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
summary(Gg.GMM2)
#> estimate: 1.26797
#> univariate method: SSD
#> multivariate method: GMM
#> no. of variables (overall): 4
#> no. of specimens (overall): 20
#> female proportion of sample (overall): 0.5
#> no. of variables (realized): 4
#> no. of specimens (realized): 20
#> female proportion of sample (realized): 0.5
#> proportion of missing data (overall): 0
#> proportion of missing data (realized): 0
#> mean function: geometric mean
## Now with some simulated fossil data
SSDvars <- c("FHSI", "TPML", "TPMAP", "TPLAP", "HHMaj",
"HHMin", "RHMaj", "RHMin", "RDAP", "RDML")
Fs1 <- fauxil[fauxil$Species=="Fauxil sp. 1", SSDvars]
Fs1GMM <- dimorph(x=Fs1, method="MMR", methodMulti="GMM", details=TRUE)
Fs1TM <- dimorph(x=Fs1, method="MMR", methodMulti="TM",
templatevar="FHSI", details=TRUE)
#> Warning: The following variable(s) were removed because they
#> were not included in the template specimen:
#> TPLAP
#> HHMaj
#> RHMaj
#> RDML
#> Warning: The following specimens(s) were removed because they
#> did not have templatable variables:
#> CMNH HTB 1797
Fs1GMM
#> GMM.MMR
#> 1.252471
Fs1TM
#> TM.MMR
#> 1.277495
summary(Fs1GMM)
#> estimate: 1.25247
#> univariate method: MMR
#> multivariate method: GMM
#> no. of variables (overall): 10
#> no. of specimens (overall): 16
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 10
#> no. of specimens (realized): 16
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0.775
#> proportion of missing data (realized): 0.775
#> mean function: geometric mean
summary(Fs1TM, verbose=TRUE)
#> estimate: 1.2775
#> univariate method: MMR
#> multivariate method: TM
#> no. of variables (overall): 10
#> no. of specimens (overall): 16
#> female proportion of sample (overall): unknown
#> no. of variables (realized): 6
#> no. of specimens (realized): 15
#> female proportion of sample (realized): unknown
#> proportion of missing data (overall): 0.775
#> proportion of missing data (realized): 0.8
#> proportion of template variable data estimated: 0.73333
#> template specimen: CMNH HTB 1710
#> mean function: geometric mean
#> ratio numerator and denominator: 51.96, 40.67
#>
#> Included variables:
#> FHSI
#> TPML
#> TPMAP
#> HHMin
#> RHMin
#> RDAP
#>
#> Included specimens:
#> PCM Gg-M877
#> PCM Gg-Z6-33
#> CMNH HTB 1997
#> CMNH HTB 1851
#> MNHN 1982-56
#> CMNH HTB 1729
#> CMNH HTB 1994
#> CMNH HTB 1732
#> CMNH HTB 1954
#> CMNH HTB 3400
Fs1Both <- rbind(dimorph(x=Fs1, method="MMR", methodMulti="GMM", details=TRUE, dfout=TRUE),
dimorph(x=Fs1, method="MMR", methodMulti="TM",
templatevar="FHSI", details=TRUE, dfout=TRUE))
#> Warning: The following variable(s) were removed because they
#> were not included in the template specimen:
#> TPLAP
#> HHMaj
#> RHMaj
#> RDML
#> Warning: The following specimens(s) were removed because they
#> did not have templatable variables:
#> CMNH HTB 1797
Fs1Both
#> estimate methodUni methodMulti center n.vars.overall
#> GMM.MMR 1.252471 MMR GMM geomean 10
#> TM.MMR 1.277495 MMR TM geomean 10
#> n.specimens.overall proportion.female.overall n.vars.realized
#> GMM.MMR 16 NA 10
#> TM.MMR 16 NA 6
#> n.specimens.realized proportion.female.realized
#> GMM.MMR 16 NA
#> TM.MMR 15 NA
#> proportion.missingdata.overall proportion.missingdata.realized
#> GMM.MMR 0.775 0.775
#> TM.MMR 0.775 0.800
#> proportion.templated
#> GMM.MMR NA
#> TM.MMR 0.7333333