Calculate the conditional Mahalanobis distance for any variables.
Usage
cond_maha(
  data,
  R,
  v_dep,
  v_ind = NULL,
  v_ind_composites = NULL,
  mu = 0,
  sigma = 1,
  use_sample_stats = FALSE,
  label = NA
)Arguments
- data
 Data.frame with the independent and dependent variables. Unless mu and sigma are specified, data are assumed to be z-scores.
- R
 Correlation among all variables.
- v_dep
 Vector of names of the dependent variables in your profile.
- v_ind
 Vector of names of independent variables you would like to control for.
- v_ind_composites
 Vector of names of independent variables that are composites of dependent variables
- mu
 A vector of means. A single value means that all variables have the same mean.
- sigma
 A vector of standard deviations. A single value means that all variables have the same standard deviation
- use_sample_stats
 If TRUE, estimate R, mu, and sigma from data. Only complete cases are used (i.e., no missing values in v_dep, v_ind, v_ind_composites).
- label
 optional tag for labeling output
Value
a list with the conditional Mahalanobis distance
dCM= Conditional Mahalanobis distancedCM_df= Degrees of freedom for the conditional Mahalanobis distancedCM_p= A proportion that indicates how unusual this profile is compared to profiles with the same independent variable values. For example, ifdCM_p= 0.88, this profile is more unusual than 88 percent of profiles after controlling for the independent variables.dM_dep= Mahalanobis distance of just the dependent variablesdM_dep_df= Degrees of freedom for the Mahalanobis distance of the dependent variablesdM_dep_p= Proportion associated with the Mahalanobis distance of the dependent variablesdM_ind= Mahalanobis distance of just the independent variablesdM_ind_df= Degrees of freedom for the Mahalanobis distance of the independent variablesdM_ind_p= Proportion associated with the Mahalanobis distance of the independent variablesv_dep= Dependent variable namesv_ind= Independent variable namesv_ind_singular= Independent variables that can be perfectly predicted from the dependent variables (e.g., composite scores)v_ind_nonsingular= Independent variables that are not perfectly predicted from the dependent variablesdata= data used in the calculationsd_ind= independent variable datad_inp_p= Assuming normality, cumulative distribution function of the independent variablesd_dep= dependent variable datad_dep_predicted= predicted values of the dependent variablesd_dep_deviations = d_dep - d_dep_predicted(i.e., residuals of the dependent variables)d_dep_residuals_z= standardized residuals of the dependent variablesd_dep_cp= conditional proportions associated with standardized residualsd_dep_p= Assuming normality, cumulative distribution function of the dependent variablesR2= Proportion of variance in each dependent variable explained by the independent variableszSEE= Standardized standard error of the estimate for each dependent variableSEE= Standard error of the estimate for each dependent variableConditionalCovariance= Covariance matrix of the dependent variables after controlling for the independent variablesdistance_reduction = 1 - (dCM / dM_dep)(Degree to which the independent variables decrease the Mahalanobis distance of the dependent variables. Negative reductions mean that the profile is more unusual after controlling for the independent variables. Returns 0 ifdM_depis 0.)variability_reduction = 1 - sum((X_dep - predicted_dep) ^ 2) / sum((X_dep - mu_dep) ^ 2)(Degree to which the independent variables decrease the variability the dependent variables (X_dep). Negative reductions mean that the profile is more variable after controlling for the independent variables. Returns 0 ifX_dep == mu_dep)mu= Variable meanssigma= Variable standard deviationsd_person= Data frame consisting of Mahalanobis distance data for each persond_variable= Data frame consisting of variable characteristicslabel= label slot
Examples
library(unusualprofile)
library(simstandard)
m <- "
Gc =~ 0.85 * Gc1 + 0.68 * Gc2 + 0.8 * Gc3
Gf =~ 0.8 * Gf1 + 0.9 * Gf2 + 0.8 * Gf3
Gs =~ 0.7 * Gs1 + 0.8 * Gs2 + 0.8 * Gs3
Read =~ 0.66 * Read1 + 0.85 * Read2 + 0.91 * Read3
Math =~ 0.4 * Math1 + 0.9 * Math2 + 0.7 * Math3
Gc ~ 0.6 * Gf + 0.1 * Gs
Gf ~ 0.5 * Gs
Read ~ 0.4 * Gc + 0.1 * Gf
Math ~ 0.2 * Gc + 0.3 * Gf + 0.1 * Gs"
# Generate 10 cases
d_demo <- simstandard::sim_standardized(m = m, n = 10)
# Get model-implied correlation matrix
R_all <- simstandard::sim_standardized_matrices(m)$Correlations$R_all
cond_maha(data = d_demo,
          R = R_all,
          v_dep = c("Math", "Read"),
          v_ind = c("Gf", "Gs", "Gc"))
#> Conditional Mahalanobis Distance = 1.0270, df = 2, p = 0.4098 Conditional Mahalanobis Distance = 1.0696, df = 2, p = 0.4356 Conditional Mahalanobis Distance = 1.0953, df = 2, p = 0.4511 Conditional Mahalanobis Distance = 1.9265, df = 2, p = 0.8436 Conditional Mahalanobis Distance = 0.4878, df = 2, p = 0.1122 Conditional Mahalanobis Distance = 2.3043, df = 2, p = 0.9297 Conditional Mahalanobis Distance = 0.1760, df = 2, p = 0.0154 Conditional Mahalanobis Distance = 2.0633, df = 2, p = 0.8810 Conditional Mahalanobis Distance = 2.4448, df = 2, p = 0.9496 Conditional Mahalanobis Distance = 0.7477, df = 2, p = 0.2438 
