1 Introduction

This is a book about analyzing dyadic data with latent variables using the SEM framework in R. As such, it occupies a unique location within the market of available books across several domains. Kenny, Kashy, and Cook (2006), for example, is an excellent book focused exclusively on dyadic data analysis. It describes the challenges of managing dyadic data, as well as many of the most popular models of dyadic data. But though Kenny, Kashy, and Cook (2006) describe the SEM framework and its application to some models, they don’t really mean “SEM” in the way that I do. That is, their consideration of SEM isn’t particularly engaged with the prospect of modeling latent variables, while I am exclusively focused on the use of latent variables.

A book like Little (2013), meanwhile, describes the use of latent variable modeling for dependent data like dyadic data, albeit in the context of longitudinal research designs, where the source of dependency is repeated observation. As such, though this book possesses some wisdom for those wishing to use latent variables in their models of dyadic data, the book is not written with dyadic data or dyadic models in mind. And as it turns out, things get weird when you are analyzing dependent groups of two (as opposed to, say, individuals over the span of three, four, or five waves of repeated assessment).

There are many other excellent books, too, that offer additional guidance–to some (in)direct degree or another–for applying latent variable models to dyadic data, such as Brown (2015) and Kline (2023) and Bolger and Laurenceau (2013). By writing this book, I mean to take nothing away from their value propositions; they are great resources. But they either do not consider the unique features of dyadic data and/or do not substantively engage in the distinctive benefits to modeling dyadic data with latent variables.

This is a book about analyzing dyadic data with latent variables using the SEM framework in R. And so I will exclusively discuss the analysis of dyadic data, and I will exclusively discuss its analysis with latent variable models.

In Act I of the book, I will first lay out “The Big Picture” (Chapter 2) of the what’s, why’s, and how’s of dyadic SEM with latent variables, as well as introducing some of the technical jargon I will use throughout the rest of the book. I then discuss some of the unique considerations of data management for dyadic data analysis (Chapter 3), before providing an overview of latent variable theory in the context of modeling dyadic data (Chapter 4).

In Act II of the book, I attempt to provide what I consider to be a “sufficient” overview of the conceptual and applied specifics of modeling latent variables, without yet engaging with how to extend this framework to the analysis of dyadic data. Chapter 5 is essential reading if you are unaccustomed to the statistical features of latent variables models (e.g., the visual depictions, notation and interprestation for particular features, etc.,) and common analytic practices within them (e.g., fixing or constraining parameters). I then discuss two related problems that must be resolved in fitting SEMs (model identification and setting the scale of the latent variable(s)) (Chapter 6), before they can be estimated (Chapter 7). These topics then set the table for describing how we evaluate (Chapter 8) and compare (Chapter 9) SEMs. I then work through all of this (and more) in an applied non-dyadic example (Chapter 10).

I then pause in Intermission 1 to provide some foreshadowing about the dyadic SEMs we will encounter in the subsequent chapters (Chapter 11). For those approaching this book while having some familiarity of the models described in Kenny, Kashy, and Cook (2006), this chapter will help to transition you to thinking about these models recast in latent space.

In Act III, we finally get into the specification of models for dyadic data with latent variables, beginning with the simplest cross-sectional models possible: those intended to capture only one construct (i.e., “uni-construct”) shared somehow between dyad members. These include the dyadic one-factor model (Chapter 12), the correlated two-factor model (Chapter 13), the bifactor model (Chapter 14), and the hierarchical factor model (Chapter 15). These models have a surprisingly interesting (and complex) relationship to one another, which I discuss in the subsequent chapter (Chapter 16). I also describe how to use dyadic invariance testing within these models, in order to evaluate the generalizability of latent variable model parameters across partners (Chapter 17), which plays an important role in many other comparisons in dyadic data analysis, as well as being an (unappreciated) interesting phenomenon in its own right. I conclude this section with a discussion of an important but vastly underappreciated issue: how to choose among competing uni-construct models for a given set of dyadic data (Chapter 18).

In Act IV, we move to discussing dyadic SEMs that are latent embodiments of the kinds of models that may seem more prototypical in dyadic data analysis (i.e., those covered in Kenny, Kashy, and Cook (2006)). These models involve the prediction of one dyad-related construct by another (i.e., are bi-construct). I first describe bi-construct models where the predictor construct and outcome construct share the same uni-construct dyadic model, including the Couple Interdependence Model (Chapter 19), the Actor-Partner Interdependence Model (Chapter 20), the Bifactor Structural Model (Chapter 21), and the Common Fate Model (Chapter 22). I conclude this section with a discussion (and some encouragement) of how different uni-construct models could be combined in more boutique bi-construct models (Chapter 23).

We then pause once more, in Intermission 2, in order to discuss–with the knowledge of uni-construct and bi-construct dyadic SEMs under our belts–just how complicated the concept of “distinguishability” is, when cast through the SEM lens.

Finally, in Act V, we delve into even more complex applications of dyadic SEM, including some themes of practice that are not yet done, yet which I hope will be on the (near) horizon of analytic practice in our field. These include the modeling of so-called “third variable” processes (Chapter 25), testing dyadic SEMs across groups (Chapter 26), the modeling of both dyadic and longitudinal dependency with latent variables (Chapter 27), and the deployment of data-driven exploratory models to provide a plausible dyadic measurement model (Chapter 28). I also discuss the application of (and need for more) Monte Carlo simulation studies (Chapter 29), to evaluate the performance of dyadic SEMs (and other modeling strategies); in this chapter, I also discuss how these simulations can help to inform sample size planning. I then end this section with some encouragement and guidance of how to contribute to open-source dyadic data modeling tools (Chapter 30), for those so inclined.

I’ll also (eventually) write a Conclusion to this book (Chapter 31), and I’m sure it’ll be very meaningful and impressive. But for now, I need to generate some content, before I can realize what it is I ought to conclude.

What I will not write about at length, however, is the basics of using R–the open-source cross-platform statistical programming language that I use in my dyadic SEM work (and upon which this book and its applications currently rely). If you are entirely new to R, the good news is that most dyadi SEM modeling instantiations require precious little of fiddling around with basic data management in R. That is, you’re often “good to go” soon after data importation. If you need additional scafolding for using R, however, I encourage you to check out “R for Data Science”, or “R4DS” as it’s sometimes known (NEED REFERENCE). It’s a gold mine of useful information for R users of all levels of comfort.

And no: I will not provide analytic resources for other programming/statistics languages (e.g., SPSS/AMOS, SAS, MPlus). Though I have sometimes done this in the past (e.g., John K. Sakaluk and Short (2017), John Kitchener Sakaluk (2019)), and some of the models I describe herein are possible to specify in these other softwares, I have decided that I am done supporting proprietary software. This may lose me some readers; so be it. I want to create learning resources and tools that are avaiable to anyone and everyone, for free, and the increasing expense of these other software packages threatens what I see as a necessary mandate to democratize access to learning.