@lesnardUsingOptimalMatching2014

Using Optimal Matching Analysis in Sociology: Cost Setting and Sociology of Time

(2014) - Laurent Lesnard

Journal: Advances in Sequence Analysis: Theory, Method, Applications
Link:: http://link.springer.com/10.1007/978-3-319-04969-4_3
DOI:: 10.1007/978-3-319-04969-4_3
Links::
Tags:: #paper #Methods #OptimalMatchingAnalysis
Cite Key:: [@lesnardUsingOptimalMatching2014]

Abstract

This paper is a reflection on the conditions required to use Optimal Matching Analysis (OMA) in sociology. The success of OMA in biology is not related to any supposed similarity of the method with biological processes but comes from setting costs in OMA in accordance with biological theory. As sequences in sociology are made of events and time, the determination of costs should be guided by sociological theories of time. After a discussion of the sociological meaning and consequences of costs, this paper comes back on the Dynamic Hamming Distance and the body of social theories of time (Durkheim, Elias, Bourdieu) from which it is derived as an example of how sociological theory can inform cost setting in using OMA in sociology.

Notes

“As sequences in sociology are made of events and time, the determination of costs should be guided by sociological theories of time. After a discussion of the sociological meaning and consequences of costs, this paper comes back on the Dynamic Hamming Distance and the body of social theories of time (Durkheim, Elias, Bourdieu) from which it is derived as an example of how sociological theory can inform cost setting in using OMA in sociology.” (Lesnard, 2014, p. 2)

“Substitution costs are determined on the basis of phylogenetic hypotheses and statistical results, while insertion and deletion costs are set more arbitrarily. Indeed, substitution costs must reflect the probability of change and are empirically estimated from sample sequences with established or probable phylogenetic links. For example, the PAM (Point Accepted Mutation) matrixes are estimated on the basis of proteins that are experimentally or hypothetically related phylogenetically (Dayhoff et al. 1978).” (Lesnard, 2014, p. 3)

“As Katherine Stovel and her colleagues stated, “The assignment of transformation costs haunts all optimal matching analyses” (Stovel et al. 1996, p. 394). If the results are too sensitive to costs it means that the costs, and not the data, are producing the results (Wu 2000). Conversely, if the results are not sensitive to costs, how could OMA be valid (Levine 2000)?” (Lesnard, 2014, p. 4)

“When the data include several strong clusters, costs are important at two levels. First, costs play a role in cases that do not fit into stable groups as easily as others do.” (Lesnard, 2014, p. 4)

“Second, costs also have an impact on the distance between the strong clusters and, in the case of an ascending hierarchical classification, affect the way in which they might be aggregated.” (Lesnard, 2014, p. 4)

“Determining costs first requires understanding and interpreting them with regard to the processed data and purpose of the analysis. In the social sciences the order of states is not tied to a biochemical combination as it is in biology, but rather to time. When a state is inserted in, or deleted from a sequence to make it identical to another, a time shift occurs between these two sequences. The more these two operations are used, the more the elements to be compared will be distant in the original sequences” (Lesnard, 2014, p. 4)

“Insertion and deletion operations therefore warp the position of the elements of the sequence. On the other hand, substitution operations respect the position of sequence elements, the trade-off being that states are replaced.” (Lesnard, 2014, p. 5)

“In sum, insertions and deletions shift the sequences in order to identify identically coded subsequences, while substitutions do not shift the sequences but replace states with others. In the first case the time structure of the sequences is altered, and in the second the states of which they are made undergo change (Lesnard 2010). When the substitution cost is double the insertion-deletion cost or more, substitution operations are no longer used since it is equivalent or less costly to insert the desired state and delete the other than it is to substitute them (Kruskal 1983). Conversely, when substitution costs are very low compared to insertion-deletion costs, the latter operations are no longer used.” (Lesnard, 2014, p. 5)

“Thus, in their study of intergenerational social mobility, Halpin and Chan define substitution costs according to a nomenclature of social classes and their theoretical distance (Halpin and Chan 1998).” (Lesnard, 2014, p. 5)

“. Typologies created with a matrix of substitution costs present two types of groups. The first includes sequences where some parts are close but time-shifted while the second includes sequences where identical or very similar sub-sequences are contemporaneous.” (Lesnard, 2014, p. 6)

“transitions provide partial information on social rhythms and it therefore seems reasonable to use this information; indeed, the use of a low substitution cost in the social sciences implies that the difference between the two states does not indicate that the sequences belong to two different groups of rhythms.” (Lesnard, 2014, p. 7)

“A large portion of my research using OMA focuses on time use surveys (Lesnard 2008; Lesnard 2009; Lesnard 2010; Lesnard and Kan 2011).” (Lesnard, 2014, p. 7)

“For Durkheim categories of understanding are not transcendental (that is to say innate, constitutive of human nature) but rather derive from the organization of life in society (Durkheim 1912).” (Lesnard, 2014, p. 9)