We are delighted to see that there is interest in the resources we have posted on this blog (more than 5,000 page views as of today). Most recently, Rick Davies has used our dataset for a presentation at the 11th biennial EES conference (see earlier posts below) which compares decision tree analysis with Qualitative Comparative Analysis (QCA). Rick comes to the conclusions that (1) a decision tree analysis of our data would have yielded fewer paths to evaluation effectiveness, and that (2) those paths would have operated a more precise differentiation between cases of effective and ineffective evaluation respectively. Rick's presentation is available on YouTube. We have found it stimulating to examine the merits of decision tree analysis (and other methods) as compared to QCA.
Meanwhile, we believe that Rick’s conclusions are flawed, for the following reasons
(this is going to be a bit technical):
1. The decision tree analysis is not based on the same data set as our QCA. Rick has converted our data set into a binary format, and based his decision tree analysis on this changed data set.
Our QCA rests on a data set which specifies four different values for each condition – not just two, as in a binary data set. Our data is therefore twice as nuanced as the data used in Rick’s decision tree. While the datasets display some similarities, they are certainly not equivalent. An equivalent decision tree would have to take into account four different values for each condition in the tree, not just two. It is likely that in that case, a decision tree of similar precision would need to be much more complex to explain the outcome.
2. Decision tree analysis is compared with a type of QCA solution that is not meant to maximise parsimony – although the comparison focuses on parsimony.
QCA produces three types of solutions, called ‘conservative’, ‘intermediate’ and ‘parsimonious’ respectively. Rick has compared the results of the decision tree analysis with our intermediate QCA solution. An intermediate QCA solution is generally recommended for communication to audiences interested in the practical implications of QCA analyses; that is why we have used and argued for an intermediate QCA solution in our review report. However, an intermediate solution is not the most parsimonious solution QCA provides. If the purpose was to compare the parsimony of QCA results with those of decision trees, then the 'parsimonious' QCA solution should be used.
3. The decision tree analysis performs less well than stated in the presentation.
Similar criteria must be applied when comparing the number of paths (and the consistency value) in the QCA and in decision tree analysis respectively.
If the question is how many paths the respective method uses to cover cases of effective evaluations (i.e. to explain positive outcomes), then the answer for the decision tree is four, not three. There are four paths in the decision tree that cover cases with positive outcomes, and the consistency value for these four paths is not 82%, but 76%.
A side note: Contrary to the contention in Rick's presentation, none of the paths in our QCA is redundant. An aggregated total coverage of more than 100% is no indication of the redundancy of an individual path. While there may be cases that are covered by two or three paths, there can still be cases for each path that are covered only by that single path. In our solution, all paths have a unique coverage of more than 0. Otherwise, they would not feature in the solution, as they would be eliminated by the QCA algorithm.
4. The decision tree consistency measure is less rigorous than in QCA. Consistency values in fsQCA take into account not only the fact whether a case shows the outcome or not, but also to which degree the case is in line with the posited set relationship. This is possible due to the “gradual” nature of the data in fsQCA. For the consistency value, it means that even if all cases covered by a path show the outcome, the consistency value can be smaller than 1. As the decision tree analysis is based on binary data, the extent to which cases contradict the posited explanation cannot be taken into account – a higher consistency value is therefore more easily achieved.
Intrigued by the idea of 'triangulating' QCA results with decision tree analysis, we have converted our QCA dataset into a binary format (as Rick did, see point 1 above) and conducted a csQCA with that data. This allows for a more precise comparison of QCA with the decision tree analysis as conducted by Rick. (Btw Rick
Davies may have made a slight error when converting the data, as his
decision tree features 29, not 28 effective evaluations.)
The resulting parsimonious solution consists of 7 paths and has a consistency value of 96%. The comparable values for the decision tree analysis are 4 paths and a consistency value of 76%. For the non-occurrence of outcomes, the parsimonious solution consists of 3 paths and has a consistency value of 100%. The comparable values for the decision tree analysis are 4 paths and a consistency value of 29%.
This means that, if we run the identical dataset through QCA and the decision tree algorithm respectively, then the decision tree analysis yields slightly more parsimonious findings (i.e. fewer paths) than QCA for the occurrence of effective evaluations. For the non-occurrence of effective evaluations, the QCA solution is more parsimonious. At the same time, applying the decision tree algorithm to our data set comes with a massive loss in accuracy for the findings regarding both outcomes.