Data Quality Experiment
The match rate is a crucial indicator of data quality in fieldwork.
Achieving a perfect match rate can be challenging due to non-systematic errors.
Experimental Design
A random sample of approximately 1000 respondents was called back.
Two types of questions were examined: those expected to have high match rates and those prone to change.
50% reminder/50% no reminder
Logistic Regression Analysis Results
We estimate a simple logistic regression model to analyze factors potentially influencing the match rate.
Three main candidates:
Question type (high-match-rate or prone-to-change)
Days since visit (duration between the original visit and follow-up call)
Reminder (50% treatment, 50% control).
We estimate a simple logistic regression model to analyze factors potentially influencing the match rate.
We estimate a simple logistic regression model to analyze factors potentially influencing the match rate.
Three main candidates:
Question type (high-match-rate or prone-to-change)
Days since visit (duration between the original visit and follow-up call)
Reminder (50% treatment, 50% control).
Key Findings and Implications
Takeaways:
The type of question asked significantly influences the match rate.
Careful question selection is essential in ensuring response consistency.
Reminding the respondent of their initial response also significantly affects the match rate. Jogging the respondent’s memory can help in aligning the responses.
The passage of time alone may not be a significant factor affecting response consistency.
Conclusion:
These findings provide valuable insights for optimizing data collection processes and improving the match rate.
Enhancing data quality requires may require to question selection and implementing reminder strategies.