An iterative approach to developing, refining and validating machine-scored constructed response assessments