Predicting Math Success in an Online Tutoring System Using Language Data and Click-Stream Variables: A Longitudinal Analysis

Crossley, Scott; Karumbaiah, Shamya; Ocumpaugh, Jaclyn; Labrum, Matthew J.; Baker, Ryan S.

doi:10.4230/OASIcs.LDK.2019.25

Abstract

Previous studies have demonstrated strong links between students' linguistic knowledge, their affective language patterns and their success in math. Other studies have shown that demographic and click-stream variables in online learning environments are important predictors of math success. This study builds on this research in two ways. First, it combines linguistics and click-stream variables along with demographic information to increase prediction rates for math success. Second, it examines how random variance, as found in repeated participant data, can explain math success beyond linguistic, demographic, and click-stream variables. The findings indicate that linguistic, demographic, and click-stream factors explained about 14% of the variance in math scores. These variables mixed with random factors explained about 44% of the variance.

Jamal Abedi and Carol Lord. The Language Factor in Mathematics Tests. Applied Measurement in Education, 14(3):219-234, 2001.
Thomasenia Lott Adams. Reading mathematics: More than words can say. The Reading Teacher, 56(8):786-795, 2003.
Mary Alt, Genesis D Arizmendi, and Carole R Beal. The relationship between mathematics and language: Academic implications for children with specific language impairment and English language learners. Language, speech, and hearing services in schools, 45(3):220-233, 2014.
Nathaniel Anozie and Brian W Junker. Predicting end-of-year accountability assessment scores from monthly student records in an online tutoring system. Technical report, Educational Data Mining: Papers from the AAAI Workshop. Menlo Park, CA: AAAI Press, 2006.
David A Balota, Melvin J Yap, Keith A Hutchison, Michael J Cortese, Brett Kessler, Bjorn Loftis, James H Neely, Douglas L Nelson, Greg B Simpson, and Rebecca Treiman. The English lexicon project. Behavior research methods, 39(3):445-459, 2007.
Kamil Barton. MuMIn: Multi-Model Inference, 2018. URL: https://CRAN.R-project.org/package=MuMIn.
Douglas Bates, Martin Mächler, Ben Bolker, and Steve Walker. Fitting Linear Mixed-Effects Models Using lme4. Journal of Statistical Software, 67(1):1-48, 2015.
Carole R Beal, Rena Walles, Ivon Arroyo, and Beverly P Woolf. On-line tutoring for math achievement testing: A controlled evaluation. Journal of Interactive Online Learning, 6(1):43-55, 2007.
Marc Brysbaert and Boris New. Subtlexus: American word frequencies. http://subtlexus.lexique.org, 2009.
Scott Crossley, Tiffany Barnes, Collin Lynch, and Danielle S McNamara. Linking Language to Math Success in an On-Line Course. In Proceedings of the 10th International Conference on Educational Data Mining, pages 180-185, Wuhan, China, 2017.
Scott Crossley and Victor Kostyuk. Letting the Genie out of the Lamp: Using Natural Language Processing tools to predict math performance. In International Conference on Language, Data and Knowledge, pages 330-342. Springer, 2017.
Scott Crossley, Ran Liu, and Danielle McNamara. Predicting math performance using natural language processing tools. In Proceedings of the Seventh International Learning Analytics &Knowledge Conference, pages 339-347. ACM, 2017.
Scott Crossley, Jaclyn Ocumpaugh, Matthew Labrum, Franklin Bradfield, Mihai Dascalu, and Ryan S Baker. Modeling Math Identity and Math Success through Sentiment Analysis and Linguistic Features. In International Educational Data Mining Society. ERIC, 2018.
Scott A. Crossley, Kristopher Kyle, and Mihai Dascalu. The Tool for the Automatic Analysis of Cohesion 2.0: Integrating semantic similarity and text overlap. Behavior Research Methods, 51(1):14-27, 2019.
Scott A Crossley, Kristopher Kyle, and Danielle S McNamara. Sentiment Analysis and Social Cognition Engine (SEANCE): An automatic tool for sentiment, social cognition, and social-order analysis. Behavior research methods, 49(3):803-821, 2017.
Scott A Crossley, Maria-Dorinela Sirbu, Mihai Dascalu, Tiffany Barnes, Collin F Lynch, and Danielle S McNamara. Modeling Math Success Using Cohesion Network Analysis. In International Conference on Artificial Intelligence in Education, pages 63-67. Springer, 2018.
Mark Davies. The 385+ million word Corpus of Contemporary American English (1990-2008+): Design, architecture, and linguistic insights. International journal of corpus linguistics, 14(2):159-190, 2009.
Mingyu Feng, Neil T Heffernan, and Kenneth R Koedinger. Predicting state test scores better with intelligent tutoring systems: developing metrics to measure assistance required. In International conference on intelligent tutoring systems, pages 31-40. Springer, 2006.
Pier Luigi Ferrari. Mathematical Language and Advanced Mathematics Learning. International Group for the Psychology of Mathematics Education, 2004.
Gillian Hampden-Thompson, Gail Mulligan, Akemi Kinukawa, and Tamara Halle. Mathematics Achievement of Language-Minority Students During the Elementary Years. Research report, The University of York, Washington, DC, 2008.
Neil T Heffernan, Kenneth R Koedinger, Brian W Junker, and Steven Ritter. Using Web-based cognitive assessment systems for predicting student performance on state exams. Research proposal to the Institute of Educational Statistics, US Department of Education. Department of Computer Science at Worcester Polytechnic Institute, Massachusetts, 2001.
Federico Hernandez. The Relationship Between Reading and Mathematics Achievement of Middle School Students as Measured by the Texas Assessment of Knowledge and Skills. PhD thesis, University of Houston, 2013.
Arnon Hershkovitz, Ryan Shaun Joazeiro de Baker, Janice Gobert, Michael Wixon, and Michael Sao Pedro. Discovery with models: A case study on carelessness in computer-based science inquiry. American Behavioral Scientist, 57(10):1480-1499, 2013.
George A Khachatryan, Andrey V Romashov, Alexander R Khachatryan, Steven J Gaudino, Julia M Khachatryan, Konstantin R Guarian, and Nataliya V Yufa. Reasoning Mind Genie 2: An intelligent tutoring system as a vehicle for international transfer of instructional methods in mathematics. International Journal of Artificial Intelligence in Education, 24(3):333-382, 2014.
Alexandra Kuznetsova, Per B Brockhoff, and Rune Haubo Bojesen Christensen. lmerTest package: tests in linear mixed effects models. Journal of Statistical Software, 82(13):1-26, 2017.
Kristopher Kyle, Scott Crossley, and Cynthia Berger. The tool for the automatic analysis of lexical sophistication (TAALES): version 2.0. Behavior research methods, 50(3):1030-1046, 2018.
Kristopher Kyle and Scott A Crossley. Measuring Syntactic Complexity in L2 Writing Using Fine-Grained Clausal and Phrasal Indices. The Modern Language Journal, 102(2):333-349, 2018.
Jo-Anne LeFevre, Lisa Fast, Sheri-Lynn Skwarchuk, Brenda L Smith-Chant, Jeffrey Bisanz, Deepthi Kamawar, and Marcie Penner-Wilger. Pathways to mathematics: Longitudinal predictors of performance. Child development, 81(6):1753-1767, 2010.
Mollie MacGregor and Elizabeth Price. An exploration of aspects of language proficiency and algebra learning. Journal for Research in Mathematics Education, 30:449-467, 1999.
Maria Martiniello. Language and the performance of English-language learners in math word problems. Harvard Educational Review, 78(2):333-368, 2008.
Maria Martiniello. Linguistic complexity, schematic representations, and differential item functioning for English language learners in math tests. Educational assessment, 14(3-4):160-179, 2009.
William L Miller, Ryan S Baker, Matthew J Labrum, Karen Petsche, Yu-Han Liu, and Angela Z Wagner. Automated detection of proactive remediation by teachers in Reasoning Mind classrooms. In Proceedings of the Fifth International Conference on Learning Analytics and Knowledge, pages 290-294. ACM, 2015.
Douglas L Nelson, Cathy L McEvoy, and Thomas A Schreiber. The University of South Florida word association, rhyme, and word fragment norms, 1998.
Jaclyn Ocumpaugh, Maria Ofelia San Pedro, Huei-yi Lai, Ryan S Baker, and Fred Borgen. Middle school engagement with mathematics software and later interest and self-efficacy for STEM careers. Journal of Science Education and Technology, 25(6):877-887, 2016.
R Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2018. URL: https://www.R-project.org/.
Steven Ritter, Ambarish Joshi, Stephen Fancsali, and Tristan Nixon. Predicting standardized test scores from Cognitive Tutor interactions. In Proceedings of the International Conference on Educational Data Mining, 2013.
Maria Ofelia San Pedro, Jaclyn Ocumpaugh, Ryan S Baker, and Neil T Heffernan. Predicting STEM and Non-STEM College Major Enrollment from Middle School Interaction with Mathematics Educational Software. In Proceedings of the International Conference on Educational Data Mining, pages 276-279, 2014.
Maria Ofelia Clarissa Z San Pedro, Ryan SJ d Baker, and Ma Mercedes T Rodrigo. Detecting carelessness through contextual estimation of slip probabilities among students using an intelligent tutor for mathematics. In International Conference on Artificial Intelligence in Education, pages 304-311. Springer, 2011.
Maria Ofelia Z San Pedro, Ryan SJ d Baker, and Ma Mercedes T Rodrigo. Carelessness and affect in an intelligent tutoring system for mathematics. International Journal of Artificial Intelligence in Education, 24(2):189-210, 2014.
David Tall. Thinking Through Three Worlds of Mathematics. International Group for the Psychology of Mathematics Education, 2004.
Rose K Vukovic and Nonie K Lesaux. The relationship between linguistic skills and arithmetic knowledge. Learning and Individual Differences, 23:87-91, 2013.
Jun Xie, Alfred Essa, Shirin Mojarad, Ryan S Baker, Keith Shubeck, and Xiangen Hu. Student learning strategies and behaviors to predict success in an online adaptive mathematics tutoring system. In Proceedings of the International Conference on Educational Data Mining, pages 460-465, 2017.

Predicting Math Success in an Online Tutoring System Using Language Data and Click-Stream Variables: A Longitudinal Analysis

Authors Scott Crossley , Shamya Karumbaiah, Jaclyn Ocumpaugh, Matthew J. Labrum , Ryan S. Baker

File

Document Identifiers

Author Details

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References

Predicting Math Success in an Online Tutoring System Using Language Data and Click-Stream Variables: A Longitudinal Analysis

Authors Scott Crossley , Shamya Karumbaiah, Jaclyn Ocumpaugh, Matthew J. Labrum , Ryan S. Baker

File

Document Identifiers

Author Details

Funding

Cite AsGet BibTex

Abstract

Subject Classification

ACM Subject Classification

Keywords

Metrics

References