Применение метода условных деревьев решений к моделированию выбора родителями школы
Для цитирования
Тенишева К.А., Савельева С.С., Александров Д.А. Применение метода условных деревьев решений к моделированию выбора родителями школы // Социология: методология, методы, математическое моделирование (Социология:4М). 2018. № 46. С. 44-84.
Аннотация
Представлен новый для социологии подход к изучению выбора – применение метода условных деревьев решений. Подробно разбирается логика метода и его преимущества на примере анализа выбора родителями школы в двух районах Санкт-Петербурга. Показывается, что деревья решений хорошо подходят для выделения групп, следующих разным стратегиям принятия решений. Метод может быть эффективным инструментом моделирования и интерпретации логики принятия решений. Он выигрывает в сравнении с традиционным моделированием при помощи логистической регрессии, поскольку позволяет оценить гомогенность предпочтений (выборов) полученных групп, а не просто выделять ключевые для выбора факторы. Предлагается в научных и прикладных социальных исследованиях, посвященных изучению сложного выбора, сочетать регрессионный анализ с методом деревьев решений
Ключевые слова:
логистическая регрессия, классификация, моделирование выбора, метод условных деревьев решений, выбор школы.
Литература
Breen R., Goldthorpe J.H. Explaining Educational Differentials: Towards a Formal Rational Action Theory // Rationality and Society. 1997. Vol. 9(3). P. 275–305.
Ball S.J. Good School/Bad School: Paradox and Fabrication // British Journal of Sociology of Education. 1997. Vol. 18 (3). P. 317–336.
Taylor C. Hierarchies and Local Markets: the Geography of the Lived Market Place in Secondary Education Provision // Journal of Education Policy. 2001. Vol. 16(3). P. 197–214.
Kristen C. School Choice and Ethnic School Segregation: Primary School Selection in Germany: Waxmann Verlag, 2003.
Shavit Y., Blossfeld H.P. Persistent Inequality: Changing Educational Attainment in Thirteen Countries. Social Inequality Series. Boulder: Westview Press, 1993.
Mare R.D. Change and Stability in Educational Stratification // American Sociological Review. 1981. Vol. 1. P. 72–87.
Shavit Y., Blossfeld H.-P. Persistent Inequality: Changing Educational Attainment in Thirteen Countries // Social Inequality Series. ERIC, 1993.
Lucas S.R. Effectively Maintained Inequality: Education Transitions, Track Mobility, and Social Background Effects // American Journal of Sociology. 2001. Vol. 106(6). P. 1642–1690.
Breen R., Jonsson J.O. Analyzing Educational Careers: A Multinomial Transition Model // American Sociological Review. 2000. Vol. 65(5). P. 754–772.
Cullen J.B., Jacob B.A., Levitt S.D. The Impact of School Choice on Student Outcomes: an Analysis of the Chicago Public Schools // Journal of Public Economics. 2005. Vol. 89 (5–6). P. 729–760.
Shaikhina T., Lowe D., Daga S., Briggs D., Higgins R., Khovanovaa N. Decision Tree and Random Forest Models for Outcome Prediction in Antibody Incompatible Kidney Transplantation // Biomedical Signal Processing and Control. 2017. https:// doi.org/10.1016/j.bspc.2017.01.012.
Masias V.H., Valle M.A., Amar J.J., Cervantes M., Brunal G., Crespo F.A. Characterising the Personality of the Public Safety Offender and Non-offender using Decision Trees: The Case of Colombia // Journal of Investigative Psychology and Offender Profiling. 2016. Vol. 13(3). P. 198–219.
Feldesman M.R. Classification Trees as an Alternative to Linear Discriminant Analysis // American Journal of Physical Anthropology: The Official Publication of the American Association of Physical Anthropologists. 2002. Vol. 119. P. 257–275.
Karels T.J., Bryant A.A., Hik D.S. Comparison of Discriminant Function and Classification Tree Analyses for Age Classification of Marmots // Oikos. 2004. Vol. 105(3). P. 575–587.
Guidotti R., Monreale A., Ruggieri S., Turini F. A Survey of Methods for Explaining Black Box Models // ACM Computing Surveys (CSUR). 2018. Vol. 51(5). P. 93.
Markou E. 3 Machine Learning Algorithms You Need to Know. URL: https://dzone.com/articles/3-machine-learning-algorithms-you-need-to-know (date of access: 21.11.2018).
Morgan J.N., Sonquist J.A. Problems in the Analysis of Survey Data, and a Proposal // Journal of the American Statistical Association. 1963. Vol. 58(302). P. 415–434.
Breiman L., Friedman J.H., Olshen R.A., Stone C.J. Classification and Regression Trees. New York: Chapman and Hall, 1984.
Quinlan J.R. Induction of Decision Trees // Machine Learning. 1986. Vol. 1(1). P. 81–106.
Quinlan J.R. C4.5: Programms for Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 1993.
Strobl C., Malley J., Tutz G. An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests // Psychological Methods. 2009. Vol. 14(4). P. 323–348.
Kingsford C., Salzberg S.L. What are Decision Trees? // Nature Biotechnology. 2008. Vol. 26. P. 1011–1013.
Song Y.Y., Ying L.U. Decision Tree Methods: Applications for Classification and Prediction // Shanghai Archives of Psychiatry. 2015. Vol. 27(2). P. 130–135.
Hothorn T., Hornik K., Zeileis A. Unbiased Recursive Partitioning: A Conditional Inference Framework // Journal of Computational and Graphical Statistics. 2006. Vol. 15(3). P. 651–674.
Zeileis A., Hothorn T. partykit: A Toolkit for Recursive Partytioning, 2012. URL: https://cran.r-project.org/web/packages/partykit/vignettes/partykit.pdf (date of access: 21.11.2018).
Hothorn T., Hornik K., Zeileis A. ctree: Conditional Inference Trees // The Comprehensive R Archive Network, 2015. URL: https://rdrr.io/rforge/partykit/f/inst/ doc/ctree.pdf (date of access: 21.11.2018).
Hothorn T., Hornik K., Strobl C., Zeileis A. Party: A Laboratory for Recursive Partitioning, 2010. URL: https://cran.r-project.org/web/packages/party/vignettes/party.pdf (date of access: 21.11.2018).
Golland P., Liang F., Mukherjee S., Panchenko D. Permutation Tests for Classification // International Conference on Computational Learning Theory. Springer, Berlin, Heidelberg, 2005. P. 501–515.
Therneau T.M., Atkinson E.J. An Introduction to Recursive Partitioning Using the RPART Routines, 2018. URL: https://cran.r-project.org/web/packages/rpart/ vignettes/longintro.pdf (date of access: 21.11.2018).
Фабрикант М.С. Модель-ориентированный подход к отсутствующим значениям: множественная импутация в многоуровневой регрессии посредством R (на примере анализа опросных данных) // Социология: методология, методы, математическое моделирование. 2016. № 41. С. 7–29.
Feelders A. Handling Missing Data in Trees: Surrogate Splits or Statistical Imputation? // European Conference on Principles of Data Mining and Knowledge Discovery. Springer, Berlin, Heidelberg, 1999. P. 329–334.
Janssen K.J., Donders A.R.T., Harrell Jr. F.E., Vergouwe Y., Chen Q., Grobbee D.E., Moons K.G. Missing Covariate Data in Medical Research: to Impute is Better than to Ignore // Journal of Clinical Epidemiology. 2010. Vol. 63(7). P. 721–727.
Valdiviezo H.C., Van Aelst S. Tree-based Prediction on Incomplete Data Using Imputation or Surrogate Decisions // Information Sciences. 2015. No. 311. P. 163–181.
Ball S.J. Good School/Bad School: Paradox and Fabrication // British Journal of Sociology of Education. 1997. Vol. 18 (3). P. 317–336.
Taylor C. Hierarchies and Local Markets: the Geography of the Lived Market Place in Secondary Education Provision // Journal of Education Policy. 2001. Vol. 16(3). P. 197–214.
Kristen C. School Choice and Ethnic School Segregation: Primary School Selection in Germany: Waxmann Verlag, 2003.
Shavit Y., Blossfeld H.P. Persistent Inequality: Changing Educational Attainment in Thirteen Countries. Social Inequality Series. Boulder: Westview Press, 1993.
Mare R.D. Change and Stability in Educational Stratification // American Sociological Review. 1981. Vol. 1. P. 72–87.
Shavit Y., Blossfeld H.-P. Persistent Inequality: Changing Educational Attainment in Thirteen Countries // Social Inequality Series. ERIC, 1993.
Lucas S.R. Effectively Maintained Inequality: Education Transitions, Track Mobility, and Social Background Effects // American Journal of Sociology. 2001. Vol. 106(6). P. 1642–1690.
Breen R., Jonsson J.O. Analyzing Educational Careers: A Multinomial Transition Model // American Sociological Review. 2000. Vol. 65(5). P. 754–772.
Cullen J.B., Jacob B.A., Levitt S.D. The Impact of School Choice on Student Outcomes: an Analysis of the Chicago Public Schools // Journal of Public Economics. 2005. Vol. 89 (5–6). P. 729–760.
Shaikhina T., Lowe D., Daga S., Briggs D., Higgins R., Khovanovaa N. Decision Tree and Random Forest Models for Outcome Prediction in Antibody Incompatible Kidney Transplantation // Biomedical Signal Processing and Control. 2017. https:// doi.org/10.1016/j.bspc.2017.01.012.
Masias V.H., Valle M.A., Amar J.J., Cervantes M., Brunal G., Crespo F.A. Characterising the Personality of the Public Safety Offender and Non-offender using Decision Trees: The Case of Colombia // Journal of Investigative Psychology and Offender Profiling. 2016. Vol. 13(3). P. 198–219.
Feldesman M.R. Classification Trees as an Alternative to Linear Discriminant Analysis // American Journal of Physical Anthropology: The Official Publication of the American Association of Physical Anthropologists. 2002. Vol. 119. P. 257–275.
Karels T.J., Bryant A.A., Hik D.S. Comparison of Discriminant Function and Classification Tree Analyses for Age Classification of Marmots // Oikos. 2004. Vol. 105(3). P. 575–587.
Guidotti R., Monreale A., Ruggieri S., Turini F. A Survey of Methods for Explaining Black Box Models // ACM Computing Surveys (CSUR). 2018. Vol. 51(5). P. 93.
Markou E. 3 Machine Learning Algorithms You Need to Know. URL: https://dzone.com/articles/3-machine-learning-algorithms-you-need-to-know (date of access: 21.11.2018).
Morgan J.N., Sonquist J.A. Problems in the Analysis of Survey Data, and a Proposal // Journal of the American Statistical Association. 1963. Vol. 58(302). P. 415–434.
Breiman L., Friedman J.H., Olshen R.A., Stone C.J. Classification and Regression Trees. New York: Chapman and Hall, 1984.
Quinlan J.R. Induction of Decision Trees // Machine Learning. 1986. Vol. 1(1). P. 81–106.
Quinlan J.R. C4.5: Programms for Machine Learning. San Francisco: Morgan Kaufmann Publishers Inc., 1993.
Strobl C., Malley J., Tutz G. An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests // Psychological Methods. 2009. Vol. 14(4). P. 323–348.
Kingsford C., Salzberg S.L. What are Decision Trees? // Nature Biotechnology. 2008. Vol. 26. P. 1011–1013.
Song Y.Y., Ying L.U. Decision Tree Methods: Applications for Classification and Prediction // Shanghai Archives of Psychiatry. 2015. Vol. 27(2). P. 130–135.
Hothorn T., Hornik K., Zeileis A. Unbiased Recursive Partitioning: A Conditional Inference Framework // Journal of Computational and Graphical Statistics. 2006. Vol. 15(3). P. 651–674.
Zeileis A., Hothorn T. partykit: A Toolkit for Recursive Partytioning, 2012. URL: https://cran.r-project.org/web/packages/partykit/vignettes/partykit.pdf (date of access: 21.11.2018).
Hothorn T., Hornik K., Zeileis A. ctree: Conditional Inference Trees // The Comprehensive R Archive Network, 2015. URL: https://rdrr.io/rforge/partykit/f/inst/ doc/ctree.pdf (date of access: 21.11.2018).
Hothorn T., Hornik K., Strobl C., Zeileis A. Party: A Laboratory for Recursive Partitioning, 2010. URL: https://cran.r-project.org/web/packages/party/vignettes/party.pdf (date of access: 21.11.2018).
Golland P., Liang F., Mukherjee S., Panchenko D. Permutation Tests for Classification // International Conference on Computational Learning Theory. Springer, Berlin, Heidelberg, 2005. P. 501–515.
Therneau T.M., Atkinson E.J. An Introduction to Recursive Partitioning Using the RPART Routines, 2018. URL: https://cran.r-project.org/web/packages/rpart/ vignettes/longintro.pdf (date of access: 21.11.2018).
Фабрикант М.С. Модель-ориентированный подход к отсутствующим значениям: множественная импутация в многоуровневой регрессии посредством R (на примере анализа опросных данных) // Социология: методология, методы, математическое моделирование. 2016. № 41. С. 7–29.
Feelders A. Handling Missing Data in Trees: Surrogate Splits or Statistical Imputation? // European Conference on Principles of Data Mining and Knowledge Discovery. Springer, Berlin, Heidelberg, 1999. P. 329–334.
Janssen K.J., Donders A.R.T., Harrell Jr. F.E., Vergouwe Y., Chen Q., Grobbee D.E., Moons K.G. Missing Covariate Data in Medical Research: to Impute is Better than to Ignore // Journal of Clinical Epidemiology. 2010. Vol. 63(7). P. 721–727.
Valdiviezo H.C., Van Aelst S. Tree-based Prediction on Incomplete Data Using Imputation or Surrogate Decisions // Information Sciences. 2015. No. 311. P. 163–181.
Форматы цитирования
Другие форматы цитирования:
APA
Тенишева, К. А., Савельева, С. С., & Александров, Д. А. (2018). Применение метода условных деревьев решений к моделированию выбора родителями школы. Социология: методология, методы, математическое моделирование (Социология:4М), (46), 44-84. извлечено от https://soc4m.ru/index.php/soc4m/article/view/6124
Выпуск
Раздел
ПРАКТИКИ СБОРА И АНАЛИЗА ФОРМАЛИЗОВАННЫХ ДАННЫХ