Steve Kroon
Senior Lecturer: Computer Science, Stellenbosch University.
My personal web pages are available here.
Research
For topics I'd be interested in supervising, see this page.
Broad research areas: artificial intelligence/machine learning; statistical learning theory; probability and computing.
Here's a longer list of things that interest me: computer opponents for various games, including Go, Starcraft, Angband, and Ingenious; getting a better understanding of social networks; support vector machines and kernel methods; graphical models; statistical learning theory and generalization bounds; concentration inequalities; regularization; design of probabilistic algorithms and data structures, and probabilistic analysis of algorithms/data structures.
My prime requirement of a student is mathematical maturity. A good background
in mathematics and statistics are a real bonus, and good programming skills
will make your job a lot easier. If you also have a background in machine
learning, you may be my perfect student.
If you're interested in pursuing research in any of the directions above, please contact me.
Research output
Also see the page of our Monte Carlo Tree Search research group.
- Decision Trees for Computer Go Features (with Francois van Niekerk), accepted at the 2013 International Joint Conference on Artificial Intelligence Workshop on Computer Games, August 3, Beijing. This article proposes using decision trees for extracting domain knowledge in the context of Monte Carlo Tree Search for Computer Go.
- Binary Jumbled String Matching for Highly Run-Length Compressible Texts (with Golnaz Badkobeh, Gabriele Fici, and Zsuzsanna Lipták), Information Processing Letters (2013), DOI: 10.1016/j.ipl.2013.05.007. This article presents an alternative algorithm for jumbled pattern matching on binary strings making use of a different index to previous approaches. Typically, the index is smaller than the traditional approach, but query times are no longer constant. The advantage of our approach is most pronounced for strings with short run-length encoding.
- Monte-Carlo Tree Search Parallelisation for Computer Go (with Francois van Niekerk, Gert-Jan van Rooyen, and Cornelia Inggs), Proceedings of the 2012 Annual Research Conference of the South African Institute for Computer Scientists and Information Technologists, October 2012. This article presents results of parallelisation of Monte-Carlo Tree Search for multi-core and cluster systems in the Computer Go program Oakfoam. (Some results from this paper were also presented by Francois van Niekerk at a talk at the 2012 International Go Symposium, "New Work on MCTS Parallelisation and The State of the Art of Supercomputer Go and its Future".)
- Unsupervised Construction of Topic-based Twitter Lists (with Francois de Villiers and McElory Hoffmann), Proceedings of the 2012 ASE/IEEE International Conference on Social Computing (SocialCom 2012), Amsterdam, Netherlands, September 2012. This article investigates compares different document representation techniques, similarity measures, and clustering methods for unsupervised list construction according to topics tweeted about in Twitter.
- A Community-Based Model of Online Social Networks (with Leendert Botha), Proceedings of the 4th SNA-KDD Workshop on Social Network Mining and Analysis (SNAKDD 2010), Washington D.C., USA, July 2010 (accepted). (Also a poster presentation at the Eighth Workshop on Mining and Learning with Graphs (MLG-2010), Washington D.C., USA, July 2010.) This article describes a model for generating social networks with similar properties to existing social networks.
- Generalizing the Margin Concept to Arbitrary Classifiers (with Sarel Steel), Proceedings of the 57th Session of the International Statistics Institute, Durban, South Africa, August 2009 (abstract) - This paper highlights a result from my Ph.D. thesis, which extends the margin concept from thresholding real-valued classifiers to classifiers in general metric spaces.
- A PAC-Bayesian Generalization of a Result of Devroye (with Sarel Steel), Slides from a talk presented at the South African Statistical Conference, 2008 - This presentation highlights selected results from my PhD thesis concerning classical covering number bounds: a classical covering number bound for classification is generalized to regression and arbitrary ghost sample sizes, inter alia, and it is shown the resulting bound is actually a form of PAC-Bayesian bound employing a transductive "prior".
- A Framework for Estimating Risk, Ph.D. Thesis, Stellenbosch University, 2008 - This research was related primarily to risk estimation, notably constructing confidence intervals for the risk of a fitted model from the data used to fit the model. (slides from thesis defence)
- Getting to Grips with Support Vector Machines: Theory (with Christian Omlin), South African Statistical Journal, 38(2), pp. 93-114, 2004 - An introduction to the theory underlying SVMs, based on material in my Masters thesis.
- Getting to Grips with Support Vector Machines: Application (with Christian Omlin), South African Statistical Journal, 38(2), pp. 159-172, 2004 - Illustrates the use of LIBSVM for analyzing a simple data set.
- Support Vector Machines, Generalization Bounds and Transduction, Masters Thesis, Stellenbosch University, 2003 - Surveys the theory of Support Vector Machines and Generalization Bounds, and presents a transductive generalization bound.
- Putting the SVM in context (with Sarel Steel), Slides from a talk presented at the South African Statistical Conference, 2003 - This presentation presents the SVM from the regularization viewpoint, contrasting it with other well-known regularization techniques such as the lasso.
- Bounding Generalization of Support Vector Machines (with Christian Omlin), Slides from a talk presented at the South African Statistical Conference, 2003 - This presents the core ideas behind classical covering number bounds, applied to SVMs.
- A Covering Number PAC Bound for Transductive Problems with Applications to SVMs (with Christian Omlin), Unpublished manuscript, 2003 - This contains the transductive bound for SVMs presented in my Masters thesis.
Talks at colloquiua, seminars, etc.
- The Bias-Variance Dilemma and Regularization Paths, Slides from a series of 3 seminars presented to Vision and Learning at Stellenbosch research group, October-November 2009 - Covers bias-variance decomposition of mean-squared error, the role of regularization in this context, and then investigates the relationships between forward stagewise modelling, least angle regression, and the lasso..
- Margin Bounds for Arbitrary Classifiers, Slides from a seminar presented to Vision and Learning at Stellenbosch research group, September 2009 - This presentation was a slight revision of my ISI presentation earlier in the year.
- Covering Number Bounds and Statistical Learning Theory, Slides from a seminar presented to Vision and Learning at Stellenbosch research group, July 2009 - This presentation was an introduction to core concepts in deriving covering number bounds and their relationship to basic ideas of uniform laws of large numbers in statistical learning theory.
- Bounding Generalization of Support Vector Machines, Slides from Department of Statistics Seminar series, Stellenbosch University, 2003 - This presents the core ideas behind classical covering number bounds, applied to SVMs.
- An introduction to Support Vector Machines, Slides for a Masters course in Intelligent Systems at the Department of Computer Science, Stellenbosch University, 2000 - Very simple introduction to SVMs
Graduate students
Project notes for honours students w.r.t. supervision (largely applicable to Masters and PhD students as well). (Last updated: 25 October 2012.)
Current
- Hilgard Bell (from 2012): Hilgard is working on using Monte-Carlo Tree Search trees to assess difficulty of puzzles for procedural content generation.
- Francois van Niekerk (from 2012): Francois is working on improving computer opponents for Go.
- Francois de Villiers (from 2011): Francois is working on recommender systems with social media.
Past
- James Saunders (2011): James worked on using graphical models for social-ecological systems. Now at Business Optics.
Completed
- Leendert Botha (December 2011): Modeling Online Social Networks using Quasi-clique Communities. Now at Google.
- Andre Kriek (March 2009): RoboCup Formation Modeling. Now at 4i Software Development.
Old pages
Please report any errors or inaccuracies on this page by emailing Steve Kroon.