# Honors Project Topics¶

Find below a list of projects that are available under my supervision. You are welcome to contact me if you want more detail on these topics.

### AE1: Multi-objective Control Parameter Tuning¶

Just about all optimization and machine learning algorithms have control parameters. The bad thing about these problem parameters is that performance of the algorithm is very sensitive to different control parameter values, and best values are usually very problem dependent. The consequence is that, if one wants to apply an optimized instance of an algorithm, tuning of these control parameters is required. A number of different approaches to control parameter tuning can be found, including factorial design, f-race, iterated f-race, amongst others. However, these approaches tune control parameters with respect to one performance criterion, usually either accuracy or computational effort.

This study will develop a multi-objective approach to control parameter tuning, considering a number of performance criteria, including accuracy, efficiency, robustness, and stability. A very simple approach will be to evaluate performance for a range of parameter values with respect to each of the perfromance criteria. Then to rank performance for the different control parameter values per performance criterion, and then compute an average rank over all of the performance criteria.

A second approach will be developed to use non-dominance based ranking of algorithms based on the performance measures under evaluation.

An alternative will be to adapt and approach such as f-race to drill down to the most optimal regions of the control parameters using the non-dominance relation with respect to all of the performance criteria.

In addition, it is also required to visualize how performance of an algorithm changes for different values of the control parameters, over different axes, one axis per performance measure.

This is a research project, with a significant amount of empirical analysis and coding. The project has very good potential to result in a publication. The control parameter tuning approach need to interface with CIlib, an opensource library of Computational Intelligence algorithms developed in scala. However, it should be general enough to also interface with any other algorithm or algorithm framework.

Prerequisite: Statistics, and students are advised to do Artificial Intelligence 791.

### AE2: Particle Swarm Optimization: Which Variations Are State-of-the-art?¶

Many particle swarm optimization (PSO) variants have been developed, each with varying degrees of performance on different problem landscapes. With so many variants, the question is raised as to which PSO algorithm(s) can be considered as the state-of-the-art. This question becomes increasingly important when a new variant is introduced, or when other optimization algorithms are compared with PSO. In such cases, it is necessary for these new variants and algorithms to be compared with the state-of-the-art. Various opinions are raised in the literature as to which PSO algorithms are best. However, these opinions are mostly based on a very limited empirical analysis on a small set of benchmark functions, and in comparison with very few other PSOs.

This study will initiate a process to rank PSO algorithms with reference to the performance of a base (control) algorithm, on a benchmark suite of problems that provide a good coverage of all fitness landscape characteristics. As a fi rst step, a statistically sound ranking approach will be proposed, and a set of performance measures will be defined. Note that multiple performance measures will be considered. This will require ranking on individual measures, and a multi-objective ranking over all of the measures. The focus will be on single-objective, static, boundary constrained continuous-valued optimization problems, and the outcomes of a recent fitness landscape analysis study of such benchmark problems will be used to identify problems ranging in levels of diculty. The third step will then be to start with the most frequently referenced PSO algorithms, and to empirically analyze their performance, and provide a ranking of these algorithms. As an outcome a results repository will be developed and populated with performance results.

This is a research project, with some software engineering requirements in the development of the ranking system and the results repository. Code will be developed as part of a pyhton computational intelligence library.

Prerequisite: Statistics, and students are advised to do Artificial Intelligence 791.

### AE3: Python Computational Intelligence library¶

There exists a variety of computational intelligence libraries. Some are extremely popular, such as TensorFlow and PyTorch. However, the ecosystem for nature-inspired optimization algorithms is fragmented and underdeveloped. For example, a library may support only multi-objective optimization but not constrained optimization, while another library supports the opposite. Some libraries only support a specific computational intelligence paradigm, such as genetic algorithms or particle swarm optimization. Some libraries are not easily extendable which makes integration with other code difficult. Libraries may also lack documentation, rigorous testing, or have not received updates in months or years. For these reasons, using these libraries for research can be difficult.

The objective of this Honours project is to develop a new computational intelligence library guided by high-level but practical software design principles. The library will be developed in Python with a focus on nature-inspired optimization algorithms and the tools related to those algorithms. The library will be very generic to allow for single-solution problems, multi-solution problems, single objectives, multi- and many-objectives, static objective functions, dynamically changing search landscapes, boundary constrained and constrained problems, single population and multi-population algorithms, and hyper-heuristic algorithms. Ideas and code for the library can be adapted from the existing Scala-based CIlib project.

A very important consideration in the design of the library will be interaction with future planned empirical analysis libraries, results repositories, and fitness landscape analysis libraries.

This work will also include an analysis of existing libraries and identify areas that the proposed library aims to solve. The success of this project will be determined by the usability of the proposed library and feedback from fellow students and supervisors.

Prerequisite: Though this is a software engineering project, students are advised to do Artificial Intelligence 791.

### AE4: Large-Scale Multi-Modal Particle Optimization¶

Multi-modal optimization (MMO) refers approaches to find multiple solutions within multi-modal search landscapes. Traditionally algorithms with the ability to find multiple solutions are referred to as niching algorithms. Thus far, MMO algorithms have been developed and evaluated mostly on small scale optimization problems. A number of MMO particle swarm optimisation (PSO) algorithms have been developed. This study will analyze the scalability of these algorithms to large scale optimization problems (LSOPs), and will develop approaches to improve their scalability. These approaches will be informed by recent studies by our research group into the reasons as to why PSO does not scale well, and algorithms will be developed to incorporate recent decomposition-based approaches. These decomposition-based approaches follows a divide-and-conquer approach towards solving LSOPs, by dividing a large-dimensional optimization problem into several smaller-dimensional problems. These smaller-dimensional problems are then solved, and a fusion approach implemented to combine optimal solutions from these smaller-dimensional problems. The scalability study will include both scalability with reference to problem dimensionality and number of optima in the search landscape.

This is a research project, with development to be done in a python computational intelligence library.

Prerequisite: Students are advised to do Artificial Intelligence 791.

### AE5: Design and Development of an Automated System for Medical Diagnosis from X-Rays¶

Diagnosis of dental anomalies and diseases is experiencing severe strain across Afric, due to lack of sufficient infrastructure and dental radiologist expertise. Public hospitals across Africa are understaffed. People therefore do not have access to timely diagnosis of their X-rays which can lead to severe dscomfort, pain, and even fatalities. Recently, we have developed computer vision and deep learning approaches to automate the diagnosis of dental diseases and anomalies from panoramic X-rays. This work has led to the design of a more generic system to automate disease diagnoses from X-rays in general. The next stage of the research includes developing the software to maintain and deploy the models for various disease diagnoses from different types of X-rays. The purpose of this project is to refine the current architecture desig to be more generic, and then to implement and deploy this system in a cloud-based environment. This project involves the design and development of a system that allows a medical professional to automatically train and deploy computer vision models for diagnostics from X-rays into a software application.

The systems consists of two main components: A training component, where different convolutional neural networks are trained to diganose different medical conditions, and a prediction component, where a new X-ray is provided as input and routed via the appropriate classifiers to make a diagnosis. For both components, images are routed through a pre-processing component, and then on towards a convolutional neural network used to predict if the X-ray image is workable or not. The later determines if the image is of sufficient quality for automated analysis. For the training component, an annotation tool needs to be developed so that features of interest can be labelled and used for training. The system must be deployed on a scalable cloud-based system.

Already developed and available is the architecture design of the system, with a single X-ray dataset type in mind, for example panoramic X-rays. Convolutional neural network code is available for a few dental anomalies and diseases, and a basic ReactJS frontend. Python code to facilitate model deployment has also been developed. This project will require implementation of the full architecture for X-rays in general, and not for specific types of X-rays, databases for storage of images, docker images for a scalable architecture, and the image annotation tool. The training pipeline has to be implemented, which should also include transfer learning and one-shot learning approaches. A user interface has to be developed, and a reporting component needs to be implemented.

The project will largely be based on using AWS services. Different services are glued together using python code that either sits on an EC2 instance or a lambda function. Research will need to go into how we can scale the AWS services to account for many different applications for different users.

The technologies that you will work with includes python for the backend, JavaScript for the frontend, html, CSS, and YAML for configuration files. In addition the following AWS services will be used for cloud deployment: SQS, s3, lambda functions, ECS, ECR, DynamoDB, Amplify, EC2 (with autoscaling). Then ReactJS and docker/AMI (for containerization of images).

Requirements: This is a software engineering project, with some research components, specifically in the development and testing of the convolutional neural networks, transfer learning and one-shot learning. You should have been exposed to a number of the above technologies.

### AE6: Competitive Coevolutionary Particle Swarm Optimziation in Dynamic Environments¶

Competitive coevolution (CCE) models the arms race that is observed in competing species in nature, e.g. predator-prey relationships. Competitive coevolutionary particle swarm optimization (CCEPSO) algorithms have been used successfully to solve a range of optimization problems where the environment (search landscape) is static, including training of NNs. Since coevolution in nature is a dynamic process, it is intuitively believed that the CCE models should eciently be applied to DOPs. This study will investigate to what extend the CCEPSO algorithms can track optima in dynamic environments of various types. We have recently identi ed 27 different classes of DOPs, and this study will investigate performance on these classes of optimization problems. As a fi rst step, a review of CCE will be done, focusing on its application in optimization algorithms and to solve optimization problems. Then, a CCEPSO algorithm will be developed to first solve static optimization problems. As part of this study, different relative fi tness measures and competion pool selection strategies will be investigated and analyzed. The study will then proceed to apply the CCEPSO to the different classes of dynamic optimization. The performance of the CCEPSO will be analyzed in depth. If the CCEPSO struggles, or prematurely converges as was observed for NN training, reasons for this will be investigated and solutions developed and further analyzed. The performance of the CCEPSO will then be compared with the performance of other state-of-the-art PSO algorithms for DOPs.

This is a research project.

Prerequisite: Students are advised to do Artificial Intelligence 791.

### AE7: Incremental Feature Learning¶

Incremental feature learning refers to problems where the number of features increases over time. More generally, it may also refer to incrementally adding features, from the current set of available features, to the predictive model during the training process. As new features are added to the predictive model, in this case we will consider a neural networks, focus is given to training the new added weights, while still refining the older weights. Incremental feature learning is ultimately a dynamic optimization problem, in the sense that the search landscape dimensionality change over time with the addition of new input units, hidden units, and weights. A training algorithm should therefore be able to cope with changing search landscapes.

This project will develop a dynamic particle swarm optimization (dPSO) neural network training algorithm for neural network incremental feature learning. A feature ranking approach will be used to rank features based feature importance, and features will be added based on feature rank. Aspects to consider are different feature ranking approaches, when to add a new feature, when to grow the neural network architecture, how to prevent overfitting, when to stop adding incremental features, and lastly how to expand the dPSO particle position and velocity vectors to cope with growing neural network architectures.

This is a research project.

Prerequisite: Students are advised to do Artificial Intelligence 791 and Machine Learning 741.

### AE8: Research Manager¶

When doing research on a specific topic, for example particle swarm optimization (PSO), one of the first tasks that the researcher has to do is to find as many as possible research papers on PSO, to organize these in some way, and then to read and possibly annotated the papers with keywords/key phrases or to summarize the papers. This project will develop a web-based tool to help the researcher in optimizing these processes. The research manager must include the following functionalities: * Provided a topic, crawl the web to find as many as possible research papers and theses on that topic. * Papers have to be tagged uniquely, and it should be possible to view papers in different orders/categories, for example, * based on year of publication * alphabetically based on first author * based on country of authors, or author institutions * based on sub-topics, e.g. multi-modal optimization, constrained optimization, multi-objective optimization, etc. * based on article type, i.e. journal article, conference article, thesis * based on whether the paper is theoretical or application-based * based on application * Bibliometric information should be extracted, for example, * number of publications on the topic or sub-topic per year * list of authors and there details for a specific topic or sub-topic * number of publications per author, country, institution on a topic or sub-topic * Automate the process to extract sub-topic maps from a collection of papers, where the collection can be specified on different granularity levels, for example, * over all documents * over all documents per year * over all documents for specific author * Automated keyword extraction from the collected papers. * Facility for the reader to tag each paper with user-defined keywords or phrases * A search facility to find all papers in a specified collection of papers that matches a given set of keywords or key phrases * A process to auto-generate Bibtex files for the papers, and to provide ordered bibliograhies for papers in different formats suchs as html or markdown. * Allowance for different main topics, e.g. particle swarm optimization as one topic and genetic algorithms as another topic. Note that some papers may address more than one main topic. * An approach to cluster a collection of papers, and then to analyze these clusters. * A timeline of papers published on different sub-topics within a given main topic.

The developed research manager can therefore be used to support the researcher, to provide bibliometric information, and to obtain an overview of the broad research topic.

This is a software engineering project. The use case will be on particle swarm optimization, and therefore it will be of benefit if you do do Artificial Intelligence 791.

### AE9: Deep Learning Approaches for Anomaly Detection in Dental Panoramic X-Rays¶

The current approach followed by orthodontic radiologists to identify teeth abnormalities is through manual inspection of teeth X-rays. This is a time consuming process, and often leads to incorrect diagnosis, or even missing subtle abnormalities. A recent project has developed a prototype to predict teeth anomalies using convolutional neural networks (CNNs). The approach followed was to develop a prediction model for each separate anomaly. This project will develop and evaluate alternative approaches to improve the performance of the current prototype. The following approaches will be implemented and evaluated: * Transfer learning from convolutional neural network trained to detect a specific anomaly, to develop additional classifiers for other anomalies * Development of multi-class convolutional neural networks to predict multiple anomalies * Stacked convolutional neural networks for multi-anomaly prediction The nature of this application is that infrequently occuring anomalies will not have many X-ray images to use for training. In addition to the above approaches, few-shot learning will be investigated for diagnosis of rare diseases.

This is a research project.

### AE10: Optimization Algorithm Behavior comparison¶

In the nature-inspired optimization algorithm research field, there is a proliferation of optimization algorithms inspired from some natural phenomenon. Examples of such algorithms include particle swarm optimization, evolutionary algorithms, artifical bee colony optimization, ant colony optimization, firefly optimization, fish school optimization salp swarm optimization, krill herd optimization, grey wolf optimization, Harris hawk optimization, to name a few. A recent survey has identified just over 150 swarm-based optimization algorithms. When the mathematical and algorithm logic of these algorithms are analyzed, then it becomes clear that many of these "new" nature-inspired algorithms are basically the same, with the main difference lying in the nature-inspired analogy. The research field of nature-inspired optimization can very much be likened to a box full of smarties, where different smarties have different colours, but in the end each is simply a smartie, with the only difference in the colour.

This project will investigate and develop an approach to evaluate the search behavior of a collection of these nature-inspired optimization problems. All these algorihtms are population-based where a number of search positions are randomly initialized throughout the search space. Over time, these positions are adjusted, converging towards one point in the search space -- hopefully a good optimum. When looking at the search trajectories of each one of these solutions, collectively the search trajectories can be seen as a graph, with root node the point (or small region) to which all the solutions converges. The idea is then to construct search trajectory graphs for the different algorithms, and then to determine similarity among these trajectory graphs. This approach will only work if all of the algorithms use the same random seed to initialize the population of individuals. In addition to the comparisons of the search behavior through analysis of the trajectory graphs, a formal comparison of the position update equations will also be done to find the common components of these algorithms, and the aspects where they differ. The objective here is to see how the different optimization algorithms considered in this study can be factorized to the same general form.

This is a research project with strong publication potential.

Prerequisite: Students are advised to do Artificial Intelligence 791