Author: Agustín Bignú
Physicist – Machine Learning Engineer.
This is the first paper of a more ambitious project by Stening® which goal is to revolutionize the market and production in the field of medicine and custom prostheses. In this very first work we train a neural network to be able to distinguish six different kinds of devices. The results obtained are very encouraging and demonstrate the potential that this type of technology has in the future of medicine.
Key words: Machine Learning. Deep Learning. Silicone stent. Tracheal prostheses. Convolutional Neural Network.
At Stening® we were always interested in research and innovation. That is why we renew ourselves continuously. Artificial intelligence is a computer discipline that is growing at a high pace as well as its impact on medicine.
The main objectives of the study are the following:
- enhance the company’s innovative and research level
- make a difference in the sector of the market in which Stening® competes, being pioneers in this type of research
- publicize artificial intelligence and machine learning
In response to the aforementioned objectives, this paper will briefly introduce what artificial intelligence is and machine learning as well as the types of learning that underlie the discipline. We will continue with deep learning introducing neural networks, with the purpose of talking about what will be used in this study: the Convolutional Neural Network. Then we will explain how the dataset was made and why it was done that way. In turn, we will discuss the results obtained by showing some examples and graphs. Finally, the conclusions obtained and the future views of the investigation will be given.
The goal of this section is to introduce the type of technology used. In this way, we will proceed to present what is Artificial Intelligence (AI), neural networks and the type of network used for this study, the Convolutional Neural Network (CNN).
AI is a branch of computing that focuses on the realization of computer programs that aim to perform various operations and tasks that are related to human intelligence, such as self-learning.
In particular we will focus on machine learning. This discipline, within the AI, is responsible for creating all this type of programs. Within machine learning we find three well differentiated branches of learning:
- supervised learning
- unsupervised learning
- reinforcement learning
The first consists of algorithms that receive data from which they learn, since they know the correct answer. In this way, they use what they learned to make predictions. The second type of learning differs from the first since we do not know the answers, so the problem is focused in a different way. The algorithms belonging to this type of learning focus on analyzing the datasets. The third type does not use large amounts of data, like the other two, but learns based on trial and error in an environment. This type of learning is used in autonomous driving, robotics or AI applied to games.
In this study, supervised learning was used. A dataset was collected, in this case images of medical devices of six different classes. Then, the data was introduced to a neural network that analyzed the images, learning from them and being able to predict new devices of these classes. In the next section we will explain what a neural network is.
Artificial neural networks are inspired by how the brain works. Human beings are able to see an image and describe what appears in it. We are also able to listen to a song and know what genre it belongs to. We know all this because we have heard it on another occasion and we learned about it. We are applying supervised learning.
The main purpose of AI is to recreate this type of learning behavior, as stated above. To do this, algorithms like neural networks are created. This type of algorithm connects nodes called neurons. The reason for this denomination is because they try to emulate the neurons of the brain that act as information processing units. This information enters through the senses: sight, touch, etc.
Figure 1: neural network2
In Figure 1 one can see a fairly simple structure of a neural network. The circles are the neurons. The network consists of two layers, one input (the bottom) and one output (the top). In the image, the “x” is the information that we introduce in the network and the “y” are the results that we obtain from the network. The “w” are numbers that are updated called weights. These mark the learning of the network. They will be updated as the neural network trains. The network will stop training as soon as the entire dataset is processed.
It should be noted that a neural network can have more layers of neurons, not just two. Networks that have internal layers, called hidden layers, are the most used. In this case of study, a type of neural network ideal for image recognition tasks called Convolutional Neural Network is used. It will be explained in the next section.
Convolutional Neural Network3
A CNN is a type of neural network that is based on the structure and functioning of the first vision layers of the brain. In 1998, this type of network was introduced (LeCun et al.)4. A CNN consists of two main parts: one that is responsible for extracting information from the image and another that is responsible for classifying it.
Figure 2: components of a CNN 5
In the first part of this network, feature extraction, we extract the information from the image. It is performed a method called convolution. We will try to understand, in a brief way, this first part. Here we extract the main qualities of the image, in our case: device shape, position, length, etc.
The process of convolution, broadly, is to reduce the dimensions of the image in such a way that we keep the most relevant features in order to classify it correctly. This first phase contains the largest number of layers in the network.
Once the first phase is over, we move on to the second: classification. This last phase contains only two layers of neurons: the first receives the information that arrives from the first phase and passes it to the output layer that tells us what class the image belongs to.
In this section we will analyze the development of the dataset, describe the programs with which we obtained the results and study the obtained results.
We created our own dataset. Due to the demand of data to achieve a high precision in the classification, we had to proceed in an innovative way. That is, to obtain a high classification percentage (> 90%) a fair number of images per class is required (> 500). As we did not have so many photos, we proceeded to make the tubes with a 3D design program called Blender (version 2.9) obtained for free from its website (Blender: www.blender.org). All of them are based on the stents fabricated by Stening®.
The results of the designs were as follows:
Figure 3: 3D stents generated with Blender.
In Figure 3, A) HE, B) SET , C) SY13, D) SY16, E) TF12 and F ) TM.
1400 images were designed for each class, of which 1000 were used for training and the other 400 for predictions (testing). To give versatility to the images, the devices were rotated to have them in different positions and to motivate the network to find different characteristics to lean on.
Programs and results
Two programs written in Python (version 3.5) were made. One of them contains the neural network and performs the training; the other receives an image and makes the prediction (on images never seen by the neural network before). We used the Keras library for machine learning as well as OpenCV to visualize the images.
In the first program the neuronal network is located. It was built on a CNN called VGG166. It is a 16-layer neural network capable of recognizing different kinds of images. The last layers of the corresponding classification network were removed and our own layers were introduced so that we could recognize the images we wanted.
The results of the training and the predictions were the following:
Figure 4: results from training and testing and the loss
In figure 4 we see two graphs. The first shows the progression of the prediction, both during training and testing. A total of 25 epochs were made. We see in the graph that the prediction at the last epoch, both in training and testing, exceeded 90% certainty. In the second graph we have training and testing. Here we measure the error made between the prediction of the neural network and the correct result of the prediction. One can also see how in the first epoch the error is very high and as the epochs pass it decreases until it almost reaches zero.
As for the second program, it makes predictions of a single image, indicating what class it belongs to. We introduced 3D designs that the neural network had not seen, neither in training nor in testing. The results were the following:
Figure 5: examples of predictions
To conclude, the results obtained were very good, reaching a certainty of more than 90%. In the future, improvements could be made to the dataset. That is, move on to train with real photos of the devices and see results.
This type of technology is entering a boom and Stening® does not want to be left behind, this is why it is strongly committed to research and innovation in this area. Studies oriented to AI and medicine will continue to be carried out.
A special acknowledgement to Nicolás Bignú for his collaboration in the development of the dataset using the Blender software.
- E. Alpaydin. Introduction to Machine Learning, Second Edition. The MIT Press, 2010.
- E. Alpaydin. Introduction to Machine Learning, Second Edition. The MIT Press, 2010. Fig. 11.2.
- Convolutional Neural Networks, https://medium.com/x8-the-ai-community/cnn-9c5e63703c3f
- Y. LeCun, L. Bottou, Y. Bengio and P. Haffner. Gradient-Based Learning Applied to Document Recognition. Proc. Of The IEEE, 1998.
- ResearchGate, https://www.researchgate.net/figure/An-Example-CNN-architecture-for-a-handwritten-digit-recognition-task_fig1_220785200 Fig. 1.
- K. Simonyan and A. Zisserman. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv:1409.1556.