{"id":229445,"date":"2021-04-17T19:00:28","date_gmt":"2021-04-17T16:00:28","guid":{"rendered":"https:\/\/en.buradabiliyorum.com\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/"},"modified":"2021-04-17T19:00:28","modified_gmt":"2021-04-17T16:00:28","slug":"new-to-computer-vision-and-medical-imaging-start-with-these-10-projects","status":"publish","type":"post","link":"https:\/\/buradabiliyorum.com\/en\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/","title":{"rendered":"#New to computer vision and medical imaging? Start with these 10 projects"},"content":{"rendered":"<div id=\"ez-toc-container\" class=\"ez-toc-v2_0_84 counter-hierarchy ez-toc-counter ez-toc-custom ez-toc-container-direction\">\n<p class=\"ez-toc-title\" style=\"cursor:inherit\">Table of Contents<\/p>\n<label for=\"ez-toc-cssicon-toggle-item-6a295139b9880\" class=\"ez-toc-cssicon-toggle-label\"><span class=\"\"><span class=\"eztoc-hide\" style=\"display:none;\">Toggle<\/span><span class=\"ez-toc-icon-toggle-span\"><svg style=\"fill: #dd3333;color:#dd3333\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" class=\"list-377408\" width=\"20px\" height=\"20px\" viewBox=\"0 0 24 24\" fill=\"none\"><path d=\"M6 6H4v2h2V6zm14 0H8v2h12V6zM4 11h2v2H4v-2zm16 0H8v2h12v-2zM4 16h2v2H4v-2zm16 0H8v2h12v-2z\" fill=\"currentColor\"><\/path><\/svg><svg style=\"fill: #dd3333;color:#dd3333\" class=\"arrow-unsorted-368013\" xmlns=\"http:\/\/www.w3.org\/2000\/svg\" width=\"10px\" height=\"10px\" viewBox=\"0 0 24 24\" version=\"1.2\" baseProfile=\"tiny\"><path d=\"M18.2 9.3l-6.2-6.3-6.2 6.3c-.2.2-.3.4-.3.7s.1.5.3.7c.2.2.4.3.7.3h11c.3 0 .5-.1.7-.3.2-.2.3-.5.3-.7s-.1-.5-.3-.7zM5.8 14.7l6.2 6.3 6.2-6.3c.2-.2.3-.5.3-.7s-.1-.5-.3-.7c-.2-.2-.4-.3-.7-.3h-11c-.3 0-.5.1-.7.3-.2.2-.3.5-.3.7s.1.5.3.7z\"\/><\/svg><\/span><\/span><\/label><input type=\"checkbox\"  id=\"ez-toc-cssicon-toggle-item-6a295139b9880\" checked aria-label=\"Toggle\" \/><nav><ul class='ez-toc-list ez-toc-list-level-1 ' ><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-1\" href=\"https:\/\/buradabiliyorum.com\/en\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/#Project_1_MNIST_and_Fashion_MNIST_for_Image_Classification_Level_Easy\" >Project 1: MNIST and Fashion MNIST for Image Classification (Level: Easy)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-2\" href=\"https:\/\/buradabiliyorum.com\/en\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/#Project_2_Pathology_Classification_for_Medical_Images_Level_Easy\" >Project 2: Pathology Classification for Medical Images (Level: Easy)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-3\" href=\"https:\/\/buradabiliyorum.com\/en\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/#Project_3_AI_Explainability_for_Multi-label_Image_Classification_Level_Easy\" >Project 3: AI Explainability for Multi-label Image Classification (Level: Easy)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-4\" href=\"https:\/\/buradabiliyorum.com\/en\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/#Project_4_Transfer_learning_for_2D_Bounding_box_detection_on_new_objects_Level_Medium\" >Project 4: Transfer learning for 2D Bounding box detection on new objects (Level: Medium)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-5\" href=\"https:\/\/buradabiliyorum.com\/en\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/#Project_5_Personalized_Medicine_and_Explainability_Level_Medium\" >Project 5: Personalized Medicine and Explainability (Level: Medium)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-6\" href=\"https:\/\/buradabiliyorum.com\/en\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/#Project_6_Point_cloud_segmentation_for_object_detection_Level_Hard\" >Project 6: Point cloud segmentation for object detection. (Level: Hard)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-7\" href=\"https:\/\/buradabiliyorum.com\/en\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/#Project_7_Image_semantic_segmentation_using_U-net_for_binary_and_multi-class_Medium\" >Project 7: Image semantic segmentation using U-net for binary and multi-class. (Medium)<\/a><\/li><li class='ez-toc-page-1 ez-toc-heading-level-2'><a class=\"ez-toc-link ez-toc-heading-8\" href=\"https:\/\/buradabiliyorum.com\/en\/new-to-computer-vision-and-medical-imaging-start-with-these-10-projects\/#Project_8_Machine_Translation_for_Posture_and_Intention_Classification_Level_Hard\" >Project 8: Machine Translation for Posture and Intention Classification (Level: Hard)<\/a><\/li><\/ul><\/nav><\/div>\n<p>&#8220;<strong>#New to computer vision and medical imaging? Start with these 10 projects<\/strong>&#8221;<\/p>\n<div style=\"text-align: left;\">\n<p id=\"8e19\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">(AI) and computer <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/sciencee\/\" data-internallinksmanager029f6b8e52c=\"5\" title=\"Science\" target=\"_blank\" rel=\"noopener\">science<\/a> that enables automated systems to see, i.e. to process images and video in<span id=\"rmm\"><span>\u00a0<\/span><\/span>a human-like manner to detect and identify objects or regions of importance, predict an outcome or even alter the image to a desired format [1]. Most popular use cases in the CV domain include automated perception for autonomous drive, augmented and virtual realities (AR, VR) for simulations, <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/game\/\" data-internallinksmanager029f6b8e52c=\"7\" title=\"Game\" target=\"_blank\" rel=\"noopener\">game<\/a>s, glasses, reality, and fashion or beauty-oriented e-commerce.<\/p>\n<p class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">Medical image (MI) processing on the other hand involves much more detailed analysis of medical images that are typically grayscale such as MRI, CT, or X-ray images for automated pathology detection, a task that requires a trained specialist\u2019s eye for detection. Most popular use cases in the MI domain include automated pathology labeling, localization, association with treatment or prognostics, and personalized medicine.<\/p>\n<p id=\"effa\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">Prior to the advent of deep learning methods, 2D signal processing solutions such as image filtering, wavelet transforms, image registration, followed by classification models [2\u20133] were heavily <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/download-scripts-themes-apps\/\" data-internallinksmanager029f6b8e52c=\"9\" title=\"Download Scripts &amp; Themes &amp; Apps\" target=\"_blank\" rel=\"noopener\">app<\/a>lied for solution frameworks. Signal processing solutions still continue to be the top choice for model baselining owing to their low latency and high <a href=\"https:\/\/buradabiliyorum.com\/en\/category\/general\/\" data-internallinksmanager029f6b8e52c=\"3\" title=\"General\" target=\"_blank\" rel=\"noopener\">general<\/a>izability across data sets.<\/p>\n<p class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">However, deep learning solutions and frameworks have emerged as a new favorite owing to the end-to-end nature that eliminates the need for feature engineering, feature selection and output thresholding altogether. In this tutorial, we will review \u201c<em class=\"ko\">Top 10\u201d<\/em><span>\u00a0project<\/span> choices for<span>\u00a0<\/span><em class=\"ko\">beginners<span>\u00a0<\/span><\/em>in the fields of CV and MI and provide examples with data and starter code to aid self-paced learning.<\/p>\n<p id=\"9e68\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">CV and MI solution frameworks can be analyzed in three segments:<span>\u00a0<\/span><em class=\"ko\">Data, Process,<\/em><span>\u00a0<\/span>and<span>\u00a0<\/span><em class=\"ko\">Outcomes<span>\u00a0<\/span><\/em>[4]. It is important to always visualize the<span>\u00a0<\/span><em class=\"ko\">data<\/em>\u00a0required for such solution frameworks to have the format \u201c{X,Y}\u201d, where X represents the image\/video data and Y represents the data target or labels. While naturally occurring unlabelled images and video sequences (X) can be plentiful, acquiring accurate labels (Y) can be an expensive process. With the advent of several data annotation platforms such as [5\u20137], images and videos can be labeled for each use case.<\/p>\n<p id=\"fc90\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">Since deep learning models typically rely on large volumes of annotated data to automatically learn features for subsequent detection tasks, the CV and MI domains often suffer from the \u201c<a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/blog.fourthbrain.ai\/check-out-our-graduates-final-projects\">small data challenge<\/a>\u201d, wherein the number of samples available for training a machine learning model is several orders lesser than the number of model parameters.<\/p>\n<p class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">The \u201csmall data challenge\u201d if unaddressed can lead to overfit or underfit models that may not generalize to new unseen test data sets. Thus, the<span>\u00a0<\/span><em class=\"ko\">process<\/em>\u00a0of designing a solution framework for CV and MI domains must always include model complexity constraints, wherein models with fewer parameters are typically preferred to prevent model underfitting.<\/p>\n<p class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">Finally, the solution framework outcomes are analyzed both qualitatively through visualization solutions and quantitatively in terms of well-known metrics such as precision, recall, accuracy, and F1 or Dice coefficients [8\u20139].<\/p>\n<p id=\"d705\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">The projects listed below present a variety in difficulty levels (difficulty levels\u00a0<em class=\"ko\">Easy, Medium, Hard<\/em>) with respect to data pre-processing and model building. Also, these projects represent a variety of use cases that are currently prevailing in the research and engineering communities. The projects are defined in terms of the:<span>\u00a0<\/span><em class=\"ko\">Goal, Methods,<\/em><span>\u00a0<\/span>and<span>\u00a0<\/span><em class=\"ko\">Results<\/em>.<\/p>\n<h2 id=\"a081\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\"><span class=\"ez-toc-section\" id=\"Project_1_MNIST_and_Fashion_MNIST_for_Image_Classification_Level_Easy\"><\/span><strong class=\"ju gn\">Project 1: MNIST and Fashion MNIST for Image Classification (Level: Easy)<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p id=\"a133\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Goal:<span>\u00a0<\/span><\/em>To process images (X) of size [28\u00d728] pixels and classify them into one of the 10 output categories (Y). For the MNIST data set, the input images are handwritten digits in the range 0 to 9 [10]. The training and test data sets contain 60,000 and 10,000 labeled images, respectively. Inspired by the handwritten digit recognition problem, another data set called the Fashion MNIST data set was launched [11] where the goal is to classify images (of size [28\u00d728]) into clothing categories as shown in Fig. 1.<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"iy iz am ja v jb\" tabindex=\"0\" role=\"button\">\n<div class=\"fs ft kq\">\n<div class=\"jh s am ji\">\n<div class=\"kw jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/60\/0*wfYSrPZOQsMaAnvd?q=20\" alt=\"\" width=\"1154\" height=\"590\" srcset=\"\"\/><\/div>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/1154\/0*wfYSrPZOQsMaAnvd\" alt=\"\" width=\"1154\" height=\"590\" srcset=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig 1: The MNIST and Fashion MNIST data sets with 10 output categories each. (Image by Author)\" data-title=\"Share Fig 1: The MNIST and Fashion MNIST data sets with 10 output categories each. (Image by Author) on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig 1: The MNIST and Fashion MNIST data sets with 10 output categories each. (Image by Author) on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig 1: The MNIST and Fashion MNIST data sets with 10 output categories each. (Image by Author)<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div><figcaption class=\"jo jp fu fs ft jq jr az b ba bb dt\" data-selectable-paragraph=\"\"\/><\/figure>\n<p id=\"ecd0\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Methods:<span>\u00a0<\/span><\/em>When the input image is small ([28\u00d728] pixels) and images are grayscale, convolutional neural network (CNN) models, where the number of convolutional layers can vary from single to several layers are suitable classification models. An example of MNIST classification model build using Keras is presented in the colab file:<\/p>\n<p id=\"8830\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/colab.research.google.com\/github\/AviatorMoser\/keras-mnist-tutorial\/blob\/master\/MNIST%20in%20Keras.ipynb#scrollTo=uPtlBJoPhI9f\">MNIST colab file<\/a><\/p>\n<p id=\"5e94\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">Another example of classification on the Fashion MNIST data set is shown in:<\/p>\n<p id=\"ab0e\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/colab.research.google.com\/github\/tensorflow\/tpu\/blob\/master\/tools\/colab\/fashion_mnist.ipynb#scrollTo=DkYyndj8oO24\">Fashion MNIST Colab file<\/a><\/p>\n<p id=\"fcf1\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">In both instances, the key parameters to tune include the number of layers, dropout, optimizer (Adaptive optimizers preferred), learning rate, and kernel size as seen in the code below. Since this is a multi-class problem, the \u2018softmax\u2019 activation function is used in the final layer to ensure only 1 output neuron gets weighted more than the others.<\/p>\n<figure class=\"kr ks kt ku kv ix\"><figcaption class=\"jo jp fu fs ft jq jr az b ba bb dt\"\/><\/figure>\n<p id=\"eaaa\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Results<\/em>: As the number of convolutional layers increases from 1\u201310, the classification accuracy is found to increase as well. The MNIST data set is well studied in literature with test accuracies in the range of 96\u201399%. For the Fashion MNIST data set, test accuracies are typically in the range 90\u201396%. An example of visualization of the MNIST classification outcome using CNN models is shown in Fig 2 below (<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/www.cs.ryerson.ca\/~aharley\/vis\/conv\/\">See visualization at front end here<\/a>).<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"iy iz am ja v jb\" tabindex=\"0\" role=\"button\">\n<div class=\"fs ft ky\">\n<div class=\"jh s am ji\">\n<div class=\"kz jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/60\/0*isLs9bvkahccUAwy?q=20\" alt=\"\" width=\"1600\" height=\"759\" srcset=\"\"\/><\/div>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/1600\/0*isLs9bvkahccUAwy\" alt=\"\" width=\"1600\" height=\"759\" srcset=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig. 2: Example of visualizing the outcome of CNN model for MNIST data. The input is shown in the top left corner and the respective layer activations are shown. The final result is between 5 and 8.\" data-title=\"Share Fig. 2: Example of visualizing the outcome of CNN model for MNIST data. The input is shown in the top left corner and the respective layer activations are shown. The final result is between 5 and 8. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig. 2: Example of visualizing the outcome of CNN model for MNIST data. The input is shown in the top left corner and the respective layer activations are shown. The final result is between 5 and 8. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig. 2: Example of visualizing the outcome of CNN model for MNIST data. The input is shown in the top left corner and the respective layer activations are shown. The final result is between 5 and 8.<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div><figcaption class=\"jo jp fu fs ft jq jr az b ba bb dt\" data-selectable-paragraph=\"\"\/><\/figure>\n<h2 id=\"d8fa\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\"><span class=\"ez-toc-section\" id=\"Project_2_Pathology_Classification_for_Medical_Images_Level_Easy\"><\/span><strong class=\"ju gn\">Project 2: Pathology Classification for Medical Images (Level: Easy)<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p id=\"5293\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Goal:<span>\u00a0<\/span><\/em>To classify medical images (acquired using Optical Coherence Tomography, OCT) as Normal, Diabetic Macular Edema (DME), Drusen, choroidal neovascularization (CNV) as shown in [12]. The data set contains about 84,000 training images and about 1,000 test images with labels and each image has a width of 800 to 1,000 pixels as shown in Fig 2.<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"iy iz am ja v jb\" tabindex=\"0\" role=\"button\">\n<div class=\"fs ft la\">\n<div class=\"jh s am ji\">\n<div class=\"lb jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\">\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf aligncenter lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/60\/0*OTGPfd_7jLM0Ivwu?q=20\" alt=\"\" width=\"708\" height=\"543\" srcset=\"\"\/><\/figure>\n<\/div>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/708\/0*OTGPfd_7jLM0Ivwu\" alt=\"\" width=\"708\" height=\"543\" srcset=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig 2: Examples of OCT images from the Kaggle Dataset in [12].\" data-title=\"Share Fig 2: Examples of OCT images from the Kaggle Dataset in [12]. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig 2: Examples of OCT images from the Kaggle Dataset in [12]. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig 2: Examples of OCT images from the Kaggle Dataset in [12].<\/figcaption><\/figure>\n<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div><figcaption class=\"jo jp fu fs ft jq jr az b ba bb dt\" data-selectable-paragraph=\"\"\/><\/figure>\n<p id=\"9dc2\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Methods:<span>\u00a0<\/span><\/em>Deep CNN models such as Resnet and CapsuleNet [12] have been applied to classify this data set. The data needs to be resized to [512\u00d7512] or [256\u00d7256] to be fed to standard classification models. Since medical images have lesser variations in object categories per image frame when compared to non-medical outdoor and indoor images, the number of medical images required to train large CNN models is found to be significantly lesser than the number of non-medical images.<\/p>\n<p class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">The work in [12] and the<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/github.com\/anoopsanka\/retinal_oct\/blob\/main\/notebooks\/11-Experiments_on_Supervised_Model.ipynb\">OCT code base<\/a><span>\u00a0<\/span>demonstrates retraining the ResNet layer for transfer learning and classification of test images. The parameters to be tuned here include optimizer, learning rate, size of input images, and number of dense layers at the end of the ResNet layer.<\/p>\n<p id=\"85e5\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Results:<span>\u00a0<\/span><\/em>For the ResNet model test accuracy can vary between 94\u201399% by varying the number of training images as shown in [12]. Fig 3. qualitatively demonstrates the performance of the classification model.<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"iy iz am ja v jb\" tabindex=\"0\" role=\"button\">\n<div class=\"fs ft lc\">\n<div class=\"jh s am ji\">\n<div class=\"ld jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/60\/0*rhTwsJtH5uOc1ZLL?q=20\" alt=\"\" width=\"950\" height=\"477\" srcset=\"\"\/><\/div>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/950\/0*rhTwsJtH5uOc1ZLL\" alt=\"\" width=\"950\" height=\"477\" srcset=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig. 3: Regions of interest (ROIs) for each pathology superimposed on the original image using the\u00a0Gradcam library\u00a0in python. (Image by author)\" data-title=\"Share Fig. 3: Regions of interest (ROIs) for each pathology superimposed on the original image using the\u00a0Gradcam library\u00a0in python. (Image by author) on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig. 3: Regions of interest (ROIs) for each pathology superimposed on the original image using the\u00a0Gradcam library\u00a0in python. (Image by author) on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig. 3: Regions of interest (ROIs) for each pathology superimposed on the original image using the\u00a0Gradcam library\u00a0in python. (Image by author)<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div><figcaption class=\"jo jp fu fs ft jq jr az b ba bb dt\" data-selectable-paragraph=\"\"\/><\/figure>\n<p id=\"d12a\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">These visualizations are produced using the<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/keras.io\/examples\/vision\/grad_cam\/\">Gradcam library<\/a><span>\u00a0<\/span>that combines the CNN layer activations onto the original image to understand the regions of interest, or automatically detected features of importance, for the classification task. Usage of Gradcam using the<span>\u00a0<\/span><em class=\"ko\">tf_explain<\/em>\u00a0library is shown below.<\/p>\n<h2 id=\"70cc\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\"><span class=\"ez-toc-section\" id=\"Project_3_AI_Explainability_for_Multi-label_Image_Classification_Level_Easy\"><\/span><strong class=\"ju gn\">Project 3: AI Explainability for Multi-label Image Classification (Level: Easy)<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p id=\"d157\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Goal:<span>\u00a0<\/span><\/em>CNN models enable end-to-end delivery, which means there is no need to engineer and rank features for classification and the model outcome is the desired process outcome. However, it is often important to visualize and explain CNN model performances as shown in later parts of Project 2.<\/p>\n<p class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">Some well-known visualization and explainability libraries are<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/tf-explain.readthedocs.io\/en\/latest\/\">tf_explain<\/a><span>\u00a0<\/span>and<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/homes.cs.washington.edu\/~marcotcr\/blog\/lime\/\">Local Interpretable Model-Agnostic Explanations (LIME)<\/a>. In this project, the goal is to achieve multi-label classification and explain what the CNN model is seeing as features to classify images in a particular way. In this case, we consider a multi-label scenario wherein one image can contain multiple objects, for example cat and a dog in<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/colab.research.google.com\/github\/arteagac\/arteagac.github.io\/blob\/master\/blog\/lime_image.ipynb#scrollTo=8fPvSbn0woWP\">Colab for LIME<\/a>.<\/p>\n<p id=\"6246\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">Here, the input is images with cat and dog in it and the goal is to identify which regions correspond to a cat or dog respectively.<\/p>\n<p id=\"2f62\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Method:<span>\u00a0<\/span><\/em>In this project, each image is subjected to super-pixel segmentation that divides the image into several sub-regions with similar pixel color and texture characteristics. The number of divided sub-regions can be manually provided as a parameter. Next, the InceptionV3 model is invoked to assign a probability to each superpixel sub-region to belong to one of the 1000 classes that InceptionV3 is originally trained on. Finally, the object probabilities are used as weights to fit a regression model that explains the ROIs corresponding to each class as shown in Fig. 4 and code below.<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"iy iz am ja v jb\" tabindex=\"0\" role=\"button\">\n<div class=\"fs ft le\">\n<div class=\"jh s am ji\">\n<div class=\"lf jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\">\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf aligncenter lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/60\/0*2ghfBk_nkZAi1EWi?q=20\" alt=\"\" width=\"808\" height=\"531\" srcset=\"\"\/><\/figure>\n<\/div>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/808\/0*2ghfBk_nkZAi1EWi\" alt=\"\" width=\"808\" height=\"531\" srcset=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig 4: Explainability of image super-pixels using regression-like models in LIME. (Image by author)\" data-title=\"Share Fig 4: Explainability of image super-pixels using regression-like models in LIME. (Image by author) on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig 4: Explainability of image super-pixels using regression-like models in LIME. (Image by author) on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig 4: Explainability of image super-pixels using regression-like models in LIME. (Image by author)<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div><figcaption class=\"jo jp fu fs ft jq jr az b ba bb dt\" data-selectable-paragraph=\"\"\/><\/figure>\n<p id=\"2eea\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Results:<span>\u00a0<\/span><\/em>Using the proposed method, ROIs in most non-medical images should be explainable. Qualitative assessment and explainability as shown here are specifically useful in corner cases, or in situations where the model misclassified or missed objects of interest. In such situations, explaining what the CNN model is looking at and boosting ROIs accordingly to correct overall classification performances can help significantly reduce data-induced biases.<\/p>\n<h2 id=\"7702\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\"><span class=\"ez-toc-section\" id=\"Project_4_Transfer_learning_for_2D_Bounding_box_detection_on_new_objects_Level_Medium\"><\/span><strong class=\"ju gn\">Project 4: Transfer learning for 2D Bounding box detection on new objects (Level: Medium)<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p id=\"a444\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Goal:<span>\u00a0<\/span><\/em>The next step after image classification is the detection of objects of interest by placing bounding boxes around them. This is a significant problem in the autonomous drive domain to accurately identify moving objects such as cars and pedestrians from static objects such as roadblocks, street signs, trees, and buildings.<\/p>\n<p class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">The major difference between this project and the prior projects is the format of data. Here, labels Y are typically in the form of [x,y,w,h] per object of interest, where (x,y) typically represent the top left corner of the bounding box and<span>\u00a0<\/span><em class=\"ko\">w<\/em><span>\u00a0<\/span>and<span>\u00a0<\/span><em class=\"ko\">h<\/em><span>\u00a0<\/span>correspond to the width and height of the output bounding box. In this project, the goal is to leverage a pre-trained classifier for its feature extraction capabilities and then to retrain it on a small set of images to create a tight bounding box around a new object.<\/p>\n<p id=\"9ea8\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Method:<span>\u00a0<\/span><\/em>In the code\u00a0<a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/colab.research.google.com\/github\/tensorflow\/models\/blob\/master\/research\/object_detection\/colab_tutorials\/eager_few_shot_od_training_tf2_colab.ipynb\">Bounding Box colab<\/a>, we can extend a pre-trained object detector such as a single shot detector (SSD) with Resnet50 skip connections and feature pyramid network backbone, that is pre-trained for object detection on the MS-COCO dataset [13] to detect a completely unseen new object category, a<span>\u00a0<\/span><em class=\"ko\">rubber duck<\/em>\u00a0in this case.<\/p>\n<p class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">In this transfer learning setup, the already learned weights from early layers of the object detector are useful to extract local structural and textural information from images and only the final classifier layer requires retraining for the new object class. This enables retraining the object detector for a new class, such as a rubber duck in this use case, using as few as 5\u201315 images of the new object. The parameters to be tuned include optimizer, learning rate, input image size, and number of neurons in the final classifier layer.<\/p>\n<p id=\"a43e\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Results:<span>\u00a0<\/span><\/em>One major difference between object detectors and the prior CNN-based classifier models shown above is an additional output metric called Intersection over Union (IoU) [11] that measures the extent of overlap between the actual bounding box and the predicted bounding box. Additionally, an object detector model typically consists of a classifier (that predicts the object class) and a bounding box regressor that predicts the dimensions of the bounding box around the object. An example of the Google API for object detection on a new unseen image is shown in Fig. 5 and code below.<\/p>\n<p id=\"50b7\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">Extensions of the 2D bounding box detector to 3D bounding boxes specifically for autonomous drive are shown in<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/blog.fourthbrain.ai\/check-out-our-graduates-final-projects\">these projects<\/a>.<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"iy iz am ja v jb\" tabindex=\"0\" role=\"button\">\n<div class=\"fs ft lg\">\n<div class=\"jh s am ji\">\n<div class=\"lh jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\">\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf aligncenter lazyreplaced\" src=\"https:\/\/miro.medium.com\/max\/60\/0*YlUgY3abqP-WXbPg?q=20\" alt=\"Fig 5: Example of 2D bounding box detection using the tensorflow api for object detection\" width=\"1127\" height=\"849\" srcset=\"\"\/><\/figure>\n<\/div>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" src=\"https:\/\/miro.medium.com\/max\/1127\/0*YlUgY3abqP-WXbPg\" alt=\"\" width=\"1127\" height=\"849\" srcset=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig 5: Example of 2D bounding box detection using the TensorFlow API for object detection\" data-title=\"Share Fig 5: Example of 2D bounding box detection using the TensorFlow API for object detection on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig 5: Example of 2D bounding box detection using the TensorFlow API for object detection on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig 5: Example of 2D bounding box detection using the TensorFlow API for object detection<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div><figcaption class=\"jo jp fu fs ft jq jr az b ba bb dt\" data-selectable-paragraph=\"\"\/><\/figure>\n<h2 id=\"26bf\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\"><span class=\"ez-toc-section\" id=\"Project_5_Personalized_Medicine_and_Explainability_Level_Medium\"><\/span><strong class=\"ju gn\">Project 5: Personalized Medicine and Explainability (Level: Medium)<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p id=\"1f81\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Goal:<span>\u00a0<\/span><\/em>In this project, the goal is to automatically segment ROIs from multiple pathology sites to classify the extent of anemia-like pallor in a patient and track the pallor over time [13]. The two major differences in this project from the previous ones is that: 1) pallor needs to be detected across multiple image sites such as conjunctiva (under eye) and tongue to predict a single label as shown in Fig. 6, 2) ROIs corresponding to pallor need to be displayed and tracked over time.<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"iy iz am ja v jb\" tabindex=\"0\" role=\"button\">\n<div class=\"fs ft li\">\n<div class=\"jh s am ji\">\n<div class=\"lj jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\">\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf aligncenter lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/60\/0*VOIlTHaq2up0ddkX?q=20\" alt=\"\" width=\"816\" height=\"462\" srcset=\"\"\/><\/figure>\n<\/div>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/816\/0*VOIlTHaq2up0ddkX\" alt=\"\" width=\"816\" height=\"462\" srcset=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig 6: Example of anemia-like pallor detection using images processed from multiple pathological sites. (Image by author)\" data-title=\"Share Fig 6: Example of anemia-like pallor detection using images processed from multiple pathological sites. (Image by author) on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig 6: Example of anemia-like pallor detection using images processed from multiple pathological sites. (Image by author) on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig 6: Example of anemia-like pallor detection using images processed from multiple pathological sites. (Image by author)<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/figure>\n<p id=\"c7d6\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Methods:<span>\u00a0<\/span><\/em>For this project, feature-based models and CNN-based classifiers are applied with heavy data augmentation using the<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/keras.io\/api\/preprocessing\/image\/\">Imagedata generator<\/a><span>\u00a0<\/span>in Keras. To fuse the outcomes from multiple pathology sites, early, mid and late fusion can be applied.<\/p>\n<p class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\">The work in [13] applies late fusion wherein the layer before the classifier, which is considered to be the optimal feature representation of the image, is used to fuse features across multiple pathological sites. Finally, the Deepdream algorithm, as shown in the<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/colab.research.google.com\/github\/tensorflow\/docs\/blob\/master\/site\/en\/tutorials\/generative\/deepdream.ipynb\">Deepdream Colab<\/a>, is applied to the original eye and tongue images to visualize the ROIs and explain the extent of pathology. The parameters to tune in this project include the parameters from Project 2 along with additive gradient factor for the Deepdream visualizations.<\/p>\n<p id=\"adc6\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Results:<span>\u00a0<\/span><\/em>The data for this work is available for<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/sites.google.com\/site\/sohiniroychowdhury\/automated-pallor-detection-project?authuser=0\">benchmarking<\/a>. Using the Deepdream algorithm the visualizations are shown in Fig. 7, where, we observe a higher concentration of features corresponding to pallor in the blood vessels under-eye than anywhere else in the eye. Similarly, we observe differences in features between the inner and outer segments of the tongue. These assessments are useful to create a personalized pathology tracking system for patients with anemia.<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"fs ft lk\">\n<div class=\"jh s am ji\">\n<div class=\"ll jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\">\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf aligncenter lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/60\/0*t7Ku338S-3drXKLS?q=20\" alt=\"\" width=\"560\" height=\"200\" srcset=\"\"\/><\/figure>\n<\/div>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/560\/0*t7Ku338S-3drXKLS\" sizes=\"auto, 560px\" alt=\"\" width=\"560\" height=\"200\" srcset=\"https:\/\/miro.medium.com\/max\/276\/0*t7Ku338S-3drXKLS 276w, https:\/\/miro.medium.com\/max\/552\/0*t7Ku338S-3drXKLS 552w, https:\/\/miro.medium.com\/max\/560\/0*t7Ku338S-3drXKLS 560w\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig 7: Example of feature concentrations from the Deep Dream implementation. Heavy concentration of gradients is observed in the conjunctive or under-eye blood vessel regions. (Image by author)\" data-title=\"Share Fig 7: Example of feature concentrations from the Deep Dream implementation. Heavy concentration of gradients is observed in the conjunctive or under-eye blood vessel regions. (Image by author) on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig 7: Example of feature concentrations from the Deep Dream implementation. Heavy concentration of gradients is observed in the conjunctive or under-eye blood vessel regions. (Image by author) on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig 7: Example of feature concentrations from the Deep Dream implementation. Heavy concentration of gradients is observed in the conjunctive or under-eye blood vessel regions. (Image by author)<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div><figcaption class=\"jo jp fu fs ft jq jr az b ba bb dt\" data-selectable-paragraph=\"\"\/><\/figure>\n<h2 id=\"c112\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\"><span class=\"ez-toc-section\" id=\"Project_6_Point_cloud_segmentation_for_object_detection_Level_Hard\"><\/span><strong class=\"ju gn\">Project 6: Point cloud segmentation for object detection. (Level: Hard)<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p id=\"a77e\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Goal:<span>\u00a0<\/span><\/em>In this project, the input is a stream of point clouds, i.e., the output from Lidar sensors that provide depth resolution. The primary difference between Lidar point clouds and an image is that point clouds provide 3D resolution, so each voxel (3D equivalent of pixel) represents the location of an object from the Lidar source and height of the object relative to the Lidar source. The main challenges posed by point cloud data models are i) model computational complexity if 3D convolutions are used and ii) object transformation invariance, which means a rotated object should be detected as the object itself as shown in [13].<\/p>\n<p id=\"a098\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Method:<span>\u00a0<\/span><\/em>The data set for this project is the ModelNet40 shape classification benchmark that contains over 12,000, 3D models from 40 object classes. Each object is sub-sampled to extract a fixed number of points followed by augmentation to cater to multiple transformations in shape. Next 1D convolutions are used to learn the shape<em class=\"ko\">ness<span>\u00a0<\/span><\/em>features using the Pytorch library in the<span>\u00a0<\/span><a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/colab.research.google.com\/github\/nikitakaraevv\/pointnet\/blob\/master\/nbs\/PointNetClass.ipynb\">Pointnet colab<\/a><span>\u00a0<\/span>as shown below.<\/p>\n<p id=\"18ea\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Results:<\/em><span>\u00a0<\/span>The outcome of the model can be summarized using Fig. 8 below. Up to 89% training accuracy for object classification can be achieved by this method that can also be extended to 3D semantic segmentation. Extensions to this work can be useful for 3D bounding box detection for autonomous drive use cases.<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"iy iz am ja v jb\" tabindex=\"0\" role=\"button\">\n<div class=\"fs ft lm\">\n<div class=\"jh s am ji\">\n<div class=\"ln jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\">\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf aligncenter lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/60\/0*W05OKgP1TODxqGlt?q=20\" alt=\"\" width=\"875\" height=\"372\" srcset=\"\"\/><\/figure>\n<\/div>\n<p><figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/875\/0*W05OKgP1TODxqGlt\" alt=\"\" width=\"875\" height=\"372\" srcset=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig. 8: Image from [15] that identifies objects from the point clouds\" data-title=\"Share Fig. 8: Image from [15] that identifies objects from the point clouds on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig. 8: Image from [15] that identifies objects from the point clouds on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig. 8: Image from [15] that identifies objects from the point clouds<\/figcaption><\/figure>\n<\/p>\n<\/div>\n<\/div>\n<\/div>\n<\/div><figcaption class=\"jo jp fu fs ft jq jr az b ba bb dt\" data-selectable-paragraph=\"\"\/><\/figure>\n<h2 id=\"d29a\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\"><span class=\"ez-toc-section\" id=\"Project_7_Image_semantic_segmentation_using_U-net_for_binary_and_multi-class_Medium\"><\/span><strong class=\"ju gn\">Project 7: Image semantic segmentation using U-net for binary and multi-class. (Medium)<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p id=\"39a1\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Goal:<span>\u00a0<\/span><\/em>The CNN models so far have been applied to automatically learn features that can then be used for classification. This process is known as<span>\u00a0<\/span><em class=\"ko\">feature encoding<\/em>. As a next step, we apply a decoder unit with similar structure as the encoder to enable generation of an output image. This combination of encoder-decoder pair enables the input and output to have similar dimensions, i.e. input is an image and output is also an image.<\/p>\n<p id=\"0a27\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Methods:<span>\u00a0<\/span><\/em>The encoder-decoder combination with residual skip connections is popularly known as the U-net [15]. For binary and multi-class problems, the data has to be formatted such that if X (input image) has dimensions [m x m] pixels, Y has dimensions [m x m x d], where \u2018d\u2019 is the number of classes to be predicted. The parameters to tune include optimizer, learning rate, and depth of the U-net model as shown in [15] and Fig. 9 below (<a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/paperswithcode.com\/paper\/multiresunet-rethinking-the-u-net\">source here)<\/a>.<\/p>\n<figure class=\"kr ks kt ku kv ix fs ft paragraph-image\">\n<div class=\"iy iz am ja v jb\" tabindex=\"0\" role=\"button\">\n<div class=\"fs ft ky\">\n<div class=\"jh s am ji\">\n<div class=\"lo jk s\">\n<div class=\"fa jc ep fd ez jd v je jf jg\"><img loading=\"lazy\" decoding=\"async\" class=\"ep fd ez jd v jl jm fe acf lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/60\/0*agnAfRLrZi10Zy2O?q=20\" alt=\"\" width=\"1600\" height=\"568\" srcset=\"\"\/><\/div>\n<figure class=\"post-image post-mediaBleed aligncenter\"><img loading=\"lazy\" decoding=\"async\" class=\"oa xq ep fd ez jd v c lazyreplaced\" role=\"presentation\" src=\"https:\/\/miro.medium.com\/max\/1600\/0*agnAfRLrZi10Zy2O\" alt=\"\" width=\"1600\" height=\"568\" srcset=\"\"\/><figcaption><a rel=\"nofollow noopener\" target=\"_blank\" href=\"https:\/\/thenextweb.com\/news\/#\" data-url=\"https:\/\/twitter.com\/intent\/tweet?url=https%3A%2F%2Feditorial.thenextweb.com%2Fneural%2F2021%2F04%2F17%2Fcomputer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication%2F&amp;via=thenextweb&amp;related=thenextweb&amp;text=Check out this picture on: Fig. 9. Example of U-net model.\" data-title=\"Share Fig. 9. Example of U-net model. on Twitter\" data-width=\"685\" data-height=\"500\" class=\"post-image-share popitup\" title=\"Share Fig. 9. Example of U-net model. on Twitter\"><i class=\"icon icon--inline icon--twitter--dark\"\/><\/a>Fig. 9. Example of U-net model.<\/figcaption><\/figure>\n<\/div>\n<\/div>\n<\/div>\n<\/div>\n<\/figure>\n<p id=\"e964\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Results:<span>\u00a0<\/span><\/em>The U-net model can learn to generate binary and multi-class semantic maps from large and small data sets [16\u201317], but it is found to be sensitive to data imbalance. Thus, selecting the right training data set is significantly important for optimal outcomes. Other extensions to this work would include DenseNet connections to the model, or other encoder-decoder networks such as MobileNet or Exception networks [17].<\/p>\n<h2 id=\"c7f2\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\"><span class=\"ez-toc-section\" id=\"Project_8_Machine_Translation_for_Posture_and_Intention_Classification_Level_Hard\"><\/span><strong class=\"ju gn\">Project 8: Machine Translation for Posture and Intention Classification (Level: Hard)<\/strong><span class=\"ez-toc-section-end\"><\/span><\/h2>\n<p id=\"50b8\" class=\"js jt gm ju b hl jv jw jx ho jy jz ka kb kc kd ke kf kg kh ki kj kk kl km kn gf hj\" data-selectable-paragraph=\"\"><em class=\"ko\">Goal:<span>\u00a0<\/span><\/em>Automated detection of posture or gesture often includes keypoint identification (such as identification of the skeletal structure) in videos that can lead to identification of posture (standing, walking, moving) or intention for pedestrians (crossing road, not crossing), etc. [18\u201319], as shown in Fig. 10 below. For this category of problems, keyframe information from multiple subsequent video frames is processed collectively to generate<a rel=\"nofollow noopener\" target=\"_blank\" class=\"eh kp\" href=\"https:\/\/www.youtube.com\/watch?v=a7SrsA--mtA&amp;t=1s\"><span>\u00a0<\/span>pose\/intention-related predictions.<\/a><\/p>\n<\/div>\n<p><script async src=\"\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script><script data-src=\"https:\/\/connect.facebook.net\/en_US\/sdk.js#xfbml=1&amp;appId=378011798897423&amp;version=v2.6\" id=\"socialSrcFacebook\" type=\"text\/template\"><\/script><\/p>\n<blockquote><p><strong><span style=\"color: #ff6600;\">If you liked the article, do not forget to share it with your friends. Follow us on\u00a0<span style=\"color: #ff0000;\"><a style=\"color: #ff0000;\" href=\"https:\/\/news.google.com\/publications\/CAAqBwgKMLG0nwswvr63Aw\" target=\"_blank\" rel=\"nofollow noopener noreferrer\">Google News<\/a><\/span>\u00a0too, click on the star and choose us from your favorites.<\/span><\/strong><\/p><\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\">For forums sites go to <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/forum.buradabiliyorum.com\/\" target=\"_blank\" rel=\"noopener\">Forum.BuradaBiliyorum.Com<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<blockquote>\n<p style=\"text-align: center;\"><strong>If you want to read more like this article, you can visit our <span style=\"color: #ff9900;\"><a style=\"color: #ff9900;\" href=\"https:\/\/en.buradabiliyorum.com\/technology\/\" target=\"_blank\" rel=\"noopener\">Technology category.<\/a><\/span><\/strong><\/p>\n<\/blockquote>\n<p><span style=\"color: black;\"><a style=\"color: #ff9900;\" href=\"https:\/\/thenextweb.com\/news\/computer-vision-and-medical-imaging-start-with-10-projects-beginners-syndication\" target=\"_blank\" rel=\"noopener\">Source<\/a><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>&#8220;#New to computer vision and medical imaging? Start with these 10 projects&#8221; (AI) and computer science that enables automated systems to see, i.e. to process images and video in\u00a0a human-like manner to detect and identify objects or regions of importance, predict an outcome or even alter the image to a desired format [1]. Most popular&#8230;<\/p>\n","protected":false},"author":1,"featured_media":229446,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"fifu_image_url":"https:\/\/img-cdn.tnwcdn.com\/image\/neural?filter_last=1&fit=1280,640&url=https:\/\/cdn0.tnwcdn.com\/wp-content\/blogs.dir\/1\/files\/2021\/04\/hero-.png&signature=cd61268df83fc72ad7dcae4ffc028246","fifu_image_alt":"","footnotes":""},"categories":[18],"tags":[],"class_list":["post-229445","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technology"],"_links":{"self":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/229445","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/comments?post=229445"}],"version-history":[{"count":0,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/posts\/229445\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media\/229446"}],"wp:attachment":[{"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/media?parent=229445"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/categories?post=229445"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/buradabiliyorum.com\/en\/wp-json\/wp\/v2\/tags?post=229445"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}