📉
Tutorials
  • Computer History
  • Function
    • Finance
      • Calculate
    • Manage Data
    • Date&Time
    • Strings and Character
  • Snippets
    • Web Application
      • Hugo
      • JavaScript
        • Stopwatch using JavaScript?
    • Note
    • Start Project
      • GitHub
      • GitLab
    • Python Programming
      • Strings and Character Data
      • List
      • Dictionaries
    • Data Science
      • Setting Option
      • Get Data
  • Link Center
    • Next Articles
    • Google
    • Excel VBA
    • Python
      • Notebook
    • WebApp
      • Vue.js
    • Finance
    • Project
      • Kids
        • Scratch
      • Finance
        • Plotly.js
        • Portfolio
      • Mini Lab
        • Systems Administration
        • Auto Adjust Image
      • Sending Emails
      • ECS
        • Knowledge Base
        • ระบบผู้เชี่ยวชาญ (Expert System)
        • Check product
        • Compare two SQL databases
      • e-Library
        • Knowledge base
        • การจัดหมวดหมู่ห้องสมุด
        • Temp
      • AppSheet
        • บัญชีรายรับรายจ่าย
      • Weather App
      • COVID-19
  • Tutorials
    • Data Science
      • Data Science IPython notebooks
    • UX & UI
      • 7 กฎการออกแบบ UI
    • Web Scraping
      • Scrape Wikipedia Articles
      • Quick Start
    • GUI
      • pysimple
        • Create a GUI
      • Tkinter
        • Python Tkinter Tutorial
      • PyQt
        • PyQt Tutorial
    • MachineLearning
      • การพัฒนา Chat Bot
      • AI ผู้ช่วยใหม่ในการทำ Customer Segmentation
      • Customer Segmentation
      • ตัดคำภาษาไทย ด้วย PyThaiNLP API
    • Excel & VBA
      • INDEX กับ MATCH
      • รวมสูตร Excel ปี 2020
      • How to Write Code in a Spreadsheet
    • Visualization
      • Bokeh
        • Part I: Getting Started
        • Data visualization
        • Plotting a Line Graph
        • Panel Document
        • Interactive Data Visualization
    • VueJS
      • VueJS - Quick Guide
    • Django
      • Customize the Django Admin
      • พัฒนาเว็บด้วย Django
    • Git
      • วิธีสร้าง SSH Key
      • Git คืออะไร
      • เริ่มต้นใช้งาน Git
      • การใช้งาน Git และ Github
      • รวม 10 คำสั่ง Git
      • GIT Push and Pull
    • Finance
      • Stock Analysis using Pandas (Series)
      • Building Investment AI for fintech
      • Resampling Time Series
      • Python for Finance (Series)
      • Stock Data Analysis (Second Edition)
      • Get Stock Data Using Python
      • Stock Price Trend Analysis
      • Calculate Stock Returns
      • Quantitative Trading
      • Backtrader for Backtesting
      • Binance Python API
      • Pine Script (TradingView)
      • Stocks Analysis with Pandas and Scikit-Learn
      • Yahoo Finance API
      • Sentiment Analysis
      • yfinance Library
      • Stock Data Analysis
      • YAHOO_FIN
      • Algorithmic Trading
    • JavaScript
      • Split a number
      • Callback Function
      • The Best JavaScript Examples
      • File and FileReader
      • JavaScript Tutorial
      • Build Reusable HTML Components
      • Developing JavaScript components
      • JavaScript - Quick Guide
      • JavaScript Style Guide()
      • Beginner's Handbook
      • Date Now
    • Frontend
      • HTML
        • File Path
      • Static Site Generators.
        • Creating a New Theme
    • Flask
      • Flask - Quick Guide
      • Flask Dashboards
        • Black Dashboard
        • Light Blue
        • Flask Dashboard Argon
      • Create Flask App
        • Creating First Application
        • Rendering Pages Using Jinja
      • Jinja Templates
        • Primer on Jinja Templating
        • Jinja Template Document
      • Learning Flask
        • Ep.1 Your first Flask app
        • Ep.2 Flask application structure
        • Ep.3 Serving HTML files
        • Ep.4 Serving static files
        • Ep.5 Jinja template inheritance
        • Ep.6 Jinja template design
        • Ep.7 Working with forms in Flask
        • Ep.8 Generating dynamic URLs in Flask
        • Ep.9 Working with JSON data
        • Ep.23 Deploying Flask to a VM
        • Ep.24 Flask and Docker
        • Ep. 25: uWSGI Introduction
        • Ep. 26 Flask before and after request
        • Ep. 27 uWSGI Decorators
        • Ep. 28 uWSGI Decorators
        • Ep. 29 Flask MethodView
        • Ep. 30 Application factory pattern
      • The Flask Mega-Tutorial
        • Chapter 2: Templates
      • Building Flask Apps
      • Practical Flask tutorial series
      • Compiling SCSS to CSS
      • Flask application structure
    • Database
      • READING FROM DATABASES
      • SQLite
        • Data Management
        • Fast subsets of large datasets
      • Pickle Module
        • How to Persist Objects
      • Python SQL Libraries
        • Create Python apps using SQL Server
    • Python
      • Python vs JavaScript
      • Python Pillow – Adjust Image
      • Python Library for Google Search
      • Python 3 - Quick Guide
      • Regular Expressions
        • Python Regular Expressions
        • Regular Expression (RegEx)
        • Validate ZIP Codes
        • Regular Expression Tutorial
      • Python Turtle
      • Python Beginner's Handbook
      • From Beginner to Pro
      • Standard Library
      • Datetime Tutorial
        • Manipulate Times, Dates, and Time Spans
      • Work With a PDF
      • geeksforgeeks.org
        • Python Tutorial
      • Class
      • Modules
        • Modules List
        • pickle Module
      • Working With Files
        • Open, Read, Append, and Other File Handling
        • File Manipulation
        • Reading & Writing to text files
      • Virtual Environments
        • Virtual Environments made easy
        • Virtual Environmen
        • A Primer
        • for Beginners
      • Functions
        • Function Guide
        • Inner Functions
      • Learning Python
        • Pt. 4 Python Strings
        • Pt. 3 Python Variables
      • Zip Function
      • Iterators
      • Try and Except
        • Exceptions: Introduction
        • Exceptions Handling
        • try and excep
        • Errors and Exceptions
        • Errors & Exceptions
      • Control Flow
      • Lambda Functions
        • Lambda Expression คืออะไร
        • map() Function
      • Date and Time
        • Python datetime()
        • Get Current Date and Time
        • datetime in Python
      • Awesome Python
      • Dictionary
        • Dictionary Comprehension
        • ALL ABOUT DICTIONARIES
        • DefaultDict Type for Handling Missing Keys
        • The Definitive Guide
        • Why Functions Modify Lists and Dictionaries
      • Python Structures
      • Variable & Data Types
      • List
        • Lists Explained
        • List Comprehensions
          • Python List Comprehension
          • List Comprehensions in 5-minutes
          • List Comprehension
        • Python List
      • String
        • Strings and Character Data
        • Splitting, Concatenating, and Joining Strings
      • String Formatting
        • Improved String Formatting Syntax
        • String Formatting Best Practices
        • Remove Space
        • Add Spaces
      • Important basic syntax
      • List all the packages
      • comment
    • Pandas
      • Tutorial (GeeksforGeeks)
      • 10 minutes to pandas
      • Options and settings
      • เริ่มต้น Set Up Kaggle.com
      • Pandas - Quick Guide
      • Cookbook
      • NumPy
        • NumPy Package for Scientific
      • IO tools (text, CSV, …)
      • pandas.concat
      • Excel & Google Sheets
        • A Guide to Excel
        • Quickstart to the Google Sheets
        • Python Excel Tutorial: The Definitive Guide
      • Working With Text Data
        • Quickstart
      • API Reference
      • Groupby
      • DateTime Methods
      • DataFrame
      • sort_values()
      • Pundit: Accessing Data in DataFrames
      • datatable
        • DataFrame: to_json()
        • pydatatable
      • Read and Write Files
      • Data Analysis with Pandas
      • Pandas and Python: Top 10
      • 10 minutes to pandas
      • Getting Started with Pandas in Python
    • Markdown
      • Create Responsive HTML Emails
      • Using Markup Languages with Hugo
    • AngularJS
      • Learn AngularJS
    • CSS
      • The CSS Handbook
      • Box Shadow
      • Image Center
      • The CSS Handbook
      • The CSS Handbook
      • Loading Animation
      • CSS Grid Layout
      • Background Image Size
      • Flexbox
  • Series
    • จาวาสคริปต์เบื้องต้น
      • 1: รู้จักกับจาวาสคริปต์
  • Articles
    • Visualization
      • Dash
        • Introducing Dash
    • Finance
      • PyPortfolioOpt
      • Best Libraries for Finance
      • Detection of price support
      • Portfolio Optimization
      • Python Packages For Finance
    • Django
      • เริ่มต้น Django RestFramework
    • General
      • Heroku คืออะไร
      • How to Crack Passwords
    • Notebook
      • IPython Documentation
      • Importing Notebooks
      • Google Colab for Data Analytics
      • Creating Interactive Dashboards
      • The Definitive Guide
      • A gallery of interesting Jupyter Notebooks
      • Advanced Jupyter Notebooks
      • Converting HTML to Notebook
    • Pandas
      • Pandas_UI
      • Pandas Style API
      • Difference Between two Dataframes
      • 19 Essential Snippets in Pandas
      • Time Series Analysis
      • Selecting Columns in a DataFrame
      • Cleaning Up Currency Data
      • Combine Multiple Excel Worksheets
      • Stylin’ with Pandas
      • Pythonic Data Cleaning
      • Make Excel Faster
      • Reading Excel (xlsx) Files
      • How to use iloc and loc for Indexing
      • The Easiest Data Cleaning Method
    • Python
      • pip install package
      • Automating your daily tasks
      • Convert Speech to Text
      • Tutorial, Project Ideas, and Tips
      • Image Handling and Processing
        • Image Processing Part I
        • Image Processing Part II
        • Image tutorial
        • Image Processing with Numpy
        • Converts PIL Image to Numpy Array
      • Convert Dictionary To JSON
      • JSON Dump
      • Speech-to-Text Model
      • Convert Text to Speech
      • Tips & Tricks
        • Fundamentals for Data Science
        • Best Python Code Examples
        • Top 50 Tips & Tricks
        • 11 Beginner Tips
        • 10 Tips & Tricks
      • Password hashing
      • psutil
      • Lambda Expressions
    • Web Scraping
      • Web Scraping using Python
      • Build a Web Scraper
      • Web Scraping for beginner
      • Beautiful Soup
      • Scrape Websites
      • Python Web Scraping
        • Web Scraping Part 1
        • Web Scraping Part 2
        • Web Scraping Part 3
        • Web Scraping Part 4
      • Web Scraper
    • Frontend
      • Book Online with GitBook
      • Progressive Web App คืออะไร
      • self-host a Hugo web app
  • Examples
    • Django
      • Build a Portfolio App
      • SchoolManagement
    • Flask
      • Flask Stock Visualizer
      • Flask by Example
      • Building Flask Apps
      • Flask 101
    • OpenCV
      • Build a Celebrity Look-Alike
      • Face Detection-OpenCV
    • Python
      • Make Game FLASH CARD
      • Sending emails using Google
      • ตรวจหาภาพซ้ำด้วย Perceptual hashing
        • Sending Emails in Python
      • Deck of Cards
      • Extract Wikipedia Data
      • Convert Python File to EXE
      • Business Machine Learning
      • python-business-analytics
      • Simple Blackjack Game
      • Python Turtle Clock
      • Countdown
      • 3D Animation : Moon Phases
      • Defragmentation Algorithm
      • PDF File
        • จัดการข้อความ และรูป จากไฟล์ PDF ด้วย PDFBox
      • Reading and Generating QR codes
      • Generating Password
        • generate one-time password (OTP)
        • Random Password Generator
        • Generating Strong Password
      • PyQt: Building Calculator
      • List Files in a Directory
      • [Project] qID – โปรแกรมแต่งรูปง่ายๆ เพื่อการอัพลงเว็บ
      • Python and Google Docs to Build Books
      • Tools for Record Linking
      • Create Responsive HTML Email
      • psutil()
      • Transfer Learning for Deep Learning
      • ดึงข้อมูลคุณภาพอากาศประเทศไทย
        • Image Classification
    • Web Scraper
      • Scrape Wikipedia Articles
        • Untitled
      • How Scrape Websites with Python 3
    • Finance
      • Algorithmic Trading for Beginners
      • Parse TradingView Stock
      • Creating a stock price database with MariaDB and python
      • Source Code
        • stocks-list
      • Visualizing with D3
      • Real Time Stock in Excel using Python
      • Create Stock Quote Module
      • The Magic Formula Lost Its Sparkle?
      • Stock Market Analysis
      • Stock Portfolio Analyses Part 1
      • Stock Portfolio Analyses Part 2
      • Build A Dashboard In Python
      • Stock Market Predictions with LSTM
      • Trading example
      • Algorithmic Trading Strategies
      • DOWNLOAD FUNDAMENTALS DATA
      • Algorithmic Trading
      • numfin
      • Financial Machine Learning
      • Algorithm To Predict Stock Direction
      • Interactive Brokers API Code
      • The (Artificially) Intelligent Investor
      • Create Auto-Updating Excel of Stock Market
      • Stock Market Predictions
      • Automate Your Stock Portfolio
      • create an analytics dashboard
      • Bitcoin Price Notifications
      • Portfolio Management
    • WebApp
      • CSS
        • The Best CSS Examples
      • JavaScript
        • Memory Game
      • School Clock
      • Frontend Tutorials & Example
      • Side Menu Bar with sub-menu
      • Create Simple CPU Monitor App
      • Vue.js building a converter app
      • jQuery
        • The Best jQuery Examples
      • Image Slideshow
      • Handle Timezones
      • Text to Speech with Javascript
      • Building Blog for Your Portfolio
      • Responsive Website Layout
      • Maths Homework Generator
  • Books
    • Finance
      • Python for Finance (O'Reilly)
    • Website
      • Hugo
        • Go Bootcamp
        • Hugo in Action.
          • About this MEAP
          • Welcome
          • 1. The JAM stack with Hugo
          • 2. Live in 30 minutes
          • 3. Using Markup for content
          • 4. Content Management with Hugo
          • 5. Custom Pages and Customized Content
          • 6. Structuring web pages
          • A Appendix A.
          • B Appendix B.
          • C Appendix C.
    • Python
      • ภาษาไพธอนเบื้องต้น
      • Python Cheatsheet
        • Python Cheatsheet
      • Beginning Python
      • IPython Cookbook
      • The Quick Python Book
        • Case study
        • Part 1. Starting out
          • 1. About Python
          • 2. Getting started
          • 3. The Quick Python overview
        • Part 2. The essentials
          • 14. Exceptions
          • 13. Reading and writing files
          • 12. Using the filesystem
          • 11. Python programs
          • 10. Modules and scoping rules
          • 9. Functions
          • 8. Control flow
          • 4. The absolute basics
          • 5. Lists, tuples, and sets
          • 6. Strings
          • 7. Dictionaries
        • Part 3. Advanced language features
          • 19. Using Python libraries
          • 18. Packages
          • 17. Data types as objects
          • 16. Regular expressions
          • 15. Classes and OOP
        • Part 4. Working with data
          • Appendix B. Exercise answers
          • Appendix A. Python’s documentation
          • 24. Exploring data
          • 23. Saving data
          • 20. Basic file wrangling
          • 21. Processing data files
          • 22. Data over the network
      • The Hitchhiker’s Guide to Python
      • A Whirlwind Tour of Python
        • 9. Defining Functions
      • Automate the Boring Stuff
        • 4. Lists
        • 5. Dictionaries
        • 12. Web Scraping
        • 13. Excel
        • 14. Google Sheets
        • 15. PDF and Word
        • 16. CSV and JSON
    • IPython
    • Pandas
      • จัดการข้อมูลด้วย pandas เบื้องต้น
      • Pandas Tutorial
  • Link Center
    • Temp
  • เทควันโด
    • รวมเทคนิค
    • Help and Documentation
  • Image
    • Logistics
Powered by GitBook
On this page
  • Intensity Transformation
  • The Reason of Gamma Correction
  • Convolution
  • Thresholding
  • Line Detection
  • Hough Transform
  • Edge Detection
  • Process
  • Vectorization
  • Contour tracking
  • Image Compression
  • Stacked Autoencoder

Was this helpful?

  1. Articles
  2. Python
  3. Image Handling and Processing

Image Processing Part II

August 22, 2018

PreviousImage Processing Part INextImage tutorial

Last updated 5 years ago

Was this helpful?

Previously we’ve seen some of the very basic image analysis operations in Python. In this last part of basic image analysis, we’ll go through some of the following contents.

Following contents is the reflection of my completed academic image processing course in the previous term. So, I am not planning on putting anything into production sphere. Instead, the aim of this article is to try and realize the fundamentals of a few basic image processing techniques. For this reason, I am going to stick to using or mainly to perform most of the manipulations, although I will use other libraries now and then rather than using most wanted tools such as :

In the previous article, we’ve gone through some of the following basic operations. To keep pace with today’s content, continuous reading is highly appreciated.

Table of Contents

    • Thresholding

    • Object Detection

    • Contour tracking

    • Stacked Autoencoder

Intensity Transformation

Let’s begin with importing an image.

%matplotlib inline

import imageio
import matplotlib.pyplot as plt
import warnings
import matplotlib.cbook
warnings.filterwarnings("ignore",category=matplotlib.cbook.mplDeprecation)

pic = imageio.imread('img/parrot.jpg')

plt.figure(figsize = (6,6))
plt.imshow(pic);
plt.axis('off');

Image Negative

The intensity transformation function mathematically defined as:

s=T(r)

where r is the pixels of the input image and s is the pixels of the output image. T is a transformation function that maps each value of r to each value of s.

Negative transformation, which is the invert of identity transformation. In negative transformation, each value of the input image is subtracted from the L−1 and mapped onto the output image.

In this case, the following transition has been done.

s=(L–1)–r

So each value is subtracted by 255. So what happens is that the lighter pixels become dark and the darker picture becomes light. And it results in image negative.

negative = 255 - pic # neg = (L-1) - img

plt.figure(figsize = (6,6))
plt.imshow(negative);
plt.axis('off');

Log transformation

The log transformations can be defined by this formula

s=c∗log(r+1)

Where s and r are the pixel values of the output and the input image and c is a constant. The value 1 is added to each of the pixel value of the input image because if there is a pixel intensity of 0 in the image, then log(0) is equal to infinity. So 1 is added, to make the minimum value at least 1.

During log transformation, the dark pixels in an image are expanded as compared to the higher pixel values. The higher pixel values are kind of compressed in log transformation. This result in the following image enhancement.

The value of c in the log transform adjust the kind of enhancement we are looking for.

%matplotlib inline

import imageio
import numpy as np
import matplotlib.pyplot as plt

pic = imageio.imread('img/parrot.jpg')
gray = lambda rgb : np.dot(rgb[... , :3] , [0.299 , 0.587, 0.114]) 
gray = gray(pic)

'''
log transform
-> s = c*log(1+r)

So, we calculate constant c to estimate s
-> c = (L-1)/log(1+|I_max|)

'''

max_ = np.max(gray)

def log_transform():
    return (255/np.log(1+max_)) * np.log(1+gray)

plt.figure(figsize = (5,5))
plt.imshow(log_transform(), cmap = plt.get_cmap(name = 'gray'))
plt.axis('off');

Gamma Correction

Gamma correction, or often simply gamma, is a nonlinear operation used to encode and decode luminance or tristimulus values in video or still image systems. Gamma correction is also known as the Power Law Transform. First, our image pixel intensities must be scaled from the range 0, 255 to 0, 1.0. From there, we obtain our output gamma corrected image by applying the following equation:

Vo=V1Gi

Where Vi is our input image and G is our gamma value. The output image, Vo is then scaled back to the range 0-255.

A gamma value, G < 1 is sometimes called an encoding gamma, and the process of encoding with this compressive power-law nonlinearity is called gamma compression; Gamma values < 1 will shift the image towards the darker end of the spectrum.

Conversely, a gamma value G > 1 is called a decoding gamma and the application of the expansive power-law nonlinearity is called gamma expansion. Gamma values > 1 will make the image appear lighter. A gamma value of G = 1 will have no effect on the input image:

import imageio
import matplotlib.pyplot as plt

# Gamma encoding 
pic = imageio.imread('img/parrot.jpg')
gamma = 2.2 # Gamma < 1 ~ Dark  ;  Gamma > 1 ~ Bright

gamma_correction = ((pic/255) ** (1/gamma)) 
plt.figure(figsize = (5,5))
plt.imshow(gamma_correction)
plt.axis('off');

The Reason of Gamma Correction

The reason we apply gamma correction is that our eyes perceive color and luminance differently than the sensors in a digital camera. When a sensor on a digital camera picks up twice the amount of photons, the signal is doubled. However, our eyes do not work like this. Instead, our eyes perceive double the amount of light as only a fraction brighter. Thus, while a digital camera has a linear relationship between brightness our eyes have a non-linear relationship. In order to account for this relationship, we apply gamma correction.

There is some other linear transformation function. Listed below:

  • Contrast Stretching

  • Intensity-Level Slicing

  • Bit-Plane Slicing

Convolution

Now, this filter is also an array of numbers where the numbers are called weights or parameters. A very important note is that the depth of this filter has to be the same as the depth of the input, so the dimensions of this filter are 3 x 3 x 3.

Now, let’s take the filter to the top left corner. As the filter is sliding, or convolving, around the input image, it is multiplying the values in the filter with the original pixel values of the image (aka computing element-wise multiplications). These multiplications are all summed up. So now we have a single number. Remember, this number is just representative of when the filter is at the top left of the image. Now, we repeat this process for every location on the input volume. Next step would be moving the filter to the right by a stride or step 1 unit, then right again by stride 1, and so on. Every unique location on the input volume produces a number. We can also choose stride or the step size 2 or more, but we have to care whether it will fit or not on the input image.

After sliding the filter over all the locations, we will find out that, what we’re left with is a 30 x 30 x 1 array of numbers, which we call an activation map or feature map. The reason we get a 30 x 30 array is that there are 900 different locations that a 3 x 3 filter can fit on a 32 x 32 input image. These 900 numbers are mapped to a 30 x 30 array. We can calculate the convolved image by following:

Convolved:N−FS+1

where N and F represent Input image size and kernel size respectively and S represent stride or step size. So, in this case, the output would be

32−31+1=30

Let’s say we’ve got a following 3x3 filter, convolving on a 5x5 matrix and according to the equation we should get a 3x3 matrix, technically called activation map or feature map.⎛⎝⎜⎜⎜⎜⎜⎜3032230100212001322001321⎞⎠⎟⎟⎟⎟⎟⎟∗⎛⎝⎜020121202⎞⎠⎟=⎛⎝⎜1210912176171914⎞⎠⎟

let’s take a look visually,

Moreover, we practically use more filters instead of one. Then our output volume would be 28x28xn (where n is the number of activation map). By using more filters, we are able to preserve the spatial dimensions better.

Now, I do realize that some of these topics are quite complex and could be made in whole posts by themselves. In an effort to remain concise yet retain comprehensiveness, I will provide links to resources where the topic is explained in more detail.

Let’s first apply some custom uniform window to the image. This has the effect of burning the image, by averaging each pixel with those nearby

%%time
import numpy as np
import imageio
import matplotlib.pyplot as plt
from scipy.signal import convolve2d

def Convolution(image, kernel):
    conv_bucket = []
    for d in range(image.ndim):
        conv_channel = convolve2d(image[:,:,d], kernel, 
                               mode="same", boundary="symm")
        conv_bucket.append(conv_channel)
    return np.stack(conv_bucket, axis=2).astype("uint8")


kernel_sizes = [9,15,30,60]
fig, axs = plt.subplots(nrows = 1, ncols = len(kernel_sizes), figsize=(15,15));

pic = imageio.imread('img:/parrot.jpg')

for k, ax in zip(kernel_sizes, axs):
    kernel = np.ones((k,k))
    kernel /= np.sum(kernel)
    ax.imshow(Convolution(pic, kernel));
    ax.set_title("Convolved By Kernel: {}".format(k));
    ax.set_axis_off();

Following is an Outline Kernel. An outline kernel (aka “edge” kernel) is used to highlight large differences in pixel values. A pixel next to neighbor pixels with close to the same intensity will appear black in the new image while one next to neighbor pixels that differ strongly will appear white.Edge Kernel=⎛⎝⎜−1−1−1−18−1−1−1−1⎞⎠⎟

%%time
from skimage import color
from skimage import exposure
import numpy as np
import imageio
import matplotlib.pyplot as plt 

# import image
pic = imageio.imread('img/crazycat.jpeg')
plt.figure(figsize = (5,5))
plt.imshow(pic)
plt.axis('off');
# Convert the image to grayscale 
img = color.rgb2gray(pic)

# outline kernel - used for edge detection
kernel = np.array([[-1,-1,-1],
                   [-1,8,-1],
                   [-1,-1,-1]])

# we use 'valid' which means we do not add zero padding to our image
edges = convolve2d(img, kernel, mode = 'valid')


# Adjust the contrast of the filtered image by applying Histogram Equalization
edges_equalized = exposure.equalize_adapthist(edges/np.max(np.abs(edges)),
                                              clip_limit = 0.03)

# plot the edges_clipped
plt.figure(figsize = (5,5))
plt.imshow(edges_equalized, cmap='gray')    
plt.axis('off');

Let’s play around for a while with different types of other filters. Let’s choose with Sharpen Kernel. The Sharpen Kernel emphasizes differences in adjacent pixel values. This makes the image look more vivid.Sharpen Kernel=⎛⎝⎜0−10−15−10−10⎞⎠⎟

%%time

from skimage import color
from skimage import exposure
from scipy.signal import convolve2d
import numpy as np
import imageio
import matplotlib.pyplot as plt 

# Convert the image to grayscale
img = color.rgb2gray(pic)  

# apply sharpen filter to the original image
sharpen_kernel = np.array([[0,-1,0],
                           [-1,5,-1],
                           [0,-1,0]])
image_sharpen = convolve2d(img, sharpen_kernel, mode = 'valid')

# apply edge kernel to the output of the sharpen kernel
edge_kernel = np.array([[-1,-1,-1],
                        [-1,8,-1],
                        [-1,-1,-1]])
edges = convolve2d(image_sharpen, edge_kernel, mode = 'valid')

# apply normalize box blur filter to the edge detection filtered image
blur_kernel = np.array([[1,1,1],
                        [1,1,1],
                        [1,1,1]])/9.0;

denoised = convolve2d(edges, blur_kernel, mode = 'valid')

# Adjust the contrast of the filtered image by applying Histogram Equalization
denoised_equalized = exposure.equalize_adapthist(denoised/np.max(np.abs(denoised)),
                                                 clip_limit=0.03)

plt.figure(figsize = (5,5))
plt.imshow(denoised_equalized, cmap='gray')    
plt.axis('off')
plt.show()

For blurring an image, there is a whole host of different windows and functions that can be used. The one of the most common is the Gaussian window. To get a feel what it is doing to an image, let’s apply this filters to the image.Gaussianblur Kernel=116⎛⎝⎜121242121⎞⎠⎟

%%time

from skimage import color
from skimage import exposure
from scipy.signal import convolve2d
import numpy as np
import imageio
import matplotlib.pyplot as plt 

# import image
pic = imageio.imread('img/parrot.jpg')

# Convert the image to grayscale 
img = color.rgb2gray(pic)

# gaussian kernel - used for blurring
kernel = np.array([[1,2,1],
                   [2,4,2],
                   [1,2,1]])
kernel = kernel / np.sum(kernel)

# we use 'valid' which means we do not add zero padding to our image
edges = convolve2d(img, kernel, mode = 'valid')


# Adjust the contrast of the filtered image by applying Histogram Equalization
edges_equalized = exposure.equalize_adapthist(edges/np.max(np.abs(edges)),
                                              clip_limit = 0.03)

# plot the edges_clipped
plt.figure(figsize = (5,5))
plt.imshow(edges_equalized, cmap='gray')    
plt.axis('off')
plt.show()

By finding the gradient in both the X and Y directions, and then taking the magnitude of these values we get a map of the gradients in an image for each colour

%%time

from skimage import color
from skimage import exposure
from scipy.signal import convolve2d
import numpy as np
import imageio
import matplotlib.pyplot as plt 

# import image
pic = imageio.imread('img/parrot.jpg')

# right sobel
sobel_x = np.c_[
    [-1,0,1],
    [-2,0,2],
    [-1,0,1]
]

# top sobel
sobel_y = np.c_[
    [1,2,1],
    [0,0,0],
    [-1,-2,-1]
]

ims = []
for i in range(3):
    sx = convolve2d(pic[:,:,i], sobel_x, mode="same", boundary="symm")
    sy = convolve2d(pic[:,:,i], sobel_y, mode="same", boundary="symm")
    ims.append(np.sqrt(sx*sx + sy*sy))

img_conv = np.stack(ims, axis=2).astype("uint8")

plt.figure(figsize = (6,5))
plt.axis('off')
plt.imshow(img_conv);

To reduce noise. we generally use a filter like, Gaussian Filter which is a digital filtering technique which is often used to remove noise from an image. Here, by combining Gaussian filtering and gradient finding operations together, we can generate some strange patterns that resemble the original image and being distorted in interesting ways.

%%time

from scipy.signal import convolve2d
from scipy.ndimage import (median_filter, gaussian_filter)
import numpy as np
import imageio
import matplotlib.pyplot as plt 

def gaussain_filter_(img):
    """
    Applies a median filer to all channels
    """
    ims = []
    for d in range(3):
        img_conv_d = gaussian_filter(img[:,:,d], sigma = 4)
        ims.append(img_conv_d)
        
    return np.stack(ims, axis=2).astype("uint8")

filtered_img = gaussain_filter_(pic)

# right sobel
sobel_x = np.c_[
    [-1,0,1],
    [-2,0,2],
    [-1,0,1]
]

# top sobel
sobel_y = np.c_[
    [1,2,1],
    [0,0,0],
    [-1,-2,-1]
]

ims = []
for d in range(3):
    sx = convolve2d(filtered_img[:,:,d], sobel_x, mode="same", boundary="symm")
    sy = convolve2d(filtered_img[:,:,d], sobel_y, mode="same", boundary="symm")
    ims.append(np.sqrt(sx*sx + sy*sy))

img_conv = np.stack(ims, axis=2).astype("uint8")

plt.figure(figsize=(7,7))
plt.axis('off')
plt.imshow(img_conv);

Now, let’s see using a Median filter to see what sort of effect it can make on the image.

%%time
from scipy.signal import convolve2d
from scipy.ndimage import (median_filter, gaussian_filter)
import numpy as np
import imageio
import matplotlib.pyplot as plt 

def median_filter_(img, mask):
    """
    Applies a median filer to all channels
    """
    ims = []
    for d in range(3):
        img_conv_d = median_filter(img[:,:,d], size=(mask,mask))
        ims.append(img_conv_d)
        
    return np.stack(ims, axis=2).astype("uint8")

filtered_img = median_filter_(pic, 80)

# right sobel
sobel_x = np.c_[
    [-1,0,1],
    [-2,0,2],
    [-1,0,1]
]

# top sobel
sobel_y = np.c_[
    [1,2,1],
    [0,0,0],
    [-1,-2,-1]
]

ims = []
for d in range(3):
    sx = convolve2d(filtered_img[:,:,d], sobel_x, mode="same", boundary="symm")
    sy = convolve2d(filtered_img[:,:,d], sobel_y, mode="same", boundary="symm")
    ims.append(np.sqrt(sx*sx + sy*sy))

img_conv = np.stack(ims, axis=2).astype("uint8")

plt.figure(figsize=(7,7))
plt.axis('off')
plt.imshow(img_conv);

Thresholding

Ostu’s Method

Thresholding is a very basic operation in image processing. Converting a greyscale image to monochrome is a common image processing task. And, a good algorithm always begins with a good basis!

Otsu thresholding is a simple yet effective global automatic thresholding method for binarizing grayscale images such as foregrounds and backgrounds. In image processing, Otsu’s thresholding method (1979) is used for automatic binarization level decision, based on the shape of the histogram. It is based entirely on computation performed on the histogram of an image.

The algorithm assumes that the image is composed of two basic classes: Foreground and Background. It then computes an optimal threshold value that minimizes the weighted within class variances of these two classes.

Otsu threshold is used in many applications from medical imaging to low-level computer vision. It’s many advantages and assumptions.

Mathematical Formulation

In Otsu’s method we comprehensively search for the threshold that minimizes the intra-class variance - the variance within the class which defined as a weighted sum of variances of the two classes:σ2w(t)=w0(t)σ20(t)+w1(t)σ21(t)

Weights W0 adn W1 are the probabilities of the two classes separated by a threshold t , and σ20(t) and σ21(t) are variances of these two classes. Mathematically probabilities of the two classes are defined asw0(t)w1(t)=∑i=0t−1p(i)=∑i=tL−1p(i)

Now, the Otsu’s method involves the iterative fashion across all the possible threshold values and measuring the spread for the pixel levels for each side of the threshold or the pixels that either fall in background or foreground regions. The goal is to find the threshold value where the sum of the bacground and foreground spreads is at its minimum.

Let’s demonstrate the process using the a simple 6:6 gray level image. The histogram for the image is drawn below. For simplification we choose only 6 levels grayscale.

Now, let’s calculate for finding variacne which is the measure of spread for a single threshold. Let’s assume, our threshold value is 3.

Back, μ0(t)Fore, μ1(t)=∑t−1i=0ip(i)w0(t)=∑L−1i=tip(i)w1(t)

Background

Weight, W0Mean, μ0Variance σ20=10+8+636=0.67=[(0∗10)+(1∗8)+(2∗6)]24=0.83=[(0−0.83)2∗10+(1−0.83)2∗8+(2−0.83)2∗6]24=6.89+0.23+8.2124=0.64

Foreground

Weight, W1Mean, μ1Variance σ21=8+4+236=0.39=[(3x8)+(4x4)+(5x2)]14=3.57=[(3−3.57)2∗8+(4−3.57)2∗4+(5−3.57)2∗2]14=0.325∗8+0.185∗4+2.05∗214=0.53

The next step is to calculate the Within-Class Variance. This is simply the sum of the two variances multiplied by their associated weights.σ2w(t)=w0(t)σ20(t)+w1(t)σ21(t)=0.67∗0.64+0.39∗0.53=0.64

This value is the sum of weighted of intra-class variance for the threshold value 3.

Otsu shows that minimizing the intra-class variance is the same as maximizing inter-class variance. Inter calss variance is mathematically defined as:σ2b(t)=σ2(t)−σ2w(t)=w0(μ0−μT)2+w1(μ1−μT)2=w0(t)w1(t)[μ0(t)−μ1(t)]2

As previously we randomly choose threshold value 3, let’s calculate inter-class variance for this threshold value.σ2b(t)=0.67∗0.39∗[0.83−3.57]2=1.96

This same calculation needs to be performed for all the possible threshold values 0 to 5, which is 0 to 255 for real gray level practical image.Within Class Variance: σ2w(t)Between Class Variance: σ2b(t)=w0(t)σ20(t)+w1(t)σ21(t)=w0(t)w1(t)[μ0(t)−μ1(t)]2

Algorithm

If we incorporate a little math into that simple step-wise algorithm, such an explanation evolves:

  • Compute histogram and probabilities of each intensity level.

  • Set up initial wi and μi.

  • Step through from threshold t = 0 to t = L-1:

    • update: wi and μi

    • compute: σ2b(t)

  • The Desired threshold corresponds to the maximum value of σ2b(t).

import numpy as np
import imageio
import matplotlib.pyplot as plt

pic = imageio.imread('img/potato.jpeg')
plt.figure(figsize=(7,7))
plt.axis('off')
plt.imshow(pic);
def otsu_threshold(im):

    # Compute histogram and probabilities of each intensity level
    pixel_counts = [np.sum(im == i) for i in range(256)]

    # Initialization
    s_max = (0,0)
    
    for threshold in range(256):

        # update
        w_0 = sum(pixel_counts[:threshold])
        w_1 = sum(pixel_counts[threshold:])

        mu_0 = sum([i * pixel_counts[i] for i in range(0,threshold)]) / w_0 if w_0 > 0 else 0       
        mu_1 = sum([i * pixel_counts[i] for i in range(threshold, 256)]) / w_1 if w_1 > 0 else 0

        # calculate - inter class variance
        s = w_0 * w_1 * (mu_0 - mu_1) ** 2

        if s > s_max[1]:
            s_max = (threshold, s)
            
            
    return s_max[0]
def threshold(pic, threshold):
    return ((pic > threshold) * 255).astype('uint8')

gray = lambda rgb : np.dot(rgb[... , :3] , [0.21 , 0.72, 0.07]) 

plt.figure(figsize=(7,7))
plt.imshow(threshold(gray(pic), otsu_threshold(pic)), cmap='Greys')
plt.axis('off');

Nice but not Great. Otsu’s method exhibits the relatively good performance if the histogram can be assumed to have bimodal distribution and assumed to possess a deep and sharp valley between two peaks.

So, now if the object area is small compared with the background area, the histogram no longer exhibits bimodality And if the variances of the object and the background intensities are large compared to the mean difference, or the image is severely corrupted by additive noise, the sharp valley of the gray level histogram is degraded.

KMeans Clustering

from sklearn import cluster
import matplotlib.pyplot as plt

# load image
pic = imageio.imread('img/purple.jpg') 

plt.figure(figsize=(7,7))
plt.imshow(pic)
plt.axis('off');

For clustering the image, we need to convert it into a two-dimensional array.

x, y, z = pic.shape
pic_2d = pic.reshape(x*y, z)

The clustering number 5 was chosen heuristically for this demonstration. One can change the number of clusters to visually validate image with different colors and decide that closely matches the required number of clusters.

%%time

# fit on the image with cluster five
kmeans_cluster = cluster.KMeans(n_clusters=5)
kmeans_cluster.fit(pic_2d)

cluster_centers = kmeans_cluster.cluster_centers_
cluster_labels = kmeans_cluster.labels_

Once the clusters are formed, we can recreate the image with the cluster centers and labels to display the image with grouped patterns.

plt.figure(figsize=(7,7))
plt.imshow(cluster_centers[cluster_labels].reshape(x, y, z))
plt.axis('off');

Line Detection

Hough Transform

Hough Transform is a popular technique to detect any shape if we can represent that shape in mathematical form. It can detect the shape even if it is broken or distorted a little bit. We won’t go too deeper to analyze the mechanism of Hough transform rather than giving intuitive mathematical description before implementing it on code and also provide some resource to understand it more in details.

Mathematical Formulation:

We all know that a line in Cartesian form has the equation:

y=mx+c

where    
m = gradient or slope of the line  
c = y-intercept

We re-arrange the above equation:

c=−xm+y

Let’s consider a linear equation we’ve plotted in x−y plane or the Image Space. This line has a lot of combinations of values of x and y . Now, instead of plane x−y, let’s consider plane m−c , here which is called feature space or parameter space. In this feature space, point A and B, define a lot of possibilities. Now, the intersection point between two line in feature space is the actual values of m and c, for which the linear equation which plotted in image space is fulfilled.

We use re-arrange linear equation on feature space and find the all possible paired of c,m for various x,y. All points on a line in image space intersect at a common point in parameter space. This common point (m,c) represents the line in image space.

In other words, all points on a line in image space intersect at a common point in parameter space. This common point (m, b) represents the line in image space.

But, the slope, m is undefined when the line is vertical. To overcome this problem we use polar coordinatee system. A line in a Polar coordinate system has the equation:

ρ=xcosθ+ysinθ

where
ρ = distance from origin to the line. [-Dmax, Dmax]
Dmax is the diagonal length of the image.

θ = angle from origin to the line. [-90° to 90°]

Any line can be represented in these two terms, (ρ,θ). So first it creates a 2D array or accumulator (to hold values of two parameters) and it is set to 0 initially.

Let rows denote the ρ and columns denote the θ. Size of the array depends on the accuracy we need. Suppose we want the accuracy of angles to be 1 degree, we need 180 columns. For ρ, the maximum distance possible is the diagonal length of the image. So taking one-pixel accuracy, a number of rows can be a diagonal length of the image.

So the problem of detecting a line in an image becomes a problem of detecting a point in Hough space. Once we detect the points in Hough space, we can do an inverse transform to get the corresponding line in image space.

Algorithm

  • Corner or edge detection

  • ρ range and θ range creation

    • ρ : -Dmax to Dmax

    • θ : -90 to 90

  • Hough accumulator

    • 2D array with the number of rows equal to the number of ρ values and the number of columns equal to the number of θ values.

  • Voting in the accumulator

    • For each edge point and for each θ value, find the nearest ρ value and increment that index in the accumulator.

  • Peak finding

    • Local maxima in the accumulator indicate the parameters of the most prominent lines in the input image.

def hough_line(img):
    # Rho and Theta ranges
    thetas = np.deg2rad(np.arange(-90.0, 90.0))
    width, height = img.shape
    diag_len = int(np.ceil(np.sqrt(width * width + height * height)))   # Dmax
    rhos = np.linspace(-diag_len, diag_len, diag_len * 2.0)

    # Cache some resuable values
    cos_t = np.cos(thetas)
    sin_t = np.sin(thetas)
    num_thetas = len(thetas)

    # Hough accumulator array of theta vs rho
    accumulator = np.zeros((2 * diag_len, num_thetas), dtype=np.uint64)
    y_idxs, x_idxs = np.nonzero(img)  # (row, col) indexes to edges

    # Vote in the hough accumulator
    for i in range(len(x_idxs)):
        x = x_idxs[i]
        y = y_idxs[i]
        
        for t_idx in range(num_thetas):
            # Calculate rho. diag_len is added for a positive index
            rho = round(x * cos_t[t_idx] + y * sin_t[t_idx]) + diag_len
            accumulator[rho, t_idx] += 1
    return accumulator, thetas, rhos

Edge Detection

Edge detection is an image processing technique for finding the boundaries of objects within images. It works by detecting discontinuities in brightness. Common edge detection algorithms include

  • Sobel

  • Canny

  • Prewitt

  • Roberts and

  • fuzzy logic methods.

Here, We’ll cover one of the most popular methods, which is the Canny Edge Detection.

A multi-stage edge detection operation capable of detecting a wide range of edges in images. Now, the Process of Canny edge detection algorithm can be broken down into 5 different steps:

  1. Apply Gaussian Filter

  2. Find the intensity gradients

  3. Apply non-maximum suppression

  4. Apply double threshold

  5. Track edge by hysteresis.

Process

  • First, apply the Gaussian filter: to smooth the image in order to remove the noise. As the edge detection is susceptible to noise in the image, so we have to remove the noise in the image.

∇f=(gxgy)=∣∣∣∣∂f∂x∂f∂y∣∣∣∣

where:

∂f∂x = is the derivative with respect to x (gradient in the x direction)

∂f∂y = is the derivative with respect to y (gradient in the y direction).

And the magnitude and the gradient direction is given by:G=G2x+G2y−−−−−−−√ (θ)=tan−1(GyGx)

For demonstrate purpose, let’s consider the following images:

On the left, an intensity image of a cat. In the center, a gradient image in the x-direction measuring a horizontal change in intensity. On the right, a gradient image in the y-direction measuring a vertical change in intensity.

  • Apply non-maximum suppression: to get rid of spurious response to edge detection. So, after getting gradient magnitude and direction, a full scan of an image is done to remove any unwanted pixels which may not constitute the edge. For this, at every pixel, a pixel is checked if it is a local maximum in its neighborhood in the direction of the gradient.

  • Apply double threshold: to determine potential edges.

  • Track edge by hysteresis: Finalize the detection of edges by suppressing all the other edges that are weak and not connected to strong edges. For this, we need two threshold values, minVal and maxVal. Any edges with intensity gradient more than maxVal are sure to be edges and those below minVal are sure to be non-edges, so discarded. Those who lie between these two thresholds are classified as edges or non-edges based on their connectivity.

Vectorization

Contour tracking

We can use a contour tracing algorithm to Scikit-Image to extract the paths around the object. This controls how accurately the path follows the original bitmap shape.

from sklearn.cluster import KMeans
import matplotlib.pyplot as plt
from skimage import measure
import numpy as np
import imageio

pic = imageio.imread('img/parrot.jpg')

h,w = pic.shape[:2]

im_small_long = pic.reshape((h * w, 3))
im_small_wide = im_small_long.reshape((h,w,3))

km = KMeans(n_clusters=2)
km.fit(im_small_long)

seg = np.asarray([(1 if i == 1 else 0)
                  for i in km.labels_]).reshape((h,w))

contours = measure.find_contours(seg, 0.5, fully_connected="high")
simplified_contours = [measure.approximate_polygon(c, tolerance=5) 
                       for c in contours]

plt.figure(figsize=(5,10))
for n, contour in enumerate(simplified_contours):
    plt.plot(contour[:, 1], contour[:, 0], linewidth=2)
    
    
plt.ylim(h,0)
plt.axes().set_aspect('equal')

Image Compression

Stacked Autoencoder

We like to conclude with a brief overview of Autoencoder. It’s a data compression algorithm where the compression and decompression functions are

  • Data-specific,

  • Lossy, and

  • Learned automatically from examples rather than engineered by a human.

And also some following stuff, in case if someone is interested too.

Now, I do realize that these topics are quite complex and lots of stuff can take concern and can be made in whole posts by each of them. In an effort to remain concise and complex free, I will implement the code but will skip explaining in details that what makes it happens and just showing the outcome of it.

import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf

from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data",one_hot=True)

# Parameter
neurons_hid3    = neurons_hid1 # Decoder Begins
num_inputs      = 784 # 28*28
num_outputs     = num_inputs
learning_rate   = 0.01
neurons_hid1    = 392
neurons_hid2    = 196
num_epochs      = 5
batch_size      = 150
num_test_images = 10 # Test Autoencoder output on Test Data

# activation function
actf = tf.nn.relu

# place holder
X = tf.placeholder(tf.float32, shape=[None, num_inputs])

# Weights
initializer = tf.variance_scaling_initializer()
w1 = tf.Variable(initializer([num_inputs, neurons_hid1]), dtype=tf.float32)
w2 = tf.Variable(initializer([neurons_hid1, neurons_hid2]), dtype=tf.float32)
w3 = tf.Variable(initializer([neurons_hid2, neurons_hid3]), dtype=tf.float32)
w4 = tf.Variable(initializer([neurons_hid3, num_outputs]), dtype=tf.float32)

# Biases
b1 = tf.Variable(tf.zeros(neurons_hid1))
b2 = tf.Variable(tf.zeros(neurons_hid2))
b3 = tf.Variable(tf.zeros(neurons_hid3))
b4 = tf.Variable(tf.zeros(num_outputs))

# Activation Function and Layers
act_func = tf.nn.relu

hid_layer1   = act_func(tf.matmul(X, w1) + b1)
hid_layer2   = act_func(tf.matmul(hid_layer1, w2) + b2)
hid_layer3   = act_func(tf.matmul(hid_layer2, w3) + b3)
output_layer = tf.matmul(hid_layer3, w4) + b4

# Loss Function
loss = tf.reduce_mean(tf.square(output_layer - X))

# Optimizer
optimizer = tf.train.AdamOptimizer(learning_rate)
train     = optimizer.minimize(loss)

# Intialize Variables
init  = tf.global_variables_initializer()
saver = tf.train.Saver() 

with tf.Session() as sess:
    sess.run(init)
    
    # Epoch == Entire Training Set
    for epoch in range(num_epochs):
        num_batches = mnist.train.num_examples // batch_size
        
        # 150 batch size
        for iteration in range(num_batches):
            X_batch, y_batch = mnist.train.next_batch(batch_size)
            sess.run(train, feed_dict={X: X_batch})
            
        training_loss = loss.eval(feed_dict={X: X_batch})   
        print("Epoch {} Complete Training Loss: {}".format(epoch,training_loss))
    saver.save(sess, "./stacked_autoencoder.ckpt")       
    
with tf.Session() as sess:
    saver.restore(sess,"./stacked_autoencoder.ckpt")
    results = output_layer.eval(feed_dict={X:mnist.test.images[:num_test_images]})

Training phase; loss decreases with epochs.

Epoch (0,1,2,3,4) Complete Training Loss: (0.02334, 0.02253, 0.02003, 0.02132, 0.01938)

Let’s visualize some outcomes.

# Compare original images with their reconstructions
f, a = plt.subplots(2, 10, figsize=(20, 4))
for i in range(num_test_images):
    a[0][i].imshow(np.reshape(mnist.test.images[i], (28, 28)))
    a[1][i].imshow(np.reshape(results[i], (28, 28)))

I’m so excited, let’s begin.

png
png

Resources:

-

png

Resources:

-

png

Resources:

- - -

We’ve discussed briefly in our previous is that, when a computer sees an image, it sees an array of pixel values. Now, Depending on the resolution and size of the image, it will see a 32 x 32 x 3 array of numbers where the 3 refers to RGB values or channels. Just to drive home the point, let’s say we have a color image in PNG form and its size is 480 x 480. The representative array will be 480 x 480 x 3. Each of these numbers is given a value from 0 to 255 which describes the pixel intensity at that point.

Like we mentioned before, the input is a 32 x 32 x 3 array of pixel values. Now, the best way to explain a convolution is to imagine a flashlight that is shining over the top left of the image. Let’s say that the flashlight shines covers a 3 x 3 area. And now, let’s imagine this flashlight sliding across all the areas of the input image. In machine learning terms, this flashlight is called a filter or or sometimes referred to as weights or mask and the region that it is shining over is called the .

An image kernel or filter is a small matrix used to apply effects like the ones we might find in Photoshop or Gimp, such as blurring, sharpening, outlining or embossing. They’re also used in machine learning for feature extraction, a technique for determining the most important portions of an image. For more, have a look at Gimp’s excellent documentation on using . We can find a list of most common kernels .

png
png

However, For the pixels on the border of the image matrix, some elements of the kernel might stand out of the image matrix and therefore does not have any corresponding element from the image matrix. In this case, we can eliminate the convolution operation for these positions which end up an output matrix smaller than the input or we can apply to the input matrix.

png

Depending on the element values, a kernel can cause a wide range of effects. this site to visualize the output of various kernel. In this article, we’ll go through few of them.

png
png

Let’s apply the edge detection kernel to the output of the sharpen kernel and also further with box blur filter.

png
png

By using more exotic windows, was can extract different kinds of information. The are used to show only the differences in adjacent pixel values in a particular direction. It tries to approximate the along one direction using kernel functions of the form following.Right Sobel Kernel=⎛⎝⎜−1−2−1000121⎞⎠⎟ Left Sobel Kernel=⎛⎝⎜121000−1−2−1⎞⎠⎟Top Sobel Kernel=⎛⎝⎜10−120−210−1⎞⎠⎟ Bottom Sobel Kernel=⎛⎝⎜−101−202−121⎞⎠⎟

png
png
png

Resources:

- - - - - - - - -

png
png
png
png
png

As a result, the possibly incorrect threshold determined by Otsu’s method results in the segmentation error. But we can further .

k-means clustering is a method of , originally from signal processing, that is popular for in .

In Otsu thresholding, we found the threshold which minimized the intra-segment pixel variance. So, rather than looking for a threshold from a gray level image, we can look for clusters in color space, and by doing so we end up with the technique.

png

Next, we use method to create clusters. We pass n_clusters as 5 to form five clusters. The clusters appear in the resulting image, dividing it into five parts with distinct colors.

png

Resources:

- - -

png

I’ve provided full implementation Code of Hough Transform in this .

Resources:

- - -

Let’s understand each of them intuitively. For a more comprehensive overview, please check the given link at the end of this article. However, this article is already becoming too big, so we decide not to provide the full implementation of code here rather than giving an intuitive overview of an algorithm of that code. But one can skip and jump to the for the code :

Find the : of the image. The smoothened image is then filtered with a Sobel kernel in both the horizontal and vertical direction to get first derivative in horizontal direction (Gx) and vertical direction (Gy). Gradient direction is always perpendicular to edges. It is rounded to one of four angles representing vertical, horizontal and two diagonal directions. From these two images, we can find edge gradient and direction for each pixel as follows:

(image courtesy to wiki)

Resources:

- - -

png
png

As it’s data specific and lossy, it’s not good for image compression in general. The fact that autoencoders are data-specific which makes them generally impractical for real-world data compression problems. But there’s a hope, future advances might change this. I find it interesting, though it’s not good enough and also very poor performance compared to another compression algorithm like JPEG, MPEG etc. this keras blog post regarding on this issue.

- -

png

Here, in the first row which is the loaded training set and the second row is the reconstructed those training set after encoding and decoding using autoencoder. Looks nice, but not great, there’s a lot of information missing in the reconstructed images. So, autoencoder is not as good as other compression technique but as a part of fast growing promising technology, future advances might change this, who knows.

At the ends of our 2 part series on Basic Image-Processing in Python, hope everyone was able to follow along, and if you feel that I have done something important mistake, please let me know in the comments!

Source Code:

Reference :

Importing images and observe it’s properties
Splitting the layers
Greyscale
Using Logical Operator on pixel values
Masking using Logical Operator
Satellite Image Data Analysis
Intensity Transformation
Convolution
Segmentation
Vectorization
Image Compression
1
2
1
2
1
2
3
4
article
kernel
receptive field
Image kernel’s
here
padding
Check out
normalizing
Sobel kernels
gradients of the image
1
2
3
4
5
6
7
8
9
10
improve Otsu’s method
vector quantization
cluster analysis
data mining
K-means clustering
scikit-learn’s cluster
1
2
3
4
repo
1
2
3
4
Canny Edge Detection
intensity gradients
1
2
3
4
Check out
1
2
3
MNIST
GitHub
Sources
https://innat.github.io/innat.github.io/Image-Processing-in-Python-Part-2/
imageio
numpy
OpenCV
repo
:smirk:
:books:
:books:
:books:
:seedling:
:books:
:books:
:books:
:books:
:thumbsup:
:relaxed: