Digital image processing
Algorithmic processing of digitallyrepresented images / From Wikipedia, the free encyclopedia
Dear Wikiwand AI, let's keep it short by simply answering these key questions:
Can you list the top facts and stats about Image processing?
Summarize this article for a 10 year old
Digital image processing is the use of a digital computer to process digital images through an algorithm.^{[1]}^{[2]} As a subcategory or field of digital signal processing, digital image processing has many advantages over analog image processing. It allows a much wider range of algorithms to be applied to the input data and can avoid problems such as the buildup of noise and distortion during processing. Since images are defined over two dimensions (perhaps more) digital image processing may be modeled in the form of multidimensional systems. The generation and development of digital image processing are mainly affected by three factors: first, the development of computers; second, the development of mathematics (especially the creation and improvement of discrete mathematics theory); third, the demand for a wide range of applications in environment, agriculture, military, industry and medical science has increased.
Many of the techniques of digital image processing, or digital picture processing as it often was called, were developed in the 1960s, at Bell Laboratories, the Jet Propulsion Laboratory, Massachusetts Institute of Technology, University of Maryland, and a few other research facilities, with application to satellite imagery, wirephoto standards conversion, medical imaging, videophone, character recognition, and photograph enhancement.^{[3]} The purpose of early image processing was to improve the quality of the image. It was aimed for human beings to improve the visual effect of people. In image processing, the input is a lowquality image, and the output is an image with improved quality. Common image processing include image enhancement, restoration, encoding, and compression. The first successful application was the American Jet Propulsion Laboratory (JPL). They used image processing techniques such as geometric correction, gradation transformation, noise removal, etc. on the thousands of lunar photos sent back by the Space Detector Ranger 7 in 1964, taking into account the position of the Sun and the environment of the Moon. The impact of the successful mapping of the Moon's surface map by the computer has been a success. Later, more complex image processing was performed on the nearly 100,000 photos sent back by the spacecraft, so that the topographic map, color map and panoramic mosaic of the Moon were obtained, which achieved extraordinary results and laid a solid foundation for human landing on the Moon.^{[4]}
The cost of processing was fairly high, however, with the computing equipment of that era. That changed in the 1970s, when digital image processing proliferated as cheaper computers and dedicated hardware became available. This led to images being processed in realtime, for some dedicated problems such as television standards conversion. As generalpurpose computers became faster, they started to take over the role of dedicated hardware for all but the most specialized and computerintensive operations. With the fast computers and signal processors available in the 2000s, digital image processing has become the most common form of image processing, and is generally used because it is not only the most versatile method, but also the cheapest.
Image sensors
The basis for modern image sensors is metal–oxide–semiconductor (MOS) technology,^{[5]} which originates from the invention of the MOSFET (MOS fieldeffect transistor) by Mohamed M. Atalla and Dawon Kahng at Bell Labs in 1959.^{[6]} This led to the development of digital semiconductor image sensors, including the chargecoupled device (CCD) and later the CMOS sensor.^{[5]}
The chargecoupled device was invented by Willard S. Boyle and George E. Smith at Bell Labs in 1969.^{[7]} While researching MOS technology, they realized that an electric charge was the analogy of the magnetic bubble and that it could be stored on a tiny MOS capacitor. As it was fairly straightforward to fabricate a series of MOS capacitors in a row, they connected a suitable voltage to them so that the charge could be stepped along from one to the next.^{[5]} The CCD is a semiconductor circuit that was later used in the first digital video cameras for television broadcasting.^{[8]}
The NMOS activepixel sensor (APS) was invented by Olympus in Japan during the mid1980s. This was enabled by advances in MOS semiconductor device fabrication, with MOSFET scaling reaching smaller micron and then submicron levels.^{[9]}^{[10]} The NMOS APS was fabricated by Tsutomu Nakamura's team at Olympus in 1985.^{[11]} The CMOS activepixel sensor (CMOS sensor) was later developed by Eric Fossum's team at the NASA Jet Propulsion Laboratory in 1993.^{[12]} By 2007, sales of CMOS sensors had surpassed CCD sensors.^{[13]}
MOS image sensors are widely used in optical mouse technology. The first optical mouse, invented by Richard F. Lyon at Xerox in 1980, used a 5 µm NMOS integrated circuit sensor chip.^{[14]}^{[15]} Since the first commercial optical mouse, the IntelliMouse introduced in 1999, most optical mouse devices use CMOS sensors.^{[16]}^{[17]}
Image compression
An important development in digital image compression technology was the discrete cosine transform (DCT), a lossy compression technique first proposed by Nasir Ahmed in 1972.^{[18]} DCT compression became the basis for JPEG, which was introduced by the Joint Photographic Experts Group in 1992.^{[19]} JPEG compresses images down to much smaller file sizes, and has become the most widely used image file format on the Internet.^{[20]} Its highly efficient DCT compression algorithm was largely responsible for the wide proliferation of digital images and digital photos,^{[21]} with several billion JPEG images produced every day as of 2015^{[update]}.^{[22]}
Medical imaging techniques produce very large amounts of data, especially from CT, MRI and PET modalities. As a result, storage and communications of electronic image data are prohibitive without the use of compression.^{[23]}^{[24]} JPEG 2000 image compression is used by the DICOM standard for storage and transmission of medical images. The cost and feasibility of accessing large image data sets over low or various bandwidths are further addressed by use of another DICOM standard, called JPIP, to enable efficient streaming of the JPEG 2000 compressed image data.^{[25]}
Digital signal processor (DSP)
Electronic signal processing was revolutionized by the wide adoption of MOS technology in the 1970s.^{[26]} MOS integrated circuit technology was the basis for the first singlechip microprocessors and microcontrollers in the early 1970s,^{[27]} and then the first singlechip digital signal processor (DSP) chips in the late 1970s.^{[28]}^{[29]} DSP chips have since been widely used in digital image processing.^{[28]}
The discrete cosine transform (DCT) image compression algorithm has been widely implemented in DSP chips, with many companies developing DSP chips based on DCT technology. DCTs are widely used for encoding, decoding, video coding, audio coding, multiplexing, control signals, signaling, analogtodigital conversion, formatting luminance and color differences, and color formats such as YUV444 and YUV411. DCTs are also used for encoding operations such as motion estimation, motion compensation, interframe prediction, quantization, perceptual weighting, entropy encoding, variable encoding, and motion vectors, and decoding operations such as the inverse operation between different color formats (YIQ, YUV and RGB) for display purposes. DCTs are also commonly used for highdefinition television (HDTV) encoder/decoder chips.^{[30]}
Medical imaging
In 1972, the engineer from British company EMI Housfield invented the Xray computed tomography device for head diagnosis, which is what is usually called CT (computer tomography). The CT nucleus method is based on the projection of the human head section and is processed by computer to reconstruct the crosssectional image, which is called image reconstruction. In 1975, EMI successfully developed a CT device for the whole body, which obtained a clear tomographic image of various parts of the human body. In 1979, this diagnostic technique won the Nobel Prize.^{[4]} Digital image processing technology for medical applications was inducted into the Space Foundation Space Technology Hall of Fame in 1994.^{[31]}
As of 2010, 5 billion medical imaging studies had been conducted worldwide.^{[32]}^{[33]} Radiation exposure from medical imaging in 2006 made up about 50% of total ionizing radiation exposure in the United States.^{[34]} Medical imaging equipment is manufactured using technology from the semiconductor industry, including CMOS integrated circuit chips, power semiconductor devices, sensors such as image sensors (particularly CMOS sensors) and biosensors, and processors such as microcontrollers, microprocessors, digital signal processors, media processors and systemonchip devices. As of 2015^{[update]}, annual shipments of medical imaging chips amount to 46 million units and $1.1 billion.^{[35]}^{[36]}
Digital image processing allows the use of much more complex algorithms, and hence, can offer both more sophisticated performance at simple tasks, and the implementation of methods which would be impossible by analogue means.
In particular, digital image processing is a concrete application of, and a practical technology based on:
Some techniques which are used in digital image processing include:
Filtering
Digital filters are used to blur and sharpen digital images. Filtering can be performed by:
 convolution with specifically designed kernels (filter array) in the spatial domain^{[37]}
 masking specific frequency regions in the frequency (Fourier) domain
The following examples show both methods:^{[38]}
Filter type  Kernel or mask  Example 

Original Image  ${\begin{bmatrix}0&0&0\\0&1&0\\0&0&0\end{bmatrix}}$  
Spatial Lowpass  ${\frac {1}{9}}\times {\begin{bmatrix}1&1&1\\1&1&1\\1&1&1\end{bmatrix}}$  
Spatial Highpass  ${\begin{bmatrix}0&1&0\\1&4&1\\0&1&0\end{bmatrix}}$  
Fourier Representation  Pseudocode:
image = checkerboard F = Fourier Transform of image Show Image: log(1+Absolute Value(F)) 

Fourier Lowpass  
Fourier Highpass  
Image padding in Fourier domain filtering
Images are typically padded before being transformed to the Fourier space, the highpass filtered images below illustrate the consequences of different padding techniques:
Notice that the highpass filter shows extra edges when zero padded compared to the repeated edge padding.
Filtering code examples
MATLAB example for spatial domain highpass filtering.
img=checkerboard(20); % generate checkerboard
% ************************** SPATIAL DOMAIN ***************************
klaplace=[0 1 0; 1 5 1; 0 1 0]; % Laplacian filter kernel
X=conv2(img,klaplace); % convolve test img with
% 3x3 Laplacian kernel
figure()
imshow(X,[]) % show Laplacian filtered
title('Laplacian Edge Detection')
Affine transformations
Affine transformations enable basic image transformations including scale, rotate, translate, mirror and shear as is shown in the following examples:^{[38]}
Transformation Name  Affine Matrix  Example 

Identity  ${\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}}$  
Reflection  ${\begin{bmatrix}1&0&0\\0&1&0\\0&0&1\end{bmatrix}}$  
Scale  ${\begin{bmatrix}c_{x}=2&0&0\\0&c_{y}=1&0\\0&0&1\end{bmatrix}}$  
Rotate  ${\begin{bmatrix}\cos(\theta )&\sin(\theta )&0\\\sin(\theta )&\cos(\theta )&0\\0&0&1\end{bmatrix}}$  where θ = π/6 =30° 
Shear  ${\begin{bmatrix}1&c_{x}=0.5&0\\c_{y}=0&1&0\\0&0&1\end{bmatrix}}$  
To apply the affine matrix to an image, the image is converted to matrix in which each entry corresponds to the pixel intensity at that location. Then each pixel's location can be represented as a vector indicating the coordinates of that pixel in the image, [x, y], where x and y are the row and column of a pixel in the image matrix. This allows the coordinate to be multiplied by an affinetransformation matrix, which gives the position that the pixel value will be copied to in the output image.
However, to allow transformations that require translation transformations, 3 dimensional homogeneous coordinates are needed. The third dimension is usually set to a nonzero constant, usually 1, so that the new coordinate is [x, y, 1]. This allows the coordinate vector to be multiplied by a 3 by 3 matrix, enabling translation shifts. So the third dimension, which is the constant 1, allows translation.
Because matrix multiplication is associative, multiple affine transformations can be combined into a single affine transformation by multiplying the matrix of each individual transformation in the order that the transformations are done. This results in a single matrix that, when applied to a point vector, gives the same result as all the individual transformations performed on the vector [x, y, 1] in sequence. Thus a sequence of affine transformation matrices can be reduced to a single affine transformation matrix.
For example, 2 dimensional coordinates only allow rotation about the origin (0, 0). But 3 dimensional homogeneous coordinates can be used to first translate any point to (0, 0), then perform the rotation, and lastly translate the origin (0, 0) back to the original point (the opposite of the first translation). These 3 affine transformations can be combined into a single matrix, thus allowing rotation around any point in the image.^{[39]}
Image denoising with Morphology
Mathematical morphology is suitable for denoising images. Structuring element are important in Mathematical morphology.
The following examples are about Structuring elements. The denoise function, image as I, and structuring element as B are shown as below and table.
e.g. $(I')={\begin{bmatrix}45&50&65\\40&60&55\\25&15&5\end{bmatrix}}B={\begin{bmatrix}1&2&1\\2&1&1\\1&0&3\end{bmatrix}}$
Define Dilation(I, B)(i,j) = $max\{I(i+m,j+n)+B(m,n)\}$. Let Dilation(I,B) = D(I,B)
D(I', B)(1,1) = $max(45+1,50+2,65+1,40+2,60+1,55+1,25+1,15+0,5+3)=66$
Define Erosion(I, B)(i,j) = $min\{I(i+m,j+n)B(m,n)\}$. Let Erosion(I,B) = E(I,B)
E(I', B)(1,1) = $min(451,502,651,402,601,551,251,150,53)=2$
After dilation $(I')={\begin{bmatrix}45&50&65\\40&66&55\\25&15&5\end{bmatrix}}$ After erosion $(I')={\begin{bmatrix}45&50&65\\40&2&55\\25&15&5\end{bmatrix}}$
An opening method is just simply erosion first, and then dilation while the closing method is vice versa. In reality, the D(I,B) and E(I,B) can implemented by Convolution
Structuring element  Mask  Code  Example 

Original Image  None  Use Matlab to read Original image
original = imread('scene.jpg');
image = rgb2gray(original);
[r, c, channel] = size(image);
se = logical([1 1 1 ; 1 1 1 ; 1 1 1]);
[p, q] = size(se);
halfH = floor(p/2);
halfW = floor(q/2);
time = 3; % denoising 3 times with all method


Dilation  ${\begin{bmatrix}1&1&1\\1&1&1\\1&1&1\end{bmatrix}}$  Use Matlab to dilation
imwrite(image, "scene_dil.jpg")
extractmax = zeros(size(image), class(image));
for i = 1 : time
dil_image = imread('scene_dil.jpg');
for col = (halfW + 1): (c  halfW)
for row = (halfH + 1) : (r  halfH)
dpointD = row  halfH;
dpointU = row + halfH;
dpointL = col  halfW;
dpointR = col + halfW;
dneighbor = dil_image(dpointD:dpointU, dpointL:dpointR);
filter = dneighbor(se);
extractmax(row, col) = max(filter);
end
end
imwrite(extractmax, "scene_dil.jpg");
end


Erosion  ${\begin{bmatrix}1&1&1\\1&1&1\\1&1&1\end{bmatrix}}$  Use Matlab to erosion
imwrite(image, 'scene_ero.jpg');
extractmin = zeros(size(image), class(image));
for i = 1: time
ero_image = imread('scene_ero.jpg');
for col = (halfW + 1): (c  halfW)
for row = (halfH +1): (r halfH)
pointDown = rowhalfH;
pointUp = row+halfH;
pointLeft = colhalfW;
pointRight = col+halfW;
neighbor = ero_image(pointDown:pointUp,pointLeft:pointRight);
filter = neighbor(se);
extractmin(row, col) = min(filter);
end
end
imwrite(extractmin, "scene_ero.jpg");
end


Opening  ${\begin{bmatrix}1&1&1\\1&1&1\\1&1&1\end{bmatrix}}$  Use Matlab to Opening
imwrite(extractmin, "scene_opening.jpg")
extractopen = zeros(size(image), class(image));
for i = 1 : time
dil_image = imread('scene_opening.jpg');
for col = (halfW + 1): (c  halfW)
for row = (halfH + 1) : (r  halfH)
dpointD = row  halfH;
dpointU = row + halfH;
dpointL = col  halfW;
dpointR = col + halfW;
dneighbor = dil_image(dpointD:dpointU, dpointL:dpointR);
filter = dneighbor(se);
extractopen(row, col) = max(filter);
end
end
imwrite(extractopen, "scene_opening.jpg");
end


Closing  ${\begin{bmatrix}1&1&1\\1&1&1\\1&1&1\end{bmatrix}}$  Use Matlab to Closing
imwrite(extractmax, "scene_closing.jpg")
extractclose = zeros(size(image), class(image));
for i = 1 : time
ero_image = imread('scene_closing.jpg');
for col = (halfW + 1): (c  halfW)
for row = (halfH + 1) : (r  halfH)
dpointD = row  halfH;
dpointU = row + halfH;
dpointL = col  halfW;
dpointR = col + halfW;
dneighbor = ero_image(dpointD:dpointU, dpointL:dpointR);
filter = dneighbor(se);
extractclose(row, col) = min(filter);
end
end
imwrite(extractclose, "scene_closing.jpg");
end

