Wednesday, September 23, 2009

ACTIVITY 19 - Restoration of Blurred Image

One of the problems encountered by people who take pictures is the inability to keep the camera steady. Most of the time, images are blurred due to the motion of the camera while the shutter is open. This, together with noise, produce images with poor quality.

In this activity, restoration via filtering in the Fourier domain is being investigated. In particular, we make use of the Weiner filter. Let us use the following image for our study.
























figure 1. Image taken from http://emeagwali.com/images/website/ebony-magazine-cover-april-2008.jpg



figure 2. This is the red channel in grayscale.

Let f be the image in figure 2. Then its FT is F. Blurring may be modeled in Fourier space by multiplying F point by point to a function, H, which is an exponential function multiplied by a sine function. H depends on the quantities a and b, where these constants represent the velocity of the camera along the horizontal and the vertical respectively as the image is being taken. This product will be called G. If we take inverse Fourier transform of G, we will get g, which is the blurred version of f. Then, let us add different types of noise into this image.

The Weiner filter involves calculations of H and G. If we know the power spectrum of the added noise, then we can include it in the expression for the Weiner filter and then multiply this by G. Its inverse transform will yield a somewhat good restoration. However, we usually don't know the power spectrum of the noise. We may opt to replace the power spectrum by a constant, K. We need to vary K and observe which value yields the best visual result.

Below are the results. In the first set, we added Gaussian noise with mean and variance of 0.5 and 0.1 respectively. The first image is the blurred version. The next image results upon noise addition. The third image is the restoration with a knowledge of the power spectrum of the Gaussian noise. Below, we have restorations for K=-.5, 0, 1.




















figure 3. Gaussian noise restoration


Second, we added exponential noise with a mean of .5. The images in the second row made use of the following values K=-1, 0, 0.1, 1.




















figure 4. Exponential noise restoration


Then, we have uniform noise that spans 0-1. We use K=0, 0.2, 1.
















figure 5. Uniform noise spanning 0-1



Finally, we have uniform noise that spans 0.5-1. K=-0.2, 0, 0.2.




















figure 6. Uniform noise, 0.5-1


We obtain the best restoration when the power spectrum is known. For certain types of noise, a certain constant K will suffice as we have seen above. In others, this estimate is not enough. Further investigation is needed on replacing the power spectrum by a suitable estimate.

Rating: 10 for proper execution
Note: imwrite causes my laptop to hang. I opted to use imshow to obtain the blurred images, which shows an incorrect "blurred" version. Apologies to readers.

Monday, September 14, 2009

ACTIVITY 18 - Noise Modelling and Basic Image Reconstruction

Previous techniques in image processing were highly subjective in nature. In this post, we use a more objective technique. In particular, we assume a model for image noise that may arise from the components of the sensor, lighting conditions during image capture, etc.

Noise comes in many forms, such as gaussian, exponential, gamma, salt and pepper, constant, etc. There are diferent techniques of reducing noise, each being more effective than others in certain types of noise. We will consider each type of noise and apply 6 techniques which involve replacing a pixel by a specified mean. There are four types of averaging that we can use. We will specify a window around the pixel that would only include a small neighborhood around it.

First, we will generate a tri grayscale image. Its grayscale PDF is given by three dirac deltas of approximately the same height.










Figure 1. Image

















Figure 2. PDF of figure 1


We will add different types of noise in this image with the following PDFs.















Figure 3. Noise PDFs


The resulting images are shown below. Then four types of averaging are applied:

A. Arithmetic Mean
Let our window be of dimensions m by n and centered at a particular pixel. To obtain the arithmetic mean, get the sum of all pixels within the window and divide by the total number of pixels or m*n.

B. Geometric Mean
Get the product of all m*n pixels and then take its (m*n)th root. This method will notably fail within the window surrounding a zero valued pixel.

C. Harmonic Mean
Take the sum of the inverse pixel values (sum of 1/g, were g is the pixel value). Then divide m*n by this sum.

D. Contraharmonic Mean
Let Q be a nonzero integer. If g is the pixel value, take the sum of g^(Q+1). Divide this by the sum of g^Q. Q is called the filter order.


We get the following noisy images and reconstructed ones using the four methods averaging.








Figure 4. Gaussian Noise









Figure 5. Gamma Noise









Figure 6. Exponential Noise









Figure 7. Uniform Noise









Figure 8. Rayleigh Noise








Figure 9. Pepper Noise


The arithmetic mean produces a blurred version of the original image. This may remove the noise from images but compromises details. However, if the image does not have sharp boundaries but gradual changes in grayscale, this filter will work well. However, even if the image does not have sharp boundaries, it will not work well with noise with a very limited range in its PDF, such as salt/pepper noise. If there are a few such "spots" throughout the image, the reconstructed one will be blotchy in places. If there are several spots, the result will be less blotchy and smoother but the overall brightness will be affected depending on the value of the noise.

The geometric mean generally produces a higher contrast image than the previous. Even when the noisy image is very dim, the reconstructed image will still be bright. It is effective with noise that is highly "imbalanced" such as gamma and exponential noise. One drawback is that blotches from salt/pepper noise are more prominent.

The harmonic mean is an interesting filter. It produces a sharp image. It performs remarkably well on pepper noise and fairly well with uniform noise and gaussian noise. But it does not work with gamma, exponential and salt noise. In the latter, the reconstructed images are very tile-like. With rayleigh noise, the quality is somewhat in between. On the other hand, the tile effect is nice as it does not overpower too much detail.

The contraharmonic filter comes in two flavors. Q may be positive or negative. When Q is positive, pepper noise is removed. We expect it to work well with dim noise. It fails with salt noise and expect it to fail with other types of bright noise. The opposite is true for negative Q.


Another factor that we can look into with filtering is the window size. If the window is small, blurring is minimized but cleaning is reduced as well. If the window is large, more noise is removed but detail is lost. We must find the proper window size that will make a good compromise between the two.












Figure 10. Effect of Window Size


Now let us observe the filters when applied to a more realistic grayscale image such as the one below. Its histogram is spread out rather than being confined to three values. In this image, gaussian noise was applied. We can see that the corrected images do not differ much from those corrected ones of the simple shapes.






Figure 11. Image with gaussian noise and results of filtering

Each filter has its own advantages and disadvantages. We need to know the nature of the noise in the image so that we can wisely choose the correct filter or set of filters which we can apply successively.

Rating: 10 for completing this very long activity

ACTIVITY 17 - Stereometry

Real objects are 3 dimensional (at least that's what we can observe easily). When they are captured by a camera, they reduce to 2 dimensional images. Each pixel in the image would reflect the color of the object and the lighting conditions it was subjected to at the time that it was captured. If we can keep the camera and the object fixed and change the lighting conditions, we may be able to determine the 3D shape of the object based only on the variations of brightness across different images.

The easiest case to work with would be a point source of light. The intensity would depend as 1/r^2, where r is the distance from the point source. The mathematical derivations will not be
shown here. Just note that we need information on the different positions of the light source relative to the object and the following images.







Figure 1.

After applying some matrix operations, we obtain the following reconstruction.




















Figure 2. Reconstruction

The reconstruction is rather good. The noise (not clear from figure 2) is due mainly to rounding off in between calculations. Note from figure 2 that the object need not be monochromatic. This is because the depth information is determined only from variations of a pixel from one image to another. This is why we need to make sure that the object is fixed relative to the camera.

Rating: 10

Wednesday, September 2, 2009

ACTIVITY 16 - Neural Networks

We present here yet another method of classifying objects. We will not consider the details but present the overall procedure when using neural networks.
















figure 1. Schematic of a neural network

In a neural network, there are three layers: input, hidden, output. The input layer may consist of the features. These values are then transferred to each node in the hidden layer using a specific weighting factor for each node. Finally, the nodes from the hidden layer transfer data onto the output layer. To obtain the proper weighting factors, the network needs to be trained.

In this activity, we vary the number of iterations done in determining the factors. As we can see below, the result improves significantly in the range 100-7000 iterations. More than that, the accuracy does not increase as quickly. For the given system then, the optimum iteration is seen to be 7000.








figure 2. Values as a function of number of iterations










figure 3. Plot of the values as a function of iterations for one object


When the number of iterations is too low, the system is barely able to classify an object. The values indicate that it is almost equally like to be one as to be the other. On the other hand, there is some form of saturation. Beyond 7000 iterations for our system, the accuracy does not increase as much. Another limit of the neural network is its inability to classify when one set of features are much larger than the other set. For instance, the area value was supposed to be in the order of tens of thousands. In that case, the network fails. However, if we divide this set my tens of thousands, the resulting values are easier to handle, especially with the red component being anywhere between 0 and 1 only.

For this system, the classification works best at about 4000-5000 iterations. In other words, the compromise between speed and accuracy is rather fair. For a different data set, especially a more difficult one, the calculations need to be repeated to determine the behavior of the network.

Rating: 10 for having completed the work

Wednesday, August 26, 2009

Activity 15 - Probabilistic Classification

Objects can be classified into categories based on features that can be measured. In the previous activity, we determine class membership by determining which mean feature vector it is closer to. However, this may prove tricky when the distributions fall along parallel lines, for instance. A feature vector may fall along the line crossing one mean vector but be nearer the other mean vector.

Linear discriminant analysis may be used in such cases. This is a mathematically involved process. The formula to be used is given by

where C is the covariance matrix of a set of measurements x,
u is the mean of the set of measurements of the class i,
p is the prior probability (expected value prior to any measurement, usually given by
number of samples in a class/total number of samples)

An object with a particular measurement x belongs to the class i which has the largest discriminant function, f.

Using LDA, we obtain a 100% recognition rate. Below is a snapshot of a sample calculation. Note that we made use of the same data set as in the previous activity.












figure 1. Sample calculation


Rating: 10 for correct classification

Activity 14 - Pattern Recognition

There are many objects around us that we can recognize. This is because our brains retain certain patterns or features that make an object unique. The simplest features to recognize are shape, size and color. These characteristics may be obtained using the image processing techniques that we have learned in the past activities.

Features may be assigned values in a coordinate system. For instance, in a 3D system, the dimensions may be area, average red and average green. We may choose any other feature for as long as we can quantify it. A set of one type of object will have a distribution of features along this coordinate system. Typically, this distribution is gaussian in form.

A set of another type of object will have its own distribution. If the two distributions do not overlap significantly, we can determine whether one object will belong to one set or the other by finding the distance of its feature point in coordinate space from the average feature point of each set. It belongs to the set with which its feature point is located closer.

Take for example these two sets of snack items.























Figure 1. abov e-Cheez it, below-Pillows

The features of 5 pieces were taken from each set and averaged. Another five were taken from each and then their features are plotted together with the average. For aree and average red, the following were obtained.





















Figure 2. Area in pixels vs average red

In this example, the objects are very easy to classify. We simply find the feature point or feature vector (f1,f2,f3,..) and find its distance from each mean feature (a1,a2,a3,..) and (b1,b2,b3,...) using the distance formula d=sqrt((f1-a1)^2 + (f2-a2)^2 + (f3-a3)^2 + ...).

In this activity, the classification rate is 100%. The two sets are well separated in feature space.

Rating: 10

Wednesday, August 19, 2009

Activity 13 - Correcting Geometric Distortion

Imaging systems are imperfect. Individual components such as lenses usually introduce noticable distortions in images. Many cameras have been designed to minimize such effects. For simpler ones such as camera phones, the distortions are obvious, especially when the object of interest is considerably far from the optical axis or the middle where distortions are negligible.

Two of the most common distortions are the barrel and pincushion distortion. In the first case, the image becomes smaller as one moves farther from the optical axis. The opposite is true for the second case. Below is an example of a barrel distortion.
















Figure 1. Ideal Grid



















Figure 2. Grid with Barrel Distortion


In image 2, the distortion is not extreme. In fact, if we look at any square, it still looks like a polygon with four straight sides. Thus, we can work on the corners of one square, find their undistorted counterparts, determine the linear transformation, and use this knowledge to recover the square, including the pixels contained in the square. The counterpart may be found by simply generating a straight grid based on the squares in the middle of the image since their distortions are imperceptible.

In principle ,we can fully recover the ideal image. However, images are discretized into pixels. A point on the reconstructed will almost always never fall exactly on a pixel in the distorted image but at some "fraction of a pixel" from it. The simplest is to assume the value of the pixel nearest it. However, the quality may be poor. A more involved but still quick method is to assume a bilinear interpolation. Using the steps outlined here, we get the following reconstructions of the distorted grid in figure 2. Note that each square has its own transformation. In scilab, for loops may be used to go through each square.

















Figure 3. Reconstructed Grid (Nearest Neighbor)

















Figure 3. Reconstructed Grid (Bilinear interpolation)


Figure 3 has a much better quality than figure 2. The lines are smoother and more accurate. This reconstruction may be very useful as shown by the next example below.
















Figure 4. Tilted and distorted image of oscilloscope display

Some experiments require gathering data that can only be displayed as in figure 4. The usual approach is to estimate the desired parameter manually. Image processing can help improve the accuracy tremendously, especially if the data is not as regular as the example in figure 4. First, we have to reconstruct the image. Note how the gridlines of the display may serve as reference squares and rectangles for the reconstruction. Nearest neighbor and bilinear interpolation produced the following results.
















Figure 5. Reconstructed Oscilloscope Display (nearest neighbor)

















Figure 6.Reconstructed Oscilloscope Display (bilinear interpolation)

The result of nearest neighbor approximation for images with "thick" elements are not as bad as for thin ones (compare figure 3 and figure 5), precisely because being one pixel off is not too big for objects much thicker than one pixel. In figure 3, the grid lines are mostly one pixel thick, so being off by one pixel makes a relatively huge difference.

The techniques learned from previous blog entries may be used to extract data. For instance, conversion to black and white and thinning may be used to obtain a result such as the following.
















Figure 7. Binarized and thinned


The quality of figure 7 may be improved, if we will, be additional morplological processes and added care in applying image processing techniques. Then we can determine the coordinates of the white pixels above and convert them via ratio and proportion into physically meaningful variables since each unit length of a side of a square in figure 6 corresponds to, say 10 Volts, which corresponds to 40 pixels (assumed width of an ideal square in figure 7).

...

This activity was frustrating and fun at the same time. I used four for loops and a few if conditions within, which ate up a lot of my mental energy. I was only too happy when I finally succeeded. 10 points for completely understanding the concepts.