A No-Reference Image Quality Metric with Application in Low-Dose Human Lung CT Image Processing

In this paper a no-reference image quality metric designed for human lung CT scans is presented. The metric can be used for several purposes, including the evaluation of visual quality of CT scans or controlling enhancement processes. The developed method is based on a modified SKFCM image segmentation algorithm combined with the SSIM metric. A lung phantom was constructed for validation purposes. Tests were performed both with synthetic images, using the lung phantom with added noise, and with real CT images. The presented methods include simulations, quantitative studies and subjective evaluation. Experimental results show that the metric values reliably follow the visual image quality of CT. Keywords—CT, fuzzy C-means, image quality metrics, lowdose, lung phantom, noise modeling


I. INTRODUCTION
O NE of the motivations behind the construction of the is to provide a tool for objective evaluation of the quality of low-dose CT scans.As it is known, dose reduction lowers the radiation exposure risks, but at the same time decreases the image quality.Therefore a quality measurement method can be relevant and useful, in this respect.
In X-ray computed tomography, the attenuation of the Xray photon beam of the human body follows the Beer-Lambert law: where I 0 is the source intensity of the beam, I is the observed intensity, recorded by the CT scanner sensors, L represents the beam path and µ is the linear attenuation coefficient function of the body parts.The recorded sinogram can be considered as a sampling of the Radon transform of µ.The reconstruction of µ can be received by means of the so-called filtered backprojection method.For details we refer the reader to [3].We note that µ depends on photon energy, therefore normalized HU (Hounsfield Unit) values are commonly used.Quantum noise and electric noise are present in the recording process, and quantization noise arise during the reconstruction due to discretization.There are several existing methods for image quality measurement, based on various approaches (see e.g.[6]).The mainstream full-reference metrics are SSIM (structural similarity index) [12] and VIF (visual information fidelity) [10].
Unfortunately, they cannot be applied directly in case of CT scans, because in general no high quality reference scans exist.We remark that the SSIM metric has already been applied [7] to compare low-dose reconstructions and original CT scans, where the latter serve as reference images.The existing no-reference metrics are relative in the sense that while they may follow the decrease of the quality of the same image, the metric values are incomparable for different images.
The proposed metric, presented in the following section is based on the SSIM metric and on a modified version of the SKFCM (spatially constrained kernelized fuzzy C-means) image segmentation algorithm [13].Besides the SSIM we also studied the CNR (contrast-to-noise) and SNR (signal-to-noise) metrics [2], which are often used in medical image processing.It turned out that SSIM performs better.For instance, CNR and SNR are very sensitive to the selection of the regions.On the other hand SSIM is normalized, i.e. the metric values are always within the interval The outline of this article is as follows.In Section II the quality measurement method is described.Section III contains an analytical study of the metric with the constructed lung phantom and noise model.In Section IV the results on real low-dose and normal lung CT images are presented.

II. METRIC CONSTRUCTION
The metric construction consist of three main steps: preprocession, segmentation and quality measurement.
We start with a preprocessing step, since experiments performed on real low-dose CT images show that the segmentation technique detailed below is more effective if a preliminary background removal and gamma correction is applied.Here background refers to the region outside the body.For background removal Gaussian and median filtering, thresholding and region fill turned out to be appropriate.Gamma correction was applied to regions with intensity below −700 HU.Fig. 2 (a) shows the result of this step for test image Fig. 1 (a).
We continue with the segmentation part.Human lung CT images, using HU values as pixel intensities, have similar structures and histograms (see Fig. 1).This happens because different human bodies have similar tissue combinations.The proposed segmentation method assigns pixels to tissue clusters according to intensities, spatial distribution and tissue properties.
The SKFCM algorithm was used as starting model.SKFCM is a derivation of the conventional FCM (fuzzy C-means) algo- where v = (v i ) C i=1 are the cluster prototypes, U = (u ik ) C,N i=1,k=1 ∈ [0, 1] C×N stands for the partition membership matrix, satisfying: K is a Gaussian RBF with standard deviance σ, and m > 1 determines the level of cluster fuzziness.P m,i,k represents the spatial penalty term in the neighborhood N k of pixel x k , within cluster v i : where the parameter α ∈ (0, 1) controls the penalty effect.
Then SKFCM can be considered as an optimization problem, i.e. minimization of J m under the constraints for U .The standard SKFCM cannot be applied directly in case of lung CT scans, because it generates the same homogeneity level in all of the clusters.In case of lung CT scans, the segmentation goal is to create homogeneous fat, muscle and internal organ regions, with a detailed lung region.The structure of the lung tissue must be preserved since it is diagnostically relevant.In order to achieve this goal, the spatial penalty term P m,i,k in the objective function had to be modified.
One of the problems with the original penalty term was the constant window size.In the modified version, we specified different window sizes for different clusters.In addition to that a weighted summation is used inside the windows, allowing different homogeneity levels in the clusters.Then the modified spatial penalty is: where N ik stands for the neighborhood of pixel x k within cluster v i , and w ik are the weighted coefficients satisfying: ∀i : Another important modification was the use of fixed cluster prototypes.In the original SKFCM the number of clusters is fixed, but the cluster prototypes are allowed to change during the iterative optimization.In case of lung CT scans, fixed cluster prototypes give a reliable and comparable segmentation result.Table I contains the selected clusters and prototypes.
Similarly to the standard SKFCM the objective function J m can be minimized by the following iteration scheme.Initialize U with: and in each iteration step update U with: Fig. 2: The result of preprocessing and segmentation.
As a consequence of noise, overfitting may happen.In order to avoid it, early stopping is necessary.The iteration stops, when one of the following criteria are met.
• The iteration reaches t max = 10 steps, 1) , where β ∈ (0, 1) is a fixed constant.The last criterion prevents overfitting.Namely, if the decrease of E (t) is under a desirable speed, the iteration stops.
The parameters were optimized by means of tests performed on real CT images.The result is as follows: m = 2, α = 0.3, β = 0.75, t max = 10.The dimension of the window N ik is 3 × 3 in case of air and lung clusters, and the weights are generated from the Gaussian filter with standard deviance 0.5.For other clusters the window size is 9 × 9, and the weights come from the Gaussian filter with standard deviance 1.5.Experimental results show, that the iteration works better if the standard deviance of Gaussian RBF K is proportional to the image noise level.Noise level was roughly estimated by using wavelet transform and calculating the standard deviation of the detail component, which mostly contains noise.The result of segmentation applied to test image Fig. 1 (a) is shown on Fig. 2 (b).
In the final step, we employ the full-reference SSIM metric to the result of segmentation and the starting preprocessed image.Here the result of segmentation serves as reference image.The metric value lies within the interval [−1, 1], where higher value means better quality.The performed tests using both synthetic and real CT images show that it complies with the visual quality of the CT scans.

III. PHANTOM STUDY
We constructed a lung phantom for testing and validating the developed metric.This phantom serves as an analytical and schematic model of the lung.We used the lung phantom and noise model to generate synthetic images with different noise levels.They served as test cases for the segmentation and quality measurement method.
During the construction of the lung phantom we followed the principles of the Shepp-Logan head phantom [11], which is the most popular analytical phantom.The tissues are represented by means of elliptical regions, therefore the Radon transform of the phantom can be calculated directly [9], and the synthetic sinogram can be obtained in an analytical form.The lung phantom, as shown on Fig. 3 tissues, small ones the lung tissue.Small ellipses have random sizes, placed randomly inside the lung area, 100-100 on each side.Prototypes of Table I shows the ellipse intensities.The exact phantom specification is given in Appendix A.
We used the noise model for digital sensors given in [4].It includes both quantum noise and electric noise.This model can be applied to CT scanner sensors, as well.The model describes the electric noise as a signal-independent Gaussian random variable with expected value 0 and fixed variance, and the quantum noise as a signal-dependent Poissonian random variable.Approximating the Poisson distribution with Gaussian distribution, the noise model is: where I is as in (1), I noise stands for the detected, noisy intensity.The positive parameters a and b control the effect of the quantum and electric noise, respectively.Similar equation, with modified a and b parameters, can be given for the quotient of the source and the observed intensities: where I 0 stand for the source intensity.
CT screening process was simulated by using the lung phantom and the noise model above.The simulation parameters were chosen to imitate the settings of the real CT scans.Namely, photon beam energy is set to 57 KeV, with the corresponding value µ H2O ≈ 0.2 cm −1 (see [5]).The output resolution is 512 × 512 pixels, pixel distance is chosen to 0.06 cm.Simulation starts with generating a synthetic sinogram by sampling the analytical Radon transform of the lung phantom.Generated noise can be added after converting the sinogram to intensity quotients.Then, reversing this method Adding noise to I/I 0 , with I noise /I 0 the noisy P noise (L) Radon transform will be: Fig. 4 contains two synthetic lung phantom reconstructions with their histograms.The images are generated with parameters b = 10 −7 , a = 6 • 10 −5 and a = 3 • 10 −4 , respectively.The synthetic and the real CT images have similar visual appearance and histogram structure (compare Fig. 4 to Fig. 1).The noise appears a similar way, similar artifacts can be observed (e.g.beam hardening).This shows that the lung phantom can be used as a schematic model for a twodimensional slice of a human lung CT scan.
We can deduce from the experimental results on synthetic scans that the segmentation process preserves the structure of the images and the metric value reliably follows the image quality.In

IV. REAL CT TESTS
Experimental results on real CT scans show that the proposed segmentation preserves the structure of the images, and the given metric value is a reliable characterization of image quality.We compared multiple datasets, compared the metric to existing methods, performed quantitative study based on the patient size and made comparison to subjective evaluations.

A. Datasets
This study is primarily based on 20 low-dose lung CT scans from Pozitron-Diagnostics Health Centre, Budapest, Hungary.
These scans were recorded with Siemens Somatom scanner using the same settings, namely photon beam energy is 57 KeV, tube current is 30 mAs with 130 kVp.Each scan consist of 231 to 298 slices, the resolution of a slice is 512 × 512, the pixel distance is between 0.0576 and 0.0744.Beside that, test were performed on two additional datasets: 50 low-dose lung scans from public database ELCAP [8], and 20 normal-dose lung scans from public database LIDC-IDRI [1].
Considering the scans of Pozitron-Diagnostics Health Centre, the metric values lie between 0.61676 and 0.85809.Fig. 14 gives a comparison of metric values with 8 images from the dataset.Tests on the scans from database ELCAP give similar results.Namely, metric values are between 0.65505 and 0.85857.The similarity of the results is reasonable, because the two low-dose dataset contain visually similar CT images.Segmentation gives a better result on normal-dose scans from database LIDC-IDRI, comparing to low-dose scans.In this case metric values lie in a higher range, between 0.82946 and 0.93688, as expected.

B. Comparison to CNR and SNR
We compared the CNR and SNR metrics to the proposed metric.Here we remark that to calculate these metrics a nearly homogeneous background and body region need to be selected.Both metrics are very sensitive the way we select that regions.We used a semi-automatic region selection based on the segmentation but a per-image supervision was required.Fig. 6 shows the selected ROI for a low-dose test image.There is a strong relation between the metric values and the CNR and SNR values, the Pearson linear correlation coefficient is 0.86 in both cases.Fig. 7 shows the CNR values against the metric values.

C. Quantitative study of the metric values
According to the properties of the CT screening, the patient size affects the quality of the scan.During the scans of larger patients the radiation dose is usually increased (i.e. by increasing the tube current or lowering the kVp) to maintain image quality.With fixed tube current and kVp, we can expect worse image quality in case of larger patients.As discussed before, the scans from Pozitron-Diagnostics Health Centre have the same recording parameters, except the pixel distance and the number of slices.These parameters are set individually for each patients, according to the size of their bodies.Consequently, in case of these CT scans, an image quality metric should correlate with the patient size.The test results show that the largest patients have the lowest metric values (see Fig. 14).We studied this effect regarding to real and synthetic images, as well.
Before a CT scan, the weight of the patient is measured, this parameter is stored in the file headers.The weight can be considered as a rough characterization of the patient size.We compared the metric values to the weight parameters, the Pearson linear correlation coefficient of 0.62 indicates a relation between these two properties.We note that heavy but tall patients may have small thoracic region.A better description should consider other properties as well, for instance the height of the patients, but this parameter is not available.To give a better characterization of the patient size, we measured the body area on the selected slices.Using the segmentation, we counted the pixels, which, according to their intensities, belong to muscles, fat, internal organs or bones.Then we scaled this area with the pixel distances.The Pearson linear correlation coefficient between the measured area and the metric values is −0.88, which indicates a strong relation.Fig. 8 presents this relation, the metric values against the measured body area.
We simulated different patient sizes with the lung phantom.At fixed level of noise, namely with noise parameter a = 6•10 −5 and b = 10 −8 , we performed two simulations.First we enlarged the phantom with fixed pixel distance, then with the original phantom we generated synthetic images with different pixel distances.We remark that the two simulations are similar, the main difference is that in the first case the phantom is cropped to the viewing area (see Fig. 9), while in the second case, the viewing area is enlarged and the whole phantom is visible (see Fig. 10).The generated synthetic images are still similar to the real CT scans, and with larger phantom or bigger pixel distance we can observe higher level of noise, as  expected.The metric values follow the size, the Pearson linear correlation coefficients are nearly −1 in both cases.Fig. 11 and Fig. 12 show the metric values against pixel distances and size, respectively.Each value is calculated as an average of 20 measurements.We can conclude that the proposed metric handles the quality degradation caused by the patient size well.

D. Subjective quality evaluation
To interpret and discuss the measurement results, we transformed the metric values into 1 to 5 scores.As mentioned before, the SSIM values are always between −1 and 1, in practical cases between 0 and 1.In our case the metric values are above 0.6 for real CT images and above 0.7 for synthetic images.Considering this, we transformed the [0.6, 1) interval into 1 to 5 integer scores.In case of low-dose images the lowest metric values is therefore 1, the highest is 4, the most of the images get score 3.In case of normal-dose images, the lowest score is 3, the highest is 5, the most of the images get score 4.
Finally we provide comparison to subjective quality scores.Two radiologist medical doctors associated with Pozitron-Diagnostics Health Centre evaluated the 20 low-dose scans the institute provided.They assigned scores between 1 and 4, where 1 means the worst, 4 the best quality.We compared the MOS (mean opinion score) of the two evaluations to the metric values.The Pearson linear correlation coefficient between the MOS and the metric values is 0.62 which indicates a relation between them.Fig. 13 shows the MOS against the metric values.Since the Pearson linear correlation coefficient between the two different subjective evaluations is also 0.62, we can conclude that, taking the uncertainties into consideration, the proposed metric values describe the quality well according to the subjective evaluations.We note that more reliable results can be given with further evaluations including more scans and more radiologist test subjects.

V. CONCLUSION
In this paper a no-reference image quality metric for human lung CT scans is presented.The metric construction is based on segmentation, modifying and adjusting the SKFCM algorithm.The modified objective function of SKFCM made it possible to applying this method to lung CT images.Quality measurement was performed with SSIM, comparing the result of the segmentation with the preprocessed original image.The metric was tested and validated with a constructed lung phantom and real CT scans.Synthetic images were created using the lung phantom and noise model, and real, low-dose and normal lung CT scans were examined.We performed simulations, quantitative studies and subjective evaluation as well.Experimental results in each cases show that the proposed metric is a good estimation of image quality.
The results presented in this paper are preliminary and further clinical evaluation is required.
The possible applications of the metric include measuring and comparing image enhancement methods, optimizing the parameters of the CT process or the settings of the CT scanners.For instance, tube current and voltage may be optimized based on the metric, achieving low radiation dose yet good image quality, even in real time during the CT recording.
The method was developed especially for low-dose lung CT scans, regarding the attributes of this images.However, if the parameters of segmentation is adjusted properly, it seems possible to use the metric with other type of CT images or even with other type of medical images.This utilization needs further research.The table above contains the exact specification of the lung phantom: center coordinates (X and Y), axes, rotation angles (Theta) and intensity levels in HU.The last row stands for the 100-100 small ellipses inside of the third and fourth big ellipses.These small ellipses are placed randomly, and have random axes between 1/512 and 4/512.

Fig. 5 a
Fig. 4 contains two synthetic lung phantom reconstructions with their histograms.The images are generated with parameters b = 10 −7 , a = 6 • 10 −5 and a = 3 • 10 −4 , respectively.The synthetic and the real CT images have similar visual appearance and histogram structure (compare Fig.4to Fig.1).The noise appears a similar way, similar artifacts can be observed (e.g.beam hardening).This shows that the lung phantom can be used as a schematic model for a twodimensional slice of a human lung CT scan.We can deduce from the experimental results on synthetic scans that the segmentation process preserves the structure of the images and the metric value reliably follows the image quality.In Fig. 5 a quantitative study of the metric and the lung phantom is demonstrated, with different a and b parameter values.The diagram shows the dependence of the metric values on parameter a with three fixed values of parameter b.Blue bars belong to b = 5 • 10 −8 , green bars to b = 10 −7 , red bars to b = 2•10 −7 .Each value is calculated as an average of 20 measurements.We can conclude that with fixed level of electric noise, the metric values follow the change of quantum noise level.

Fig. 6 :Fig. 7 :
Fig. 6: ROI used to calculate CNR and SNR, white rectangle for background, black rectangle for body ROI.

Fig. 8 :
Fig. 8: Correlation between the metric values and the measured body area.

Fig. 13 :
Fig. 13: Correlation between the metric values and the subjective MOS.

Fig. 14 :
Fig. 14: Low-dose CT scans with different quality and the metric values.(Pozitron-Diagnostics Health Centre)

TABLE I :
Cluster prototypes