شکل‌دهی طیف نویز بازسازی در کدگذار ADPCM با پیش‌بینی پسرو

محورهای موضوعی : مهندسی برق و کامپیوتر

1 - دانشگاه صنعتی همدان
2 - دانشگاه شهید بهشتی

تاریخ دریافت : 1394/09/09 تاریخ پذیرش : 1394/09/09 تاریخ انتشار : 1393/06/30

کلید واژه: پیش‌بینی وفقی پسرو شکل‌دهی طیفی نویز کدگذاری صحبت کدگذار ADPCM کیفیت ادراکی سیگنال صحبت,

چکیده مقاله :

تفکر اصلی در روش کدگذاری ADPCM حذف افزونگی‌های موجود در سیگنال صحبت قبل از چندی‌کردن آن است. علی‌رغم سطح پایین خطای بازسازی، یک ویژگی مهم این کدگذار صاف‌بودن شکل طیفی سیگنال خطای بازسازی است. در پژوهش جاری سعی می‌شود که با اعمال یک فیلتر تمام‌صفر به کدگذار ADPCM با پیش‌بینی پسرو، جهت شکل‌دهی طیفی نویز بازسازی، کیفیت ادراکی سیگنال بازسازی‌شده را افزایش داد. این عمل منجر به ایجاد یک تعامل مناسب میان سطح انرژی و شکل طیفی این سیگنال خطا می‌شود. نتایج حاصل بیان‌گر افزایش کیفیت ادراکی سیگنال بازسازی‌شده (بر اساس معیار PESQ) در عوض افزایش انرژی خطای بازسازی (بر اساس معیار SNR) است.

چکیده انگلیسی:

The main idea in ADPCM coding is to remove the redundancies of the speech signal before quantization. One of the important characteristics of this coding scheme is the spectral flatness of the reconstruction noise in spite of its low level. It has been tried, in the present research, to improve the perceptual quality of the reconstructed signal by shaping the spectrum of the reconstruction noise using an all-zero filter in the backward ADPCM coding. By doing so, a useful compromise is achieved between the level and the spectral shape of the reconstruction noise. The obtained results show an improvement in the perceptual quality of the reconstructed signal (higher PESQ score) and an increase in the noise level (lower SNR).

منابع و مأخذ:

[1] N. S. Jayant and P. Noll, Digital Coding of Waveforms: Principles and Applications to Speech and Video, New Jersey: Prentice Hall, 1984.
[2] J. R. Deller et al., Discrete-Time Processing of Speech Signals, New York: Macmillan, 1993.
[3] N. Jayant, J. Johnston, and R. Safranek, "Signal compression based on models of human perception," Proceeding of the IEEE, vol. 81, no. 10, pp. 1385-1422, Oct. 1993.
[4] T. Painter and A. Spanias, "Perceptual coding of digital audio," Proceeding of the IEEE, vol. 88, no. 4, pp. 451-515, Apr. 2000.
[5] J. H. Chen and A. Gersho, "Adaptive postfiltering for quality enhancement of coded speech," IEEE Trans. Speech and Audio Processing, vol. 3, no. 1, pp. 59-70, Jan. 1995.
[6] D. Malah and R. V. Cox, "A generalized comb filter technique for speech enhancement," in Proc. IEEE ICASSP, vol. 7, pp. 160-163, May 1982.
[7] V. Ramanaoorthy, N. S. Jayant, R. V. Cox, and M. M. Sondhi,"Enhancement of ADPCM speech coding with backward-adaptive algorithms for postfiltering and noise feedback," IEEE J. Select. Areas Commun., vol. 6, no. 2, pp. 364-382, Feb. 1988.
[8] B. S. Atal and M. R. Schroeder, "Predictive coding of speech and subjective error criteria," IEEE Trans. Acoust. Speech, Signal Process., vol. 27, no. 2, pp. 247-254, 1979.
[9] M. R. Schroeder, B. S. Atal, and J. L. Hall, "Optimizing digital speech coders by exploiting masking properties of the human ear," J. of Acoust. Soc. Am., vol. 66, no. 6, pp. 1647-1652, 1979.
[10] J. Makhoul and M. Berouti, "Adaptive noise spectral shaping and entropy coding in predictive coding of speech," IEEE Trans. Acoust., Speech, Signal Processing, vol. 27, no. 1, pp. 63-73, Feb. 1979.
[11] A. Borowicz and A. Petrovsky, "Signal subspace approach for psychoacoustically motivated speech enhancement," Speech Communication, vol. 53, no. 2, pp. 210-219, Feb. 2011.
[12] E. Ravelli et al., "Joint Optimization of Base and Enhancement Layers in Scalable Audio Coding," IEEE Trans. Acoust. Speech, Signal Process., vol.21, no.4, pp.711-724, Apr. 2013.
[13] ITU-T, "Recommendation G.711.1, wideband embedded extension for G.711 pulse code modulation," in ITU-T Recommendation, Ed. Geneva, Switzerland, 2008.
[14] S. Haykin, Adaptive Filter Theory, Pearson Education India, 2008.
[15] A. Sayed, Fundamentals of Adaptive Filtering, New York, 2003.
[16] TIMIT, DARPA TIMIT-Acoustic-Phonetic Continuous Speech Corpus, [Online].
[17] ITU-T, "P.862: perceptual evaluation of speech quality (PESQ): an objective method for end-to-end speech quality assessment of narrow-band telephone networks and speech Codecs," in ITU-T Recommendation, Ed. Geneva, Switzerland, 2001.

مقالات مرتبط

یک رهیافت فرااکتشافی چندهدفه برای بهبود پوشش و اتصال در شبکه‌های حسگر بی‌سیم
تاریخ چاپ : 1405/02/22
رویکرد ارزیابی هیجان نوین جهت مراقبت از سرطان مبتنی بر مدل‌های زبانی بزرگ
تاریخ چاپ : 1405/02/22
ارائه روشی برای مدیریت منابع در شبکه‌های Fog-DSDN با بهره‌گیری از معماری میکروسرویس و شبکه‌های ESN
تاریخ چاپ : 1405/02/22
چارچوب ترکیبی سبک‌وزن برای امنیت اینترنت اشیا با استفاده از جنگل تصادفی بهینه و انتخاب ویژگی تطبیقی در معماری لبه-ابری
تاریخ چاپ : 1405/02/22
یک چارچوب یادگیری نیمه‌نظارتی جهت دسته‌بندی دقیق موارد آزمون با بهره‌گیری از تعبیه‌های زبانی و ویژگی‌های معنایی متن
تاریخ چاپ : 1405/02/22
تکنیک هوشمند مبتنی بر الگوریتم چتر دریایی برای زمان‌بندی وظایف بر اساس اولویت در شبکه‌های IoT/Fog
تاریخ چاپ : 1405/02/22

اشتراک گذاری

آدرس مقاله

شکل‌دهی طیف نویز بازسازی در کدگذار ADPCM با پیش‌بینی پسرو