Manuscript received September 21, 2024; revised October 27, 2024; accepted November 15, 2024; published December 27, 2024.
Abstract—Driven by Artificial Intelligence (AI), speech information processing technology is developing rapidly, among which speech synthesis technology can realize high-fidelity speech output of specified objects and content, and has a wide range of application prospects in human-computer interaction, pan-entertainment and other fields. Traditional audio synthesis methods rely on a lot of manpower and computing power for post-editing and trimming, and the quality of synthesis is not high, which becomes a challenge restricting the development of speech synthesis technology. In this paper, firstly, the sound signal and sound processing are the theoretical basis for audio synthesis. Secondly, the format, parameters and structure of audio files are discussed. Then, the paper reviews the research process and practice methods of neural network in the field of sound synthesis, and takes music as the main research direction. Finally, the application prospect of AI in this field is prospected. Research on the application of AI in the field of sound synthesis will change the traditional method of sound synthesis, improve audio quality and audibility, and promote the innovation and development of audio technology.
Keywords—speech synthesis, audio processing, artificial intelligence, neural networks, application prospects
Cite: Boan Huang, "The Application and Prospect of AI Technology in Audio Synthesis,"
International Journal of Engineering and Technology, vol. 16, no. 4, pp. 261-268, 2024.
Copyright © 2024 by the authors. This is an open access article distributed under the Creative Commons Attribution License which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited (CC BY 4.0).