A New Direction
Since my first post, I have further narrowed down what I hope to accomplish in my project. The following is an update on what my new goal is and how I will be working to accomplish that goal.
My new goal is to create a user-friendly program that analyzes vowel formants in an audio signal and presents to the user the IPA symbol of the vowel being produced. The program will consider the most commonly found vowels in English: /i/, /ɪ/, /ɜ/, /ɝ/, /æ/, /u/, /o/, /ʌ/, /ə/, /ɚ/, and /ɑ/. If possible, I would love to also be able to analyze the English diphthongs, /eɪ/, /aɪ/, /oʊ/, /ɔɪ/, and /aʊ/. The diphthongs will be more challenging, since the vowel formants change partway through these sounds.
The program should also present the fundamental frequency of the sung or spoken vowel. Therefore, a singer can see on the screen both the vowel they’re singing and the pitch of that vowel.
I’ve been studying phonetics and phonology this semester, and the tools and methods I’ve learned will be very helpful in this project. Acoustic phonetics studies the sound signal of speech sounds, and one area of this considers the formants produced when speaking different vowels. When a speaker produces a sound, the fundamental frequency gives the pitch that we hear the speaker produce– for example, this is the pitch a singer sings to match the intonation of a song. The peaks in loudness at other frequencies are the different formants of this vowel sound, and the values at which these vowels occur changes the quality of the vowel. The most important formants to consider are the first and second, which correlate to where in the vowel space the sound is being produced and therefore correspond to particular vowels. The third formant mostly corresponds to the rhoticity of a vowel. These first three formants, especially the first and second, will be the main parameters I will look at to distinguish the vowels. More information about this can be found here: http://hyperphysics.phy-astr.gsu.edu/hbase/Music/vowel.html
Updated Methods and Tools
I will be using Praat to extract the formants from the audio recordings. Praat allows a user to record and/or upload audio, and from this audio, the program analyzes the acoustic signal and creates a spectrogram showing the formants. One aspect of the program even superimposes lines to highlight where the formants are, and using tools in Praat one can find the decimal values of each formant. After recording vowel sounds, I will use the decimal values of each formant when creating the cutoffs in my program that tells the user which vowel is being produced.
I will use a Jupyter notebook in order to create an accessible interface where a user can upload their audio file and easily find out what the IPA representation and fundamental frequency of that vowel are. By utilizing the Jupyter interface, users unfamiliar with Python will have an easier time with this program.
I plan on coding the necessary functions in Python in order to use the formant frequency data to choose from a dictionary of frequency values and share with the user which phoneme is being produced.
In order to extract the formants from the data in Praat, I plan on using code from GitHub. https://github.com/mwv/praat_formants_python
The above program interacts with Praat in order to find the formants at a particular point or interval in time.