top of page
  • Linkedin
Search

This is it!

  • Writer: Kavan Mehta
    Kavan Mehta
  • Jan 30, 2023
  • 1 min read

Let's start Final Product! I think this idea of combining the usage of both video and audio of speech to recognize and understand efficiently could be a great problem to solve! I am currently in the process of asking some of the professionals I have interviewed for mentorship! This week I plan to work on conducting independent research on semantics and utilizing both lip movements and audio segments (converted into spectrograms) to improve recognition. Furthermore, many of the mainstream algorithms, such as linear regression, logistic regression, decision trees, and deep neural networks (feed-forward and recursive), could be used to utilize the various sounds and speech effectively.


I also think that some tutorials and online research on this topic by academia could be used to help improve my understanding of the topic and work on creating a machine-learning solution. Apart from research, I think I could utilize my future mentor's experience to enhance my understanding of the entire speech recognition problem and frame it so that I can optimize the issue's roots by changing my focus towards lip movements or audio depending on the training data results. I think working with simplified algorithms can also help me see the issue, and I am hoping to start my Final Product implementation next week.


So see you next week, same place, same time.

 
 
 

Recent Posts

See All
New Progress on Final Product!

After talking to my mentor last week, we were able to go over some research papers, GitHub repositories, and other project information in...

 
 

Comments


Commenting on this post isn't available anymore. Contact the site owner for more info.

© 2022 by Kavan Mehta                 Independent Study and Mentorship Program

bottom of page