OpenAI Whisper Applications

Browse applications built on OpenAI Whisper technology. Explore PoC and MVP applications created by our community and discover innovative use cases for OpenAI Whisper technology.


In today's increasingly digitized world, seamless interaction with personal computing devices is crucial for efficiency and productivity. However, a significant segment of the population faces substantial barriers in accessing and utilizing these essential tools due to physical disabilities or special needs. These barriers range from the inability to operate a keyboard due to conditions such as paralysis, broken limbs, or other motor impairments, to cognitive challenges that complicate the use of standard interfaces. This disparity not only hampers individual progress but also represents a broader societal failure to include all individuals in the digital revolution. Our startup, twinzadm, aims to bridge this accessibility gap by offering an innovative voice assistant tailored for PC users with special needs and physical disabilities. Leveraging advanced voice recognition and natural language processing technologies, twinzadm provides a hands-free, intuitive solution for tasks such as opening and closing installed programs, as well as conducting searches on platforms like Google, YouTube, and ChatGPT. Upon its first initialization, twinzadm scans the user's device to create a customized profile of available applications, thus allowing for a more personalized user experience. By transforming auditory input into actionable text commands, and vice versa, twinzadm offers a two-fold communication model. This dual nature ensures not only that the commands are executed effectively but also that the user receives understandable and audible feedback, closing the loop of interaction. In summary, twinzadm tackles the pressing issue of digital accessibility, enabling a marginalized segment of the population to participate more fully in today's technology-driven society. Our solution addresses the complex challenges of adapting human-computer interaction to the diverse needs of users, thereby elevating the standard of inclusive technology.



An interactive AI system that teaches preschool children how to pronounce the alphabet with different diacritics brings together the best of both worlds: cutting-edge AI technology and effective language learning techniques. By combining the interactive nature of AI technology with the teaching of diacritic-enhanced alphabet pronunciation, this system aims to revolutionize the way young children learn and engage with language. Learning the alphabet is a fundamental step in a child's early education. However, the introduction of diacritics, which are marks or symbols added to letters to indicate specific phonetic sounds, can be challenging for young learners. Traditional teaching methods often struggle to make this aspect of language learning engaging and enjoyable for children. This is where the interactive AI system comes in. With the help of AI, the system can provide a dynamic and personalized learning experience. The AI system is designed to adapt to each child's individual needs and progress, ensuring that they receive targeted instruction and support. Through interactive exercises, games, and activities, the system engages children in a fun and immersive learning environment. By incorporating diacritics into the alphabet pronunciation lessons, the system not only teaches children how to recognize and pronounce different letters but also exposes them to the nuances of phonetic sounds. This fosters their linguistic skills by helping them develop a more accurate and nuanced understanding of language. Moreover, by introducing diacritics from various cultures and languages, the system promotes cultural understanding and appreciation from an early age. The interactive AI system utilizes advanced speech recognition technology to assess and provide feedback on a child's pronunciation. Through this instant feedback loop, children can correct their pronunciation and refine their skills, building confidence and improving their overall language proficiency.

Arabic AI


VisionAI is a program that enhances web accessibility for the visually impaired. It uses artificial intelligence and voice technologies to transform web content into audio output and allow voice interaction with web pages. It analyzes images and text on web pages and converts them into audio output using a synthetic voice. It also allows the user to interact with web pages using voice input, which is processed by a natural language model and converted into text output. VisionAI can handle various web tasks, such as searching, browsing, reading, filling forms, and more. VisionAI leverages the power of GPT-4, a large multimodal model that can accept image and text inputs, and emit text outputs. GPT-4 is used for image analysis and natural language generation. VisionAI also uses OpenAI whisper, a voice input technology that can recognize speech and transcribe it into text. For voice output, VisionAI uses espeak, a speech synthesizer that can produce natural sounding speech in various languages and accents. VisionAI integrates these technologies to create a seamless and user-friendly web accessibility program. VisionAI aims to make web browsing more accessible and enjoyable for the visually impaired by providing them with a rich and interactive web experience. VisionAI is not only a program, but also a vision for the future of disability. We believe that VisionAI can help the visually impaired overcome the barriers and challenges of accessing the web, and enable them to participate fully and equally in the digital world.

Stellar Pointers

World needs Talents

Scouting New Talents in Football Based on Stats Football is a global sport with a huge fan base. Every year, millions of young athletes dream of playing professional football. However, only a small percentage of these athletes will ever make it to the pros One of the biggest challenges for young athletes is getting noticed by scouts. Scouts are constantly looking for new talent, but they can only watch so many games. This means that many talented athletes are overlooked. Our project uses data science to identify and scout new talents in football. We collect data on a player's stats, such as passing accuracy, rushing yards, and tackles. We then use this data to create a player profile that identifies their strengths and weaknesses We also use this data to predict a player's potential value to a team. Our project has the potential to revolutionize the way that football is scouted. By using data science, we can identify new talents that would have otherwise been overlooked. This could lead to more competitive and exciting football leagues, and give more athletes the opportunity to reach their full potential. Business model: To sell our data to clubs that want to know the strength and weaknesses for each player in Thier team And the clubs in higher league that want to find talents that performs like a professional one without costing them much money And the clubs in a lower leagues that wants to market their players and their talents to the other clubs and league Finally the individual players that has no team that want to market his talent to the world and make it seen Our customers Clubs , coaches, academy , players

application badge
CodexCohereWhisperStable DiffusionGPT-3GPT-4


Tired of struggling to turn long audio recordings into written text? Look no further! Soma is here to solve that problem. With Soma, you can easily convert audio to text and even translate it into different languages. But there's more! Soma can summarize the audio's content and offer a chat AI where you can ask questions about the audio. And know let’s see our Product Demo: Why Summa is good investment? Summa is good investment because the market size that Summa targets it is huge. Because Summa focus in two groups first group: English speakers, who are around 1.35 billion people worldwide, and second Arabic speakers, who are about 480 million people. If we can get just 5% of each language group to use our app, that's approximately 67.5 million English users and 24 million Arabic users. That's could make us have a total of 91.5 million potential users! The demand for Summa's services is massive. Okay Now, let's discuss about the business model. We use a subscription method. We offer three plans: Starter Plan (free), Premium Plan, and Ultimate Plan. You can pause the video to see the details of each plan and what they provide. It's a simple and straightforward way to access our app's features. Our team consists of four individuals, each with a lot of knowledge in their specific field. We are confident that our team's skills will drive Soma to achieve great success in the audio conversion and translation industry. Invest in Soma today and join us on this incredible journey to revolutionize audio conversion and translation. With our wide reach, attractive business model, and talented team, Soma is poised for remarkable achievements. Thank you for considering Soma, and we hope you have a fantastic day!


AVCB Chatbot

OAN is a Discord bot that uses the Bard large language model to provide a variety of features, including: Real-time translation: OAN can translate messages into any language, making it easy to communicate with people from all over the world. Voice note recognition: OAN can understand voice notes, so you can send commands or ask questions without having to type. Text-to-speech: OAN can convert the response from Bard into a voice note, so you can listen to it instead of reading it. These features make OAN a powerful tool for communication and collaboration on Discord. Whether you're working on a project with people from different countries or just want to chat with friends in other languages, OAN can help you connect with people from all over the world. Here are some additional benefits of using OAN: It's easy to use. You can just type in a command or ask a question, and OAN will do its best to understand you. It's accurate. OAN uses the latest language models to provide accurate translations and responses. It's versatile. You can use OAN for a variety of tasks, including translation, research, and creative writing. If you're looking for a powerful and versatile Discord bot, OAN is a great option. It's easy to use, accurate, and versatile, making it a great way to connect with people from all over the world. Here are some examples of how OAN can be used: Translate messages: You can use OAN to translate messages into any language. This is a great way to communicate with people from other countries or to learn a new language. Answer questions: You can ask OAN questions about anything. It can provide you with information on a variety of topics, including history, science, and pop culture. Generate creative content: You can use OAN to generate creative content, such as poems, stories, and code. This is a great way to get your creative juices flowing.


Pharma X

Pharma X is an innovative healthcare solution that utilizes advanced AI algorithms to transform the prescription process. By analyzing patient data, including medical history and medication records, it generates tailored and safe prescriptions, optimizing treatment plans and enhancing patient care. Traditional prescription writing can be cumbersome and prone to errors. Pharma X streamlines this process by integrating comprehensive patient data and leveraging AI algorithms to generate accurate and personalized prescriptions. With its ability to quickly analyze vast amounts of medical information, Pharma X provides healthcare providers with efficient and reliable prescription generation. One of the key benefits of Pharma X is its ability to mitigate potential drug interactions. By cross-referencing current medications with prescribed medicines, it ensures patient safety and reduces the risk of adverse effects. This advanced feature sets Pharma X apart, providing healthcare providers with the confidence that the prescribed medications are safe and suitable for the patient's unique medical profile. Pharma X also enables personalized care by considering individual factors such as age, gender, medical history, and allergies. This tailored approach optimizes treatment effectiveness and patient satisfaction. By providing precise dosage instructions and general advice, Pharma X empowers patients to take an active role in their healthcare journey. In addition to enhancing patient care, Pharma X offers time and cost savings. Its streamlined workflow and efficient prescription generation save valuable time for healthcare providers, allowing them to focus more on patient care. By preventing medication errors and improving patient outcomes, Pharma X also contributes to reducing healthcare costs in the long run.

MediSoft Health


The concept revolves around "Aoun," a personal assistant designed to enhance productivity and efficiency in meetings. Each year, countless hours are wasted in inefficient meetings, a problem Aoun aims to mitigate. It reduces the active time of meetings by providing comprehensive summaries, reducing a 60-minute meeting to a 5-minute summary. This tool automatically registers tasks derived from the meeting context, attributing each to the corresponding employee. It also offers an analytical tool that enables users to better understand their meetings while ensuring high security. Aoun functions by taking voice inputs, which are fed into the Whisper AI for processing, and then summarized using the GPT model. Our future vision includes potential integrations with platforms like Microsoft Office. Despite international competition from companies like Fireflies and Microsoft, Aoun offers distinct features like support for all languages, not only English. The current market is estimated at $3 billion in 2023, with projections to reach $14 billion by 2030. Particularly in the Middle East market, there is a glaring absence of investors and competitors, which represents a golden opportunity. We offer three packages: a free one for trial, a standard one for professionals and freelancers, and a premium package for corporations. We need support in further refining our voice recognition capabilities, upgrading our computational power, and establishing better AI connections. As a demonstration of Aoun's capabilities, it was able to summarize this entire pitch on stage into less than 100 words in 11 minutes, while also identifying key tasks and assessing the meeting's effectiveness. This is the transformative power of Aoun.


Spark Education

Our project is motivated with the 2030 Saudi Arabia Vision to improve the quality of life and empower digital transformation and the application of artificial intelligence. Guided by this vision, we are determined to revolutionize education and ensure that every learner has access to high-quality and tailored educational experiences. By harnessing the power of artificial intelligence, we aim to overcome the limitations of traditional education systems and unlock the full potential of each learner. We strongly believe that our platform has the potential to transform the educational landscape, empowering learners, optimizing learning outcomes, and shaping a more effective and efficient educational system. Our platform utilizes cutting-edge technology to analyze individual learners' preferences, performance, and learning patterns, generating a customized model for each learner. This model adapts the learning content, pace, and teaching style to match their unique needs, creating a truly personalized educational journey. Starting with personalized virtual instructor and adapting the educational experience to the individual, we enhance learner engagement, motivation, and overall learning outcomes. Key features of our platform include customized learning paths, real-time progress tracking, adaptive assessments, interactive virtual instructors, and personalized recommendations for supplementary materials and resources. Additionally, we offer a comprehensive college readiness program, preparing students for higher education and providing them with valuable insights and guidance.

Spark Team
GPT-4WhisperStable Diffusion

Green AI

Our problem is the lack of vegetation, In this project we tried to find the best way to plant trees as efficiently as possible to reduce the process of global warming using artificial intelligence based on factors including environmental pollutants in the air, temperatures, population density, the ROI standard that measures fine particles with temperatures so that the artificial intelligence model finds the number of trees that should be planted in the region, neighborhoods, and any A spot in the kingdom was big or small based on geographical locations . We are creating a deep-learning model with the help of Long Short-Term Memory (LSTM) networks a modified version of recurrent neural networks (RNN), to predict the number of trees that should be planted in each city based on population density, air pollution, and temperature. air pollution data collected during covid - 19 was used in the regions and geographic area data. as the start of the project, the model was applied to the Dammam region to identify the neighborhoods in it and determine the percentages of environmental pollution present in it and temperatures and we found the ROI standard so that the darker the area or neighborhood, the more trees it needs to be planted. :And of course our project Achieving one of the goals of sustainable development by ensuring a healthy life and promoting luxury. - Accelerate and facilitate the process of achieving one of the goals of Vision 2023 by using artificial intelligence mechanisms. - Increasing the country's economy. - Preserving the environment. We used Python for programming and interfaces HTML , CSS , and Javascript For the tools we used Flask as a framework. As well as Colab for Python and Model Finally, we used visual studio to apply all the details of the project. As a future direction, we aspire to add more features to the system that will improve our environment like : integrate more features into our prediction model.

Green AI
WhisperCohere Neural Search


Problem: Healthcare professionals struggle to accurately document patient consultations due to traditional note-taking methods being time-consuming, error-prone, and requiring significant effort. This can result in incomplete or inconsistent records, negatively affecting the quality of care provided. Target Audience: Health institutions and healthcare professionals. Market Size: According to recent market research, the global medical transcription market is projected to reach $3.7 billion by 2029. This demonstrates the vast opportunities that lie ahead for our AI transcription tool. Solution: MediScript is a web-based platform that records conversations between physicians and patients, and provides a summary of symptoms, diagnosis, and treatment. MediScript consists of two main steps. The first step is recording and converting patient-doctor conversations to text using Speech Recognition AI. For that, we used OpenAI’s Whisper API. Second, feeding the transcription along with the patient's information to a large language model to summarize the symptoms and suggest a diagnosis and treatment. We used Google’s Bard, a conversational generative AI. We feed Bard with the patient information including chronic diseases and allergies along with the conversation. Bard extracts symptoms and suggests a diagnosis and its treatment. Benefits: Reduces time and effort of patient visits. Improves transcription accuracy. Provides standardized transcriptions. Better collaboration between healthcare professionals. Improves healthcare. Future Work: Integrating ICD-10-AM (6th Edition), a standardized classification system of all diagnoses and symptoms, to provide an accurate diagnosis. Using DrugBank, a comprehensive and up-to-date medication database, to ensure accurate prescribing by considering the latest pharmaceutical information and guidelines. Support multiple languages and dialects. Sending a report of the visit to the patient's phone and email.

MediSoft Health