
Since 1985, the Wild Dolphin Project (WDP) has studied wild Atlantic spotted dolphins in the Bahamas. Their non-invasive approach “In Their World, on Their Terms” has created a valuable dataset of underwater audio, video, and behaviour-linked vocalizations. Over time, researchers have learned to associate specific sounds with certain behaviours. For example, signature whistles help mothers and calves reunite, while click “buzzes” often occur during courtship.
Yet, understanding the full structure of dolphin communication remains a formidable challenge. This is where DolphinGemma comes in. Using audio technologies like Google’s SoundStream tokenizer and drawing on the architecture behind Gemini models, DolphinGemma analyses sequences of dolphin sounds to uncover patterns. Much like language models for human speech, it predicts what comes next only this time, it’s with clicks, whistles, and squawks.
Importantly, the model is sized to run directly on Pixel smartphones, which are used in the field. That means researchers can work with AI-powered tools right in the dolphins’ environment. This direct analysis enables faster insights and can help uncover hidden structures in communication that would take humans years to find manually.
From Listening to Interacting
While decoding natural dolphin sounds is essential, WDP is also exploring two-way interaction. Their CHAT (Cetacean Hearing Augmentation Telemetry) system is designed to establish a shared, simplified vocabulary. It does this by linking synthetic whistles with specific objects dolphins enjoy, like seaweed or scarves.
This setup works by first demonstrating the system between humans. Then, curious dolphins may begin mimicking the sounds to request the objects themselves. Early results are promising. With Pixel phones now running real-time analysis, the system can hear, identify, and respond to dolphin whistles faster than ever. Future enhancements set for summer 2025 aim to make this process even smoother by combining predictive modeling and advanced audio hardware in one device.
Sharing the Tools for Global Impact
Looking ahead, Google plans to release DolphinGemma as an open-source model. While it’s currently trained on Atlantic spotted dolphins, researchers studying other species could fine-tune it for their own needs. This openness means more scientists can analyze their acoustic data with powerful AI support.
Ultimately, DolphinGemma represents more than a technological breakthrough. It’s a bridge between species powered by decades of fieldwork, AI innovation, and a shared curiosity about the marine world. Although full understanding may still be far off, we’re now closer than ever to truly talking with dolphins.