Will humans ever learn to speak whale?

2021-06-13

Sperm whales are among the loudest living animals on the planet, producing creaking, knocking and staccato clicking sounds to communicate with other whales that are a few feet to even a few hundred miles away.

This symphony of patterned clicks, known as codas, might be sophisticated enough to qualify as a full-fledged language. But will humans ever understand what these cetaceans are saying?

The answer is maybe, but first researchers have to collect and analyze an unprecedented number of sperm whale communications, researchers told Live Science. PLAY SOUND

With brains six times larger than ours, sperm whales (Physeter macrocephalus) have intricate social structures and spend much of their time socializing and exchanging codas. These messages can be as brief as 10 seconds, or last over half an hour. In fact, “The complexity and duration of whale vocalizations suggest that they are at least in principle capable of exhibiting a more complex grammar” than other nonhuman animals, according to an April 2021 paper about sperm whales posted to the preprint server arXiv.org.

This paper, by a cross-disciplinary project known as CETI (Cetacean Translation Initiative), outlines a plan to decode sperm whale vocalizations, first by collecting recordings of sperm whales, and then by using machine learning to try to decode the sequences of clicks these fellow mammals use to communicate. CETI chose to study sperm whales over other whales because their clicks have an almost Morse code-like structure, which artificial intelligence (AI) might have an easier time analyzing.

Breaching the surface

The little that humans do know about sperm whales has all been learned quite recently. It was only in the 1950s that we noted they made sounds, and it wasn’t known that they were using those sounds to communicate until the 1970s, according to the new research posted by CETI.

This clicking appears to serve a dual purpose. Sperm whales can dive to depths of 4,000 feet (1,200 meters), or three times deeper than nuclear submarines, according to the Woods Holes Oceanographic Institution. Because it is pitch black at these depths, they have evolved to seek out squid and other marine creatures by using clicks for echolocation, a type of sonar. This same clicking mechanism is also used in their social vocalizations, although the communication clicks are more tightly packed, according to the CETI paper.

Figuring out even this much has been challenging, as sperm whales have “been so hard for humans to study for so many years,” David Gruber, a marine biologist and CETI project leader, told Live Science. But now, “we actually do have the tools to be able to look at this more in depth in a way that we haven’t been able to before.” Those tools include AI, robotics and drones, he said.

Pratyusha Sharma, a data science researcher for CETI and a doctoral candidate in the Computer Science and Artificial Intelligence Laboratory at MIT, told Live Science more about recent developments in artificial intelligence and language models, such as GPT-3, which uses deep learning to construct human-like text or stories on command, and last year took the AI community by storm. Scientists hope these same methods could be applied to the vocalizations of sperm whales, she said. The only problem: these methods have a voracious appetite for data.

The CETI project currently has recordings of about 100,000 sperm whale clicks, painstakingly gathered by marine biologists over many years, but the machine-learning algorithms might need somewhere in the vicinity of 4 billion. To bridge this gap, CETI is setting up numerous automated channels for collecting recordings from sperm whales. These include underwater microphones placed in waters frequented by sperm whales, microphones that can be dropped by eagle-eyed airborne drones as soon as they spot a pod of sperm whales congregating at the surface, and even robotic fish that can follow and listen to whales unobtrusively from a distance.

But even with all this data, will we be able to decipher it? Many of the machine-learning algorithms have found audio more difficult to analyze than text. For instance, it might be challenging to parse apart where one word begins and ends. As Sharma explained, “Suppose there’s a word ‘umbrella.’ Is ‘um’ the word or is it ‘umbrell’ or is it ‘umbrella’?” The barriers between spoken words are more ambiguous and less regular, and patterns may therefore require more data to suss out.

That’s not the only difficulty CETI will face. “Whether someone comes from let’s say Japan or from the U.S. or from wherever, the worlds we talk about are very similar; we talk about people, we talk about their actions,” Sharma said. “But the worlds these whales live in are very different, right? And the behaviors are very different.”

What’s more, sperm whales are known to have dialects, according to a 2016 study in the journal Royal Society Open Science, which analyzed codas from nine sperm whale groups in the Caribbean for six years.

But these difficulties are also what make the project so worthwhile. What exactly one sperm whale says to another remains as dark and murky as the waters they swim in, but this mystery makes any answers CETI finds all the more intriguing. As Gruber put it, “We learn so much when we try to view the world from the perspective of the other.”

Originally published on Live Science.

livescience.com, 13 June 2021
; https://www.livescience.com