The Transformer Eight: Anatomy of a Breakthrough
In the spring of 2017, eight individuals collaborated on a scientific paper titled “Attention Is All You Need.” This groundbreaking work introduced the transformer architecture, which has since become the foundation for numerous AI systems, including those that can generate human-like text. Although not an author himself, ChatGPT is perhaps the most well-known example of a language model built upon this technology.
Since the paper’s publication, all eight authors have departed from Google and are now working with systems powered by their creation. To understand the anatomy of this breakthrough, we spoke with the Transformer Eight, a gathering of human minds that may have inadvertently created a machine capable of having the last word.
Jakob Uszkoreit: The Catalyst
Jakob Uszkoreit, the son of renowned computational linguist Hans Uszkoreit, found himself at the center of the transformer story. After abandoning his PhD plans, Jakob joined Google’s translation group in 2012. He later transitioned to a team working on a system that could respond to users’ questions directly on the search page, a move prompted by the launch of Apple’s Siri.
“It was a false panic,” Uszkoreit says. Siri never really threatened Google.
Despite the false alarm, Uszkoreit welcomed the opportunity to explore systems that enabled computers to engage in dialog with users. He soon discovered the potential of the self-attention idea and agreed to work on it alongside Polosukhin and Vaswani.
Ashish Vaswani and Niki Parmar: The Collaborators
Ashish Vaswani and Niki Parmar joined forces with Uszkoreit to create the design document “Transformers: Iterative Self-Attention and Processing for Various Tasks.” The name “transformers” was chosen from the outset, inspired by the idea that the mechanism would transform the input information, allowing the system to extract human-like understanding.
Parmar, an Indian engineer who had recently joined Google, worked with Uszkoreit on model variants to enhance Google search. Together, they contributed to the development of the transformer architecture.
Llion Jones and the Expanding Team
Llion Jones, a Welsh computer enthusiast, found his way to Google Research after a challenging job search during the recession. He learned about the concept of self-attention from a colleague named Mat Kelcey and later joined Team Transformers.
As the project gained momentum, other Google Brain researchers who were working on improving machine translation also joined the effort. The team continued to grow and refine the transformer architecture.
The Missed Opportunities and Departures
Despite the groundbreaking nature of their work, Google failed to capitalize on the potential of transformer-based systems fully. As Aidan Gomez, one of the authors, points out, the company struggled to integrate the technology into its product line effectively.
Remarkably, all eight authors of the transformer paper have since left Google. Many have gone on to found successful companies based on transformer technology, with combined valuations reaching billions of dollars.
Lukasz Kaiser: The OpenAI Researcher
Lukasz Kaiser, the only author who hasn’t founded a company, joined OpenAI and is now working on a new technology called “constitutional AI.” His continued research in the field highlights the ongoing evolution of AI systems built upon the transformer architecture.
The story of the Transformer Eight serves as a testament to the power of human collaboration and the transformative potential of their creation. As AI continues to advance, the impact of their work will undoubtedly shape the future of language models and beyond.
2 Comments
So, these 8 brainiacs are basically why I can’t trick my phone anymore, huh
Oh, so we’re giving credit where credit is due now; those 8 folks just changed the game entirely.