[ad_1]
2023 was a surprising year for AI – the company long considered as the AI top dog lost at its own game to a startup unheard of by many before its meteoric rise. Caught off guard, Google scrambled to catch up with its own AI chatbot offering in the form of Bard, but that was proved unremarkable by users, at least when compared to others in the race.
Now the search engine giant looks geared to dethrone OpenAI with a fresh start – a new family of AI models called Gemini that’s seemingly built from the ground up. And if Google’s own comparisons are to be believed, the AI model does seem to outperform GPT-4 across multiple parameters, particularly the multimodal ones. The company has already plugged bits of it into the Pixel 8 Pro and Bard, but that’s just a start. Below, we’ve compiled everything Google has in store with ChatGPT’s biggest possible threat.
- 01
Gemini sees and talks like a real person
Like GPT-4, Gemini is an AI model that cannot be accessed directly. Rather, it acts as a base that Google and, ultimately, other developers can use to build products on top of. Google suggests that Gemini was built from the ground up to be multimodal, which means it can operate across and combine different types of information, including text, audio, image, code, and video. It can recognise images, speak in real-time, and even solve physics with remarkable ingenuity. Just check out the demo.
While this alone doesn’t set the model apart from GPT-4, which was also designed to be multimodal, Gemini’s versatility is commendable in that it’s more than a single model and can run on everything – from data centres to mobile devices.
- 02
Can practically run on anything
Basically, Gemini 1.0 comprises three models – Ultra, Pro, and Nano. Gemini Ultra is Google’s most powerful LLM ever and is aimed at enterprise applications that will run it for “highly complex tasks.” Gemini Pro is the most general-purposed of the three and has already been plugged into Bard for prompts that require advanced reasoning, planning, and understanding. Developers and enterprise customers will be able to access this model via the Gemini API in Google AI Studio or Google Cloud Vertex AI starting December 13. Meanwhile, Gemini Nano, described as the most efficient model for on-device tasks, has been baked into the Pixel 8 Pro to process tasks like information summarisation and Smart Reply.
- 03
Trained using Tensor Processing Units
TPU v5p AI accelerator supercomputers in a Google data center. (Image: Google)
Google says it trained Gemini 1.0 on its AI-optimised infrastructure using its in-house designed Tensor Processing Units (TPUs) v4 and v5e. If that name sounds familiar, that’s because it’s the same technology that’s included in the Google Pixel’s Tensor chipset. Google says that training and running Gemini on TPUs allows it to run faster than earlier, smaller and less-capable models.
- 04
Gemini Ultra beats GPT-4 in benchmarks
(Image: Google)
Google says that Gemini outperforms the competition across tasks, showing off that Gemini Ultra led the pack in six out of eight benchmarks in a research paper. When multimodal capabilities including natural image, audio and video understanding are brought into the picture, Gemini Ultra’s performance was found to exceed state-of-the-art results on 30 of the 32 benchmarks used in large language model (LLM) development. However, it’s to be noted that the series of benchmarks from the research paper showed that only Gemini Ultra outperformed GPT-4, while the consumer-oriented Gemini Pro grabbed a cosy spot between GPT-3.5 and GPT-4.
Benchmarks are just benchmarks, but if converted proportionally into real-world scenarios, Gemini Pro – which like GPT 3.5 is free to use – could prove advantageous to the average user because it’s seemingly better at many tasks than GPT 3.5.
- 05
Safety check
The ability to reason and accuracy are two of the biggest factors that make a ‘good’ AI model but those qualities are virtually meaningless if they’re not accompanied by appropriate safety checks. To that end, Google says it employed “best-in-class adversarial testing techniques” to identify safety issues before deploying Gemini. The company says it’s put checks in place and built dedicated safety classifiers to help its model stay clear of risks like bias, toxicity, and spitting out content that encourages violence.
- 06
A new lease of life to Bard
Gemini is also coming to Bard. Starting Wednesday, the AI chatbot is using a tweaked version of Gemini Pro for more advanced reasoning, planning, understanding, and more in English. A second version of the AI chatbot called Bard Advanced will follow early next year, giving access to the company’s most cutting-edge models including Gemini Ultra. Chances are this will be locked behind a subscription just like ChatGPT Plus.
- 07
Pixel 8 Pro will benefit too
Another Google product harvesting gains from Gemini is the Pixel 8 Pro, which will use the tech to power various on-device experiences. This includes Summarize in the Recorder app which’ll fetch you a summary of your recorded conversations, interviews, and more even when you’re offline.
(Image: Google)
The model is also starting to power Smart Reply in Gboard as a developer preview, suggesting “high-quality responses with conversational awareness.” This, Google says, will first be supported on WhatsApp followed by more apps next year.
[ad_2]