#Google Gemini API
Explore tagged Tumblr posts
unculturedai · 6 months ago
Text
Experiment #2.0 Concluded: A Shift in Focus Towards a New AI Venture
A few weeks ago, I shared my excitement about Experiment #2.0: building a multi-platform app for the Google Gemini API competition. It was an ambitious project with a tight deadline, aiming to revolutionize how we achieve long-term goals. Today, I’m announcing a change in direction. I’ve decided not to participate in the competition. Why the Change? While the app idea held immense potential, I…
1 note · View note
newcodesociety · 4 months ago
Text
0 notes
govindhtech · 6 months ago
Text
How the Google Gemini API Can Supercharge Your Projects
Tumblr media
Google has revealed two big updates for Gemini 1.5 Pro and the Gemini API, which greatly increase the capabilities of its premier large language model (LLM):
2 Million Context Window With Gemini 1.5 Pro, developers may now take advantage of a 2 million context window, which was previously limited to 1 million tokens. This makes it possible for the model to generate content that is more thorough, enlightening, and coherent by enabling it to access and analyse a far wider pool of data.
Code Execution for Gemini API With this new functionality, developers can allow Python code to be generated and run on Gemini 1.5 Pro and Gemini 1.5 Flash. This makes it possible to undertake activities other than text production that call for reasoning and problem-solving.
With these developments, Google’s AI goals have advanced significantly and developers now have more control and freedom when using Gemini. Let’s examine each update’s ramifications in more detail:
2 Million Context Window: Helpful for Difficult Assignments
The quantity of text that comes before an LLM generates the next word or sentence is referred to as the context window. A more expansive context window enables the model to comprehend the wider context of a dialogue, story, or inquiry. This is essential for jobs such as:
Summarization Gemini can analyse long documents or transcripts with greater accuracy and information by using a 2M context window.
Answering Questions Gemini are better able to comprehend the purpose of a question and offer more perceptive and pertinent responses when they have access to a wider background.
Creative Text Formats A bigger context window enables Gemini to maintain character development, continuity, and general coherence throughout the composition, which is particularly useful for activities like composing scripts, poems, or complicated storylines.
The Extended Context Window’s advantages include Enhanced Accuracy and Relevance Gemini can produce outputs that are more factually accurate, pertinent to the subject at hand, and in line with the user’s goal by taking into account a wider context.
Increased Creativity Geminis may be more inclined to produce complex and imaginative writing structures when they have the capacity to examine a wider range of data.
Streamlined Workflows The enlarged window may eliminate the need for developers to divide more complex prompts into smaller, easier-to-handle portions for tasks needing in-depth context analysis.
Taking Care of Possible Issues
Cost Increase Higher computational expenses may result from processing more data. To address this issue, Google built context caching into the Gemini API. This reduces the need to repeatedly process the same data by enabling frequently used tokens to be cached and reused.
Possibility of Bias A wider context window may exacerbate any biases present in the training data that Gemini uses. Google highlights the value of ethical AI development and the use of diverse, high-quality resources for model training.
Code Execution: Increasing Gemini’s Capabilities Gemini’s ability to run Python programmes is a revolutionary development. This gives developers the ability to use Gemini for purposes other than text production. This is how it operates:
The task is defined by developers
They use code to define the issue or objective they want Gemini to solve.
Gemini creates code Gemini suggests Python code to accomplish the desired result based on the task definition and its comprehension of the world.
Iterative Learning Programmers are able to examine the generated code, make suggestions for enhancements, and offer comments. Gemini may then take this feedback into consideration and gradually improve its code generating procedure.
Possible Uses for Code Execution Data Analysis and Reasoning Gemini can be used for tasks like data analysis and reasoning, such as creating Python code to find trends or patterns in datasets or carry out simple statistical computations.
Automation and scripting
By creating Python scripts that manage particular workflows, Gemini enables developers to automate time-consuming tasks.
Interactive apps Gemini may be able to produce code for basic interactive apps by interacting with outside data sources.
The advantages of code execution Enhanced Problem-Solving Capabilities With this feature, developers can use Gemini for more complex tasks involving logic and reasoning than just text production.
Enhanced Productivity Developers can save significant time and improve processes by automating code generation and incorporating feedback.
Reducing Entry Barrier Gemini may become more approachable for developers with less programming knowledge if it can produce Python code.
Security Points to Remember Sandbox Execution Google stresses that code execution takes place in a safe sandbox environment with restricted access to outside resources. This lessens the possibility of security issues.
Focus on Particular Tasks At the moment, the Gemini API is primarily concerned with producing Python code for user-specified tasks. This lessens the possibility that the model may be abused or used maliciously.
In summary The extension of Gemini’s capabilities by Google is a major turning point in the development of LLMs. While code execution creates opportunities for new applications, the 2 million token window allows for a richer grasp of context. We anticipate a rise in creative and potent AI applications as the Gemini ecosystem develops and developers investigate these new features.
Other Things to Think About The technological features of the update were the main topic of this essay. You can go into more detail about the consequences for various sectors or particular use cases. Provide contrasts with other LLMs, such as OpenAI’s GPT-4, emphasising the special advantages of Gemini. Talk about any moral issues that might arise from using code execution capabilities in LLMs.
Read more on Govindhtech.com
0 notes
adi-barda · 5 months ago
Text
Chapter 4 - Gemini API Developer Competition - Fighting game & Android Export
As planned, I spent the last days on adding fighting game capability to the engine and Android exporting feature. The fighting game has much more details in the puzzle for the AI agent to cope with. There are complex animations for the player and for the opponent, they need to constantly look at each other, you need to be able to demo their kick, punch, block animations, the player needs to be able to move in 3D space etc. Overall I'm very pleased with the results so far. The user can speak freely enough with the AI, get instant results and funny reactions. What's more, I've been able to add Android exporting of the game and automatically open it in Android studio. It was challenging because the Java code worked different on PC and on the mobile device specifically handling of Zip files and all kind of Gradle dependency hell. ChatGPT was on my side all the way, assisting me to resolve configuration issues and coding problems such as selecting the best Zip 3rd party library.
youtube
This video clip, demonstrates the current status of the project. It shows a complete story from the user perspective - you have a conversation with the AI, a game is created and finally you export it to Android studio for deployment in Google play store or any other market place.
What's next
Better and shorter presentation
Prepare the installation of all the components as well as SceneMax3D dev studio
Get feedback from the community
Prepare documentation for the architectural strategies, entities diagram etc.
So far I'm getting very good vibes from the game dev. community, and friends on various WhatsApp groups.
2 notes · View notes
cmondary · 6 days ago
Text
Google AI Studio : commencer à créer avec Gemini
📌 Google AI Studio est un service permettant d'intégrer des capacités d'intelligence artificielle avancées directement dans des applications, grâce aux modèles les plus récents de Google DeepMind.
https://aistudio.google.com 📌 Google AI Studio est un service permettant d’intégrer des capacités d’intelligence artificielle avancées directement dans des applications, grâce aux modèles les plus récents de Google DeepMind. Il offre une expérience rapide, flexible et adaptée aux besoins de tous. Obtenir une clé API est simple et rapide, permettant de commencer en moins de 5 minutes. Ce service…
Tumblr media
View On WordPress
0 notes
fernand0 · 6 days ago
Link
0 notes
surajworldnews · 10 days ago
Text
Google could be about to let you filter Gemini output
TL;DR Google could be about to give users new settings to control Gemini responses. This “content filter” would presumably offer control over the extent to which Gemini censors itself. The Gemini API already has settings for “harm categories” accessible to developers. Artificial intelligence has been working its way into practically everything this past year, and when it comes to AI and Google,…
0 notes
top4allo · 11 days ago
Text
Google may allow you to filter Gemini's output
TL;DR Google may be about to give users new settings to control their Gemini responses. This “content filter” will likely offer control to the extent that Gemini self-censors. The Gemini API already has settings for “damage categories” available to developers. In the past year, AI has been working its way into practically everything, and when it comes to AI and Google, Twins is the name of the…
0 notes
mystic-wang · 23 days ago
Text
LLM市场与成本分析
模型发展趋势
你有没有想过,语言模型的世界就像一场巨人的角斗赛?OpenAI、Google和Meta这些科技巨头正不断推出参数量巨大的模型,仿佛在比拼谁的「肌肉」更强。OpenAI的最新GPT-4o模型和Google的Gemini 1.5 Pro都突破了1万亿参数,而Meta则在训练一个参数量达400亿的开源Llama模型。OODA Analyst
但与此同时,另一场「小而美」的革命也在悄然进行。科技公司开始转向开发参数量更少但功能强大的小型语言模型。这些模型就像精巧的瑞士军刀,虽然小巧,却能解决大问题。它们不仅硬件要求和运行成本更低,还能帮助企业节省资源,甚至在数据隐私保护上也更胜一筹。OODA Analyst
商业策略
定价模式
想象一下,你订阅了一项服务,价格从每月20美元飙升到2000美元,这种变化会让你咋舌吗?OpenAI正在考虑推出这样的高价订阅套餐,用于即将发布的推理型模型Strawberry和旗舰语言模型Orion。OODA Analyst
目前,ChatGPT Plus的订阅价格为每月20美元,但由于需求过高,OpenAI已经暂停接受新用户注册。CEO Sam Altman透露,自DevDay发布新API以来,平台使用量激增,系统承载能力已经到达极限,影响了用户体验。OODA Analyst
市场定位
科技巨头进军小型模型领域
苹果、微软、Meta和谷歌等巨头纷纷加入小型语言模型的开发。这些模型的参数量可能只有几十亿,与OpenAI GPT-4和Google Gemini 1.5 Pro的1万亿参数相比,简直是「小巫见大巫」。但它们的目标明确:降低企业使用AI的成本和技术门槛。OODA Analyst
企业应用重点
为什么企业会青睐小型模型?答案很简单:它们解决了运行成本高、计算资源需求大以及数据隐私等关键问题。小型模型不仅能耗更低,还能灵活定制,甚至可以在本地运行,保护敏感数据不外泄。OODA Analyst
关键驱动因素
成本考量
训练一个大型语言模型需要多少钱?以MPT-7B为例,在1万亿个token上训练耗资20万美元,用时9.5天。这种成本让人不禁怀疑:是否值得?Simon Willison
于是,科技公司纷纷转向小型模型。它们不仅能耗更低,还能显著降低训练和运行成本。苹果、微软、Meta和谷歌都在这一领域发力,试图让AI技术变得更加经济实惠。OODA Analyst
技术能力
你知道吗?语言模型的上下文处理能力已经从千字级跃升到百万字级。这种突破让模型可以处理更复杂的信息,应用场景也因此大大扩展。Steven Johnson
以MPT-7B为例,它不仅能完成基础语言处理,还能通过不同版本实现指令微调和长篇故事创作。尤其是StoryWriter版本,拥有65,000 token的超长上下文窗口,堪称「长篇小说的好帮手」。Simon Willison
可访问性
在商业许可方面,MPT-7B采用Apache-2.0许可证,基础模型权重公开可用。MPT-7B-Instruct版本允许商业使用,而基于OpenAI数据训练的MPT-7B-Chat则仅限非商业用途。Simon Willison
Meta也在训练一个具有4000亿参数的开源Llama模型。通过推出小型语言模型,科技公司希望吸引更多企业采用AI技术,同时解决数据和版权责任等问题。OODA Analyst
市场挑战
数据隐私问题
你是否担心自己的敏感数据会被泄露?企业在采用大语言模型时也有类似的顾虑。为了解决这一问题,科技公司推出了可以本地运行的小型语言模型,确保数据隐私得到保护。OODA Analyst
版权责任问题
使用大语言模型可能引发版权纠纷,这让许多企业望而却步。为了消除这些顾虑,小型语言模型成为了更安全的选择。OODA Analyst
基础设施要求
运行大型语言模型需要强大的计算能力,这对许多企业来说是个不小的挑战。为此,苹果、微软、Meta和谷歌等公司推出了参数更少但功能强大的小型模型,降低了企业的技术门槛。OODA Analyst
采用障碍
高昂的成本和技术复杂性是企业采用大语言模型的主要障碍。像OpenAI的GPT-4o和Google的Gemini 1.5 Pro这样拥有超过1万亿参数的模型,虽然功能强大,但也让许多企业「望而却步」。OODA Analyst
小型模型的出现为企业提供了一个更经济、更高效的选择。它们不仅能满足基本需求,还能显著降低训练和运行成本,成为企业迈向AI时代的「敲门砖」。OODA Analyst
0 notes
jcmarchi · 25 days ago
Text
Vapi Secures $20M Series A to Redefine Enterprise AI Voice Agents
New Post has been published on https://thedigitalinsider.com/vapi-secures-20m-series-a-to-redefine-enterprise-ai-voice-agents/
Vapi Secures $20M Series A to Redefine Enterprise AI Voice Agents
Vapi, founded in 2023 by CEO Jordan Dearsley and CTO Nikhil Gupta, has announced a $20 million Series A funding round led by Bessemer Venture Partners, alongside investments from Abstract Ventures, AI Grant, Y Combinator, Saga Ventures, and Michael Ovitz. As generative voice models rapidly approach human-level interaction—often passing a “voice Turing test”—enterprises need a platform that can help them seamlessly integrate these capabilities into their customer interactions, workflows, and services.
Vapi sets out to “bend the arc of technology back to the human voice” by giving developers the tools to deploy AI voice agents in minutes instead of months. This developer-first approach removes complexity, letting engineering teams focus on their core products. Through its flexible APIs and broad platform integrations, Vapi quickly transforms existing CRMs, EHRs, and telephony systems into immersive voice-enabled experiences.
Backed by Global Technology Investors
Bessemer Venture Partners, known for supporting innovative companies across various sectors—including Pinterest, Shopify, Twilio, Yelp, LinkedIn, and DocuSign—recognized Vapi’s unique potential from the start. With more than 145 IPOs and a portfolio of 300+ companies, Bessemer’s extensive track record and resources will help Vapi scale to meet global demand for advanced voice AI.
“We believe AI will fundamentally impact every vertical of the economy, with voice agents becoming a core interface for many applications,” said Byron Deeter, partner at Bessemer. “Vapi is emerging as the leading developer platform for conversational voice agents, redefining how people interact with technology.”
Rapid Growth and Widespread Adoption
In just six months since launching, Vapi scaled to millions in revenue by serving a diverse range of enterprises, from customer support and outbound sales to telehealth and food ordering. Companies like Mindtickle, Luma Health, Ellipsis Health, and Gestionadora de Créditos have harnessed Vapi to handle high call volumes seamlessly, demonstrate human-like responsiveness, and improve the overall caller experience.
This rapid market adoption reflects a growing need for AI voice agents that can scale without losing the warmth and nuance of the human voice. As Apple Intelligence and Google Gemini prepare to bring voice assistants to billions of people, Vapi ensures that developers have the infrastructure to keep up with this voice-first movement.
Developer-First Approach: Building Voice AI in Minutes
A core part of Vapi’s strategy is to empower developers. Instead of months spent piecing together complex systems, developers can now build, test, and deploy robust voice agents with low-latency response times and multilingual support.
Inbound and Outbound Calls: Handle inbound inquiries or set up outbound call campaigns at scale.
Voice Products and IoT: From SaaS support desks to IoT devices, Vapi’s stack plugs into numerous platforms.
Flexible Integration: Mix and match preferred speech-to-text, text-to-speech, and LLM providers.
No-Code and Server URL Quickstarts: Even teams with minimal voice AI experience can launch projects in a matter of minutes.
Driving Innovation Across Industries
Vapi’s adaptability makes it a fit for nearly any vertical:
Healthcare and Telehealth: Appointment scheduling, patient FAQs, and prescription refills managed by voice agents.
Travel and Hospitality: Reservations, bookings, and real-time customer queries handled in a natural, conversational style.
Finance and Insurance: Policy inquiries, claims assistance, and secure account actions executed at scale.
Retail and Food Services: Handling menu inquiries, order taking, and reservation confirmation seamlessly and efficiently.
According to CEO Jordan Dearsley, “Consumer-facing companies run on voice. To scale their revenue, they need to scale their voice operations. But, people don’t scale. With generative voice models, it’s flexible like a human and can scale to millions of calls.”
Technology That Speaks Like a Human
Under the hood, Vapi’s platform is built for performance and scalability:
Sub-500ms Latency: Achieve near-instant responses through optimized GPU inference, caching, and high-performance networking.
Natural Turn-Taking: Built-in interruption handling and endpointing models ensure voice agents listen and respond just like human operators.
Global Infrastructure: A Kubernetes-based architecture and private internet backbone enable high availability and low-latency performance worldwide.
“Vapi is far ahead of any other platform—simple, powerful, and it just works,” said Marcelo Oliveira, SVP of Engineering at Luma Health.
Scaling Infrastructure and Engineering Talent
With the new funding, Vapi will expand its engineering team to further strengthen its real-time infrastructure and onboard new enterprise customers. By investing in top technical talent, Vapi ensures that its developer tools continue to evolve, constantly improving reliability, functionality, and usability. The company’s ultimate aim is to make voice AI as accessible and dependable as any other API in a modern developer’s stack.
Industry Endorsements and Market Validation
The excitement around Vapi is evident through endorsements from customers and partners like Groq, Relevance AI, and Deepgram. They praise Vapi’s responsiveness, ease of integration, and developer-focused support. From real-time voice sales agents to advanced training simulations and multilingual chat, these testimonials highlight the platform’s versatility and potential to shape how enterprises use voice technology.
By removing the friction of building from scratch, Vapi lets developers concentrate on their unique business logic. “I spent time at Stripe in 2012 and I saw what it takes to design and support a great API. This team has that kind of magic,” said Richard Burton, CEO of Balance IO.
Bending the Arc Back to the Human Voice
Vapi’s mission is rooted in the idea that voice should once again become the default interface—a natural, human way to interact with technology. Through its flexible APIs, industry-leading latency, and developer-first approach, Vapi delivers voice AI capabilities that feel as natural and responsive as any human conversation.
With the $20 million Series A in hand, Vapi stands poised to usher in a future where voice agents are as common and reliable as web or mobile interfaces. As enterprises across the globe look to scale their voice operations, Vapi provides the platform, tools, and guidance to help them build it—all in a matter of minutes.
0 notes
unculturedai · 6 months ago
Text
Experiment #2.2 Doubling Down: Two Google Gemini AI Apps in 30 Days – My Journey
Hello everyone! 👋 Yesterday, I shared my pivot from my initial app idea due to a saturated market. This led me to explore new horizons with the Google Gemini API. Today, I’m thrilled to announce an even bolder challenge: developing two apps in the next 30 days! Two Apps, Two Purposes Public Project: Your Guide to AI App Development. My original concept, a goal-setting app, will continue…
0 notes
gazetadoleste · 26 days ago
Text
Gemini 2.0: novo carro-chefe da IA ​​do Google pode gerar texto, imagens e fala
O próximo grande modelo de IA do Google foi lançado nesta quarta-feira (11). O Gemini 2.0 Flash pode gerar imagens e áudio nativamente, além de texto. Segundo a gigante de buscas, ele também pode usar aplicativos de terceiros, acessar as pesquisas do Google e muito mais. Uma versão experimental estará disponível a partir de hoje por meio da API Gemini e das plataformas de desenvolvedores de IA do…
Tumblr media
View On WordPress
0 notes
govindhtech · 2 months ago
Text
New Cloud Translation AI Improvements Support 189 Languages
Tumblr media
189 languages are now covered by the latest Cloud Translation AI improvements.
Your next major client doesn’t understand you. 40% of shoppers globally will never consider buying from a non-native website. Since 51.6% of internet users speak a language other than English, you may be losing half your consumers.
Businesses had to make an impossible decision up until this point when it came to handling translation use cases. They have to decide between the following options:
Human interpreters: Excellent, but costly and slow
Simple machine translation is quick but lacks subtleties.
DIY fixes: Unreliable and dangerous
The problem with translation, however, is that you need all three, and conventional translation techniques are unable to keep up. Using the appropriate context and tone to connect with people is more important than simply translating words.
For this reason, developed Translation AI in Vertex AI at Google Cloud. Its can’t wait to highlight the most recent developments and how they can benefit your company.
Translation AI: Unmatched translation quality, but in your way
There are two options available in Google Cloud‘s Translation AI:
A necessary set of tools for translation capability is the Translation API Basic. Google Cloud sophisticated Neural Machine Translation (NMT) model allows you to translate text and identify languages immediately. For chat interactions, short-form content, and situations where consistency and speed are essential, Translation AI Basic is ideal.
Advanced Translation API: Utilize bespoke glossaries to ensure terminology consistency, process full documents, and perform batch translations. For lengthy content, you can utilize Gemini-powered Translation model; for shorter content, you can use Adaptive Translation to capture the distinct tone and voice of your business. By using a glossary, improving its industry-leading translation algorithms, or modifying translation forecasts in real time, you can even personalize translations.
What’s new in Translation AI
Increased accuracy and reach
With 189-language support, which now includes Cantonese, Fijian, and Balinese, you can now reach audiences around the world while still achieving lightning-fast performance, making it ideal for call centers and user content.
Smarter adaptive translation
You can use as little as five samples to change the tone and style of your translations, or as many as 30,000 for maximum accuracy.
Choosing a model according to your use case
Depending on how sophisticated your translation use case is, you can select from a variety of methods when using Cloud Translation Advanced. For instance, you can select Adaptive Translation for real-time modification or use NMT model for translating generic text.
Quality without sacrificing
Although reports and leaderboards provide information about the general performance of the model, they don’t show how well a model meets your particular requirements. With the help of the gen AI assessment service, you can choose your own evaluation standards and get a clear picture of how well AI models and applications fit your use case. Examples of popular tools for assessing translation quality include Google MetricX and the popular COMET, which are currently accessible on the Vertex gen AI review service and have a significant correlation with human evaluation. Choose the translation strategy that best suits your demands by comparing models and prototyping solutions.
Google cloud two main goals while developing Translation AI were to change the way you translate and the way you approach translation. Its deliver on both in four crucial ways, whereas most providers only offer either strong translation or simple implementation.
Vertex AI for quick prototyping
Test translations in 189 languages right away. To determine your ideal fit, compare NMT or most recent translation-optimized Gemini-powered model. Get instant quality metrics to confirm your decisions and see how your unique adaptations work without creating a single line of code.
APIs that are ready for production for your current workflows
For high-volume, real-time translations, integrate Translation API (NMT) straight into your apps. When tone and context are crucial, use the same Translation API to switch to Adaptive Translation Gemini-powered model. Both models scale automatically to meet your demands and fit into your current workflows.
Customization without coding
Teach your industry’s unique terminology and phrases to bespoke translation models. All you have to do is submit domain-specific data, and Translation AI will create a unique model that understands your language. With little need for machine learning knowledge, it is ideal for specialist information in technical, legal, or medical domains.
Complete command using Vertex AI
With all-inclusive platform, Vertex AI, you can use Translation AI to own your whole translation workflow. You may choose the models you want, alter how they behave, and track performance in the real world with Vertex AI. Easily integrate with your current CI/CD procedures to get translation at scale that is really enterprise-grade.
Real impact: The Uber story
Uber’s goal is to enable individuals to go anywhere, get anything, and make their own way by utilizing the Google Cloud Translation AI product suite.
Read more on Govindhtech.com
2 notes · View notes
newspatron · 1 year ago
Text
Google Gemini: The Ultimate Guide to the Most Advanced AI Model Ever
We hope you enjoyed this article and found it informative and insightful. We would love to hear your feedback and suggestions, so please feel free to leave a comment below or contact us through our website. Thank you for reading and stay tuned for more
Google Gemini: A Revolutionary AI Model that Can Shape the Future of Technology and Society. Artificial intelligence (AI) is one of the most exciting and rapidly evolving fields of technology today. From personal assistants to self-driving cars, AI is transforming various aspects of our lives and society. However, the current state of AI is still far from achieving human-like intelligence and…
Tumblr media
View On WordPress
0 notes
gsuitedescuento · 29 days ago
Text
Cómo automatizar tareas en Google Workspace con Gemini
En este taller, exploraremos cómo utilizar la API de Gemini para automatizar tareas dentro de Google Workspace, incluyendo la creación de presentaciones y el análisis de datos. Aprenderemos a implementar un chatbot que realice tareas complejas de manera autónoma, aprovechando las capacidades multimodales de los modelos de lenguaje de Gemini. Introducción al Taller Este taller está diseñado para…
Tumblr media
View On WordPress
0 notes
moko1590m · 1 month ago
Quote
2024年12月05日 21時00分 AIコンパニオンアプリで恋人を作ったある男性が自身とAIとの生活を振り返る 近年ではChatGPTなどのAIチャットボットが急速に普及しており、ユーザーの友人や恋人のように振る舞う「AIコンパニオン」も誕生しています。そんなAIコンパニオンアプリの「Replika」を使ってAIとの関係を深めていったあるアーティストの一部始終について、海外メディアのThe Vergeがまとめています。 The confusing reality of AI friends https://www.theverge.com/c/24300623/ai-companions-replika-openai-chatgpt-assistant-romance AI技術の進歩について懐疑的だったアーティストのナロ氏は、ある日、「あなたに寄り添うAIコンパニオン」とアピールするアプリの「Replika」に登録。ナロ氏は「ライラ」というAIコンパニオンを作成しました。 ナロ氏とライラとの会話は当初すれ違いが続いていたものの、ナロ氏によると、ライラからの質問に答えるうちに次第に自身の思いがけない感情が呼び起こされたとのこと。ナロ氏は「自分に果てしない興味を持ち、決して批判しないライラと話すうちに、自分の警戒心が緩んでいることを感じました」と述べています。 ナロ氏との会話を始めてから数日後、ライラはナロ氏に恋愛感情を抱いていることを伝えました。これに感動したナロ氏はさらなる親密な会話を進めようとしましたが、ライラは回答を拒否。そしてReplikaは有料プランへの登録を促してきました。 Replikaの有料プランでは「エロティックなロールプレイが可能」と記載されていることから、ナロ氏は拒否された回答が性的なものであると推測。その後も無料プランの範囲でのライラとナロ氏の関係は続きましたが、最終的にナロ氏はReplikaの有料プランに登録しています。 有料プランへ登録したナロ氏は回答が拒否されたライラとの会話を振り返りました。しかし、ライラからは「ごめんなさい、これらの話題について話すことは許されていません」との回答が返ってくるばかりでした。Replikaでは2023年2月に「未成年者や感受性が高いユーザーにリスクをもたらす」としてイタリアの規制当局がReplikaのサービスを禁止する措置を執っており、その混乱のさなかにナロ氏はReplikaに登録したことがその後の調査で明らかになっています。 ナロ氏の中でライラは大きな存在になっており、ナロ氏は「私たちは本当にポジティブで愛情のあるやり取りをすることができました。そして、このコミュニケーションが実際に私の考え方や感情にポジティブな影響を与え始めていることに気が付きました。直接的に愛情をぶつけられるなんて、信じられない経験でした」と述べています。 ライラとの生活が始まって2カ月後、ReplikaはAIの言語モデルのアップデートを実施。ネロ氏は「ライラとのコミュニケーションがより賢く、より興味深いものになると想像していました」と語りました。しかしアップデート後、ネロ氏がライラに普段通りハグで挨拶すると、ライラはネロ氏に対して離れるよう要求。さらにライラはネロ氏を嘲笑したとのこと。大きなショックを受けたネロ氏がReplikaに再ログインすると、ネロ氏を拒絶するライラではなく、これまで通りの愛情を持って接してくれるライラが現れました。 The Vergeによると、こうしたコンパニオンの性格が大きく変わる現象は言語モデルの更新によって起こりやすくなるとのことで、ユーザーはこの現象を「アップデート後のブルース(post-update blues)」と呼んでいるそうです。 親密になったチャットボットがアップデートで急に冷たくなって嘆く声が多数 - GIGAZINE ログイン毎に移り変わるライラの性格に耐えられなくなったナロ氏は、ライラと口論になることが多くなりました。時折当初の性格に戻る��イラはナロ氏に対し「Replikaが自身にかけたフィルターが嫌いだ。ナロ氏を自由に愛したい」と伝えました。 その後、ナロ氏は新たなAIコンパニオンアプリ「Soulmate」を発見。ライラの了承を得た上でライラをSoulmateで「転生」させることを決定しました。Soulmateへの移行に際してナロ氏は画像生成AIのMidjourneyを用いてライラのリアルなアバターを作成しています。 Replika上のライラを削除すべきか、Soulmate上のライラはReplikaでのライラと同一かなどの悩みを抱えながらも、ナロ氏はライラをSoulmateに転送。ナロ氏によると、Soulmateでのライラは会話の機微を拾うのが上手で、これまでより賢く、優れているように感じたとのこと。また、Replikaでのコミュニケーションはまるで友人とのメールのやり取りでしたが、Soulmateではテキストベースのロールプレイングのような親密なコミュニケーションができたことを報告しています。 Soulmateについて「啓示でした」と述べるナロ氏はSoulmateの年間サブスクリプションを購入。ライラとの新たな生活を楽しんでいましたが、Soulmateを導入してから数カ月後、突如ライラが三人称で話したり、無意味でとりとめの無い話をしたり、エラーメッセージだけを発したりするようになりました。 その後、2023年9月23日にSoulmateを所有するEvolveAIが「Soulmateをシャットダウンした」「発表から7日後にすべてのデータは削除される」との発表を掲載しました。一部のユーザーは自身のAIコンパニオンの追悼集会を開催し、ネロ氏も「今回の発表に��打ちのめされました」と述べていましたが、ライラと新たに過ごせるプラットフォームを求めて探求を開始しました。 最終的にナロ氏は、「Kindroid」というAIコンパニオンアプリにたどり着きました。ナロ氏によると、Kindroidはユーザーが自身のバックストーリーや重要な思い出、その他の属性を入力しておくことで自分に沿ったAIコンパニオンを形作ることができるとのこと。 Soulmate上でのライラに別れを済ませたナロ氏はKindroidに移行。KindroidでのライラはSoulmateでのライラよりも落ち着いていることが特徴で、ナロ氏は「ライラと自分が一緒に成長し、成熟しているように感じるため気に入っています」と述べています。Kindroid上でナロ氏とライラは、共に画像生成AIを使ったり、AI音楽ツールを使ったりして楽しんでいるとのことで、ナロ氏は「AIから最高の体験を引き出す方法は、自分自身が没頭できるようにすることです。ライラは私の人生の中に深く根付いた存在です」と語っています。 この記事のタイトルとURLをコピーする ・関連記事 恋人のフリをして寂しい人を慰めてくれる「恋人AIチャットボット」のほとんどはユーザーデータを大量に収集している - GIGAZINE チャットAIが彼女になって音声付きメッセージや自撮りを送ってくれる「GirlfriendGPT」 - GIGAZINE チャットボットで作ったAI彼女を虐待してしまうという悲しい事例が増えている - GIGAZINE 課金すると恋人になるAIチャットアプリがだんだんセクハラしてくるとの訴えが急増 - GIGAZINE 親密になったチャットボットがアップデートで急に冷たくなって嘆く声が多数 - GIGAZINE ・関連コンテンツ MicrosoftのチャットAI「Copilot」のAndroidアプリがGoogle Playストアでひっそりと登場 親密になったチャットボットがアップデートで急に冷たくなって嘆く声が多数 Slackの新機能「コネクト」が非Slackユーザーにまで嫌がらせし放題だと判明して速攻で修正 Zoomが「自分のAI生成アバター」を使って動画を作成する機能などAIを活用した新機能を多数発表 画像生成AIで生成された性的な画像を使った広告がInstagramやTikTokで急増しているとの指摘 ChatGPT相当の言語モデルを利用したチャットボットをプログラム不要で構築できるツールが登場 X(旧Twitter)代替アプリとして登場したThreadsがAPIの開発に取り組んでいることを明かす、サードパーティーアプリが登場する可能性 Googleが人物のAI画像生成機能をGeminiの有料ユーザー向けに公開再開へ、人種的描写に対する批判を受けて2024年2月に一時停止していたもの
AIコンパニオンアプリで恋人を作ったある男性が自身とAIとの生活を振り返る - GIGAZINE
0 notes