Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
A recent exchange on X
Formerly Twitter, between Wharton professor Ethan Mollick and Andrej Karpathy, the former Director of AI at Tesla and co-founder of OpenAI, touches on something both fascinating and foundational: many of today’s top generative AI models — including those from OpenAI, Anthropic, and Google— exhibit a striking similarity in tone, prompting the question: why are large language models (LLMs) converging not just in technical proficiency but also in personality?
Commentary on Output Convergence
The follow-up commentary pointed out a common feature that could be driving the trend of output convergence: Reinforcement Learning with Human Feedback (RLHF), a technique in which AI models are fine-tuned based on evaluations provided by human trainers.
Inflection AI’s Approach
Building on this discussion of RLHF’s role in output similarity, Inflection AI’s recent announcements of Inflection 3.0 and a commercial API may provide a promising direction to address these challenges. It has introduced a novel approach to RLHF, aimed at making generative models not only consistent but also distinctively empathetic.
With an entry into the enterprise space, the creators of the Pi collection of models leverage RLHF in a more nuanced way, from deliberate efforts to improve the fine-tuning models to a proprietary platform that incorporates employee feedback to tailor gen AI outputs to organizational culture. The strategy aims to make Inflection AI’s models true cultural allies rather than just generic chatbots, providing enterprises with a more human and aligned AI system that stands out from the crowd.
Inflection AI wants your work chatbots to care
Against this backdrop of convergence, Inflection AI, the creators of the Pi model, are carving out a different path. With the recent launch of Inflection for Enterprise, Inflection AI aims to make emotional intelligence — dubbed “EQ” — a core feature for its enterprise customers.
The company says its unique approach to RLHF sets it apart. Instead of relying on anonymous data-labeling, the company sought feedback from 26,000 school teachers and university professors to aid in the fine-tuning process through a proprietary feedback platform. Furthermore, the platform enables enterprise customers to run reinforcement learning with employee feedback. This enables subsequent tuning of the model to the unique voice and style of the customer’s company.
Drawbacks of RLHF
RLHF has become the centerpiece of gen AI development, largely because it allows companies to shape responses to be more helpful, coherent, and less prone to dangerous errors. However, RLHF is not without its drawbacks. RLHF was quickly offered as a contributing reason to a convergence of model outputs, potentially leading to a loss of unique characteristics and making models increasingly similar.
Inflection AI’s Nuanced Training Strategy
To mitigate some of these RLHF limitations, Inflection AI has embarked on a more nuanced training strategy. Not only implementing improved RLHF, but it has also taken steps towards agentic AI capabilities, which it has abbreviated as AQ (Action Quotient). As White described in a recent interview, Inflection AI’s..
Navigating a post-Suleyman world
Inflection AI has undergone significant internal changes over the past year. The departure of CEO Mustafa Suleyman in Microsoft’s “acqui-hire,” along with a sizable portion of the team, cast doubt on the company’s trajectory. However, the appointment of White as CEO and a refreshed management team has set a new course for the organization.
Pi’s… actually pretty popular
Inflection AI’s unique approach with Pi is gaining traction beyond the enterprise space, particularly among users on platforms like Reddit. The Pi community has been vocal about their experiences, sharing positive anecdotes and discussions regarding Pi’s thoughtful and empathetic responses.
What’s next for Inflection AI
Moving forward, Inflection AI’s focus on post-training features like Retrieval-Augmented Generation (RAG) and agentic workflows aims to keep their technology at the cutting edge of enterprise needs. The jury’s still out on whether Inflection AI’s novel approach will significantly enhance output similarity.
VB Daily
Stay in the know! Get the latest news in your inbox daily
Thanks for subscribing. Check out more VB newsletters here.
An error occurred.
FAQs
Q: What is RLHF?
A: RLHF stands for Reinforcement Learning with Human Feedback. It is a technique where AI models are fine-tuned based on evaluations provided by human trainers.
Q: How does Inflection AI differentiate its models?
A: Inflection AI differentiates its models by incorporating employee feedback to tailor gen AI outputs to organizational culture, making them true cultural allies instead of generic chatbots.
Q: What is the focus of Inflection AI’s future developments?
A: Inflection AI’s focus is on post-training features like Retrieval-Augmented Generation and agentic workflows to keep their technology cutting edge for enterprise needs.
Credit: venturebeat.com