Why experts writing AI evals is creating the fastest-growing companies in history | Brendan Foody (CEO of Mercor)
Brendan Foody, CEO & co-founder of Mercore, discusses the "era of evals" in AI, Mercore's hyper-growth, and the creation of new high-skilled jobs for training AI models. He shares insights on building a successful company and adapting to the evolving labor market.
Deep Dive Analysis
11 Topic Outline
Introduction to Brendan Foody and Mercor's Rapid Growth
Understanding the Era of AI Evals and Their Importance
Landscape of AI Training and Data Labeling Companies
Future of Work and Skills in an AI-Driven Economy
Evolution of Labor Markets and AI's Impact on Hiring
How AI Models are Trained: Pre-training vs. Post-training
Mercor's Founding Story and Uncovering Market Opportunity
Core Tenets and Values Driving Mercor's Success
Lessons from Brendan's Past Entrepreneurial Ventures
Perspectives on AGI, Superintelligence, and Model Progress
Brendan's Personal AI Use and Entrepreneurial Advice
4 Key Concepts
Era of Evals
This refers to the current period where the primary bottleneck for improving AI models is effectively measuring what success looks like for the model. Evals serve as the product requirement documents (PRDs) for models, guiding researchers and demonstrating capabilities.
RLAIF (Reinforcement Learning from AI Feedback)
This is a more scalable and data-efficient approach to AI training where humans define success criteria (e.g., rubrics, unit tests) for models. Instead of direct human feedback on every output, an AI then uses these criteria to incentivize and reward desired model capabilities.
Elastic Demand (in jobs)
This concept describes job categories or industries where increased productivity, often due to AI, leads to a significant increase in demand for the output. Examples include software development and potentially consulting, where more output can always be absorbed by the market.
Post-training Data
This type of data is used after a model has undergone initial 'pre-training' to acquire general knowledge. Post-training, often through reinforcement learning, helps the model refine its reasoning, prioritize accurate information, and learn specific capabilities like making medical diagnoses or redlining legal contracts.
6 Questions Answered
AI evals (evaluations) are systematic ways to measure how well AI automates specific tasks or capabilities. They are critical because they serve as the 'product requirement documents' for models, allowing researchers to define success, measure progress, and effectively apply reinforcement learning to improve model performance.
After pre-training, which provides general knowledge, models improve through post-training and reinforcement learning. This involves using human-defined criteria (evals, rubrics) and expert feedback to help the model focus on accurate information, prioritize effective reasoning chains, and learn specific, complex capabilities.
Jobs with 'elastic demand' will be most valuable, meaning industries where increased productivity leads to significantly more output (e.g., software development, product management, certain consulting roles). Skills involving leveraging AI tools to enhance one's daily workflows and doing 'so much more' with technology will be crucial.
Mercor focuses on sourcing and assessing high-caliber professionals—like experienced software engineers, investment bankers, doctors, and lawyers—who can evaluate and interpret model capabilities. These experts create rubrics and success criteria, often working part-time on projects, to help models learn and improve in specific domains.
Mercor's success is attributed to three core values: a 'can-do attitude' (setting ambitious goals), 'high standards' (hiring exceptional talent with rigorous vetting), and 'intensity' (a culture of ownership and dedication to pushing boundaries, focusing on output over specific hours).
Brendan believes that while AI models are extraordinary and will automate most knowledge work tasks in the next 10 years, superintelligence is a longer road than some executives predict. He emphasizes that this long road will be paved with continuous expert-created evals and post-training data, rather than just more pre-training data.
24 Actionable Insights
1. Master AI for Career Survival
Prioritize becoming highly proficient in using AI tools and technologies, as individuals who effectively leverage AI will be more competitive and successful than those who don’t.
2. Embrace AI for Abundance
Adopt a mindset of abundance, focusing on how AI enables you to achieve significantly more and create new possibilities, rather than resisting it out of fear of job displacement.
3. Take Initiative and Build
Overcome the barrier of inaction by taking the initiative to build products or experiences that customers want, investing time and ambition to scale them up, especially with AI making it easier to build.
4. Integrate AI into Daily Work
Actively learn and integrate AI tools into your daily workflows, regardless of your industry, to enhance your capabilities and productivity.
5. Prioritize Customer Obsession
Focus 100% of early company resources on building exceptional products and customer experiences, allowing word-of-mouth and customer love to drive growth before investing heavily in sales and marketing.
6. Track Market Leading Indicators
In fast-moving markets, prioritize identifying and acting on leading indicators of new demand pockets, especially where wealthy customers are willing to pay for solutions, to ensure you build the best product for flagship customers.
7. Define AI Model Success Metrics
To improve AI models, clearly define what success looks like for the model, creating effective measurement systems (evals) that serve as benchmarks and verifiers in reinforcement learning environments.
8. Leverage Strengths, Not Weaknesses
In management, focus on leveraging individuals’ strengths to maximize their impact rather than excessively trying to improve their weaknesses, recognizing that some areas may never be world-class.
9. Foster Intense, Output-Oriented Culture
Build an intense, output-oriented early-stage culture where team members are deeply bought-in and committed to achieving ambitious goals, focusing on results rather than specific hours worked.
10. Maintain High Hiring Standards
Implement incredibly high hiring standards, prioritizing talent density by seeking out exceptional individuals, including former founders or those with significant achievements, to shape the organizational culture.
11. Cultivate a Can-Do Attitude
Cultivate a “can-do” attitude to set ridiculously ambitious goals, as the company’s trajectory often forms around these high aspirations.
12. Find Easy-to-Sell Customers
Seek out customers who are surprisingly easy to sell to, indicating a significant pain point and potential for growth; balance strong conviction in your vision with openness to how the market evolves and your company fits in.
13. Embrace AI Tools in Assessments
When assessing talent or designing tasks, encourage the use of AI tools (like ChatGPT, Codex, Cursor) rather than prohibiting them, focusing on what individuals can achieve by leveraging these technologies.
14. Pursue Elastic Demand Industries
For career planning, focus on industries with elastic demand (e.g., software development, consulting, operations) where increased productivity from AI will lead to greater demand and output, rather than job displacement.
15. Automate Hiring with AI
For companies, leverage AI to automate the manual matching problems in hiring (resume review, interviews) to access a global, unified labor market and improve efficiency.
16. Focus on Human-AI Gaps
Identify and focus on tasks and domains where humans currently outperform AI, as these areas will continue to require human expertise for evaluation and improvement of models for the foreseeable future.
17. Adopt RLAIF for Scalability
Transition from human feedback (RLHF) to AI feedback (RLAIF) by having humans define scalable success criteria or rubrics (like unit tests for code) that AI can then use to incentivize and improve model capabilities.
18. Expert-Created Evaluation Rubrics
Engage domain experts (e.g., lawyers for legal tasks) to create detailed rubrics that define success criteria and allow for effective scoring of AI model outputs, guiding model improvement.
19. Hire High-Skilled AI Evaluators
When training AI models, prioritize sourcing and assessing highly skilled professionals (e.g., experienced software engineers, lawyers, doctors) who can effectively evaluate and interpret complex model capabilities.
20. Evals as Sales Collateral
Use evals not only as internal guides for researchers to build model capabilities but also as external sales collateral to demonstrate the efficacy and practical value of your AI products.
21. Measure AI Automation Efficacy
For enterprises, build systematic tests or rubrics to measure how effectively AI automates your company’s core value chain, as this is a prerequisite for effective AI application.
22. Treat Evals as PRDs
Treat AI model evaluations (evals) as essential product requirement documents (PRDs) to guide development and measure success, similar to how researchers use them to make small improvements.
23. PMs: Leverage AI for Productivity
If you are a product manager, learn to leverage AI to significantly increase your productivity and output, as this will position you extremely well in the evolving job market.
24. Use AI for Writing & Thought Partnering
Utilize AI tools for writing documents and engage with them as a thought partner to reason through problems and gain advice, enhancing your thinking process.
5 Key Quotes
If the model is the product, then the eval is the product requirement document.
Brendan Foody
The market is bound by the amount of things where humans can do something that models can't.
Brendan Foody
AI won't replace you. People that are really good with AI will replace you.
Brendan Foody
Models are only as good as their evals.
Brendan Foody
You can just do things. So many people have ideas, but the barrier to more companies being built, I think, is just initiative and taking the steps to build the product or experience that customers want.
Brendan Foody