Generative AI is reshaping what’s possible in software, from augmenting human work with code and content generation to unlocking entirely new interfaces, such as retrieval-augmented generation, chatbots, and automated processing. But with great power comes shocking costs.
In this talk, we’ll explore real-world enterprise use cases where GenAI is actively delivering results, breaking down why these innovations matter and what’s now possible that wasn’t before. We’ll then dive into the actual cost of scaling these systems, including when it makes sense to bring those costs in-house, with a look at both engineering and infrastructure realities, the best tools in the ecosystem, and usage examples.
Finally, we’ll cover practical strategies to optimize your self-hosted GenAI deployments, from using off-the-shelf compressed models to fine-tuning your own, along with the rationale, expected gains, and tools to make it happen. Whether you're just getting started or an experienced machine learning engineer, you’ll leave with a clear, high-level intuition for what’s possible, why it works, what it costs, and how to start making smarter GenAI decisions for your teams!
Speaker

Mark Kurtz
Enabling Efficient AI @Red Hat | Former CTO Neural Magic | ML Innovator and Researcher
As an AI Expert at Red Hat and former CTO of Neural Magic (acquired), I bring a wealth of experience in software engineering, machine learning, and startup leadership. Over 15 years, I've built a reputation for tackling complex research and engineering challenges and driving innovative solutions that make AI more efficient, scalable, and accessible.
At Neural Magic, I led a world-class team to achieve breakthroughs across many AI domains, delivering state-of-the-art performance and accuracy for AI applications. Now, at Red Hat, I continue this mission, leading the development of open-source AI technologies that empower organizations to deploy generative AI solutions more cost-effectively and at scale.
I’m passionate about making AI innovation practical and transformative, as evidenced by my open-source contributions, patents, published research, and active engagement with the broader AI community.