From "Simple" Fine-Tuning to Your Own Mixture of Expert Models Using Open-Source Models

code red code red

Nowadays, training a large language model (LLM) from scratch is a huge effort, even for very large companies. Starting from pre-trained models to create your own custom models is no longer just an option for resource-constrained organizations; it has become a necessary starting point for many.

In this context, various techniques and strategies can help to maximize the potential of pre-trained models:

  • Lora: A technique for low-rank adaptation, which allows for efficient fine-tuning of models by focusing on adjusting a small subset of the model's parameters.
  • Quantization and QLora: Methods to reduce the computational complexity and memory footprint of models without significantly compromising their performance, enabling more efficient deployment and fine-tuning.
  • Managing Multiple Lora Adapters: This involves using multiple Lora adapters to equip models with multiple skills, allowing for a flexible and modular approach to model capabilities.
  • Fine Embeddings Management to Improve RAG (Retrieval-Augmented Generation): Enhancing the management of embeddings can significantly improve the performance of RAG systems, which combine the strengths of information retrieval and generative models.
  • Mixing Models: Creating Your MoE (Mixture of Experts) Model: This advanced technique involves combining several fine-tuned models to create a Mixture of Experts model, leveraging the strengths of each individual model to enhance overall performance.

These strategies provide a robust toolkit for those who plan to adapt and enhance pre-trained models to meet specific needs, even without deep expertise in machine learning. By understanding and applying these techniques, organizations can harness the power of modern AI with greater efficiency and effectiveness cutting costs.
 

What's the focus of your work these days?

With my team, we create LLMs from scratch, train how to use / fine-tune as well as use existing services (Like Gemini, OpenAI, Claude), and mix them with a proprietary AI engine that orchestrates flows, and requests routing to the best option.

My insight is that nothing is perfect, neither products from trillion dollar companies, there is always, so far, something that can be improved or shaped to better fit your needs. Of course nothing is free of cost, the magic does not exist.

What technical aspects of your role are most important?

Talking about dealing with AI, for sure the ability to have your feet planted on the ground controlling costs. We are fascinated by technology, we love it too much, but AI can be very very expensive.

Being a CTO that deals with AI today, much more than other cases, is not just a matter of algorithms and tech stuff, but also the ability to make compromises between quality and costs. The best algorithm is useless if not sold.

How does your InfoQ Dev Summit Munich session address current challenges or trends in the industry?

At first glance, it is a tech session, and of course, it is. This session will help people find answers very quickly to problems I spent weeks and months addressing; the main goal is to help people create more cost-effective AI products.

How do you see the concepts discussed in your InfoQ Dev Summit Munich session shaping the future of the industry?

Very few companies can compete with giants like OpenAI, Antropic, Meta, and Microsoft, no matter how rich they believe they are - they simply can't - OpenAI burns through 5 billion per year. AI is powerful enough for a lot of tasks, but to be profitable for our business, we need to find a way to cut costs dramatically. 

With my session I do not plan to save the world but to offer a different point of view, a different approach that is cost-saving oriented.


Speaker

Sebastiano Galazzo

CTO @Synapsia AI, Winner of Three AI Awards, Microsoft MVP for Artificial Intelligence Category, 25 Years Working in AI and ML

Winner of three AI awards, I’ve been working in AI and machine learning for 25 years, designing and developing AI and computer graphic algorithms.

I’m very passionate about AI, focusing on Audio, Image and Natural Language Processing, and predictive analysis as well.
I received several national and international awards that recognize my work and contributions in these areas.

Microsoft MVP for Artificial Intelligence Category, I have the pleasure of being a guest speaker in national and international events.

Read more
Find Sebastiano Galazzo at: