From Data 2 AI

A Data Scientist and LLM Engineer with a passion for building systems that transform messy, unstructured data into clear signals and smarter decisions.

My journey started at USP, where I studied Molecular Sciences: a deeply interdisciplinary program that gave me strong roots in math, statistics, programming, and scientific thinking. Since then, I've worked across the data lifecycle, from SQL pipelines and automated dashboards to forecasting models, churn prediction, and production-ready GenAI solutions.

Over the last few years, I've led and delivered projects that improved marketing performance, cut operational time, and helped teams move faster with fewer errors. What excites me isn't just building a model; it's building something that works. Something that integrates with people, processes, and platforms.

What I work on

Fine-tuning open-source LLMs (Mistral, DeepSeek, etc.) for business-specific use cases
Designing RAG pipelines using vector DBs, LangChain, and semantic search
Deploying LLMs on AWS, with scalability and cost in mind (SageMaker, ECR, Athena, S3)
Building AI agents that automate complex workflows like campaign mapping
Maintaining clean SQL pipelines for marketing, ecommerce, and funnel analytics

When needed, I still reach for traditional ML, especially when a lean logistic regression or XGBoost model solves the problem just as well.

I currently specialize in the practical side of Generative AI.

My approach

I don't build AI for the sake of buzz. I build it to ship, improve, and stay useful over time. I prefer:

Testable experiments over assumptions
Simpler tools used well, rather than exotic stacks no one maintains
Transparent documentation that non-engineers can follow

Some results I'm proud of:

Reducing API inference cost by ~60% after fine-tuning
Saving 30 minutes per analyst per day through LLM-powered mapping automation
Improving decision speed with dashboards tied directly to modeled forecasts and churn signals

But most of all, I like when the people using what I build say: "This actually helped."

Why I created fromdata2ai.com

This site is part portfolio, part lab notebook, part digital garden. It exists because I believe:

Good ideas become better when written down
AI should be shared, questioned, improved; not hidden behind buzzwords
The best learning happens when we build in the open

Here, I document the systems I'm building, the ideas I'm refining, and the trade-offs I'm constantly learning from. Whether it's fine-tuning a model for market prediction, embedding customer feedback into vector stores, or exploring how GenAI tools integrate with human workflows — I share the technical details and the design thinking behind them.

A few things that shape how I think

I'm deeply curious about how information shapes decisions, especially in markets and in teams
I believe data science is both a craft and a conversation, and clarity is part of our responsibility
I enjoy systems that blend language, logic, and structure, from pipelines to prompts
Outside work, I read sci-fi, study behavior, and reflect on how technology transforms how we relate to each other

Hi, I'm Raul Rocha

What I work on

My approach

Why I created fromdata2ai.com

A few things that shape how I think