Gold Market Prediction with LLMs

Financial News Impact Prediction

Using LLMs to Predict Gold Market Movements from News Sentiment

Mistral-7B FinBERT AWS SageMaker LoRA Fine-tuning Financial ML

Project Overview

Predicting XAU/USD price movements using LLMs and sentiment analysis

This project demonstrates a production-ready ML pipeline that predicts gold market movements by analyzing financial news using Large Language Models. It combines fine-tuned Mistral-7B with FinBERT sentiment analysis to generate trading signals across multiple time horizons.

Business Value

Automated Trading Signals: Predict gold price direction and magnitude from financial news
Multi-Horizon Forecasting: 6h, 12h, 24h, and 48h prediction windows
Scalable Infrastructure: AWS SageMaker with GPU acceleration
Cost-Effective: LoRA fine-tuning reduces training costs by 90%

Key Features

🤖 Fine-tuned Mistral-7B

Specialized for financial text understanding and gold market prediction

📊 FinBERT Sentiment

Financial news sentiment analysis with confidence scoring

📰 Headline Rewriting

Automated context enhancement for gold market relevance

🎯 Multi-class Prediction

Direction (Up/Down/Neutral) + Magnitude (4 levels)

System Architecture

End-to-end ML pipeline on AWS infrastructure

News API → Headline Rewriting → Sentiment Analysis → Feature Engineering → LLM Prediction
 (Alpaca)     (Mistral-7B)         (FinBERT)          (SQL/Athena)      (Fine-tuned Mistral)
                   ↓                    ↓                    ↓                    ↓
             AWS SageMaker        AWS SageMaker        AWS Athena          AWS SageMaker

Pipeline Components

1. Data Collection

Alpaca Markets API for real-time financial news + MetaTrader 5 for XAU/USD market data

2. Headline Rewriting

Mistral-7B enhances headlines with gold market context

3. Sentiment Analysis

FinBERT extracts sentiment (Positive/Neutral/Negative) with confidence scores

4. Feature Engineering

AWS Athena processes market data, ETFs, indices, and correlates

5. LLM Prediction

Fine-tuned Mistral predicts direction and magnitude across 4 time horizons

Technical Deep Dive

Key implementation details and engineering decisions

LoRA Fine-tuning Strategy

Instead of fine-tuning all 7 billion parameters of Mistral, we use Low-Rank Adaptation (LoRA) which adds small trainable matrices to the model's attention layers.

Why LoRA?

99% parameter reduction: Only ~4M trainable params vs 7B
90% cost savings: Smaller GPU memory requirements
Faster training: 3 epochs in ~2 hours on ml.g5.2xlarge
No catastrophic forgetting: Base model knowledge preserved

# LoRA Configuration
lora_config = LoraConfig(
    r=16,              # Rank of update matrices
    lora_alpha=32,     # Scaling factor
    target_modules=[   # Which layers to adapt
        "q_proj", "k_proj", "v_proj", "o_proj"
    ],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

# Training Configuration
training_args = TrainingArguments(
    output_dir="./mistral-gold-lora",
    num_train_epochs=3,
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    fp16=True,                    # Mixed precision
    logging_steps=50,
    save_strategy="epoch",
    evaluation_strategy="epoch"
)

Data Pipeline Architecture

The system processes financial news through multiple stages, each adding value:

1. News Collection & Filtering

Alpaca Markets API provides real-time financial news. We filter for gold-relevant keywords:

Keywords: ["gold", "XAU", "precious metals", "Fed", "inflation", 
           "dollar", "treasury yields", "safe haven"]

# API Call Example
news = alpaca.get_news(
    symbols=["GLD", "IAU"],
    start=datetime.now() - timedelta(hours=24),
    limit=100
)
filtered = [n for n in news if any(kw in n.headline.lower() 
            for kw in gold_keywords)]

2. Headline Rewriting

Mistral-7B rewrites headlines based on the full article body, focusing on gold-correlated assets (USD, yields, Fed policy, etc.):

Original:

"Fed signals rate cuts in 2024"

Rewritten:

"Fed rate cut signals boost gold as safe haven demand rises"

3. Sentiment Extraction

FinBERT analyzes the rewritten headline (which focuses on gold-correlated assets):

{
  "sentiment": "Positive",
  "confidence": 0.94
}

Note: Only the rewritten headline is analyzed, as it contains the gold market context.

4. Feature Engineering

Combine news sentiment with market data:

XAU/USD price, volume, volatility (30-min candles)
Gold ETFs: GLD, IAU performance
USD Index (DXY) movements
10-year Treasury yields
Real yields (TIPS)

-- Athena Query for Feature Aggregation
SELECT 
    news_timestamp,
    AVG(sentiment_score) as avg_sentiment,
    xau_close - LAG(xau_close, 1) as price_change,
    dxy_close as dollar_strength,
    treasury_10y as yield_level,
    gld_volume / AVG(gld_volume) OVER (ORDER BY timestamp 
        ROWS BETWEEN 20 PRECEDING AND CURRENT ROW) as volume_ratio
FROM market_data
WHERE timestamp BETWEEN news_timestamp - INTERVAL '2' HOUR 
    AND news_timestamp

Model Output Format

The fine-tuned model outputs structured JSON predictions for 4 time horizons:

{
  "6h": {
    "direction": "Up",
    "magnitude": "Medium",
    "confidence": 0.82
  },
  "12h": {
    "direction": "Up", 
    "magnitude": "High",
    "confidence": 0.76
  },
  "24h": {
    "direction": "Neutral",
    "magnitude": "Low",
    "confidence": 0.65
  },
  "48h": {
    "direction": "Down",
    "magnitude": "Medium",
    "confidence": 0.58
  }
}

💡 Key Insight

The model achieves 100% JSON validity through careful prompt engineering and output parsing. This is critical for production systems where downstream processes expect structured data.

Inference Pipeline

# Production Inference
def predict_gold_movement(news_headline, market_features):
    # Rewrite headline with gold context
    rewritten = mistral_rewriter.generate(
        f"Rewrite focusing on gold impact: {news_headline}"
    )
    
    # Extract sentiment
    sentiment = finbert(rewritten)
    
    # Combine features
    features = {
        "headline": rewritten,
        "sentiment": sentiment["label"],
        "confidence": sentiment["score"],
        **market_features  # XAU price, DXY, yields, etc.
    }
    
    # Generate prediction
    prompt = format_prediction_prompt(features)
    prediction = mistral_finetuned.generate(prompt)
    
    return json.loads(prediction)  # Structured output

Results & Performance

Production results on real financial news data

0.90

Direction F1-Score

0.85

Direction Accuracy

100%

JSON Validity

~1000

News/Hour Processing

Cost Analysis

Component	Cost	Notes
Training (ml.g5.2xlarge)	~$2.50/hour	GPU-accelerated fine-tuning
Inference (ml.g4dn.xlarge)	~$0.70/hour	Real-time predictions
Storage (S3)	~$0.023/GB/month	Data and models
Full Training Cycle	~$120	Complete pipeline

Models & Deployment

Modern ML engineering practices

Models & Methods

FinBERT Sentiment Analysis

Model: yiyanghkust/finbert-tone
Task: 3-class sentiment (Positive/Neutral/Negative)
Confidence scoring for intensity measurement

Mistral-7B Fine-tuning

Base: mistralai/Mistral-7B-Instruct-v0.2
Method: LoRA (r=16, alpha=32)
Precision: 4-bit quantization
Training: 3 epochs, ~5200 steps
Reduces trainable parameters by 99%

Use Cases

Algorithmic Trading

Generate trading signals from news events

Risk Management

Anticipate volatility from market events

Market Research

Analyze news impact patterns

Portfolio Optimization

Adjust gold exposure based on sentiment

View on GitHub →