Project Alpha-Sentient

Decoding Crypto Chaos with Fine-Tuned LLMs & Quantitative Market Profiling. A fusion of qualitative market psychology and quantitative market structure.

Book Live Demo

LLM Fine-tuning
Data Engineering
Quantitative Analysis
React/D3.js

Alpha-Sentient Dashboard: Real-time sentiment gauge fused with price action.

Beyond the Noise

In the relentless 24/7 cycle of cryptocurrency markets, traditional sentiment analysis tools are obsolete. They are too slow, too generic, and utterly incapable of understanding the nuanced, meme-heavy vernacular of Crypto Twitter and Discord. A standard NLP model sees chaos; a trader needs actionable signal.

My goal was to bridge the gap between qualitative market psychology and quantitative market structure. I didn't just want to know if the market was bullish or bearish; I wanted to know the probability of that sentiment translating into price action, and exactly where that action would stall or accelerate based on order flow and dealer positioning.

This project is the culmination of that vision: an end-to-end pipeline using a custom-trained Large Language Model (LLM) fused with advanced technical derivatives data to predict market direction with high-fidelity targets.

Phase 1: The Data Engineering Challenge

A model is only as good as its diet. The first hurdle was constructing a robust, low-latency data pipeline capable of ingesting a firehose of unstructured text.

I built scrapers for high-signal Twitter accounts, exclusive Discord trading groups, and real-time news wires. The cleaning process was the critical differentiator. Standard cleaning removes slang, but in crypto, slang is the signal. I developed custom regex and preprocessing steps to retain context around terms like "rekt," "ape in," "wagmi," and specific token tickers, while filtering out bot spam and low-quality engagement farming.

Python code processing data on a dark mode screen.

The Chaotic Input:
`"$BTC TO 100K!"`, `"rekt"`, `"ape in"`, `"wagmi"`

The Structured Output:
{"source": "twitter", "text": "apeing into $SOL", "cleaned_text": "entering long position solana", "context_tags": ["high_fomo"] }

Phase 2: Training the "Crypto-Native" Brain

Off-the-shelf models like GPT-4 are too polite and generalized for financial sentiment grading. They fail to distinguish between a "bearish bot post" and genuine "whale capitulation fear."

Neural network nodes visualizing the fine-tuning process.

I selected an open-source foundational model (LLaMA 3) for its efficiency. I created a proprietary dataset of 50,000+ crypto-specific text examples labelled not just for sentiment (Positive/Negative), but for market impact potential (e.g., "High-Impact FUD," "Retail FOMO," "Institutional Accumulation").

Using QLoRA (Parameter-Efficient Fine-Tuning), I retrained the model to understand the specific context of financial vernacular. The result was a 38% increase in accuracy over base models in correctly identifying market-moving narratives before price impact.

Phase 3: The Fusion

This is where the project moves beyond standard data science and into actionable quantitative finance. An LLM can tell you everyone is scared, but it can't tell you where the bottom is.

I built an inference engine that takes the LLM’s real-time sentiment score (ranging from -1.0 Extreme Fear to +1.0 Extreme Greed) and cross-references it with hard market structure data (CVD, GEX, TPO) to generate actionable trade setups.

Case Study: The "Bear Trap" Reversal Protocol

The market had been bleeding for three days. The LLM detected peak "retail capitulation" language, flashing an Extreme Fear score of -0.85. However, price was stalling.

The Data Fusion (The Alpha):
1. CVD Divergence: Price made lower lows, but Cumulative Volume Delta made higher lows (absorption).
2. Negative GEX: Market makers were short gamma, forced to buy aggressively as price ticked up.
3. Poor Low Structure: TPO profile suggested unfinished business had been resolved.

The Output: The Trade Plan
Based on this fusion, the system rejected the bearish sentiment as a lagging indicator and predicted an imminent squeeze.

• SIGNAL: BULLISH Reversal (High Confidence)
• Invalidation: $58,200
• Target 1: $61,500 (Fib 0.382)
• Target 2: $64,100 (GEX Wall & Golden Ratio)

Architecture & Results

The entire system runs in a Dockerized environment on AWS. Kafka handles real-time ingestion, feeding the PyTorch-based LLM inference engine.

Server room representing the system architecture.

This project proved that while sentiment drives narratives, structure dictates price. By fusing LLMs with math, we gained a significant edge in anticipating market moves.