BioFinderLM V2.5: From On-Demand Search to a Weekly Email Digest

Project LLM Bioinformatics NLP Automation

Published on Feb 15, 2026

BioFinderLM V2.5: From On-Demand Search to a Weekly Email Digest

This is the third post in the BioFinderLM series. If you’re new here, start with the project overview, then see v1 and v2.

What Changed, and Why

After running v2 for a few weeks, two pain points emerged:

I was still running it manually. Every Monday morning I’d open a terminal, run the script, wait 10 to 15 minutes, and check the results. This is exactly the kind of task that should run itself.

Classification was wasteful at the tail end. After DPR ranking, the top papers are genuinely relevant, but by paper #200 you’re deep into noise. The LLM was dutifully classifying papers as “Low” confidence, spending API tokens to confirm what the DPR score already suggested. I needed an early stopping mechanism.

v2.5 addresses both by turning BioFinderLM into a scheduled job that emails me a weekly digest of new high-confidence papers.

Workflow

Key Improvements Over v2

Feature	v2.0	v2.5
Execution	Manual	Cron-schedulable
Classification	Always full run	Adaptive early stopping
Email digest	❌	✅ Weekly HTML / plain-text
DPR ranking	Articles only	Articles + Preprints

Adaptive Classification: Early Stopping

The core idea is simple: if the LLM has classified the last several papers (sorted by DPR score) and none of them are “High” confidence, there’s diminishing value in continuing. In practice, this cuts classification time noticeably on unfocused queries where the relevant literature is small, while leaving highly active fields mostly unaffected since high-confidence papers keep appearing throughout the ranking.

← Back to the BioFinderLM project overview.

BioFinderLM V2.5: From On-Demand Search to a Weekly Email Digest

What Changed, and Why

Workflow

Key Improvements Over v2

Adaptive Classification: Early Stopping

About the Author