The why
The engine room
for messy
papers.
Research papers are messy in two ways. Each one is long and unstructured — critical details like datasets, baselines, metrics, and limitations are buried deep inside PDFs with no reliable schema. And there are simply too many of them: manual reading doesn't scale, and keyword search returns noise rather than comparable evidence.
What you actually need is to extract evidence and key findings, then compare horizontally across many papers — reliably and repeatedly. That's what gateway makes possible. It’s not a standalone product; it's the deep infrastructure behind the Article API extraction pipeline and the research experience at research.jing.vision.
Each extraction is a typed, versioned prompt function: submit an arXiv ID, retrieve and parse the PDF into sections, run six parallel LLM passes, validate every output against a JSON schema with confidence scoring, dedup against the index, and serve structured results to the API. Repeatable, observable, and cheap enough to run across hundreds of papers.