Case Study
Research Radar is a working prototype. The goal is straightforward: ranked lists of MIR and audio ML papers where you can see why each result landed where it did. You can move from the main feed into a paper, check the evaluation against simpler sorts, and follow trends in the current dataset without losing the reasoning behind the order. The live app runs separately from this case study; the bridge feature is visible as a separate experimental view and is kept out of the main recommendation flow for now.
Research Radar centers on one workflow: inspect ranked recommendations, open a paper dossier, compare against baselines, and read trends in corpus context. The core value is making the ranking signals visible.
Two main feeds: emerging papers and undercited papers. Each list is precomputed and every card includes enough context to show why that paper ranked where it did.
Each paper page shows metadata plus a related-papers section, so someone can move from one useful paper to the next without starting over from search every time.
The trends page shows which topics are gaining traction inside the papers the ranker is working with. Useful context for recurring topics in the feeds.
The evaluation page compares the ranking against simpler baselines like citation count and recency so you can sanity-check behavior against scores everyone already understands.
Each ranked list stores the signal mix that produced it alongside the results. When the ranking code changes and something moves, you can check what actually shifted instead of guessing. It also gives anyone reviewing the work something concrete to push back on.
This page is the case study, while the prototype runs as its own app. Under the hood there is a Next.js frontend, a FastAPI backend, Postgres with pgvector for storage and similarity search, and Python jobs for ingest, ranking, and clustering. Keeping those pieces separate makes it easier to update the ranking workflow without turning every change into a full-site deploy.
Bridge is the main example: the signal is measured and shown in the UI, but it is kept out of the final score until the feature is strong enough to earn that role. I'd rather show the work in progress than hide it.
Current corpus scope is intentionally narrow. The deployed corpus is curated around the currently wired bootstrap sources, and it should not be read as a comprehensive index of audio-ML literature.
Evaluation here is proxy-only: citation/date baselines, topic-mix and recency checks, plus distribution-level comparisons. There is not yet a human-labeled relevance benchmark.
Most paper tools give you a sorted list and no explanation of why anything landed where it did. I built this partly to fix that and partly to understand what it actually takes to build something like this end to end. The way I use it: open the emerging or undercited feed, read the signal note on each card before trusting the order, click into a paper if it looks worth following, then check the evaluation page to see whether the ranking makes sense against simpler sorts like citation count or date.

This is the main list. Each card shows a paper and a short breakdown of what pushed it up in the ranking for this snapshot. You don't have to take the order on faith.
What to look at
What you can take from this: The signal mix is visible enough to explain each result and compare runs when the code changes.
Boundary: The corpus is curated and narrower than the field. Objective best-paper claims are out of scope.

Click any card from the feed and you land here. You get the paper's full details, where it sat in the ranked list, and a set of similar papers so you can keep exploring without starting over.
What to look at
What you can take from this: You can move from a ranked result to a full paper view and keep moving through related work without losing your place.
Boundary: Similar papers are matched by content similarity. Treat them as navigation aids, not quality judgments.

This is where I check whether the ranking is actually doing something useful. It puts the system's output next to the most obvious alternatives: sort by citation count, sort by date. If the ranking is adding value, you should be able to see it here.
What to look at
What you can take from this: A quick sanity check that the ranking is doing something beyond what a spreadsheet sort would give you.
Boundary: This compares against simple baselines. It is a useful starting point before a formal relevance benchmark exists.

The trends page shows which topics are picking up momentum inside the papers the ranker actually sees.
What to look at
What you can take from this: Gives context for why a topic might be heating up in the ranked lists right now.
Boundary: This reflects momentum inside the curated set only. It won't match broader field trends.
Route: /recommended?family=bridge
final_score does not use the bridge signal.What you can verify without special access: repository history, linked tests, roadmap notes, and the screenshot baseline shown in the walkthrough above. Internal hosting details are intentionally omitted; the live prototype uses the stable radar.mmaitland.dev subdomain.
The interface is built in Next.js, the API is built in FastAPI, and the data lives in Postgres with pgvector. A separate Python pipeline handles ingest, cleanup, embeddings, ranking, and clustering experiments behind the scenes.
Building the explanation layer early turned out to matter more than I expected. Once each run stored why it ranked the way it did, comparing versions became straightforward instead of guesswork. Keeping bridge visible as a separate experimental view also helped: the core flow stayed focused, and I could work on the experimental side without it polluting the main results. The evaluation page earned its place too. I kept expecting a single summary score to be enough, and it never was. Seeing the full comparison grid against simple sorts was always more honest.
Status: Live prototype
The strongest stable claim today is that the prototype makes its ranking behavior visible and understandable over a curated set of MIR and audio ML papers.
Known limits: Bridge is an experimental view separate from the main recommender; semantic similarity only appears in runs where the UI labels it; the corpus is still narrower than the long-term plan.
No questions or comments yet. Sign in with GitHub to leave the first one.