Feed aggregator

2025 Pulitzer Prizes

Hacker News - Mon, 05/05/2025 - 6:00pm
Categories: Hacker News

You Can Find a Great $300 Phone as Long as You Make One Choice

CNET Feed - Mon, 05/05/2025 - 5:56pm
Commentary: Do you want a phone with fun features or one with longer software support?
Categories: CNET

Apple iPhone 16E vs. iPhone 15: Which Lower-Cost iPhone Is Best for You?

CNET Feed - Mon, 05/05/2025 - 5:53pm
Apple's iPhone 16E and the iPhone 15 are both capable devices that cost less than a $799 iPhone 16, but each comes with different compromises.
Categories: CNET

Guelta

Hacker News - Mon, 05/05/2025 - 5:51pm

Article URL: https://en.wikipedia.org/wiki/Guelta

Comments URL: https://news.ycombinator.com/item?id=43899828

Points: 3

# Comments: 0

Categories: Hacker News

Does Railway Simplify Things?

Hacker News - Mon, 05/05/2025 - 5:24pm

And I mean it's approach to NON Docker deployments in particular. I mean is fighting their Docker abstraction really worth it? Isn't directly using Docker just easier?

Comments URL: https://news.ycombinator.com/item?id=43899600

Points: 1

# Comments: 0

Categories: Hacker News

Signal Clone Used by Mike Waltz Pauses Service After Reports It Got Hacked

Wired Security - Mon, 05/05/2025 - 5:24pm
The communications app TeleMessage, which was spotted on former US national security advisor Mike Waltz's phone, has suspended “all services” as it investigates reports of at least one breach.
Categories: Wired Security

History of the Centrifuge

Hacker News - Mon, 05/05/2025 - 5:16pm
Categories: Hacker News

q5.js v3.0 [video]

Hacker News - Mon, 05/05/2025 - 5:14pm
Categories: Hacker News

I built a neural classifier to replace Plaid's transaction categories

Hacker News - Mon, 05/05/2025 - 5:06pm

I recently shut down a startup I was building. It was a rewards platform for health-related spending. My users were scattered across the US, but mostly in SF, NYC, LA, Chicago, and Boston.

The core product relied on inferring whether a transaction was health-related or not. I quickly realized that adding rules and heuristics on top of Plaid's categories wouldn't work. Not to mention that Plaid's categorization was way too inaccurate to be deciding financial rewards on.

Here's an account of what I built to make it work, verified with a cleaned dataset of 6k data points collected from my platform.

First of all, Plaid's baseline categorization accuracy was low: - Categorization accuracy was 65.22% overall - Accuracy was better for well-known merchants (Plaid identified an "Entity ID") at 83.99%

I tried RAG to start, but that immediately fell apart due to name collisions and regional duplication

Thankfully I was able to start with Plaid's already cleaned transaction data. To better resolve entities, my pipeline took in: - Transaction amount (for product band heuristics) - Location - POS method (in-person vs. online) - A list of known bank-specific formatting quirks that I collected as I tried to build this pipeline (for now limited to the Big Banks ™)

Using that data I could much better figure out: - Which entity the purchase was made from among entities with duplicate names (mostly SMBs) - Collapsing regional identifiers into a single parent organization - Side note: did you know that Orangetheory has a different regional identifier for every location. For example: "Orangetheory", "OTF", "otf", "otf {city}", "orangetheory {city}" are all possible names. This one took so long to solve robustly

Also this way I could provide a custom category to look for. In my case it was "health-related" or not. Which I defined with the FSA/HSA eligibility rules (in JSON format), plus some other properties like fitness/studio classes merchants, and supplements.

The results: - 87.28% accuracy on classifying "health-related" spend (with a "needs more info" tag for marketplace cases like Amazon) - 95.78% accuracy on personal finance category classification, with only 300 known entities logged in my database. So this can definitely improve with more effort put in expanding the known entities list

I made this writeup mostly for catharsis to shutting down my startup, and to warn of potential things to look out for when trying to properly utilize transactions data.

But I really do believe that this kind of infra, semantic understanding of financial data, is becoming increasingly valuable as financial data becomes more available. And new businesses can be built with it. I am considering expanding more on this infra as a developer API or toolkit. So if you're working on financial rewards, personal finance apps, FSA/HSA/expense platforms, accounting tools, etc. I'd love to hear from you!

Comments URL: https://news.ycombinator.com/item?id=43899459

Points: 1

# Comments: 0

Categories: Hacker News

Pages