Feed aggregator
Programmers I Know
Article URL: https://endler.dev/2025/best-programmers/
Comments URL: https://news.ycombinator.com/item?id=43620504
Points: 1
# Comments: 0
Show HN: I made an app to save you money on unused subscriptions
I lost $90 in the month on unused subscriptions.
So I’m building a tool to stop that. It tracks subscriptions and warns me before I get charged again.
MVP in progress. Who’s in?
Comments URL: https://news.ycombinator.com/item?id=43620501
Points: 1
# Comments: 0
Let's Fix OAuth in MCP
Article URL: https://aaronparecki.com/2025/04/03/15/oauth-for-model-context-protocol
Comments URL: https://news.ycombinator.com/item?id=43620496
Points: 1
# Comments: 0
TCP/IP over Amazon Cloudwatch Logs (2019)
Article URL: https://medium.com/clog/tcp-ip-over-amazon-cloudwatch-logs-c1cf08f2296c
Comments URL: https://news.ycombinator.com/item?id=43620493
Points: 1
# Comments: 0
Google fixes two actively exploited zero-day vulnerabilities in Android
Google has patched 62 vulnerabilities in Android, including two actively exploited zero-days in its April 2025 Android Security Bulletin.
When we say “zero-day” we mean an exploitable software vulnerability for which there was no patch at the time of the vulnerability being exploited or published. The term reflects the amount of time that a vulnerable organization has to protect against the threat by patching—zero days.
The April updates are available for Android 13, 14, and 15. Android vendors are notified of all issues at least a month before publication, however, this doesn’t always mean that the patches are available for all devices immediately.
You can find your device’s Android version number, security update level, and Google Play system level in your Settings app. You’ll get notifications when updates are available for you, but you can also check for them yourself.
For most phones it works like this: Under About phone or About device you can tap on Software updates to check if there are new updates available for your device, although there may be slight differences based on the brand, type, and Android version of your device.
If your Android phone shows patch level 2025-04-05 or later then you can consider the issues as fixed. The difference with patch level 2025-04-01 is that the higher level provides all the fixes from the first batch and security patches for closed-source third-party and kernel subcomponents, which may not necessarily apply to all Android devices.
Keeping your device as up to date as possible protects you from known vulnerabilities and helps you to stay safe.
Technical detailsThe zero-days are both located in the kernel:
CVE-2024-53150: an out-of-bounds flaw in the USB sub-component of the Linux Kernel that could result in information disclosure. Local attackers can exploit this flaw to access sensitive information on vulnerable devices without user interaction.
The out of bounds vulnerability was caused by the USB-audio driver code which failed to check the length of each descriptor before passing it on. There are currently no details on how CVE-2024-53150 has been exploited in real-world attacks, by whom, and who may have been targeted in those attacks.
CVE-2024-53197: a privilege escalation flaw in the USB audio sub-component of the Linux Kernel. Again, no user interaction is required.
This vulnerability is the missing link to CVE-2024-50302 and CVE-2024-53104 which put together were reportedly exploited in Serbia by law enforcement using Cellebrite forensic tools to unlock a student activist’s device and attempt spyware installation.
We don’t just report on phone security—we provide it
Cybersecurity risks should never spread beyond a headline. Keep threats off your mobile devices by downloading Malwarebytes for iOS, and Malwarebytes for Android today.
Every programming language needs its killer app to succeed
Article URL: https://www.grilly.com/posts/programming-languages-reason-to-exist/
Comments URL: https://news.ycombinator.com/item?id=43620480
Points: 2
# Comments: 1
First medical X-ray taken in space
Article URL: https://news.mit.edu/2025/3-questions-lonnie-petersen-first-medical-x-ray-taken-in-space-0407
Comments URL: https://news.ycombinator.com/item?id=43620479
Points: 1
# Comments: 0
Comparing GenAI Inference Engines: TensorRT-LLM, VLLM, HF TGI, and LMDeploy
Hey everyone, I’ve been diving into the world of generative AI inference engines for quite some time at NLP Cloud, and I wanted to share some insights from a comparison I put together. I looked at four popular options—NVIDIA’s TensorRT-LLM, vLLM, Hugging Face’s Text Generation Inference (TGI), and LMDeploy—and ran some benchmarks to see how they stack up for real-world use cases. Thought this might spark some discussion here since I know a lot of you are working with LLMs or optimizing inference pipelines:
TensorRT-LLM
------------
NVIDIA’s beast for GPU-accelerated inference. Built on TensorRT, it optimizes models with layer fusion, precision tuning (FP16, INT8, even FP8), and custom CUDA kernels.
Pros: Blazing fast on NVIDIA GPUs—think sub-50ms latency for single requests on an A100 and ~700 tokens/sec at 100 concurrent users for LLaMA-3 70B Q4 (per BentoML benchmarks). Dynamic batching and tight integration with Triton Inference Server make it a throughput monster.
Cons: Setup can be complex if you’re not already in the NVIDIA ecosystem. You need to deal with model compilation, and it’s not super flexible for quick prototyping.
vLLM
----
Open-source champion for high-throughput inference. Uses PagedAttention to manage KV caches in chunks, cutting memory waste and boosting speed.
Pros: Easy to spin up (pip install, Python-friendly), and it’s flexible—runs on NVIDIA, AMD, even CPU. Throughput is solid (~600-650 tokens/sec at 100 users for LLaMA-3 70B Q4), and dynamic batching keeps it humming. Latency’s decent at 60-80ms solo.
Cons: It’s less optimized for single-request latency, so if you’re building a chatbot with one user at a time, it might not shine as much. Also, it’s still maturing—some edge cases (like exotic model architectures) might not be supported.
Hugging Face TGI
----------------
Hugging Face’s production-ready inference tool. Ties into their model hub (BERT, GPT, etc.) and uses Rust for speed, with continuous batching to keep GPUs busy.
Pros: Docker setup is quick, and it scales well. Latency’s 50-70ms, throughput matches vLLM (~600-650 tokens/sec at 100 users). Bonus: built-in output filtering for safety. Perfect if you’re already in the HF ecosystem.
Cons: Less raw speed than TensorRT-LLM, and memory can bloat with big batches. Feels a bit restrictive outside HF’s world.
LMDeploy
--------
This Toolkit from the MMRazor/MMDeploy crew, focused on fast, efficient LLM deployment. Features TurboMind (a high-performance engine) and a PyTorch fallback, with persistent batching and blocked KV caching for speed.
Pros: Decoding speed is nuts—up to 1.8x more requests/sec than vLLM on an A100. TurboMind pushes 4-bit inference 2.4x faster than FP16, hitting ~700 tokens/sec at 100 users (LLaMA-3 70B Q4). Low latency (40-60ms), easy one-command server setup, and it even handles multi-round chats efficiently by caching history.
Cons: TurboMind’s picky—doesn’t support sliding window attention (e.g., Mistral) yet. Non-NVIDIA users get stuck with the slower PyTorch engine. Still, on NVIDIA GPUs, it’s a performance beast.
What’s your experience with these tools? Any hidden issues I missed? Or are there other inference engines that should be mentioned? Would love to hear your thoughts!
Julien
Comments URL: https://news.ycombinator.com/item?id=43620472
Points: 1
# Comments: 1
Show HN: Badgeify – Add Any App to Your Mac Menu Bar
Article URL: https://badgeify.app/
Comments URL: https://news.ycombinator.com/item?id=43620471
Points: 1
# Comments: 0
Apple Plans to Source More iPhones from India as Potential Tariff Fix
Article URL: https://www.wsj.com/tech/apple-iphone-production-china-tariffs-6cc37f40
Comments URL: https://news.ycombinator.com/item?id=43620458
Points: 1
# Comments: 0
Tuesday Telescope: Does this Milky Way image remind you of Powers of 10?
Article URL: https://arstechnica.com/space/2025/04/tuesday-telescope-the-heart-of-the-galaxy-revealed-in-two-kinds-of-light/
Comments URL: https://news.ycombinator.com/item?id=43620453
Points: 1
# Comments: 0
Meta got caught gaming AI benchmarks
Article URL: https://www.theverge.com/meta/645012/meta-llama-4-maverick-benchmarks-gaming
Comments URL: https://news.ycombinator.com/item?id=43620452
Points: 2
# Comments: 0
Navy SEAL. Harvard Doctor.NASA Astronaut. Don't Tell Mom About This Overachiever
Article URL: https://www.wsj.com/lifestyle/jonny-kim-nasa-astronaut-navy-seal-harvard-doctor-nasa-astronaut-7ad0e523
Comments URL: https://news.ycombinator.com/item?id=43620444
Points: 1
# Comments: 1
Plebiscitary Override in Venezuela: Eroding Democracy Deepening Authoritarianism
Article URL: https://journals.sagepub.com/doi/10.1177/00027162241309709
Comments URL: https://news.ycombinator.com/item?id=43620441
Points: 1
# Comments: 0
Attack of the Quack-Industrial Complex – Paul Krugman
Article URL: https://paulkrugman.substack.com/p/attack-of-the-quack-industrial-complex
Comments URL: https://news.ycombinator.com/item?id=43620437
Points: 1
# Comments: 0
Bug crowd for small startups and vibe coders?
Article URL: https://picklock.47labs.io/
Comments URL: https://news.ycombinator.com/item?id=43620434
Points: 1
# Comments: 1
Why the Ultrarich Are Unplugging from "Smart Homes"
Article URL: https://www.hollywoodreporter.com/lifestyle/real-estate/tech-free-homes-luxury-trend-1236177909/
Comments URL: https://news.ycombinator.com/item?id=43620421
Points: 1
# Comments: 1
FreeDOS 1.4 Released
Article URL: https://freedos.org/download/announce.html
Comments URL: https://news.ycombinator.com/item?id=43620415
Points: 1
# Comments: 0
What if we taxed advertising?
Article URL: https://matthewsinclair.com/blog/0177-what-if-we-taxed-advertising
Comments URL: https://news.ycombinator.com/item?id=43620407
Points: 1
# Comments: 1