Hacker News

Show HN: Software Engineering Handbook – How to Survive Tech

Hacker News - Sun, 04/27/2025 - 11:32am

Hey HN,

We wrote the Software Engineering Handbook because most engineering advice is about technical background. Ours is about surviving everything else.

It covers:

- Surviving bad managers, layoffs, toxic teams

- Navigating promotions, negotiations, career changes

- Handling burnout, immigration, parenting, even death

- Setting boundaries, managing time, staying sane

- Building a life while growing in your a tech career.

No tutorials. No buzzwords. Just real-world survival and growth, written by engineers who’ve been through Amazon, Meta, Workday, TripAdvisor, and startups.

Comments URL: https://news.ycombinator.com/item?id=43812616

Points: 6

# Comments: 0

Categories: Hacker News

Show HN: SemHash – Semantic Text Deduplication, Outlier Filtering and Sampling

Hacker News - Sun, 04/27/2025 - 11:29am

We’ve just released SemHash v0.3.0, a major rework of our open-source text pre-processing library. We’ve added two new functionalities: outlier filtering & representative sampling. The core API has been reworked to make sure all of these features can be used together in an intuitive way. Our new features use the existing approximate nearest neighbors index that we already used for semantic deduplication, so they can be ran very quickly after building the index on your dataset. The core package can now be used for:

- Semantic Deduplication: Remove semantic duplicates from your dataset. This can prevent train/test set overlap in classification tasks, or prevent duplicate samples in RAG/semantic search.

- Outlier Filtering: Surface and filter the most anomalous samples from your dataset. This can help with automated removal of low quality data, or data that should not be in your dataset.

- Representative Sampling: Select the most central and diverse examples using Maximal Marginal Relevance. This can help you quickly explore and understand a dataset, or even build a small, diverse, high quality dataset, for example for LLM finetuning.

We’ve designed these features in the same way as our semantic deduplication: CPU friendly, lightweight, and explainable.

We hope these features help you create cleaner datasets, or simply understand your data better. We’re curious to hear your feedback, and whether there are any other features you think would improve SemHash further!

Comments URL: https://news.ycombinator.com/item?id=43812594

Points: 2

# Comments: 0

Categories: Hacker News

DeepWiki turns 30k+ GitHub repos to tech documentations

Hacker News - Sun, 04/27/2025 - 11:25am

Article URL: https://deepwiki.com/

Comments URL: https://news.ycombinator.com/item?id=43812575

Points: 1

# Comments: 0

Categories: Hacker News

Show HN: A crowd-data collection Project about Indoor Air Quality (CO2)

Hacker News - Sun, 04/27/2025 - 11:25am

I built IndoorCo2map.com and the accompanying (open source, MIT licensed) App with the goal to create a large free dataset about CO2 levels in public accessible Buildings and public transport using portable CO2-Monitors.

We spent about 90% of our lives in doors, yet there is barely any data about the air quality in enclosed areas. CO2 is in most locations a very good proxy for the amount of exhaled air in a room which correlates with infection risks. High CO2 levels also can increase the aerostability of viruses so high levels are there is also a direct effect of CO2 as well. Aside from infection risks, high CO2 levels also can decrease cognitive abilities (during exposure, not permanently) and cause dizziness and headaches etc.

Most existing studies are small size and focus on homes, hospitals or schools. During the beta test the community already took more than 10000 measurements (each between 5 and 120 minutes long), which to my knowledge already makes it the largest dataset of its kind. Currently most users are from german speaking regions which is a result of me being german and my social graph being mostly german.

The App does not require any user registration and works with most of the common portable CO2-Monitors (Aranet4, Airvalent, Inkbird-IAM-T1, Airspot Health). For some of them I had to reverse engineer the Bluetooth messages and to make things worse the Airvalent’s data isn’t byte aligned.

The App is built using C# MAUI and cross-compiles to both android and Apple. I use it because C# is the Language I am most comfortable in but also because it can be deployed to local iPhones without having to own a Mac. The Backend is using a queue server, serverless functions (also C#) and a postgresql Database, all hosted on AWS. The website is using maplibre, deck.gl and chart.js - I have no clue about websites so just tried to keep things simple. Expanding to other Indoor Air Quality indicators like PM2.5 would be trivial but currently the amount of people having mobile sensors is too small to be worth the effort.

Comments URL: https://news.ycombinator.com/item?id=43812574

Points: 1

# Comments: 0

Categories: Hacker News

Self-hosted apps are awesome, but licensing them is a mess

Hacker News - Sun, 04/27/2025 - 11:24am

Article URL: https://kagehq.com/

Comments URL: https://news.ycombinator.com/item?id=43812566

Points: 1

# Comments: 1

Categories: Hacker News

No Honor Among Mutuals

Hacker News - Sun, 04/27/2025 - 11:16am
Categories: Hacker News

Show HN: Logchef – Schema-agnostic log viewer for ClickHouse

Hacker News - Sun, 04/27/2025 - 11:15am

Hey HN! I’m Karan, creator of Logchef (https://logchef.app), an open-source log viewer built specifically for exploring logs stored in ClickHouse.

This tool grew directly out of my $day job managing massive log volumes. Like many orgs, we migrated our log workloads to ClickHouse for its performance, but found the ecosystem lacked dedicated UI tooling for actually browsing and analyzing those logs effectively.

We were using Metabase, and while great for general BI, it wasn't designed for log exploration workflows. Common pain points included:

- Clunky Ad-hoc Querying: Writing/modifying raw ClickHouse SQL for quick checks was slow and error-prone, especially during incidents. - Disconnect Between Viz & Raw Logs: Visualizing trends (like error counts) then drilling down to the specific raw logs often required separate, complex queries and wrestling with row limits. The intuitive "slice-and-dice" was missing. - UI Friction: Simple things like selecting precise time ranges ("last 90 minutes"), easily viewing surrounding log context, or dealing with truncated columns added unnecessary friction. Debugging sessions were taking longer than they should. So, over the last 3-4 months, I built Logchef to scratch this itch.

Logchef's Core Ideas:

- Purpose-Built for ClickHouse Logs: Designed from the ground up for the specific task of log exploration on top of ClickHouse, focusing on speed and intuitive workflows. - Schema-Agnostic: Logchef doesn't force OTEL or any other schema. Connect it directly to your existing ClickHouse log tables (it just needs a timestamp column). Bring your own schema! - Focus on Viewing/Querying: Logchef intentionally doesn't handle log collection/ingestion. It complements great tools like Vector, Fluentbit, etc., by focusing purely on the exploration layer once data is in ClickHouse. - Simple Search Syntax: Includes a simple query syntax (e.g., `status=200 and path~"/api/"`) that translates to efficient ClickHouse SQL behind the scenes, integrated with the Monaco editor.

Tech Stack: Go backend, SQLite for metadata, Vue.js + shadcn/ui + Tailwind CSS frontend.

You can try a live public demo here: https://demo.logchef.app (It's pre-populated with sample data using Vector, so you can dive right in. Uses Dex for OIDC auth - creds are on the login page).

What's Next & Getting Involved:

Logchef is already used internally at Zerodha, and I'm driving towards v1.0 this year. The roadmap includes features like Alerting, Live Tail Logs, and Enhanced Dashboarding. It's open source (AGPLv3), and I'd love to get more eyes on it and build a community.

Check out the repo: https://github.com/mr-karan/logchef

I’d love to hear your feedback, whether positive or negative. Please open issues on GitHub with suggestions or bug reports!

Thanks so much, HN!

Comments URL: https://news.ycombinator.com/item?id=43812500

Points: 1

# Comments: 0

Categories: Hacker News

IBM PC Code Page 437 to Unicode Mapping Table

Hacker News - Sun, 04/27/2025 - 10:41am

Article URL: https://mw.rat.bz/cp437map/

Comments URL: https://news.ycombinator.com/item?id=43812291

Points: 1

# Comments: 0

Categories: Hacker News

Show HN: I Made A parody website for the AI addicted (Brain AI)

Hacker News - Sun, 04/27/2025 - 10:37am

Yes, there's no reason for this to exist, it is just funny.

Comments URL: https://news.ycombinator.com/item?id=43812260

Points: 2

# Comments: 0

Categories: Hacker News

Pages