๐Ÿ“–

Engineering Guides

Practical, step-by-step guides for everyday engineering challenges. Templates, checklists, and real examples you can use today.

๐Ÿ”€ How to Write a Perfect Pull Request

๐Ÿ“š 5 min read ยท Git & Collaboration ยท For all levels

A great PR description saves reviewers 30+ minutes per review. It's the difference between "LGTM" rubber-stamps and meaningful code review that catches bugs and improves architecture.

1
Keep it small and focused. One PR = one concern. If you're adding a feature AND refactoring the auth module, open two PRs. PRs under 400 lines get 2.5ร— more review comments than large ones โ€” reviewers actually read them.
2
Write a clear description. Answer: What does this change? Why is it needed? How was it tested? What should reviewers focus on? Link the related issue/ticket.
3
Add screenshots/videos for UI changes. Before and after screenshots remove ambiguity instantly. Loom recordings for complex interactions.
4
Self-review before requesting review. Read your own diff end-to-end. You'll catch 30% of issues before bothering your teammates.
## PR Template โ€” add to .github/PULL_REQUEST_TEMPLATE.md

## What does this PR do?
Adds AI-powered category suggestion to the expense form.
When the user types in the Note field, Groq API suggests a category.

## Why is this needed?
Closes #147 โ€” users spend too much time manually selecting categories.
AI automates this for 80% of common expenses.

## How was it tested?
- [ ] Unit tests for `aiCategorizer.js` (mock Groq API)
- [ ] Manual testing: "Bought coffee" โ†’ suggests "Food & Drinks"
- [ ] Edge case: API timeout falls back gracefully (no suggestion shown)

## Screenshots
| Before | After |
|--------|-------|
| [img]  | [img] |

## Checklist
- [ ] Tests pass locally (`npm test`)
- [ ] No console.log left in code
- [ ] ENV var documented in README

๐Ÿ‘๏ธ How to Do Effective Code Review

๐Ÿ“š 4 min read ยท Engineering Culture ยท For seniors

Code review is a skill. Done well, it catches bugs, spreads knowledge, and improves team velocity. Done poorly, it creates gatekeeping, delays, and resentment.

โœ… Effective Reviews
  • Ask questions, don't demand
  • Explain WHY, not just WHAT
  • Distinguish: nit / suggestion / must-fix
  • Approve with minor suggestions
  • Review within 24 hours
  • Acknowledge what's done well
โŒ Toxic Review Patterns
  • "This is wrong" (no explanation)
  • Style nitpicks without autoformatter
  • Rewriting everything in your style
  • Waiting 3+ days to review
  • Blocking on personal preferences
  • Never approving โ€” always "Request Changes"
// โŒ Unhelpful comment
// "This is inefficient"

// โœ… Helpful comment
// "nit: This filter runs O(nยฒ) because findExpenseById
// searches the full array each iteration. Consider
// converting expenses to a Map first (O(n)) โ€” would
// help when the list grows to 1000+ items."

// Use prefixes to set expectations:
// nit:      trivial style issue โ€” author can ignore
// question: I'm genuinely curious, not blocking
// suggest:  good improvement but not required
// blocker:  must fix before merge

๐Ÿ› Systematic Debugging in Production

๐Ÿ“š 6 min read ยท Operations ยท For all levels

Production bugs are stressful. A systematic approach removes panic and lets you find the root cause in minutes, not hours.

1
Reproduce the problem. Get exact steps. What's the input? What's the expected output? What's the actual output? Never debug without being able to reproduce first.
2
Check recent changes. 80% of production bugs are caused by something that changed recently. Check: recent deploys (git log --oneline -10), config changes, infrastructure changes, external dependencies.
3
Correlate with metrics. When did the error rate spike? Does it correlate with traffic increase, a deploy, or a cron job? Check your Grafana dashboards.
4
Divide and conquer. Isolate the failing component. If it's a slow API: is it the DB query (check slow query log)? The network? The application logic? Binary search the call stack.
5
Fix โ†’ Verify โ†’ Document. Apply fix, verify error rate returns to baseline, write a short postmortem explaining root cause and prevention.
๐Ÿ’ก "Rubber Duck Debugging" Explain the bug out loud to a rubber duck (or a colleague, or yourself). The act of articulating the problem forces you to examine your assumptions. You'll often find the bug before you finish explaining.

๐Ÿ—„๏ธ Safe Database Migrations

๐Ÿ“š 5 min read ยท Database ยท For intermediates

Database migrations are the #1 cause of deployment-related outages. Run them wrong and you lock your tables, corrupt data, or can't roll back. Run them right and they're invisible.

1
Never lock tables in production. ALTER TABLE users ADD COLUMN phone VARCHAR(20) can lock the entire table for minutes on large tables. Use ADD COLUMN ... DEFAULT NULL โ€” nullable columns with no default are instantaneous in Postgres.
2
Expand/Contract pattern for zero-downtime. When renaming a column: (1) Add new column, (2) Write to both old and new, (3) Backfill old data to new column, (4) Read from new column, (5) Remove old column. Five deploys โ€” zero downtime.
3
Always test on a production-sized copy. A migration that takes 50ms on 1,000 rows takes 50 minutes on 1 billion rows. Use pg_dump to copy production data to staging and test migrations there first.
-- โŒ Dangerous: locks table, can timeout
ALTER TABLE expenses ADD COLUMN currency VARCHAR(3) DEFAULT 'USD' NOT NULL;

-- โœ… Safe: three-step migration
-- Migration 1: Add nullable column (instant)
ALTER TABLE expenses ADD COLUMN currency VARCHAR(3);

-- Migration 2: Backfill in batches (non-blocking)
UPDATE expenses SET currency = 'USD'
WHERE currency IS NULL AND id BETWEEN 1 AND 10000;
-- Run in batches, not all at once

-- Migration 3: Add NOT NULL constraint after backfill (fast)
ALTER TABLE expenses ALTER COLUMN currency SET NOT NULL;