sysctl, load modules, or debug “kernel” vs “application” issues, you’re straddling this boundary.
/bin, /sbin — essential binaries; /etc — config; /var — variable data (logs, spool); /tmp — temp (often cleared); /usr — read-only app data; /opt — add-on software; /home — users; /dev — device nodes; /proc, /sys — virtual kernel interfaces./var/log, service configs in /etc.
readlink -f for canonical symlink target.ls -i shows inode numbers. Copying a file creates a new inode; moving within the same FS usually renames the directory entry only.df -i shows 100% — common with millions of tiny files.
df -h OK → check inodes..service, .timer, .socket). Targets group units (roughly map to runlevels: multi-user.target, graphical.target).systemctl isolate rescue.target — change target; systemctl get-default — boot target./etc/init.d scripts may still exist via compatibility layers.
chmod 755 and chmod 640 in rwx terms.
755 = rwxr-xr-x — owner full, group/others read+execute (typical scripts).640 = rw-r----- — owner rw, group r, others nothing (common for secrets file with correct group).
umask and how does it affect new files and directories?
umask subtracts permission bits from the process default (files often 666, dirs 777 before mask). umask 022 → files 644, dirs 755.027 removes group/others write more aggressively. Service umask can be set in systemd unit UMask=.passwd, sudo). Security-sensitive — avoid on scripts on many FS./tmp) — only owner can delete/rename own files despite world-writable dir.ls -l (s/t in mode) or find / -perm -4000 for audits.
setfacl) instead of only chmod/chown?
getfacl -R).chmod -R 777 almost always wrong in production?
kill.wait() — kernel keeps minimal exit status row. Uses almost no resources but clutters ps (Z state).init adopts and reaps. You cannot kill -9 a zombie (already dead).systemctl restart vs reload vs reload-or-restart?
journalctl effectively for service debugging?
journalctl -u nginx -f — follow unit; --since "10 min ago" — window; -b — current boot only; -p err — priority filter; -o json-pretty for structured export.journalctl _PID=1234 — logs mentioning PID. Persistent journals require /var/log/journal config./var/log when services log both places.
journalctl -xe for boot errors — classic first step after failed deploy.Killed in shell, dmesg / journal OOM lines. Mitigations: limits (systemd MemoryMax=, cgroups), right-size workloads, fix leaks, add RAM/swap carefully.vm.overcommit_memory tuning affects allocation behavior — know it exists, don’t over-tweak without measurement.
df and du disagree on disk usage?
df sees filesystem block allocation; du walks directory tree summing file sizes. Disagreements from: deleted files still open (space held until FD closed — check lsof +L1), other mounts hiding paths, sparse files, internal FS reserved blocks.df -i — IUse% at 100%. Errors like “No space left on device” while df -h shows free space often mean inodes.mount -o remount,ro, cloud hypervisor storage glitch, full disk on root during boot.dmesg, journalctl -b, SMART / cloud console for volume health. mount | column -t shows ro flags.fsck offline if ext4 suggests, then remount rw.
/etc/fstab?
device mountpoint fstype options dump pass — e.g. UUID=... / ext4 defaults 0 1.pass = fsck order (root usually 1, others 2 or 0). dump legacy backup flag (often 0). Options include noatime, nodev, nfs specifics.blkid for stable UUIDs instead of /dev/sdX names.
fsck and what are the risks?
fsck checks/repairs filesystem metadata. Run on unmounted or read-only FS (or boot maintenance). Running on mounted rw ext4 is unsafe.xfs_repair, btrfs check — name awareness scores points.ss -tlnp output?
-t TCP, -l listening sockets, -n numeric (no DNS), -p process. Columns show State, Recv-Q/Send-Q, Local Address:Port, Peer, users:(("nginx",pid=...)).netstat. Use ss -tulpn for UDP too.127.0.0.1 vs 0.0.0.0), or firewall.
dig +trace example.com — full resolution path; dig @8.8.8.8 — specific resolver; host/nslookup for quick checks./etc/resolv.conf, systemd-resolved (resolvectl), VPC DNS settings in cloud.getent hosts uses NSS — not only DNS.curl flags for API and load-balancer debugging?
-v verbose TLS + headers; -I HEAD only; -w '%{http_code}\n' format output; --connect-timeout / --max-time; -H 'Host: ...' SNI / vhost tests; -k skip TLS verify (debug only).-L follow redirects. Combine with -o /dev/null -s for silent timing checks.
net.ipv4.ip_local_port_range, tw_reuse (careful, context-specific), fix app to reuse connections.iptables -L -n -v or nft list ruleset — requires root and distro-specific defaults.
set -euo pipefail do in bash?
-e — exit on first command failure (subtleties in conditionals). -u — error on unset variables. -o pipefail — pipeline fails if any stage fails (not only last).IFS=$'\n\t' sometimes for parsing safety.|| true or explicit check.
set -x for verbose trace when debugging."$var")?
* expansion."$var" preserves single argument. Exception: rare cases you want splitting — be explicit.$(cmd) in quotes when passing to other commands.
find ... -print0 | xargs -0?
-print0 emits NUL-delimited records; xargs -0 reads NUL-delimited — safe for arbitrary filenames.find -exec cmd {} + avoids separate xargs process.strace / perf at a high level for troubleshooting?
read. Use strace -f -p PID or wrap command. Overhead can be high.x (execute) for traversal; SELinux/AppArmor denials (ausearch/aa-status); NFS root squash; immutable flag (lsattr); read-only FS.namei -l /path/to/file shows per-component permissions.vmstat 1, iostat -xz, ps aux | awk '$8 ~ /^D/'.iotop, blktrace for deep disk — name-drop if you’ve used them.getenforce, sestatus, ausearch -m avc, audit2why; AppArmor aa-status, logs in journal.chcon/semanage fcontext) or profiles — not permanently disabling without policy approval.
uptime, load; df -h / df -i; free -m; dmesg -T | tail; journalctl -p err -b --no-pager | tail; check critical service systemctl status; ss -tlnp for listeners.strace, inode check, journal, network trace), result — restored service, time to recover.