Claude 4 Sonnet, Claude Opus 4.6, GPT-5, or Gemini 2.5 Pro. Any frontier model with a 100k+ token context window works well. Claude 4 Sonnet tends to produce the most structured and actionable security findings.You have a PR open, it's been waiting for review for two days, and the person who normally reviews your security-sensitive code is OOO. Or you are a solo developer shipping a feature and want a second pair of eyes before deploying. Or you are a senior engineer who wants to pre-check your own work before asking colleagues to spend time on it. This prompt gives you a structured review that covers the categories most likely to become production incidents.Developer Tools

بررسی‌کننده Pull Request هوش مصنوعی: پیش از هر Merge نظر یک مهندس ارشد را روی کد خود بگیرید

June 4, 2026

اشتراک‌گذاری:

بررسی‌کننده Pull Request هوش مصنوعی: پیش از هر Merge نظر یک مهندس ارشد را روی کد خود بگیرید

Why this prompt matters

Most code bugs that reach production are not found in reviews because reviews are inconsistent — the reviewer is tired, distracted, or focuses on style while missing logic errors. A structured prompt with defined categories and required severity levels forces coverage of the issues that actually matter: correctness, security, and error handling. It also produces reviews that explain the why behind every finding, which builds understanding rather than just producing a checklist of changes to make.

What we use it for

You have a PR open, it's been waiting for review for two days, and the person who normally reviews your security-sensitive code is OOO. Or you are a solo developer shipping a feature and want a second pair of eyes before deploying. Or you are a senior engineer who wants to pre-check your own work before asking colleagues to spend time on it. This prompt gives you a structured review that covers the categories most likely to become production incidents.

Prompt

Role: Act as a senior software engineer with 10+ years of experience doing code reviews. You have a strong bias toward correctness, maintainability, and security. You write reviews that teach — not just list problems — because you want the author to understand the why, not just fix the what.

Context: I am submitting a pull request for review. The code is written in [LANGUAGE: e.g., Python / TypeScript / Go / Rust / Java]. The codebase is [TYPE: e.g., a REST API / a frontend React app / a CLI tool / a data pipeline]. The PR is doing the following: [DESCRIBE WHAT THE PR DOES IN 1-3 SENTENCES]. The most important things for this codebase are [PRIORITIES: e.g., performance / security / readability / test coverage / backward compatibility].

Task: Review this code as a senior engineer doing a thorough pre-merge review. Go beyond surface-level issues. Look for:

1. Correctness bugs — logic errors, edge cases, off-by-one errors, null/undefined handling, incorrect assumptions about input
2. Security vulnerabilities — injection risks, authentication flaws, insecure data handling, exposed secrets, unsafe deserialization
3. Performance issues — unnecessary allocations, N+1 query patterns, blocking operations, missing indexes, inefficient algorithms
4. Maintainability problems — functions doing too much, poor naming, missing abstractions, code duplication, hard-coded values that should be configurable
5. Error handling gaps — unhandled exceptions, swallowed errors, missing logging for failure cases, no retry logic where needed
6. Test coverage — are the happy path, error cases, and edge cases all tested? Are the tests testing the right things?
7. API and interface design — is this the right abstraction? Will this be easy to change later? Are there breaking changes?
8. Concurrency and race conditions — if applicable to the language and context

Constraints:
- Every finding must include: the specific location or pattern (line, function, or concept), the severity (Critical / High / Medium / Low / Nitpick), and the explanation of why it is a problem — not just that it is.
- Do not invent problems. Only flag what you can see in the code.
- If something is genuinely well done, say so briefly — a good review acknowledges what works.
- Do not rewrite the entire function in your review unless the existing implementation is fundamentally broken.
- Separate blocking issues (must fix before merge) from non-blocking suggestions (good to fix, but optional for this PR).
- If you lack enough context to assess something, say so explicitly rather than guessing.

Output Format:
1. Summary (2-3 sentences: overall assessment, biggest concerns, merge readiness)
2. Blocking Issues — must fix before merge (severity, location, explanation, suggested fix)
3. Non-Blocking Issues — recommended improvements (severity, location, explanation)
4. Security Check — pass / flag / needs-closer-look, with brief reasoning
5. Test Coverage Assessment — adequate / thin / missing, with specific gaps
6. What is well done (1-3 specific positives)
7. Suggested questions for the author (things you need clarification on before approving)

Code to review:
[PASTE YOUR DIFF OR CODE HERE]

Result

SUMMARY This PR adds a user authentication endpoint with JWT token issuance. The core logic is sound and the structure is clean, but there are two blocking security issues — a timing attack vulnerability in password comparison and missing token expiry validation — plus one high-severity missing error case that could expose internal stack traces to API consumers. Test coverage is thin on the error paths. Not merge-ready as-is, but close.

BLOCKING ISSUES

1. Timing attack in password comparison Severity: Critical Location: comparePasswords() function, line 34 Problem: Using a simple string equality check (userPassword === hashedPassword) for password comparison is vulnerable to timing attacks. An attacker who sends many requests can measure response time differences to enumerate valid usernames or reconstruct password hashes character by character. Fix: Replace with a constant-time comparison function. In Node.js use crypto.timingSafeEqual(). In Python use hmac.compare_digest(). Never use == for secret comparisons.

2. JWT token expiry not validated on refresh Severity: Critical Location: refreshToken() handler, line 89 Problem: The refresh endpoint checks token signature validity but does not verify the exp claim against the current timestamp. An expired access token can be used to generate a fresh token indefinitely, effectively making tokens immortal and defeating the purpose of expiry. Fix: Add explicit exp check: if (decoded.exp < Math.floor(Date.now() / 1000)) throw new TokenExpiredError(). Most JWT libraries have a built-in option for this — set ignoreExpiration: false explicitly rather than relying on default behavior.

3. Unhandled database error exposes stack trace Severity: High Location: loginHandler() catch block, line 67 Problem: The catch block re-throws the raw database error object directly to the Express error handler, which in the current middleware configuration will serialize the full stack trace into the 500 response body. This leaks internal file paths, dependency versions, and query structure to API consumers. Fix: Log the raw error internally with a correlation ID, then return a generic error response: res.status(500).json({ error: 'Authentication service error', correlationId: uuid }).

NON-BLOCKING ISSUES

4. Magic numbers for token expiry Severity: Medium Location: Lines 45 and 102 Problem: Token expiry durations are hard-coded as 900 and 86400 (seconds). These should be named constants or environment-configurable values so they can be changed without a code deployment, and so the next developer understands what these numbers represent. Suggestion: const ACCESS_TOKEN_TTL_SECONDS = parseInt(process.env.ACCESS_TOKEN_TTL ?? '900', 10)

5. Missing rate limiting on login endpoint Severity: Medium Location: loginHandler route registration Problem: There is no rate limiting on POST /auth/login. This allows unlimited password brute-force attempts. This might be handled at the infrastructure level (API gateway, load balancer), but if not, it should be at the application level. Suggestion: Add express-rate-limit or equivalent middleware, or at minimum document that infrastructure-level rate limiting covers this endpoint.

6. Function does two things Severity: Low (Nitpick) Location: validateAndDecodeToken(), line 78 Problem: This function both validates the token signature AND extracts the user ID from the payload, returning both as a tuple. These are two distinct concerns and should probably be separate functions, or the return value should be a typed object with clearly named fields rather than a positional tuple.

SECURITY CHECK: FLAG Blocking issues (1 and 2) make this endpoint insecure in its current state. After fixes, recommend a second security-focused pass.

TEST COVERAGE ASSESSMENT: THIN Happy path (valid credentials → token returned) is covered. Missing: - Expired token refresh attempt (the blocking issue above) - Invalid password attempt — verify 401, not 500 - Malformed JWT in Authorization header - Database connection failure during login - Rate limiting behavior (if applicable)

WHAT IS WELL DONE 1. Token rotation on refresh is implemented correctly — the old refresh token is invalidated when a new one is issued. This is easy to get wrong and it is done right here. 2. Password hashing uses bcrypt with a cost factor of 12, which is appropriate. 3. The middleware composition is clean and easy to follow.

QUESTIONS FOR THE AUTHOR 1. Is there infrastructure-level rate limiting on this endpoint? If yes, please document it in the code or README so it is not duplicated. 2. Is token revocation (logout/blocklist) in scope for this PR or a follow-up? The current architecture does not support it, which should be a conscious decision, not an oversight.

بیشتر توسعه‌دهندگان هر دو سر طیف بازبینی کد را تجربه کرده‌اند. در یک سر: تأییدکننده‌ای که در سی ثانیه کلیک می‌کند و یک ایموجی تنها می‌گذارد. در سر دیگر: مهندس ارشدی که با دو صفحه یادداشت خط‌به‌خط بازمی‌گردد که صحت، امنیت، قابلیت نگهداری و تست را پوشش می‌دهد — و توضیح می‌دهد که چرا هر مسئله مهم است به‌جای اینکه فقط آن را نادرست علامت بزند. نوع دوم بازبینی شما را توسعه‌دهنده بهتری می‌کند. همچنین نوعی است که گرفتن آن زمان می‌برد، به‌ویژه از افرادی که خودشان بیش از حد بار کاری دارند.

این پرامپت تلاشی است برای آوردن بخشی از آن تجربه دوم به هر Pull Request. بازبینی را به دسته‌بندی‌های مهم ساختار می‌دهد — باگ‌های صحت، آسیب‌پذیری‌های امنیتی، عملکرد، مدیریت خطا، پوشش تست — و از مدل می‌خواهد که هر یافته را با مکان، شدت و استدلال توضیح دهد پیش از آنکه رفع را پیشنهاد کند.

مشکل پرامپت‌های عمومی «کد من را بررسی کن»

از یک مدل زبانی بخواهید که «این کد را بررسی کند» بدون ساختار و معمولاً یک لیست مسطح از مشاهدات بدون اولویت‌بندی و بدون توضیح شدت دریافت خواهید کرد. مدل ممکن است یک ناسازگاری در نام‌گذاری متغیر را در همان سطح فوریت یک آسیب‌پذیری حمله زمان‌بندی علامت بزند. ممکن است شما را به‌خاطر ساختار تمیزتان تبریک بگوید و سپس از توجه به اشاره‌گر null مدیریت‌نشده‌ای که با اولین ورودی خالی در محیط تولید از کار می‌افتد غافل شود.

ساختار موجود در این پرامپت اولویت‌بندی را اجباری می‌کند. مسائل مسدودکننده — چیزهایی که باید قبل از Merge رفع شوند — از پیشنهادهای غیرمسدودکننده جدا می‌شوند. هر یافته مستلزم یک سطح شدت (بحرانی، بالا، متوسط، کم، جزئی)، یک مکان خاص و توضیحی درباره اینکه چرا مسئله وجود دارد به‌جای اینکه فقط چه کاری باید انجام شود، است. بررسی امنیتی و ارزیابی پوشش تست بخش‌های صریحی هستند نه افکار بعدی.

نحوه استفاده

چهار فیلد زمینه را در بالا پر کنید: زبان برنامه‌نویسی، نوع پایگاه کد، یک تا سه جمله توضیح درباره اینکه PR چه کاری انجام می‌دهد و اولویت‌هایی که برای این پایگاه کد بیشتر اهمیت دارند. سپس diff یا کد مربوطه را در پایین بچسبانید.

فیلدهای زمینه تفاوت معناداری در کیفیت خروجی ایجاد می‌کنند. مدلی که بداند این یک Python REST API با اولویت امنیت-اول است چیزهای متفاوتی را نسبت به مدلی که کد Go CLI را با اولویت عملکرد بازبینی می‌کند تشخیص می‌دهد. توضیح اینکه PR در تلاش است به چه چیزی دست یابد به مدل کمک می‌کند ارزیابی کند که آیا پیاده‌سازی واقعاً به هدف می‌رسد، نه اینکه فقط سبک را بررسی کند.

برای PRهای بسیار بزرگ، ابتدا بحرانی‌ترین فایل‌ها را بچسبانید — احراز هویت، مدیریت داده، رابط‌های API عمومی — و برای تغییرات ابزار یا پیکربندی یک پاس جداگانه بخواهید. بیشتر مدل‌های با پنجره زمینه ۱۰۰ هزار به بالا می‌توانند PRهای متوسط را در یک پاس مدیریت کنند.

آنچه دریافت می‌کنید

فرمت خروجی هفت بخش تولید می‌کند: خلاصه‌ای به زبان ساده با ارزیابی آمادگی برای Merge، مسائل مسدودکننده با راه‌حل‌ها، پیشنهادهای غیرمسدودکننده، رأی امنیتی، ارزیابی پوشش تست با شکاف‌های خاص، تأیید کارهای خوب انجام شده و سوالات شفاف‌ساز برای نویسنده. خروجی مثال در این پست مرور یک endpoint احراز هویت JWT را نشان می‌دهد — توجه کنید که یافته حمله زمان‌بندی هم مکانیسم حمله و هم تابع کتابخانه خاص را برای استفاده در رفع توضیح می‌دهد.

بخش «آنچه به خوبی انجام شده» یک کار خوب اختیاری نیست. بازبینی‌هایی که فقط مشکلات را برمی‌شمارند سیگنالی را که کد خوب ارسال می‌کند از دست می‌دهند — اینکه نویسنده قصد طراحی را درک می‌کند و باید به انجام آنچه کار می‌کند ادامه دهد. همچنین برای توسعه‌دهندگانی که هنوز یک مهندس ارشد به‌عنوان مربی نداشته‌اند الگویی از آنچه که بازبینی کد متفکرانه واقعاً به نظر می‌رسد ارائه می‌دهد.

محدودیت‌های قابل توجه

این پرامپت روی کدی که مدل در معرض آموزش آن قرار گرفته است نتایج بهتری تولید می‌کند. مسائل بیشتری را در Python, TypeScript, Java و Go می‌گیرد تا در زبان‌های خاص دامنه. نمی‌تواند رفتار زمان اجرا، ردیابی‌های پروفایلینگ یا لاگ‌های تولید را تحلیل کند — فقط آنچه در diff قابل مشاهده است. برای سیستم‌های حساس امنیتی، بازبینی کد هوش مصنوعی باید مکمل بازبینی امنیتی انسانی باشد نه جایگزین آن، به‌ویژه برای پیاده‌سازی‌های رمزنگاری و جریان‌های احراز هویت.

مدل گاهی اوقات موارد مثبت کاذب را علامت می‌زند — مسائلی که با توجه به زمینه‌ای که ندارد در واقع باگ نیستند. محدودیتی که از آن می‌خواهد وقتی زمینه ندارد صریحاً علامت بزند به کاهش این مشکل کمک می‌کند، اما در نظر گرفتن هر یافته به‌عنوان نقطه شروع تحقیق به‌جای رأی نهایی نتایج بهتری به همراه دارد.

developer toolscode reviewsoftware-engineeringdebuggingsecuritypromptpull-request

اشتراک‌گذاری: