【SOP】Checklist
Li Wei
【SOP】Checklist
Related Personnel
| Role | Name | Online RD | Double‑Check List Review | Person‑in‑Charge |
|---|
Six‑Point Guidelines
| Six‑Point | Who Fills In | Who Reviews / Assigns Checker / TL | Testing Required | Self‑Test Cases:
‑ Check that testing is sufficient | Awareness Required | Already Informed:
‑ Check that it meets the release standards | Review Required | Already Applied:
‑ Check that it meets the release standards | Gray‑Release Required | During gray release, verify that the gray machines are running normally and that there are no special alerts
‑ Check that the gray plan and gray code are complete | Observation Required | Observe machines for anomalies and alerts:
‑ Check that the observation items are complete | Rollback Required | Code can be rolled back directly |
Code‑Related
| Service | Development Branch | Pull Request | Code Review | Completion Date | Code Reviewer |
|---|---|---|---|---|---|
| 2023‑02‑01 |
Release Order
Tip: Includes the service itself, third‑party packages, front‑end, and callers/providers.
| Project | Order | Start Release Time | Publisher | Notes (e.g., gray/full) |
|---|---|---|---|---|
| A | 1 | 2023‑02‑01 20:00 | @xx | Gray release on one node, observe 10 min, then release in three batches; after each batch observe for a period before proceeding |
| B | 2 | 2023‑02‑01 20:30 | @xx | Front‑end |
| C | 3 | 2023‑02‑01 21:00 | @xx | ×× business, caller/consumer/provider |
Rollback Order
| Project | Order | Notes (important considerations, e.g., dependencies on services outside the team, impact of rollback) |
|---|---|---|
| C | 1 | Need to notify the xx team; roll back service yyy first |
| B | 2 | |
| A | 3 |
Configuration Check (RD)
Note: All configuration items must be described in the remarks.
| # | Configuration Item | Action | Check Description | Remarks |
|---|---|---|---|---|
| 1 | DB configuration | – | Not applicable | Completed – online SQL tables match offline; do not modify during online application |
| 2 | ES configuration | – | Not applicable | Completed |
| 3 | Config Center | – | Not applicable | Completed |
| 4 | Scheduled‑task configuration | – | Not applicable | Completed – verify that task execution times meet expectations |
| 5 | Message‑queue configuration | – | Not applicable | Completed – check code for correct namespace, topic, consumer‑group settings; ensure producer clusters and consumer machines are balanced; verify idempotent consumption handling; confirm deserialization is reasonable and tolerant of new fields (ignore unknown fields); if dead‑letter functionality is needed, enable it both in code and on the platform (otherwise startup fails) |
| 6 | Thread‑pool configuration | – | Not applicable | Completed |
| 7 | pom changes | – | Not applicable | Completed – diffed JAR against online stable version, no issues |
| 8 | SNAPSHOT → official VERSION | – | Not applicable | Completed |
| 9 | Branch merged to master | – | Not applicable | Completed – merged |
| 10 | Own‑service interfaces added auth | – | Not applicable | Completed – auth added; confirm health‑check URL for new service is configured |
| 11 | Dependent‑service interfaces opened auth | – | Not applicable | Completed – auth added; confirm health‑check URL for dependent service is configured |
| 12 | Release order clarified | – | Not applicable | Completed |
| 13 | Rollback plan defined | – | Not applicable | Completed – severe issues trigger immediate rollback of this release |
| 14 | Degradation plan defined | – | Not applicable | Completed |
| 15 | Gray release enabled | – | Not applicable | Completed |
| 16 | Impact assessment of release code performed | – | Not applicable | Completed – data‑source compatibility solution implemented |
| 17 | Config Center details | – | – | |
| Service | Config key | Config type | Online value | |
| hawkeye.rank.demotion.list.resource.info.map | json | Same as offline | HawkEye demotion list |
Note: Configuration checks should be tailored per team.
Business Configuration (RD)
Explanation: Primarily for business‑level page settings managed by RD, such as data‑objectification configs, feature toggles, masking rules, API permission settings, etc.
If not applicable, write: “No such configuration.”
Alert Configuration
Required: External API calls, Kafka consumption, custom Transactions, Events, etc., must have alerts set in Raptor. If none are needed, state so.
| Alert Item | Description | Configured? | Notes |
|---|---|---|---|
| XXX | Explanation: Type: Transaction Metric: TP9999 Alert Level: P2 Condition: Raw value > 100, trigger 3 times within a 5‑minute window |
Configured | – |
| – | Not applicable | – | – |
Release Checklist
When using a gray release, first deploy to gray machines. If the feature has a switch, verify the default state works during the gray period. After the full release, toggle the switch and watch for correct behavior. Throughout and after deployment, monitor all system metrics.
| Item | Owner | Content | Status | Remarks |
|---|---|---|---|---|
| Log check (post‑release) | - Verify that problem logs have not surged- Spot‑check logs on machines for new errors or an increase in error count |
OK / Issue | ||
| Machine‑metric check | Metrics to watch: - load.1minPerCPU: 1‑minute load per CPU (total load ÷ number of cores)- cpu.idle: % of CPU idle time- ss.estab: number of established TCP connections- ss.closewait: number of TCP connections in CLOSE‑WAIT- jvm.gc.count: GC count- jvm.memory.used: total memory used- jvm.thread.count: total threads- jvm.thread.blocked.count: blocked threads |
OK / Issue | ||
| Core business logic | Each team has its own core business; list them here. Observe these aspects during the release. | OK / Issue | ||
| Verification approach | Code‑logic coverage verification. After full release, promptly track impact (ensure product usage is normal, confirm no anomalies, and verify that features are actually being used). | OK / Issue | ||
| Message‑queue verification | Check for backlog | OK / Issue | ||
| Mark release as stable | After release, click “Finish Release” and ensure the release package is marked stable | OK / Issue | ||
| Other checks | Fill in as needed | OK / Issue | ||
| Requirement pool status update | Points to note: - Feature requests should move to “Closed” after testing - After release, requirement status and full‑code timestamps should be automatically updated correctly |
OK / Issue |
Originally written by Li Wei (李唯_) and published in Chinese on 后端技术栈全书 (Full-Stack Backend Engineering). Translated and adapted for DriftSeas with permission.