Phase Status Technical Workflows
Overview
The DCP Data Access Service (DAS) uses a synchronous state management system. This design choice was for the short-term and offers simplicity and speed-to-market. However, it also is much too tightly coupled and does not scale as well as an async state system. We describe this as synchronous because the one state handler is responsible for actioning the state to the next system. For example, the /approve callback is responsible for the logic needed to action to Starting. This unnecessarily blurs logic between states.
In the near future, after GA, when scale becomes an issue, we will make the state handling more asynchronous with queues and listenersthat listen for states. For example, the Starting listener will listen for items in an Approved State, and perform the logic needed to action to Starting. This allows us accomodate a state that takes longer by adding more listeners and scaling horizontally
Details
Note: when a phase begins by starting or approving, the parent Deployment is set to its "Started" state.
- deployment phase created, state set to NEW
- in the future, we will raise a 'Created' event.
- inititalize (set aproving or starting)
- evaluate any checks for the Phase's environment
- if pre-requisites are unmet for the Phase's environment, state is set to Approving. If all pre-requisites are approved or none exist, state is set to Starting.
- If the phase is kgstg or kprod, the phase will be scheduled, even if scheduledAt is current time (gusCaseAuto)
- Approving
- submits approval workkflow (human, gus case, etc)
- waits for approval or rejection.
- Argo workflow creates case with Gus CLI and then calls back to DCP to set the case Id, risk, and scheduledAt
- Approved
- workflow calls back to /approve route when the approval is approved or rejected
- if approved
- Approved event emitted
- immediately begin "starting step" or sets to startScheduled
- begin "starting step" logic if state is not startScheduled
- if approval is rejected,
- Cancel event emitted
- see Cancel logic below.
- Starting
- if a gov env, check that the scheduledAt UTC time is after current UTC time, cancels phase if not.
- if state = starting, submits deploy workflow
- Starting or StartScheduled event emitted
- StartScheduled
- when a gov phase, we may need to start the phase after the ScheduledAt time, so we can support release windows.
- periodically, we will query for phases in this state and start them when ready.
- Started
- workflow calls back to say work is beginning
- Started event emitted
- Completed event emitted
- Completed
- workflow calls back to say work is successful
- Completed event emitted
- complete deploy set if all phases done
- start next deployment phase
- Failed
- event emitted (handled)
- fail deploy set if not already failed
- Cancelled
- cancel deploy set if not already cancelled
- event emitted (? handled)
More Topics
Last Updated: 2024-07-01T19:32:00+0000