Conversation
Initial zone version only so far, no diffs yet. Lots of code duplication, no docs, etc, i.e. PoC only.
As the DNS XFR wire format used is just an implementation detail, and it could be confusing if it looks like these perhaps relate to incoming XFRs which they do not.
bal-e
left a comment
There was a problem hiding this comment.
@ximon18 and I had a discussion while I was writing this review, and decided that I will add commits changing the PR to implement a new approach for the zone data storage state machine. I've left these comments here for us to come back once I'm done; I also didn't review everything, so there will be more to come.
|
|
||
| /// The index of the current loaded instance. | ||
| pub(super) curr_loaded_index: bool, | ||
|
|
||
| /// The index of the current signed instance. | ||
| pub(super) curr_signed_index: bool, |
There was a problem hiding this comment.
Are these fields necessary, if this state is only used at the very beginning (where they would both be 0/false)?
| pub(super) curr_signed_index: bool, | ||
| } | ||
|
|
||
| pub struct UninitializedStorage { |
There was a problem hiding this comment.
This type (and RestoringPersistedStorage below) need documentation, similar to the other types in this file. I'm happy to write it by adding to the PR.
| assert!( | ||
| old_reviewer.loaded_index == self.curr_loaded_index, | ||
| "'old_reviewer' does not point to the current instance", | ||
| "'old_reviewer' does not point to the current instance: {} vs:{}", |
| let viewer = unsafe { ZoneViewer::new(self.data.clone(), true, true) }; | ||
|
|
||
| let signed = unsafe { &*self.data.signed[0].get() }; | ||
| let serial = signed.soa.as_ref().unwrap().rdata.serial; |
There was a problem hiding this comment.
I don't think the serial number should be extracted here. Perhaps we need methods on LoadedZoneBuilt and SignedZoneBuilt that give you readers so you can observe the data, or just getters for the serial?
| assert!( | ||
| Arc::ptr_eq(&signed_built.data, &self.data), | ||
| "'built' is for a different zone" | ||
| ); |
There was a problem hiding this comment.
You should also check loaded_built. And the assert message should clarify which parameter it is referring to.
| let (transition, state) = if let ZoneDataStorage::Uninitialized(s) = state { | ||
| let s = s.initialize(); | ||
| t.move_to(ZoneDataStorage::Passive(s)); | ||
| transition(&mut self.state.storage.machine) | ||
| } else { | ||
| (t, state) | ||
| }; |
There was a problem hiding this comment.
I'd prefer the above commented-out initialize() function over this silent automatic conversion. Especially since it should only occur at startup and we can handle it outside this function.
There was a problem hiding this comment.
With the current setup, we might actually want to do this check in other places, but it is hard to tell.
| let (s, loaded_reviewer, signed_reviewer, viewer, serial) = | ||
| s.finish(loaded_built, signed_built); | ||
| self.state.storage.loaded_reviewer = loaded_reviewer; | ||
| self.state.storage.signed_reviewer = signed_reviewer; |
There was a problem hiding this comment.
This code does not correctly handle the reviewer types. They should be passed back into the storage state machine, so it can guarantee that viewers only exist for the permitted 0/1 instances. We should discuss what the right adjustments to the storage state machine's flow are.
| /// The zone is in an empty initial uninitialized state pending possible | ||
| /// restoration from persisted state or initialization directly to an | ||
| /// empty passive state. | ||
| Uninitialized(UninitializedStorage), |
There was a problem hiding this comment.
It might be better to omit this state entirely and only have a Restoring state, which can then decide to proceed with or skip restoring. Restoring might already need that functionality, e.g. if the previously persisted instances could not be found.
| Uninitialized(UninitializedStorage), | ||
|
|
||
| /// The zone is being restored from persisted state. | ||
| Restoring(RestoringPersistedStorage), |
There was a problem hiding this comment.
The enum variant name should match the underlying name, for consistency. I think the underlying type should be RestoringStorage.
| @@ -86,9 +94,6 @@ pub enum ZoneDataStorage { | |||
| impl ZoneDataStorage { | |||
| /// Construct a new [`ZoneDataStorage`]. | |||
| pub fn new() -> (Self, LoadedZoneReviewer, SignedZoneReviewer, ZoneViewer) { | |||
There was a problem hiding this comment.
Perhaps we want to change this function so it no longer returns reviewers? That might be easier all around.
WIP: Exploring an idea, learning how the new zone storage state works.
Status
Zone edits (e.g. via file edit and
zone reloador due to changes received via XFR) are persisted both for loaded and signed zones. On application restart persisted zones are restored andcascade zone statusanddig AXFRappear to work as they did prior to application shutdown.The code has been initially reviewed and led to some wanted changes:
zonedatacrate interface to enable moving some of the changes to Cascade itself.save_now()instead ofmark_dirty(). This is because persisting zone data to disk not enough, as if there was a power outage between callingmark_dirty()and publication of the zone the paths to the persisted files (that are recorded in state) would not have been written to disk, thussave_now()should be invoked instead. We may actually want to invokesave_now()in the publication server itself to ensure all published zone related state is persisted, but that is out of scope for this PR.tl;dr
Persisterto write snapshots and deltas to disk (currently in DNS AXFR/IXFR wire format).ZoneStatewith two vecs of paths to written snapshot and delta files.Uninitialized->PassiveandUninitialized->Restoring->PassiveLoader::init()to invoke newLoader::restore()prior to invokingenqueue_refresh().Loader::restore()creates aPassivestorage state populated from parsed snapshot and delta files as if the storage had moved through the pipeline being approved and signed (without actually invoking review hooks or doing signing).PROs/CONs:
Known issues:
unsafe { ... }blocks I had to add are "safe".Alternative ideas
restoringflag that affects the behaviour of existing states so that a zone being restored can be moved through the pipeline without side effects (such as invoking review hooks or signing). This would however complicate the code/logic at multiple existing points in the pipeline for every update to every zone for a situation that only occurs once at startup, while the implemented approach leaves the existing pipeline flow untouched.Alternative snapshot/delta storage formats
A choice would need to be driven by requirements:
Some ideas:
DiffDatato JSON/CBOR/whatever. Would tie the format to the internal format, changing the internal format would require writing additional code to handle storage format migration or support parsing old as well as new format. A PRO of this approach is being able to storage additional meta data with the snapshot/delta, if needed.Issues encountered
transition()does not do a transition, in fact things happen after such calls prior to an actualmove_totransition occurring.switch_loaded_data()andswitch_signed_data().LoadedZonePatcherandSignedZonePatcherbeing separate types with identical interfaces but no common trait required creating duplicate code to use them even though the logic in both implementations is identical. A macro could be used to work around this but feels a bit heavy...usizeanyway before they can be used as indices.Cargo.*,crates/,etc/,integration-tests/,src/):actthrough theact-wrapper(as described inTESTING.md)?