DryDock mechanics

Deepak Giridharagopal
Principal Developer

Updated: 2003/10/17 21:55:10
Version: 1.4

Abstract

This guide will give a more nuts-and bolts look at DryDock's composition. It will talk about DryDock database use, the permissions model, how it interoperates with CVS, and the intricacies of synchronization.


1. Introduction

Note: This information parrots information found in the DryDock LISA paper.

Note: Also, you should have already read the DryDock Overview Guide, otherwise you'll probably have no idea what I'm talking about here. :)

DryDock is written in Python and uses Webware for request handling and session management. The user interface is written in HTML using Cheetah, a Python templating library. DryDock's back-end is comprised of four main components: a relational database that stores auditing information, a role-based permissions system, a revisioning system that tracks changes in approved documents, and a synchronization daemon that updates the production web server.

2. The database

Since DryDock needs to query its data on a per-file, per-directory, and per-user basis, storing the information in a relational database was a natural fit. DryDock uses MySQL to store authorization information for files, permission definitions for users, review information, and user activity logs.

Each time a file is signed, reviewed, or revoked, DryDock records the operation's details in its tables. DryDock remembers the user, the time, the file's current MD5 fingerprint, users' notes, and any additional information DryDock has been configured to accept. DryDock uses this data to display a file's transaction history and to determine if a file's contents have changed since it was last signed or reviewed.

3. Permissions

DryDock features a role-based permissions model. Users and groups from the underlying UNIX system can be assigned a role of admin, sign, review, view, or none per path on the staging web tree. A role circumscribes all of the actions a user can perform; any capabilities not specifically permitted are prohibited for that directory and all paths underneath. Figure 3.1 describes the different roles and shows how they are cumulative in design; for example, a user with sign authority for a path also has review authority.


Figure 3.1: Role-based permissions model


Fig. 1

Permissions for a user resolve in a bottom-up manner. If no role is defined for the user on a path, DryDock searches for one defined for the user on the path's parent directory. The process continues until a role is found or the root directory is reached. If no role is found for the user, then DryDock performs the same recursive check for each group the user is in. If there is still no matching role, DryDock assumes the user has no privileges for the initial path.

4. The revisioning system

CVS 

To let administrators see a document's evolution, DryDock relies on the freely available Concurrent Versions System (CVS) to track changes in approved files. CVS provides stable, production quality, multi-file version control. DryDock interacts with CVS by coordinating files among three directories during versioning operations: the staging tree, a CVS repository, and a CVS working directory (a working copy of the repository used to commit changes to the repository).

When a file is signed, DryDock copies the file to the CVS working directory and then commits it to the repository. Similarly, when a file is revoked, DryDock deletes the file from the CVS working directory and then notifies the repository that the file has been removed. Through use of these mechanisms, the CVS working directory always contains the current version of each approved file.

Workarounds 

While CVS adequately handled most of our revisioning needs, its file-based design couldn't handle changes in the repository's directory structure. CVS only tracks file contents, not directory structure. CVS provides no way to remove a directory from the repository without losing all of its versioning information. So, if you delete all the files within a directory and still want to access those files' revision histories, the directory must remain in the repository. This lingering directory prohibits adding a new file with the same name to the repository because UNIX file systems won't allow two identically named items in a directory. If a file cannot be added to the repository, then DryDock users cannot approve that file.

To remedy this, we distort directories' names when they are added to the working directory. Whenever DryDock creates a directory in the CVS working directory, it prepends a predetermined sequence of characters to the directory's name; this mangled name is used when the directory is added to the repository. Concurrently, we prohibit users from creating files beginning with the same reserved characters. This allows us to have directories and files with the same name under version control simultaneously.

5. Synchronization

Note: This section assumes that you're examining DryDock in the context of its use with a dual web server setup (a recommended practice).

With DryDock, users never make updates directly to the production web server. Synchronization, DryDock's process of pushing documents out to the production machine's web tree, is the sole way to update the external web site. Periodically, DryDock copies approved versions of all documents to the external machine, reconstructing the production web site.

Scheduling 

Since synchronizations are the only way changes propagate to the production machine, we needed to schedule them frequently. Instructing DryDock to immediately sync whenever a user signed or revoked a file would work well in periods of light use, but the staging server would be overwhelmed if users signed and revoked pages en masse.

Our solution was to employ delayed syncs. Instead of immediately syncing when a user signs or revokes a file, DryDock schedules a sync to occur five minutes later. If users sign or revoke additional files inside this five-minute window, DryDock reschedules the sync for five minutes from the time of the most recent approval operation. This process continues until there is no activity for the duration of the window, at which time the sync occurs. Since heavy usage would keep pushing the sync back by five minutes, perhaps indefinitely, we instituted a one-hour failsafe between syncs. If a sync hasn't occurred in the last 60 minutes, one is forced. Grouping updates in this way gave us a reasonable compromise between update frequency and server load. For situations requiring finer control, however, we allow DryDock administrators to force a sync on demand.

The sync process 

Figure 5.1 details the sync process. Sync is split into two parts: handling pre-approved files and exporting signed files to the production machine.


Figure 5.1: Synchronization flow


Fig. 1

Pre-approved files require special handling during sync to ensure their changes are monitored by DryDock's revisioning system. Ordinarily, DryDock only adds files to its revisioning system when they are signed or revoked. Since users aren't required to sign pre-approved files each time their contents change, DryDock would normally be unable to track changes in pre-approved documents over time. To remedy this, DryDock adds the current state of every pre-approved file to its revisioning system at the start of each sync.

Before DryDock exports approved files to the external web server, it must construct an image of the production web tree. DryDock accomplishes this by continually maintaining an image of the production web tree in a directory alongside the CVS repository; this is the export directory. Whenever a user signs a file DryDock copies it to the export directory, and whenever a file is revoked DryDock deletes it from the export directory.

DryDock can use a variety of user-defined scripts to transmit the export directory to the external web server. For details on how to customize the synchronization process, consult the DryDock SyncKit Guide. In our organization, we use tar and SSH. Our production server is configured to automatically decompress the archive and replace its current web root with the new content. To make the process more secure, we use a pair of public and private cryptographic keys to establish the connection instead of a traditional user name and password combination, and the key is associated exclusively with the specific script that updates the machine's web root.

6. Wrap-up

And that's how all the pieces tie together. From here, you can see if DryDock is right for your organization. If you think it is, you can then check out the Installation Guide to see what you're in for if you want to give DryDock a whirl. Questions or comments? Throw them at the DryDock users mailing list. Have fun!



Outline

1. Introduction

2. The database

3. Permissions

4. The revisioning system
- CVS
- Workarounds

5. Synchronization
- Scheduling
- The sync process

6. Wrap-up

Links

1. DryDock LISA paper
2. DryDock Overview Guide
3. Python
4. Webware
5. Cheetah
6. MySQL
7. Figure 3.1
8. CVS
9. recommended practice
10. Figure 5.1
11. DryDock SyncKit Guide
12. see if DryDock is right for your organization
13. Installation Guide

Valid XHTML 1.0!

Valid CSS!