[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Ganymede Dev] Re: many G... questions

Date Thu, 4 Jul 2002 23:32:45 +0200 (MEST)
From Stephan Wiesand <wiesand@ifh.de>

On Wed, 3 Jul 2002, Jonathan Abbey wrote:

...

> | There are however some things we'd like to do that seem difficult even for
> | Ganymede to achieve, and for those I'd like to ask whether I've overlooked
> | a simple way provided by Ganymede and - if not - how you'd go about
> | enhancing Ganymede to do them, and whether you'd accept such a feature
> | as a contribution to Ganymede:
> |
> | 1) Kerberos(V) authentication
>
> I don't know enough to be able to comment very inteligently on this
> one, unfortunately.  Could you elaborate on how you'd like to see
> Ganymede interact with Kerberos?

Clients should be able to obtain a session with their Kerberos TGT if they
possess one, without having to provide a password. Now that JAAS is part
of the JDK, wouldn't that be the right way of doing authentication?

...

> A student here has recently demonstrated a proof of RMI over SSL, and
> it appears that it would be on the order of a 10 line patch to the
> server, and a one or two line patch to the client.  Essentially the
> only change that would be required for the client would be the setting
> of the RMI security manager, to allow downloading of the appopriate
> socket factory classes from the server.
>
> We have not actually tried running Ganymede over SSL yet, though, so I
> can't comment on the CPU or memory requirements that running over SSL
> will take.  It's possible that it would be better for performance to
> only SSL-protect certain communications, mostly those having to do
> with passwords.  Whether such performance improvements would be worth
> the potential for intercepting and spoofing a command stream is an
> open question.

Not for us, anyway. Good news. Guess this (plus the previous item) would
make Ganymede much more appealing to many potential adopters.

> | 3) Enhanced command line interface and/or external API:
> |
> |    The xmlclient seems to do a good job when writing to the directory,
> |    but queries aren't possible, just a complete dump of the DB. Would it
> |    be feasible to enhance xmlclient to allow querying for the value of
> |    particular attributes of particular objects? Could the very nice query
> |    mechanism in the GUI client be reused?
>
> This is an obvious enhancement, and one I'd love to see happen.  The
> GUI client's submits queries to the server using a serialized boolean
> logic tree.  Probably the easiest path to what you're suggesting would
> be to define an XML mapping for the serialized data structures that
> are currently used for querying, and the addition of a new entry point
> in the GanymedeSession class (or one in the GanymedeXMLSession) that
> would take an XML query representation, internally regenerate the
> current logic tree structure, and then pass it to the existing query
> engine.  The results from the query would then be converted to XML for
> the return to the xmlclient.
>
> That's only if you needed to have an externally scriptable XML
> interface.  It would also be possible to write a custom client to
> perform whatever sort of querying would be appropriate.
>
> So, very doable, but a bit of design effort should need to be
> undertaken before starting to code anything, I think, especially if
> you wanted a convenient command-line shortcut for queries rather than
> having to generate XML for each query.

Again, good news. The XML approach, plus some perl for convenience, should
do. By the way, are the permission mechanisms in effect in an XMLSession?

> | 4) Incremental, asynchronous propagation of data to target platforms:
> |
> |    The example schema coming with Ganymede does full builds on every
> |    transaction commit, right? What we'd like to do instead is, on commit,
> |    create an "event" with the information what changed and put it into
> |    a queue read by the mechanism actually doing the work on the platform
> |    (we call that a "platform adaptor"), which should then acknowledge
> |    reading the event, do the work - possibly after querying for additional
> |    information from Ganymede through an enhanced xmlclient - and finally
> |    confirm the event's execution.
>
> Right, I've been thinking about something along those lines for quite
> some time.  I've always come back to the problem of how to let the
> Ganymede server know which changes have been successfully propagated
> to the exterior environment.  Ganymede doesn't (currently) have the
> database scalability to remember older versions of objects, so it's
> hard to tell Ganymede to generate a delta for service X since time Y.

But at the time DBEditObject.finalize...() is called, we know exactly
what changes if the transaction is committed, don't we? And we are allowed
to create new objects and add them to the current DBEditSet.

So why not create a on object of type "event: Change to User". It's
existance after the commit means that something has to be done on a
platform. The builder task could then check for such objects, and act
accordingly for each with the value "new" for the attribute "status" by
sending the relevant information to the platform and then changing the
status to "sent". The platform would acknowledge the event by setting the
status to "read" via xmlclient, do the work, then update the status to
"done" (or delete the event object). In case of failure, status is sent to
"failed". A scheduled task would check for "failed" event objects and
overdue ones - resend them and/or notify someone.

It's still ugly because of the additional builder tasks triggered, but
those should be fairly cheap. But communication with the platforms
would be fully decoupled from transaction commits, and full status
information would be available within Ganymede, changes pending execution
could be listed with the query facility, ...

Wouldn't this work?

> There already exists code to generate transaction-by-transaction
> differential change records.  The database journal file (handled by
> the DBJournal class) is based on that kind of differential logic.
>
> I can certainly see having the Ganymede server generate an XML delta
> description summarizing changes made during a transaction, and then
> having an external interface (your platform adaptor) process the XML
> chunk and process the delta.
>
> The challenge would be to make sure that those XML chunks are
> successfully processed, and to report success or failure back to
> Ganymede as you suggest.  If it took any significant time to process
> the delta chunks, then the rate of transaction commits in the Ganymede
> server would be drastically slowed down.  Right now, Ganymede uses its
> scheduler to asynchronously decouple transaction commits from external
> builds.  If the Ganymede server could not finish a transaction commit
> until an external process of unknown duration completed, it could
> severely reduce the server's concurrency.
>
> If you anticipate managing several unlinked types of data in Ganymede,
> running multiple Ganymede servers could reduce the impact on
> concurrency, of course.

I'm afraid it will be more like a fabric.

> |    I believe all of this would be possible without touching the Ganymede
> |    core: Either by making commitPhase2() talk to some external daemon
> |    actually doing the queuing and pushing the status information back
> |    into Ganymede by creating and manipulating objects of type "Platform
> |    Event" via xmlclient, or by directly creating the "Platform Event"
> |    objects from DBEditObject's set...() or finalize...() methods, and
> |    have scheduled tasks check for existing event objects and do the
> |    talking to the platforms.
>
> Hm.  I'd need to know more about your schema design to be able to
> comment on this.

If it would just exist yet... Maybe I made myself more clear above.

> |    Am I right? Could we do that? Is it allowed/possible to create new
> |    objects from within DBEditObject methods and check them into the
> |    current DBEditSet? Can scheduled tasks manipulate/create/delete
> |    objects? And: could you imagine having such a mechanism tied more
> |    closely to the core?
>
> Unfortunately, the commitPhase2() method is not allowed to create new
> objects in the DBEditSet.  The various non-commit-time methods in
> DBEditObject are allowed to create new objects, and scheduled tasks
> most certainly could.

...

> | 7) Derived object types
> |
> |    Not each of our users has accounts on all platforms, access to all
> |    resources etc. To model this, it would be very useful to be able
> |    to define an "Account" object type, with a one-to-many relation
> |    from "User" to "Account", a "Resource" object type wit a many-to-many
> |    relation between "Account" and "Resource". Ganymede makes that
> |    amazingly simple.
> |
> |    But: Unix accounts will for example have a different set of attributes
> |    than W2K account or others. Is there a way to define "Unix Account"
> |    and "W2K Account" sharing some set of attributes (including the
> |    2-way object reference to "User") by being derived from "Account"
> |    while extending the base object with additional attributes (like a
> |    2-way object reference to "Unix Group" or "W2K Group", which may
> |    themselves be derived from "Group")?
> |
> |    I figure this one wouldn't be easy.
>
> No, it's not that easy, but not, perhaps, for the reason you'd expect.
> The big difficulty I've had in trying to figure out a way to support
> derived object types is in the user interface side of things.  The
> Ganymede GUI client's tree display currently shows all objects of a
> given type in a folder, and the query engine executes queries based on
> an object type.  It's an open design question to decide how the
> Ganymede client should handle the object tree when there is a type
> hierarchy for the objects.

It seems straightforward to me: Your fine tree widget handles subfolders
nicely, and as a user I'd like queries to consider the object type I
specify and all derived types. I'm probably missing something, what is it?

> That said, having derivable types would be a wonderful enhancement,
> and one I've wished I had designed into Ganymede at the start for some
> time now.
>
> On the other hand, the DBEditObject class is powerful enough that you
> can define all the fields you might want to have in the user class,
> and then have a pull-down toggle or a series of checkboxes for the
> user type, and the DBEditObject plugin could hide the inappropriate
> fields based on the selected user type.  We do something very similar
> to this in a number of our classes.. see the Task object class in the
> default Ganymede schema for an example of this.

Yes, that's an option. But every time a new subtype is introduced,
all object types designed to have references to the corresponding
supertype need to be changed as well, most likely even their plugins.

Would it be possible to allow a *list* of object types as reference
targets instead of just one? That would solve most cases I have in mind
nicely.

...

> I'm pleased and excited, and I would love to hear more about your
> goals.  How large a dataset do you imagine you would eventually want
> to manage with Ganymede?

We have to deal with about 3000 persons. This number will hopefully
double within a couple of years. DESY has two sites of quite different
scale. Most, but not all users have accounts for the larger site's
facilities, fewer for the small one. Not all platforms exist at both
sites. We'd like to avoid keeping unused accounts on any platform or site.

Fluctuation is high. Users from external collaborating institutes,
guests, and customers of certain facilities need access to computing
(and other) resources for a couple of days, months, or years. External
and internal users change their field of activity frequently. They
leave for a couple of months or years to work somewhere else, then come
back and want their old UID, email address, etc.

Resources like file space should be freed when they leave or change group,
and privileges shouldn't be hoarded.

Once we're done, there should be no need for anyone responsible for
granting access to some facility to run his own directory.

I figure we'll eventually manage some 20 distinguished platforms and an
even larger number of more generic "access targets", an assortment of
automatically populated mailing lists plus free ones and some more.

A typical user may hold some 5 accounts, 5 resource objects, have access
to 5 additional targets, and be a member of 10 mailing lists. So that
makes some 150k objects, and a lot of links between them (preferrably
expirable ones ;-)

Account creation, prolongation, resource granting, password resets etc. must
be delegatable. We want to be able to get new users going immediately,
and do the paperwork and signature stuff later (within a week or so, but
then accounts that haven't been finally authorized must be disabled) if the
responsible is not present.

Quite a challenge. Do you think Ganymede's up to it?

Cheers,
	Stephan Wiesand

--

 ----------------------------------------------------
| Stephan Wiesand  |                                |
|                  |                                |
| DESY Zeuthen     | phone  +49 33762 7 7370        |
| Platanenallee 6  | fax    +49 33762 7 7216        |
| D-15738 Zeuthen  | mobile +49 171 317 6367        |
| Germany          | email  stephan.wiesand@desy.de |
 ----------------------------------------------------



----------------------------------------------------------------------------
To make changes to your subscription to the Ganymede Dev mailing list, send
mail to majordomo@arlut.utexas.edu.

To unsubcribe, include the line

unsubscribe ganymede-dev

in the body of your mail message

Visit the Ganymede web page at http://www.arlut.utexas.edu/gash2

----------------------------------------------------------------------------


  • [Ganymede Dev] Re: many G... questions
    • From: Stephan Wiesand <wiesand@ifh.de>