[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ganymede Dev] Re: many G... questions

Date Thu, 4 Jul 2002 19:11:54 -0500
From Jonathan Abbey <jonabbey@arlut.utexas.edu>

On Thu, Jul 04, 2002 at 11:32:45PM +0200, Stephan Wiesand wrote:
| On Wed, 3 Jul 2002, Jonathan Abbey wrote:
| 
| ...
| 
| Clients should be able to obtain a session with their Kerberos TGT if they
| possess one, without having to provide a password. Now that JAAS is part
| of the JDK, wouldn't that be the right way of doing authentication?

I'm not familiar with the JAAS.  I imagine the issues would revolve
around how the client would get access to the TGT if it is run from a
browser.  By default, of course, a downloaded Java applet has almost
no privileges to access anything of the local environment.  The
ganymede GUI client and xmlclients might benefit from that if run as
an application from a command line, but the ganymede GUI client is
dual-mode, and when run as an applet has very restricted privileges.

An additional question is how to represent the privileges that the
Kerberos ticket is authenticating for.  Ganymede is designed to have
its own record of the user account, and to rely on it to decide login
and personae availability.  I imagine that changing Ganymede so that
it referenced an external authenticator could certainly be done if
need be, but I'm afraid I don't have enough experience with Kerberos
to understand the advantages over simply entering the
username/password pair.

| Again, good news. The XML approach, plus some perl for convenience, should
| do. By the way, are the permission mechanisms in effect in an XMLSession?

Of course.  XMLSessions are layered on top of the GanymedeSession, so
all permissions are maintained.  Here at ARL, users typically change
their passwords by using a web CGI, written in Perl, which turns
around and submits a generated XML requested to the server through the
xmlclient.  Users can't generate arbitrary requests through this web
interface, but all of our Solaris users have direct access to the
xmlclient.  If the xmlclient didn't enforce permissions, that would be
a bad thing.

| But at the time DBEditObject.finalize...() is called, we know exactly
| what changes if the transaction is committed, don't we? And we are allowed
| to create new objects and add them to the current DBEditSet.

The DBEditObject.finalize...() methods are called to finalize changes
to individual fields, not to all changes to a given object.

It would certainly be possible to add another method to DBEditObject
that would be called before commitPhase1(), as long as some reasonable
semantics could be established for it.  The trick would be that if you
hit 'commit', and this method was called on all objects that have been
checked out for editing by the transaction, and then the commit fails
for some reason (such as a missing required field in some object in
the transaction), you'd want to be sure that a subsequent commit
attempt wouldn't generate redundant change event objects.

| So why not create a on object of type "event: Change to User". It's
| existance after the commit means that something has to be done on a
| platform. The builder task could then check for such objects, and act
| accordingly for each with the value "new" for the attribute "status" by
| sending the relevant information to the platform and then changing the
| status to "sent". The platform would acknowledge the event by setting the
| status to "read" via xmlclient, do the work, then update the status to
| "done" (or delete the event object). In case of failure, status is sent to
| "failed". A scheduled task would check for "failed" event objects and
| overdue ones - resend them and/or notify someone.

Sure, that sounds very feasible.

| It's still ugly because of the additional builder tasks triggered, but
| those should be fairly cheap. But communication with the platforms
| would be fully decoupled from transaction commits, and full status
| information would be available within Ganymede, changes pending execution
| could be listed with the query facility, ...
| 
| Wouldn't this work?

Yes, I think it would work, and work well.  There would obviously be
some design effort required to implement something clean, and it's
possible that some generic mechanism could be added that others might
be able to make use of in a generic fashion if it was done right, 
but I see absolutely no difficulty in doing such a thing.

We'd probably want to have some sort of appropriate object queueing
and ordering mechanism, for instance.  Right now, neither the GUI
client nor the server API provides a method for reordering items
within a Vector field, which would probably be important if you wanted
to maintain object queues, unless you simply used date fields for
everything.

| > If you anticipate managing several unlinked types of data in Ganymede,
| > running multiple Ganymede servers could reduce the impact on
| > concurrency, of course.
| 
| I'm afraid it will be more like a fabric.

That's okay, Ganymede is made to work with heavily interlinked
objects.  Just so long as you don't attempt to make external builds
synchronous with transaction commits, I don't believe it would be a
problem to have things as heavily linked as you like.

The one place I've seen problems with heavy linkages is when you have
certain objects that lots of objects get symmetrically linked to.  In
such cases, only one transaction at a time can make or break a link to
the common object, which may well limit concurrency.  By making it an
anonymous link, the current Ganymede server core can handle tons of
concurrent link and unlink operations, though.

| > No, it's not that easy, but not, perhaps, for the reason you'd expect.
| > The big difficulty I've had in trying to figure out a way to support
| > derived object types is in the user interface side of things.  The
| > Ganymede GUI client's tree display currently shows all objects of a
| > given type in a folder, and the query engine executes queries based on
| > an object type.  It's an open design question to decide how the
| > Ganymede client should handle the object tree when there is a type
| > hierarchy for the objects.
| 
| It seems straightforward to me: Your fine tree widget handles subfolders
| nicely, and as a user I'd like queries to consider the object type I
| specify and all derived types. I'm probably missing something, what is it?

Mm, no, perhaps not.  You're thinking that there would be a User
folder, which you would open and see 'UNIX Users', 'NT Users',
'UNIX&NT Users' each as separate subfolders, along with all plain
Users listed thereunder?

| Would it be possible to allow a *list* of object types as reference
| targets instead of just one? That would solve most cases I have in mind
| nicely.

Sure, the DBEditObject class already has methods that you can override
to generate custom choice lists for InvidDBFields.  Right now, though,
the GUI client doesn't show any kind of type identifier for listed
objects, and if you had two objects of differing kinds with the same
name, it would probably get confused.

I've thought about making the Invid selector widget in the GUI client
be tree-based in some fashion, to reflect the types of mixed choice
lists.  Or, alternatively, at least show an icon of some sort and
track and display the type identity to distinguish things that might
not have a shared namespace.

Derivable types aside, the major extension I've tried to consider to
Ganymede's type system is to allow for hierarchical object containment.

That would also complicate the GUI, especially considering that at
present an object may have more than one owner.  If ownership was made
congruent with containership, the client's object tree could not
literally be a tree, as individual objects could appear under multiple
container-owners.  The XML file format that Ganymede uses currently
would need to be changed as well, along with the query engine, and so
on.

| ...
| 
| > I'm pleased and excited, and I would love to hear more about your
| > goals.  How large a dataset do you imagine you would eventually want
| > to manage with Ganymede?
| 
| We have to deal with about 3000 persons. This number will hopefully
| double within a couple of years. DESY has two sites of quite different
| scale. Most, but not all users have accounts for the larger site's
| facilities, fewer for the small one. Not all platforms exist at both
| sites. We'd like to avoid keeping unused accounts on any platform or site.

Right.

| Fluctuation is high. Users from external collaborating institutes,
| guests, and customers of certain facilities need access to computing
| (and other) resources for a couple of days, months, or years. External
| and internal users change their field of activity frequently. They
| leave for a couple of months or years to work somewhere else, then come
| back and want their old UID, email address, etc.
| 
| Resources like file space should be freed when they leave or change group,
| and privileges shouldn't be hoarded.
| 
| Once we're done, there should be no need for anyone responsible for
| granting access to some facility to run his own directory.
| 
| I figure we'll eventually manage some 20 distinguished platforms and an
| even larger number of more generic "access targets", an assortment of
| automatically populated mailing lists plus free ones and some more.
| 
| A typical user may hold some 5 accounts, 5 resource objects, have access
| to 5 additional targets, and be a member of 10 mailing lists. So that
| makes some 150k objects, and a lot of links between them (preferrably
| expirable ones ;-)

We have less than 12k objects in our Ganymede server, but it works
fine on a single-processor Ultra 5 workstation with only 256 megabytes
of memory.  Our data load takes up only about 10 megabytes of heap
space, so I imagine scaling that up to one or two hundred megabytes or
so shouldn't be too bad, particularly if you use a modern
multiprocessor UltraSparc-III system with Sun's latest HotSpot server
VM technology.

| Account creation, prolongation, resource granting, password resets etc. must
| be delegatable. We want to be able to get new users going immediately,
| and do the paperwork and signature stuff later (within a week or so, but
| then accounts that haven't been finally authorized must be disabled) if the
| responsible is not present.
| 
| Quite a challenge. Do you think Ganymede's up to it?

As long as Sun's JVM technology is up to that large a heap space, I'd
say that Ganymede should be up to it.

I can't think of anything in the server that should really have a
scaling problem for a scale-up of 10 or 20 times, other than perhaps
garbage collection overhead, and that could be significantly reduced
by using Sun's latest continual-collection JVM on a multiprocessor
system, and/or by simply giving it a really really big heap.  A lot of
the data structures in the server are really over-engineered for our
level of scale and concurrency.  The client may need some work to be
able to effectively handle such large datasets, but even that mostly
from a usability perspective.

Especially if you're thinking of going away from doing complete builds
at each build time, I would think the scalability should be all right.

| Cheers,
| 	Stephan Wiesand
| 
| --
| 
|  ----------------------------------------------------
| | Stephan Wiesand  |                                |
| |                  |                                |
| | DESY Zeuthen     | phone  +49 33762 7 7370        |
| | Platanenallee 6  | fax    +49 33762 7 7216        |
| | D-15738 Zeuthen  | mobile +49 171 317 6367        |
| | Germany          | email  stephan.wiesand@desy.de |
|  ----------------------------------------------------

-- 
-------------------------------------------------------------------------------
Jonathan Abbey 				              jonabbey@arlut.utexas.edu
Applied Research Laboratories                 The University of Texas at Austin
Ganymede, a GPL'ed metadirectory for UNIX     http://www.arlut.utexas.edu/gash2

----------------------------------------------------------------------------
To make changes to your subscription to the Ganymede Dev mailing list, send
mail to majordomo@arlut.utexas.edu.

To unsubcribe, include the line

unsubscribe ganymede-dev

in the body of your mail message

Visit the Ganymede web page at http://www.arlut.utexas.edu/gash2

----------------------------------------------------------------------------


  • Re: [Ganymede Dev] Re: many G... questions
    • From: Jonathan Abbey <jonabbey@arlut.utexas.edu>