DRI Forum "working" meeting minutes, 10/31/2001

Agenda

  1. Verify design goals and non-goals of DRI 1.0 before diving into specific issues.
  2. Address the design of the interfaces in the "second half" of the API. That is, those interfaces that are called after DRI_Distribution_create in a typical DRI use case.

Attendance

Agenda item 1  --- DRI scope confirmation

steve: concerned about the definition of DRI 1.0 "compliance" rossen: regarding subsequent DRI versions after 1.0 spec: group: decides that the focus of this meeting should be exclusively on buffer sharing issues, and not to dive deeply into dynamic application concerns (which is an issue to be tackled in its entirety after DRI 1.0 is finished)

Agenda item 2  --- Buffer sharing issues


steve: one approach is just to let the implementation make optimization choices based on user's DRI_Reorg objects -- relieves the user from having to specify memory management issues, and enables implementation optimizations

ken: problem is that customers inevitably will want to more explicitly specify their environment in some cases. We should enable that. The best approach would be 2 user-selectable modes -- one in which user specifies the sharing, another in which system detects and optimizes automatically.

steve: the reorg_create_user and reorg_create_system calls enable this right now. The create_user method is awkward with a function pointer mechanism to attach a user memory allocation routine.

jamie: callback functions for memory allocation should be removed for ease of use reasons. The use of function pointer interfaces was decided by the group to be a problem in the first VSIPL interoperation proposal, for example.

steve: let's first determine what the goals are for providing buffer sharing in DRI.  1) reduce memory utilization 2) better cache-locality by doing in-place processing
 
 

group decides to first examine the DRI_Reorg_create_system use-case against the scenarios outlined in Ken's skeleton working briefing. (see http://www.data-re.org/DRI_BufferSharing_Issues.pdf)

Scenario 1 (pg. 8 of briefing): user would have to call DRI_Reorg_process_inplace to share buffers between R1-R and R2-S. There is no opportunity to share for data reorganizations because this is a pipeline design. things look good here with respect to the current specification.

Scenario 2 (pg. 9 of brief): also looks good, by virtue of DRI_Reorg_process_inplace

Scenario 3 (pg  10):

steve: suggests that the system would never construct this scenario when using DRI_Reorg_create_system(). The system would never want to do an out-of-place data reorg in the clique scenario. It would use an internal temporary buffer, and place the result of the data reorg in the source buffer -- therefore using only 1 user-exposed buffer.

group: there isn't really any value to specifying out of place data reorgs in a clique environment (from an overall memory utilization standpoint). If the user is handed a "destination" buffer that is really the source buffer, no big deal. When you call DRI_Reorg_put_datapart, you are relinquishing control of the source buffer anyway.

jamie: a benefit to specifying out-of-place data reorg is that both buffers affiliated with the reorganization are exposed to the user (a good thing in his opinion). Some users want to have access to all memory affiliated with the application.

group: this level of control could be delegated to the DRI_Reorg_create_user use-case

jon: 1) give the user some control to specify oop-dr with clique configuration 2) have the system return an error in the event that it needs to create internal temp buffers that won't be exposed to user. The user request to not create temp buffers could be included in the flag input parameter. Implementation would return an error if it _has_ to.

jon: an alternative to above approach is for the user to specify a process-wide high water mark for temp buffer allocation, and DRI_Reorg_create would return an error in the event that the high water mark is exceeded.

steve: proposes that we change the DRI_Reorg_create_user semantics, so that when the user's alloc_handler is called, a buffer_index parameter of -1 indicates that this is a temporary buffer that must not be in use by the application when the associated DRI_Reorg is in use. During other times, the application can work with that temp space.

group: the bottom line is that we need to document everywhere when temporary memory will be allocated by the implementation.

with system managed  memory use case, there is no way to specify whether DR is in place or oop. We could just say that the Reorg_create_user mode is the way to express this.

informal resolution on this scenario: we leave discretion to the implementation here. There should be no way for the user to specify this scenario when calling DRI_Reorg_create_system. If a specific assignment of buffersets to reorgs is required, DRI_Reorg_create_user must be called instead. In the DRI_Reorg_create_system() use case, the library is free to create (or not) create temp buffers affiliated with the reorgs. No change to the spec is needed (except to document this design). Implementations should document the scenarios in which temp buffers are allocated. The specific size of temp buffers is not documented, and is expected to differ among different vendors/architectures.

group: any user-customizable restrictions on temp buffer space should be deferred until a later version of DRI spec.
 

Scenario 4 (pg. 11): no problems here, we have DRI_Reorg_process_inplace.
 

steve: R1-R and R2-S: what if create_system is used with one, and create_user with another?
cases in question:

another related issue: we resolve that calling DRI_Reorg_create_user with NULL alloc handler is illegal

steve: R2-S should get all of its buffer knowledge from R1-R (regardless of how R2-S was constructed)

Scenario 5 (pg. 12): recall in-place data reorganizations are not specifiable when calling create_system -- ok
This is just run of the mill out of place processing -- should just work
 

Scenario 6 (page 13):
 similar to other scenarios -- DRI_Reorg_process_inplace will work
 
 

*** discussing user memory buffer sharing

steve: why do we want to allow user allocated memory to be shared among multiple Reorgs?

ken:

myra: allowing this degree of specification can enable other useful types of buffer sharing relationships beyond what we have currently considered. For example, a single process group pushing data to different process groups in a round robin fashion (over disjoint time periods) requires multiple reorgs. Associating the same user-allocated memory to each channel achieves memory efficiency that would be critical to this design.
 

the problem of buffer sharing is related to I/O overlap and the num_buffers argument to Reorg_create

steve: consider R1-S and R1-R in a clique reorg, each created with num_buffers=1. Does this mean that both sides of reorg can have a buffer checked out (via Reorg_get_datapart) at the same time, or only 1 side can use a single buffer at a time?

myra: if you had a bufferset object, containing 1 buffer, and associate it with both reorgs, then the number of buffers is determined (it is exactly 1).
 

**** discussing august 2001 minutes candidate buffer sharing scenarios (we didn't get too far on this topic before meeting concluded)

August buffer sharing scenario #1 / system memory

steve: this involves Reorg_connect, and not network wide commit. network wide commit prevented deadlock that could occur if user writes erroneous programs (by calling Reorg_connect in the wrong order)

steve: discusses his implementation of network wide commit that requires 3 passes in the global commit call over all reorg objects in the network

exploring if we can accomplish these phases without network wide commit:

jamie: DRI_Reorg_create cannot register buffer size information, so the connection cannot be finalized at DRI_Reorg_connect time.
ken: actually, we have previously discussed how this would work in clique case (1st Reorg_connect call does nothing, 2nd Reorg_connect call does collective communication).
 

**** miscellaneous topics

rossen: we need error codes in the spec, focusing on a whole list of illegal conditions. Make very clear in documentation for each function what is legal, and what is illegal.