steve: one approach is just to let the implementation make optimization
choices based on user's DRI_Reorg objects -- relieves the user from having
to specify memory management issues, and enables implementation optimizations
ken: problem is that customers inevitably will want to more explicitly specify their environment in some cases. We should enable that. The best approach would be 2 user-selectable modes -- one in which user specifies the sharing, another in which system detects and optimizes automatically.
steve: the reorg_create_user and reorg_create_system calls enable this right now. The create_user method is awkward with a function pointer mechanism to attach a user memory allocation routine.
jamie: callback functions for memory allocation should be removed for ease of use reasons. The use of function pointer interfaces was decided by the group to be a problem in the first VSIPL interoperation proposal, for example.
steve: let's first determine what the goals are for providing buffer
sharing in DRI. 1) reduce memory utilization 2) better cache-locality
by doing in-place processing
group decides to first examine the DRI_Reorg_create_system use-case against the scenarios outlined in Ken's skeleton working briefing. (see http://www.data-re.org/DRI_BufferSharing_Issues.pdf)
Scenario 1 (pg. 8 of briefing): user would have to call DRI_Reorg_process_inplace to share buffers between R1-R and R2-S. There is no opportunity to share for data reorganizations because this is a pipeline design. things look good here with respect to the current specification.
Scenario 2 (pg. 9 of brief): also looks good, by virtue of DRI_Reorg_process_inplace
Scenario 3 (pg 10):
steve: suggests that the system would never construct this scenario when using DRI_Reorg_create_system(). The system would never want to do an out-of-place data reorg in the clique scenario. It would use an internal temporary buffer, and place the result of the data reorg in the source buffer -- therefore using only 1 user-exposed buffer.
group: there isn't really any value to specifying out of place data reorgs in a clique environment (from an overall memory utilization standpoint). If the user is handed a "destination" buffer that is really the source buffer, no big deal. When you call DRI_Reorg_put_datapart, you are relinquishing control of the source buffer anyway.
jamie: a benefit to specifying out-of-place data reorg is that both buffers affiliated with the reorganization are exposed to the user (a good thing in his opinion). Some users want to have access to all memory affiliated with the application.
group: this level of control could be delegated to the DRI_Reorg_create_user use-case
jon: 1) give the user some control to specify oop-dr with clique configuration 2) have the system return an error in the event that it needs to create internal temp buffers that won't be exposed to user. The user request to not create temp buffers could be included in the flag input parameter. Implementation would return an error if it _has_ to.
jon: an alternative to above approach is for the user to specify a process-wide high water mark for temp buffer allocation, and DRI_Reorg_create would return an error in the event that the high water mark is exceeded.
steve: proposes that we change the DRI_Reorg_create_user semantics, so that when the user's alloc_handler is called, a buffer_index parameter of -1 indicates that this is a temporary buffer that must not be in use by the application when the associated DRI_Reorg is in use. During other times, the application can work with that temp space.
group: the bottom line is that we need to document everywhere when temporary memory will be allocated by the implementation.
with system managed memory use case, there is no way to specify whether DR is in place or oop. We could just say that the Reorg_create_user mode is the way to express this.
informal resolution on this scenario: we leave discretion to the implementation here. There should be no way for the user to specify this scenario when calling DRI_Reorg_create_system. If a specific assignment of buffersets to reorgs is required, DRI_Reorg_create_user must be called instead. In the DRI_Reorg_create_system() use case, the library is free to create (or not) create temp buffers affiliated with the reorgs. No change to the spec is needed (except to document this design). Implementations should document the scenarios in which temp buffers are allocated. The specific size of temp buffers is not documented, and is expected to differ among different vendors/architectures.
group: any user-customizable restrictions on temp buffer space should
be deferred until a later version of DRI spec.
Scenario 4 (pg. 11): no problems here, we have DRI_Reorg_process_inplace.
steve: R1-R and R2-S: what if create_system is used with one, and create_user
with another?
cases in question:
steve: R2-S should get all of its buffer knowledge from R1-R (regardless of how R2-S was constructed)
Scenario 5 (pg. 12): recall in-place data reorganizations are
not specifiable when calling create_system -- ok
This is just run of the mill out of place processing -- should just
work
Scenario 6 (page 13):
similar to other scenarios -- DRI_Reorg_process_inplace will
work
*** discussing user memory buffer sharing
steve: why do we want to allow user allocated memory to be shared among multiple Reorgs?
ken:
the problem of buffer sharing is related to I/O overlap and the num_buffers argument to Reorg_create
steve: consider R1-S and R1-R in a clique reorg, each created with num_buffers=1. Does this mean that both sides of reorg can have a buffer checked out (via Reorg_get_datapart) at the same time, or only 1 side can use a single buffer at a time?
myra: if you had a bufferset object, containing 1 buffer, and associate
it with both reorgs, then the number of buffers is determined (it is exactly
1).
**** discussing august 2001 minutes candidate buffer sharing scenarios (we didn't get too far on this topic before meeting concluded)
August buffer sharing scenario #1 / system memory
steve: this involves Reorg_connect, and not network wide commit. network wide commit prevented deadlock that could occur if user writes erroneous programs (by calling Reorg_connect in the wrong order)
steve: discusses his implementation of network wide commit that requires 3 passes in the global commit call over all reorg objects in the network
jamie: DRI_Reorg_create cannot register buffer size information, so
the connection cannot be finalized at DRI_Reorg_connect time.
ken: actually, we have previously discussed how this would work in
clique case (1st Reorg_connect call does nothing, 2nd Reorg_connect call
does collective communication).
**** miscellaneous topics
rossen: we need error codes in the spec, focusing on a whole list of
illegal conditions. Make very clear in documentation for each function
what is legal, and what is illegal.