Hi all and thanks for the discussion about 2.4. I have cc'd the Talk list so that everyone can see this discussion as it is precisely the type of conversation that we want to have happen in our community space.
What I take away from the discussion thread is that there is a tension between taking on the merging work now versus shifting that work to a later point in our development schedule so that the immediate needs of deployments at UCB might instead be more efficiently addressed (based on current resource constraints). The merging work has been talked about and pushed out for over a year now so the landscape looks different from the deployer perspective than it once did, yet the change to the model that the merging work represents remains critical to the long-term viability of the software for future deployers.
A couple of questions that come to mind are:
What do we lose by not doing the merging work now?
I think that the answer is (based on Patrick's comments below) that we lose the ability to extend repeatable groups, and we lose support for localization (neither of which we have now). Is that correct? Chris, you pointed out that UCB has already addressed repeatable groups through a work-around. Chris Pott has confirmed that he has had to do the same thing at SMK and I suspect that MMI and Walker have had to as well (but I don't know for certain- Jesse, Chris, Nate please chime in if you are so moved). This means that while the merging work will make extending repeatable groups easier, more efficient and effective in the future, current deployers won't necessarily benefit from the change to the model right now. Same goes for localization.
The other work defined for 2.4 ensures that we gain support for preferred and non-preferred terms and support for non-default vocabularies and is defined as a low-risk, achievable set of deliverables for a 2-week development sprint.
What is prevented from being developed in releases 2.6, 2.7, and 2.8 if we do not do the merging work now?
I'm not sure that I know what the answer is to this question. Would we, for example, still be able to work on relationships between catalog records (AKA multiple object handling) in 2.7 without the merging work? Is anything related to media handling(2.6) or improving the upgrade process (2.6) affected by not doing the merging work?
If what we gain from pushing the merging work out to a later time in the development schedule is more beneficial for all current deployers, then I support the shift. It sounds like that is the case, but I need your help Patrick with answers to the questions above, in order to fully understand the ramifications.
Angela
On Apr 20, 2012, at 12:03 PM, Patrick Schmitz wrote:
> Yes, this is all correct so far.
>
> We *could* take a messier approach with the PT/NPT changes. That would work, but is a bit messier in a number of places. However, the messier approach is well understood and low risk.
>
> We had been planning to take an approach that was cleaner, but required support for merging.
>
> Yesterday, discussions about how we would actually implement the merging led to some questions about how this would look in the database, and the implications for report authoring. There was some surprise from our deployers at how complex the so-called "merged" model would be in the DB.
>
> For SMK, it will mean that any report authoring will be more complex if they are injecting language all over.
>
> Richard and I are feeling ambivalent about this. As software engineers, we feel that the functionality should be there to support merging, and that it would make the code for services and app cleaner, and would present a nice model to the UI. However, being practical, and considering the additional pain that this will entail for reporting, I am having some doubts. Knowing how long it is taking me to get code working in the app layer, I am concerned about the risk.
>
> The details:
>
> Let's say we want to add language to some repeating group of fields (let's call it FooInfo). This will entail setting up specific fields for language in an extension schema (it actually has to be something like smk_FooInfoList/smk_FooInfo/language), some associated config in the app layer, and the obvious UI work to present it to the user.
>
> What happens on the back-end is that the new language fields end up in a separate table (smk_FooInfo) that the app layer and services have to keep aligned with the FooInfo part of the common schema. The UI and app layer should be able to ensure that if there are three rows in FooInfo, there will be three rows in smk_FooInfo. This is the code I would have to write, and it entails changes to the core of the app layer, so is very high risk.
>
> Issues:
> If someone uses the REST APIs or import and does not get this right, everything will get hosed (e.g., SMK's language markers will end up on the wrong row for the notes they were supposed to be associated with. There is no reasonable way for services to even detect when an import or REST call has screwed this up.
> When someone wants to generate a report that includes the language (or selects for the language) for a note, the query will get a little funky, requiring more careful joins and constraints in what is already a complex data model.
> ChrisH was surprised that the fields would not be merged into the common table, until we explained that it cannot without changing the common model in the code and for all the users. Even if we split up the repository to be one per tenant, our code (a lot of the the REST stuff) would get confused by not having a consistent definition of the schema, and so we have to add a new table for every injected field.
>
> One alternative is to defer or just not support merging. Instead, we would either:
> Tell folks to live with the model they have now, in which they have to redefine the whole repeating structure everywhere they want to add a field. Ugly, but well understood. The main objection to this after the amount of typing, was that upgrades could be problematic. If we can stabilize the schema and stop changing them so much (which we really need to do anyway), this should not be so much of an issue.
> OR
> Be more liberal about adding new fields where folks need them, even if we are not supporting them in the UI right away. E.g., adding Language to a bunch of groups would cause us little pain, and then SMK could contribute templates that use the language fields.
> Neither of these are great, and I do not mean to be glib in suggesting them. Am trying to be practical and realistic about our current resource situation.
>
> HTH - Patrick
>
>
> From: Chris Hoffman [mailto:chris.hoffman@berkeley.edu]
> Sent: Friday, April 20, 2012 9:01 AM
> To: Carly Bogen
> Cc: Angela Spinazze; Heather Hart; Carl Goodman; David A. GREENBAUM; Patrick Schmitz
> Subject: Re: Concerns about 2.4
>
> Thanks, Carly. Yes, you're right. Patrick has been calling this work "merging" because the app layer needs to merge the extension fields into the core group somehow. This would be needed for PT/NPT if the common set of term fields were all in one shared schema. However, Patrick has acknowledged that the "common" fields could be in their own authority-specific schema. We just have extra work to try to make sure those fields don't diverge in the future. I might not be translating this correctly :-)
>
> Chris
>
> On Apr 20, 2012, at 8:54 AM, Carly Bogen wrote:
>
>> Hi Chris,
>>
>> Based on the design and scope discussions we had this week, preferred and non-preferred terms and support for non-default vocabularies are the main drivers for 2.4. I'm not sure where the concern about extending core field groups came from, since it was not something we discussed. Is this related to the merging issue? If so, it was my understanding that fixing the merging issue was key to making PT and NPT and non-default vocabularies work.
>>
>> Patrick, maybe you can weigh in on this and help me understand.
>>
>> Thanks,
>> Carly
>>
>> On Fri, Apr 20, 2012 at 11:50 AM, Chris Hoffman wrote:
>> Hi all,
>>
>> I'm really, really happy to see how much work the team has done to make 2.4 more realistic. However at UCB we still have some concerns that I want to bring up with you and other deployers.
>>
>> In talking with Patrick and Richard yesterday, I'm concerned that the work to enable deployers to extend core field groups is going to swamp other areas of work that for us are absolutely critical. PAHMA needs to launch with version 2.4, or they will have to extend TMS licensing (probably costing tens of thousands of dollars) and answer to campus auditors.
>>
>> We also want to add fields to existing groups, but in the meantime, we have created our own versions of those groups as needed. The work Patrick described to me is hard, risky, and has other implications for data loading and reporting (reporting will be very hard and the approach might have to change completely to point iReport at the App layer instead of at Postgres tables). Patrick is the only one who can do this work now, and he is concerned about being on the critical path for something like this. He is off for 3 weeks starting in early June.
>>
>> For PAHMA to launch with 2.4, they need the preferred/non-preferred term stuff, the support for multiple vocabularies within a single authority, and a fix to the dollar-sign import bug. There is also some significant customization work here that will need help from Patrick's team (e.g., supporting range search on object numbers, "show me objects with numbers 91-202 through 91-292").
>>
>> I suspect field group extension is critical for SMK in particular. However, we really are on the edge of a major situation here.
>>
>> Thanks,
>> Chris
>>
>