talk@lists.collectionspace.org

WE HAVE SUNSET THIS LISTSERV - Join us at collectionspace@lyrasislists.org

View all threads

what's the best way to represent structured dates?

CP
Christopher Pott
Tue, Aug 23, 2011 3:48 PM

Hi,

I'm looking at the structured date schema and the best way to represent
dates migrated from our old system. Any information about the following
would be much appreciated....

  1. I see some different/overlapping possibilities for recording an
    object creation date. For example, to represent "1718" (display date)
    one could either store:

a) a single date, providing year only (1718)

b) an earliest date (01.01.1718) with a latest date (31.12.1718)

c) a single date (01.01.1718) with a qualifier of + 364 days

Is there any difference in the meaning of these three (and would they
all be treated equally by services such as advanced search)? Our
migration data is currently in format 'b' but the other formats are
sometimes 'friendlier' for new input.

  1. How are date "certainties" and "qualifiers" used by advanced search?

I've specified "1718", as a creation date with a certainty of "circa".
Will a search for objects created during 1717 result in a hit?

Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define
exact limits for this type of searching?

  1. Is the Date schema locally extendable? (We need a field to hold an
    English translation of the display text)

  2. Will Advanced search by creation date also search the display date or
    will it just use the single/earliest and latest date groups?

Thanks,

Chris

Hi, I'm looking at the structured date schema and the best way to represent dates migrated from our old system. Any information about the following would be much appreciated.... 1. I see some different/overlapping possibilities for recording an object creation date. For example, to represent "1718" (display date) one could either store: a) a single date, providing year only (1718) b) an earliest date (01.01.1718) with a latest date (31.12.1718) c) a single date (01.01.1718) with a qualifier of + 364 days Is there any difference in the meaning of these three (and would they all be treated equally by services such as advanced search)? Our migration data is currently in format 'b' but the other formats are sometimes 'friendlier' for new input. 2. How are date "certainties" and "qualifiers" used by advanced search? I've specified "1718", as a creation date with a certainty of "circa". Will a search for objects created during 1717 result in a hit? Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact limits for this type of searching? 3. Is the Date schema locally extendable? (We need a field to hold an English translation of the display text) 4. Will Advanced search by creation date also search the display date or will it just use the single/earliest and latest date groups? Thanks, Chris
CH
Chris Hoffman
Tue, Aug 23, 2011 6:04 PM

Hi Chris,

Thanks for asking these questions.  We've been wondering the same things.  Have you changed any existing calendar date fields to structured dates or added new structured date fields?  I think this is all quite new, and together we might be able to come up with some best practices (related to data migration and data entry).

I suspect that search and advanced search of structured date fields is going to be an issue for those of us deploying CollectionSpace.  Our users will want CSpace to be smart about structured dates, interpreting qualifiers, and so on, but I bet that we won't see much as much as we would like in the first version of advanced search.  I'm always happy to be proved wrong!  :-)

Chris

On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote:

Hi,

I’m looking at the structured date schema and the best way to represent dates migrated from our old system. Any information about the following would be much appreciated….

  1. I see some different/overlapping possibilities for recording an object creation date. For example, to represent “1718” (display date) one could either store:
    a) a single date, providing year only (1718)
    b) an earliest date (01.01.1718) with a latest date (31.12.1718)
    c) a single date (01.01.1718) with a qualifier of + 364 days

Is there any difference in the meaning of these three (and would they all be treated equally by services such as advanced search)? Our migration data is currently in format ‘b’ but the other formats are sometimes ‘friendlier’ for new input.

  1. How are date “certainties” and “qualifiers” used by advanced search?
    I’ve specified “1718”, as a creation date with a certainty of “circa”. Will a search for objects created during 1717 result in a hit?
    Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact limits for this type of searching?

  2. Is the Date schema locally extendable? (We need a field to hold an English translation of the display text)

  3. Will Advanced search by creation date also search the display date or will it just use the single/earliest and latest date groups?

Thanks,
Chris


Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org

Hi Chris, Thanks for asking these questions. We've been wondering the same things. Have you changed any existing calendar date fields to structured dates or added new structured date fields? I think this is all quite new, and together we might be able to come up with some best practices (related to data migration and data entry). I suspect that search and advanced search of structured date fields is going to be an issue for those of us deploying CollectionSpace. Our users will want CSpace to be smart about structured dates, interpreting qualifiers, and so on, but I bet that we won't see much as much as we would like in the first version of advanced search. I'm always happy to be proved wrong! :-) Chris On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote: > Hi, > > I’m looking at the structured date schema and the best way to represent dates migrated from our old system. Any information about the following would be much appreciated…. > > 1. I see some different/overlapping possibilities for recording an object creation date. For example, to represent “1718” (display date) one could either store: > a) a single date, providing year only (1718) > b) an earliest date (01.01.1718) with a latest date (31.12.1718) > c) a single date (01.01.1718) with a qualifier of + 364 days > > Is there any difference in the meaning of these three (and would they all be treated equally by services such as advanced search)? Our migration data is currently in format ‘b’ but the other formats are sometimes ‘friendlier’ for new input. > > 2. How are date “certainties” and “qualifiers” used by advanced search? > I’ve specified “1718”, as a creation date with a certainty of “circa”. Will a search for objects created during 1717 result in a hit? > Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact limits for this type of searching? > > 3. Is the Date schema locally extendable? (We need a field to hold an English translation of the display text) > > 4. Will Advanced search by creation date also search the display date or will it just use the single/earliest and latest date groups? > > Thanks, > Chris > _______________________________________________ > Talk mailing list > Talk@lists.collectionspace.org > http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org
PS
Patrick Schmitz
Wed, Aug 24, 2011 3:55 PM

Aron was out yesterday, and Rick is out for a few days. They have done a lot
of the structured dates work to date.

The last I heard, Aron was hoping to get consensus from the users about how
we map all the text fields to actual date fields that we can reason on (for
search, ordering, etc). To date, there seems to be no such consensus, which
will preclude search, ordering, etc. on structured dates. I would think that
this would make structured dates quite a bit less useful...

Perhaps (I hope) my understanding is out of date, and we just need to
implement the logic. Perhaps Aron can jump in this AM, when he gets in.

Patrick


From: talk-bounces@lists.collectionspace.org
[mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman
Sent: Tuesday, August 23, 2011 11:05 AM
To: Christopher Pott
Cc: talk@lists.collectionspace.org
Subject: Re: [Talk] what's the best way to represent structured dates?

Hi Chris,

Thanks for asking these questions.  We've been wondering the same things.
Have you changed any existing calendar date fields to structured dates or
added new structured date fields?  I think this is all quite new, and
together we might be able to come up with some best practices (related to
data migration and data entry).

I suspect that search and advanced search of structured date fields is going
to be an issue for those of us deploying CollectionSpace.  Our users will
want CSpace to be smart about structured dates, interpreting qualifiers, and
so on, but I bet that we won't see much as much as we would like in the
first version of advanced search.  I'm always happy to be proved wrong!  :-)

Chris

On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote:

Hi,

I'm looking at the structured date schema and the best way to represent
dates migrated from our old system. Any information about the following
would be much appreciated..

  1. I see some different/overlapping possibilities for recording an object
    creation date. For example, to represent "1718" (display date) one could
    either store:

a) a single date, providing year only (1718)

b) an earliest date (01.01.1718) with a latest date (31.12.1718)

c) a single date (01.01.1718) with a qualifier of + 364 days

Is there any difference in the meaning of these three (and would they all be
treated equally by services such as advanced search)? Our migration data is
currently in format 'b' but the other formats are sometimes 'friendlier' for
new input.

  1. How are date "certainties" and "qualifiers" used by advanced search?

I've specified "1718", as a creation date with a certainty of "circa". Will
a search for objects created during 1717 result in a hit?

Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact
limits for this type of searching?

  1. Is the Date schema locally extendable? (We need a field to hold an
    English translation of the display text)

  2. Will Advanced search by creation date also search the display date or
    will it just use the single/earliest and latest date groups?

Thanks,

Chris


Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace
.org

Aron was out yesterday, and Rick is out for a few days. They have done a lot of the structured dates work to date. The last I heard, Aron was hoping to get consensus from the users about how we map all the text fields to actual date fields that we can reason on (for search, ordering, etc). To date, there seems to be no such consensus, which will preclude search, ordering, etc. on structured dates. I would think that this would make structured dates quite a bit less useful... Perhaps (I hope) my understanding is out of date, and we just need to implement the logic. Perhaps Aron can jump in this AM, when he gets in. Patrick _____ From: talk-bounces@lists.collectionspace.org [mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman Sent: Tuesday, August 23, 2011 11:05 AM To: Christopher Pott Cc: talk@lists.collectionspace.org Subject: Re: [Talk] what's the best way to represent structured dates? Hi Chris, Thanks for asking these questions. We've been wondering the same things. Have you changed any existing calendar date fields to structured dates or added new structured date fields? I think this is all quite new, and together we might be able to come up with some best practices (related to data migration and data entry). I suspect that search and advanced search of structured date fields is going to be an issue for those of us deploying CollectionSpace. Our users will want CSpace to be smart about structured dates, interpreting qualifiers, and so on, but I bet that we won't see much as much as we would like in the first version of advanced search. I'm always happy to be proved wrong! :-) Chris On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote: Hi, I'm looking at the structured date schema and the best way to represent dates migrated from our old system. Any information about the following would be much appreciated.. 1. I see some different/overlapping possibilities for recording an object creation date. For example, to represent "1718" (display date) one could either store: a) a single date, providing year only (1718) b) an earliest date (01.01.1718) with a latest date (31.12.1718) c) a single date (01.01.1718) with a qualifier of + 364 days Is there any difference in the meaning of these three (and would they all be treated equally by services such as advanced search)? Our migration data is currently in format 'b' but the other formats are sometimes 'friendlier' for new input. 2. How are date "certainties" and "qualifiers" used by advanced search? I've specified "1718", as a creation date with a certainty of "circa". Will a search for objects created during 1717 result in a hit? Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact limits for this type of searching? 3. Is the Date schema locally extendable? (We need a field to hold an English translation of the display text) 4. Will Advanced search by creation date also search the display date or will it just use the single/earliest and latest date groups? Thanks, Chris _______________________________________________ Talk mailing list Talk@lists.collectionspace.org http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace .org
CH
Chris Hoffman
Wed, Aug 24, 2011 4:01 PM

Hi Patrick, I think it's fair to say there's no consensus because there's been no discussion :-) It's possible that each customer will want to use structured dates differently, which will make it hard to add any intelligence or even predictability to things like search.  However, I also think that if a significant number of deploying institutions can come to some consensus on best ways to use and search structured dates, then many institutions will choose to adopt the same best practices.

Thanks,
Chris

On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote:

Aron was out yesterday, and Rick is out for a few days. They have done a lot of the structured dates work to date.

The last I heard, Aron was hoping to get consensus from the users about how we map all the text fields to actual date fields that we can reason on (for search, ordering, etc). To date, there seems to be no such consensus, which will preclude search, ordering, etc. on structured dates. I would think that this would make structured dates quite a bit less useful...

Perhaps (I hope) my understanding is out of date, and we just need to implement the logic. Perhaps Aron can jump in this AM, when he gets in.

Patrick

From: talk-bounces@lists.collectionspace.org [mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman
Sent: Tuesday, August 23, 2011 11:05 AM
To: Christopher Pott
Cc: talk@lists.collectionspace.org
Subject: Re: [Talk] what's the best way to represent structured dates?

Hi Chris,

Thanks for asking these questions.  We've been wondering the same things.  Have you changed any existing calendar date fields to structured dates or added new structured date fields?  I think this is all quite new, and together we might be able to come up with some best practices (related to data migration and data entry).

I suspect that search and advanced search of structured date fields is going to be an issue for those of us deploying CollectionSpace.  Our users will want CSpace to be smart about structured dates, interpreting qualifiers, and so on, but I bet that we won't see much as much as we would like in the first version of advanced search.  I'm always happy to be proved wrong!  :-)

Chris

On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote:

Hi,
I’m looking at the structured date schema and the best way to represent dates migrated from our old system. Any information about the following would be much appreciated….

  1. I see some different/overlapping possibilities for recording an object creation date. For example, to represent “1718” (display date) one could either store:
    a) a single date, providing year only (1718)
    b) an earliest date (01.01.1718) with a latest date (31.12.1718)
    c) a single date (01.01.1718) with a qualifier of + 364 days
    Is there any difference in the meaning of these three (and would they all be treated equally by services such as advanced search)? Our migration data is currently in format ‘b’ but the other formats are sometimes ‘friendlier’ for new input.
  2. How are date “certainties” and “qualifiers” used by advanced search?
    I’ve specified “1718”, as a creation date with a certainty of “circa”. Will a search for objects created during 1717 result in a hit?
    Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact limits for this type of searching?
  3. Is the Date schema locally extendable? (We need a field to hold an English translation of the display text)
  4. Will Advanced search by creation date also search the display date or will it just use the single/earliest and latest date groups?
    Thanks,
    Chris

Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org

Hi Patrick, I think it's fair to say there's no consensus because there's been no discussion :-) It's possible that each customer will want to use structured dates differently, which will make it hard to add any intelligence or even predictability to things like search. However, I also think that if a significant number of deploying institutions can come to some consensus on best ways to use and search structured dates, then many institutions will choose to adopt the same best practices. Thanks, Chris On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote: > Aron was out yesterday, and Rick is out for a few days. They have done a lot of the structured dates work to date. > > The last I heard, Aron was hoping to get consensus from the users about how we map all the text fields to actual date fields that we can reason on (for search, ordering, etc). To date, there seems to be no such consensus, which will preclude search, ordering, etc. on structured dates. I would think that this would make structured dates quite a bit less useful... > > Perhaps (I hope) my understanding is out of date, and we just need to implement the logic. Perhaps Aron can jump in this AM, when he gets in. > > Patrick > > From: talk-bounces@lists.collectionspace.org [mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman > Sent: Tuesday, August 23, 2011 11:05 AM > To: Christopher Pott > Cc: talk@lists.collectionspace.org > Subject: Re: [Talk] what's the best way to represent structured dates? > > Hi Chris, > > Thanks for asking these questions. We've been wondering the same things. Have you changed any existing calendar date fields to structured dates or added new structured date fields? I think this is all quite new, and together we might be able to come up with some best practices (related to data migration and data entry). > > I suspect that search and advanced search of structured date fields is going to be an issue for those of us deploying CollectionSpace. Our users will want CSpace to be smart about structured dates, interpreting qualifiers, and so on, but I bet that we won't see much as much as we would like in the first version of advanced search. I'm always happy to be proved wrong! :-) > > Chris > > On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote: > >> Hi, >> I’m looking at the structured date schema and the best way to represent dates migrated from our old system. Any information about the following would be much appreciated…. >> 1. I see some different/overlapping possibilities for recording an object creation date. For example, to represent “1718” (display date) one could either store: >> a) a single date, providing year only (1718) >> b) an earliest date (01.01.1718) with a latest date (31.12.1718) >> c) a single date (01.01.1718) with a qualifier of + 364 days >> Is there any difference in the meaning of these three (and would they all be treated equally by services such as advanced search)? Our migration data is currently in format ‘b’ but the other formats are sometimes ‘friendlier’ for new input. >> 2. How are date “certainties” and “qualifiers” used by advanced search? >> I’ve specified “1718”, as a creation date with a certainty of “circa”. Will a search for objects created during 1717 result in a hit? >> Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact limits for this type of searching? >> 3. Is the Date schema locally extendable? (We need a field to hold an English translation of the display text) >> 4. Will Advanced search by creation date also search the display date or will it just use the single/earliest and latest date groups? >> Thanks, >> Chris >> _______________________________________________ >> Talk mailing list >> Talk@lists.collectionspace.org >> http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org >
PS
Patrick Schmitz
Wed, Aug 24, 2011 4:16 PM

Really? I am a little surprised that in all the design discussions for
structured dates this did not come up.

Anyway, I will say that we do not have the time to build some system that
allows for a lot of custom behavior, or elaborate heuristics here (i.e.,
trying to divine what is intended from a set of text fields). I would
strongly suggest that we either

require entry of some actual date fields upon which the application

can reason,
or
2.

we define a very simple set of rules for building such date values

from some of the existing text fields. By very simple, I mean that we only
consider fields like year, month, day, and that the rules produce two scalar
dates (early and late), which either define a range, or define a point (if
they are equal). AIUI, a structured date cannot model the idea of "some
time/any time after (or before) a date".

One further question is the semantic of which date (if any) will be used for
ordering.

If and when advanced search supports date ranges (which also implies support
for before and after logic in search), we can handle the interval logic
associated with the structured dates, but only if we have a pair of scalar
date values.

Down the road, I do think it would be interesting to implement some fuzzier
logic on date search, to better support "circa" logic and the uncertainty
values. However, I suspect that it will take longer to arrive at consensus
on these semantics, and would defer that discussion for now.

Thanks - Patrick


From: Chris Hoffman [mailto:chris.hoffman@berkeley.edu]
Sent: Wednesday, August 24, 2011 9:01 AM
To: Patrick Schmitz
Cc: 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org
Subject: Re: [Talk] what's the best way to represent structured dates?

Hi Patrick, I think it's fair to say there's no consensus because there's
been no discussion :-) It's possible that each customer will want to use
structured dates differently, which will make it hard to add any
intelligence or even predictability to things like search.  However, I also
think that if a significant number of deploying institutions can come to
some consensus on best ways to use and search structured dates, then many
institutions will choose to adopt the same best practices.

Thanks,
Chris

On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote:

Aron was out yesterday, and Rick is out for a few days. They have done a lot
of the structured dates work to date.

The last I heard, Aron was hoping to get consensus from the users about how
we map all the text fields to actual date fields that we can reason on (for
search, ordering, etc). To date, there seems to be no such consensus, which
will preclude search, ordering, etc. on structured dates. I would think that
this would make structured dates quite a bit less useful...

Perhaps (I hope) my understanding is out of date, and we just need to
implement the logic. Perhaps Aron can jump in this AM, when he gets in.

Patrick


From: talk-bounces@lists.collectionspace.org
[mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman
Sent: Tuesday, August 23, 2011 11:05 AM
To: Christopher Pott
Cc: talk@lists.collectionspace.org
Subject: Re: [Talk] what's the best way to represent structured dates?

Hi Chris,

Thanks for asking these questions.  We've been wondering the same things.
Have you changed any existing calendar date fields to structured dates or
added new structured date fields?  I think this is all quite new, and
together we might be able to come up with some best practices (related to
data migration and data entry).

I suspect that search and advanced search of structured date fields is going
to be an issue for those of us deploying CollectionSpace.  Our users will
want CSpace to be smart about structured dates, interpreting qualifiers, and
so on, but I bet that we won't see much as much as we would like in the
first version of advanced search.  I'm always happy to be proved wrong!  :-)

Chris

On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote:

Hi,

I'm looking at the structured date schema and the best way to represent
dates migrated from our old system. Any information about the following
would be much appreciated..

  1. I see some different/overlapping possibilities for recording an object
    creation date. For example, to represent "1718" (display date) one could
    either store:

a) a single date, providing year only (1718)

b) an earliest date (01.01.1718) with a latest date (31.12.1718)

c) a single date (01.01.1718) with a qualifier of + 364 days

Is there any difference in the meaning of these three (and would they all be
treated equally by services such as advanced search)? Our migration data is
currently in format 'b' but the other formats are sometimes 'friendlier' for
new input.

  1. How are date "certainties" and "qualifiers" used by advanced search?

I've specified "1718", as a creation date with a certainty of "circa". Will
a search for objects created during 1717 result in a hit?

Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact
limits for this type of searching?

  1. Is the Date schema locally extendable? (We need a field to hold an
    English translation of the display text)

  2. Will Advanced search by creation date also search the display date or
    will it just use the single/earliest and latest date groups?

Thanks,

Chris


Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace
.org

Really? I am a little surprised that in all the design discussions for structured dates this did not come up. Anyway, I will say that we do not have the time to build some system that allows for a lot of custom behavior, or elaborate heuristics here (i.e., trying to divine what is intended from a set of text fields). I would strongly suggest that we either 1. require entry of some actual date fields upon which the application can reason, or 2. we define a very simple set of rules for building such date values from some of the existing text fields. By very simple, I mean that we only consider fields like year, month, day, and that the rules produce two scalar dates (early and late), which either define a range, or define a point (if they are equal). AIUI, a structured date cannot model the idea of "some time/any time after (or before) a date". One further question is the semantic of which date (if any) will be used for ordering. If and when advanced search supports date ranges (which also implies support for before and after logic in search), we can handle the interval logic associated with the structured dates, but only if we have a pair of scalar date values. Down the road, I do think it would be interesting to implement some fuzzier logic on date search, to better support "circa" logic and the uncertainty values. However, I suspect that it will take longer to arrive at consensus on these semantics, and would defer that discussion for now. Thanks - Patrick _____ From: Chris Hoffman [mailto:chris.hoffman@berkeley.edu] Sent: Wednesday, August 24, 2011 9:01 AM To: Patrick Schmitz Cc: 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org Subject: Re: [Talk] what's the best way to represent structured dates? Hi Patrick, I think it's fair to say there's no consensus because there's been no discussion :-) It's possible that each customer will want to use structured dates differently, which will make it hard to add any intelligence or even predictability to things like search. However, I also think that if a significant number of deploying institutions can come to some consensus on best ways to use and search structured dates, then many institutions will choose to adopt the same best practices. Thanks, Chris On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote: Aron was out yesterday, and Rick is out for a few days. They have done a lot of the structured dates work to date. The last I heard, Aron was hoping to get consensus from the users about how we map all the text fields to actual date fields that we can reason on (for search, ordering, etc). To date, there seems to be no such consensus, which will preclude search, ordering, etc. on structured dates. I would think that this would make structured dates quite a bit less useful... Perhaps (I hope) my understanding is out of date, and we just need to implement the logic. Perhaps Aron can jump in this AM, when he gets in. Patrick _____ From: talk-bounces@lists.collectionspace.org [mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman Sent: Tuesday, August 23, 2011 11:05 AM To: Christopher Pott Cc: talk@lists.collectionspace.org Subject: Re: [Talk] what's the best way to represent structured dates? Hi Chris, Thanks for asking these questions. We've been wondering the same things. Have you changed any existing calendar date fields to structured dates or added new structured date fields? I think this is all quite new, and together we might be able to come up with some best practices (related to data migration and data entry). I suspect that search and advanced search of structured date fields is going to be an issue for those of us deploying CollectionSpace. Our users will want CSpace to be smart about structured dates, interpreting qualifiers, and so on, but I bet that we won't see much as much as we would like in the first version of advanced search. I'm always happy to be proved wrong! :-) Chris On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote: Hi, I'm looking at the structured date schema and the best way to represent dates migrated from our old system. Any information about the following would be much appreciated.. 1. I see some different/overlapping possibilities for recording an object creation date. For example, to represent "1718" (display date) one could either store: a) a single date, providing year only (1718) b) an earliest date (01.01.1718) with a latest date (31.12.1718) c) a single date (01.01.1718) with a qualifier of + 364 days Is there any difference in the meaning of these three (and would they all be treated equally by services such as advanced search)? Our migration data is currently in format 'b' but the other formats are sometimes 'friendlier' for new input. 2. How are date "certainties" and "qualifiers" used by advanced search? I've specified "1718", as a creation date with a certainty of "circa". Will a search for objects created during 1717 result in a hit? Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact limits for this type of searching? 3. Is the Date schema locally extendable? (We need a field to hold an English translation of the display text) 4. Will Advanced search by creation date also search the display date or will it just use the single/earliest and latest date groups? Thanks, Chris _______________________________________________ Talk mailing list Talk@lists.collectionspace.org http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace .org
SS
Susan Stone
Wed, Aug 24, 2011 4:33 PM

Patrick,

There was quite a bit of discussion of this in our Wednesday afternoon
meetings at Berkeley that I'm sure Aron will recall. I don't think date
fields go back far enough in time, so I'd recommend using integers or
floating point numbers and providing the math in search. It is also
important to make sure that false precision isn't introduced. If
something is dated 1900, the system shouldn't enter that as January 1,
1900 unless the uncertainty values can be used to differentiate and the
display date can be simply 1900.

Susan

On 08/24/2011 09:16 AM, Patrick Schmitz wrote:

Really? I am a little surprised that in all the design discussions for
structured dates this did not come up.
Anyway, I will say that we do not have the time to build some system
that allows for a lot of custom behavior, or elaborate heuristics here
(i.e., trying to divine what is intended from a set of text fields). I
would strongly suggest that we either

  1. require entry of some actual date fields upon which the application
    can reason,
    or
  2. we define a very simple set of rules for building such date values
    from some of the existing text fields. By very simple, I mean that
    we only consider fields like year, month, day, and that the rules
    produce two scalar dates (early and late), which either define a
    range, or define a point (if they are equal). AIUI, a structured
    date /cannot/ model the idea of "some time/any time after (or
    before) a date".

One further question is the semantic of which date (if any) will be used
for ordering.
If and when advanced search supports date ranges (which also implies
support for /before/ and /after/ logic in search), we can handle the
interval logic associated with the structured dates, but only if we have
a pair of scalar date values.
Down the road, I do think it would be interesting to implement some
fuzzier logic on date search, to better support "circa" logic and the
uncertainty values. However, I suspect that it will take longer to
arrive at consensus on these semantics, and would defer that discussion
for now.
Thanks - Patrick

 ------------------------------------------------------------------------
 *From:* Chris Hoffman [mailto:chris.hoffman@berkeley.edu]
 *Sent:* Wednesday, August 24, 2011 9:01 AM
 *To:* Patrick Schmitz
 *Cc:* 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org
 *Subject:* Re: [Talk] what's the best way to represent structured dates?

 Hi Patrick, I think it's fair to say there's no consensus because
 there's been no discussion :-) It's possible that each customer will
 want to use structured dates differently, which will make it hard to
 add any intelligence or even predictability to things like search.
 However, I also think that if a significant number of deploying
 institutions can come to some consensus on best ways to use and
 search structured dates, then many institutions will choose to adopt
 the same best practices.

 Thanks,
 Chris

 On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote:
 Aron was out yesterday, and Rick is out for a few days. They have
 done a lot of the structured dates work to date.
 The last I heard, Aron was hoping to get consensus from the users
 about how we map all the text fields to actual date fields that we
 can reason on (for search, ordering, etc). To date, there seems to
 be no such consensus, which will preclude search, ordering, etc.
 on structured dates. I would think that this would make structured
 dates quite a bit less useful...
 Perhaps (I hope) my understanding is out of date, and we just need
 to implement the logic. Perhaps Aron can jump in this AM, when he
 gets in.
 Patrick

     ------------------------------------------------------------------------
     *From:* talk-bounces@lists.collectionspace.org
     <mailto:talk-bounces@lists.collectionspace.org>
     [mailto:talk-bounces@lists.collectionspace.org] *On Behalf Of
     *Chris Hoffman
     *Sent:* Tuesday, August 23, 2011 11:05 AM
     *To:* Christopher Pott
     *Cc:* talk@lists.collectionspace.org
     <mailto:talk@lists.collectionspace.org>
     *Subject:* Re: [Talk] what's the best way to represent
     structured dates?

     Hi Chris,

     Thanks for asking these questions. We've been wondering the
     same things. Have you changed any existing calendar date
     fields to structured dates or added new structured date
     fields? I think this is all quite new, and together we might
     be able to come up with some best practices (related to data
     migration and data entry).

     I suspect that search and advanced search of structured date
     fields is going to be an issue for those of us deploying
     CollectionSpace. Our users will want CSpace to be smart about
     structured dates, interpreting qualifiers, and so on, but I
     bet that we won't see much as much as we would like in the
     first version of advanced search. I'm always happy to be
     proved wrong! :-)

     Chris

     On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote:
     Hi,

     I’m looking at the structured date schema and the best way to
     represent dates migrated from our old system. Any information
     about the following would be much appreciated….

     1. I see some different/overlapping possibilities for
     recording an object creation date. For example, to represent
     “1718” (display date) one could either store:

     a) a single date, providing year only (1718)

     b) an earliest date (01.01.1718) with a latest date (31.12.1718)

     c) a single date (01.01.1718) with a qualifier of + 364 days

     Is there any difference in the meaning of these three (and
     would they all be treated equally by services such as
     advanced search)? Our migration data is currently in format
     ‘b’ but the other formats are sometimes ‘friendlier’ for new
     input.

     2. How are date “certainties” and “qualifiers” used by
     advanced search?

     I’ve specified “1718”, as a creation date with a certainty of
     “circa”. Will a search for objects created during 1717 result
     in a hit?

     Or is it necessary to use qualifiers (eg. 1718 +/- 2 years)
     to define exact limits for this type of searching?

     3. Is the Date schema locally extendable? (We need a field to
     hold an English translation of the display text)

     4. Will Advanced search by creation date also search the
     display date or will it just use the single/earliest and
     latest date groups?

     Thanks,

     Chris

     _______________________________________________
     Talk mailing list
     Talk@lists.collectionspace.org
     <mailto:Talk@lists.collectionspace.org>
     http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org
Patrick, There was quite a bit of discussion of this in our Wednesday afternoon meetings at Berkeley that I'm sure Aron will recall. I don't think date fields go back far enough in time, so I'd recommend using integers or floating point numbers and providing the math in search. It is also important to make sure that false precision isn't introduced. If something is dated 1900, the system shouldn't enter that as January 1, 1900 unless the uncertainty values can be used to differentiate and the display date can be simply 1900. Susan On 08/24/2011 09:16 AM, Patrick Schmitz wrote: > Really? I am a little surprised that in all the design discussions for > structured dates this did not come up. > Anyway, I will say that we do not have the time to build some system > that allows for a lot of custom behavior, or elaborate heuristics here > (i.e., trying to divine what is intended from a set of text fields). I > would strongly suggest that we either > > 1. > require entry of some actual date fields upon which the application > can reason, > or > 2. > we define a very simple set of rules for building such date values > from some of the existing text fields. By very simple, I mean that > we only consider fields like year, month, day, and that the rules > produce two scalar dates (early and late), which either define a > range, or define a point (if they are equal). AIUI, a structured > date /cannot/ model the idea of "some time/any time after (or > before) a date". > > One further question is the semantic of which date (if any) will be used > for ordering. > If and when advanced search supports date ranges (which also implies > support for /before/ and /after/ logic in search), we can handle the > interval logic associated with the structured dates, but only if we have > a pair of scalar date values. > Down the road, I do think it would be interesting to implement some > fuzzier logic on date search, to better support "circa" logic and the > uncertainty values. However, I suspect that it will take longer to > arrive at consensus on these semantics, and would defer that discussion > for now. > Thanks - Patrick > > ------------------------------------------------------------------------ > *From:* Chris Hoffman [mailto:chris.hoffman@berkeley.edu] > *Sent:* Wednesday, August 24, 2011 9:01 AM > *To:* Patrick Schmitz > *Cc:* 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org > *Subject:* Re: [Talk] what's the best way to represent structured dates? > > Hi Patrick, I think it's fair to say there's no consensus because > there's been no discussion :-) It's possible that each customer will > want to use structured dates differently, which will make it hard to > add any intelligence or even predictability to things like search. > However, I also think that if a significant number of deploying > institutions can come to some consensus on best ways to use and > search structured dates, then many institutions will choose to adopt > the same best practices. > > Thanks, > Chris > > On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote: > >> Aron was out yesterday, and Rick is out for a few days. They have >> done a lot of the structured dates work to date. >> The last I heard, Aron was hoping to get consensus from the users >> about how we map all the text fields to actual date fields that we >> can reason on (for search, ordering, etc). To date, there seems to >> be no such consensus, which will preclude search, ordering, etc. >> on structured dates. I would think that this would make structured >> dates quite a bit less useful... >> Perhaps (I hope) my understanding is out of date, and we just need >> to implement the logic. Perhaps Aron can jump in this AM, when he >> gets in. >> Patrick >> >> ------------------------------------------------------------------------ >> *From:* talk-bounces@lists.collectionspace.org >> <mailto:talk-bounces@lists.collectionspace.org> >> [mailto:talk-bounces@lists.collectionspace.org] *On Behalf Of >> *Chris Hoffman >> *Sent:* Tuesday, August 23, 2011 11:05 AM >> *To:* Christopher Pott >> *Cc:* talk@lists.collectionspace.org >> <mailto:talk@lists.collectionspace.org> >> *Subject:* Re: [Talk] what's the best way to represent >> structured dates? >> >> Hi Chris, >> >> Thanks for asking these questions. We've been wondering the >> same things. Have you changed any existing calendar date >> fields to structured dates or added new structured date >> fields? I think this is all quite new, and together we might >> be able to come up with some best practices (related to data >> migration and data entry). >> >> I suspect that search and advanced search of structured date >> fields is going to be an issue for those of us deploying >> CollectionSpace. Our users will want CSpace to be smart about >> structured dates, interpreting qualifiers, and so on, but I >> bet that we won't see much as much as we would like in the >> first version of advanced search. I'm always happy to be >> proved wrong! :-) >> >> Chris >> >> On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote: >> >>> Hi, >>> >>> I’m looking at the structured date schema and the best way to >>> represent dates migrated from our old system. Any information >>> about the following would be much appreciated…. >>> >>> 1. I see some different/overlapping possibilities for >>> recording an object creation date. For example, to represent >>> “1718” (display date) one could either store: >>> >>> a) a single date, providing year only (1718) >>> >>> b) an earliest date (01.01.1718) with a latest date (31.12.1718) >>> >>> c) a single date (01.01.1718) with a qualifier of + 364 days >>> >>> Is there any difference in the meaning of these three (and >>> would they all be treated equally by services such as >>> advanced search)? Our migration data is currently in format >>> ‘b’ but the other formats are sometimes ‘friendlier’ for new >>> input. >>> >>> 2. How are date “certainties” and “qualifiers” used by >>> advanced search? >>> >>> I’ve specified “1718”, as a creation date with a certainty of >>> “circa”. Will a search for objects created during 1717 result >>> in a hit? >>> >>> Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) >>> to define exact limits for this type of searching? >>> >>> 3. Is the Date schema locally extendable? (We need a field to >>> hold an English translation of the display text) >>> >>> 4. Will Advanced search by creation date also search the >>> display date or will it just use the single/earliest and >>> latest date groups? >>> >>> Thanks, >>> >>> Chris >>> >>> _______________________________________________ >>> Talk mailing list >>> Talk@lists.collectionspace.org >>> <mailto:Talk@lists.collectionspace.org> >>> http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org >> > > > > _______________________________________________ > Talk mailing list > Talk@lists.collectionspace.org > http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org
AR
Aron Roberts
Wed, Aug 24, 2011 4:53 PM

On Wed, Aug 24, 2011 at 9:16 AM, Patrick Schmitz pschmitz@berkeley.edu wrote:

I would strongly suggest that we either require entry of some actual date
fields upon which the application can reason,
or
we define a very simple set of rules for building such date values from some
of the existing text fields. By very simple, I mean that we only consider
fields like year, month, day, and that the rules produce two scalar dates
(early and late), which either define a range, or define a point (if they
are equal).

This is how I've understood we'd be modeling this as well, based on
our discussions to date.

To restate what I personally understand Patrick has said today, and
he and others in various discussions, so that we can identify any
points of remaining unclarity/disagreement:

  • In either option, an 'earliest' date and a 'latest' date would be
    synthesized and stored internally by the services layer, from values
    entered into other fields in the structured date by the user.  These
    would be exposed in services schema as read-only values; a client
    could not set these.

  • In the first option above, date fields would be defined as xs:date
    or xs:dateTime in the XSD schemas used by Nuxeo, which results in the
    creation of native date types in the underlying database (PostgreSQL
    or MySQL).

  • In the second option, dates would be stored as strings, in a profile
    of ISO 8601, which would also permit date and date range comparisons.

  • Data entry would be constrained/validated in the UI and services
    layers so that at least a year must be present, for a month to be
    entered; and a year and month would need to be present, for a day to
    be entered.  (This validation is not present in structured dates at
    the moment, AFAIK.)

    In either case, as Patrick mentioned:

  • These date values would initially be synthesized based on just Year,
    Month, and Day values.  Uncertainty qualifiers and the like wouldn't
    be taken into account in the first pass, although those would be
    retained for potential future use.

Aron

AIUI, a structured date cannot model the idea of "some time/any
time after (or before) a date".

One further question is the semantic of which date (if any) will be used for
ordering.

If and when advanced search supports date ranges (which also implies support
for before and after logic in search), we can handle the interval logic
associated with the structured dates, but only if we have a pair of scalar
date values.

Down the road, I do think it would be interesting to implement some fuzzier
logic on date search, to better support "circa" logic and the uncertainty
values. However, I suspect that it will take longer to arrive at consensus
on these semantics, and would defer that discussion for now.

Thanks - Patrick


From: Chris Hoffman [mailto:chris.hoffman@berkeley.edu]
Sent: Wednesday, August 24, 2011 9:01 AM
To: Patrick Schmitz
Cc: 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org
Subject: Re: [Talk] what's the best way to represent structured dates?

Hi Patrick, I think it's fair to say there's no consensus because there's
been no discussion :-) It's possible that each customer will want to use
structured dates differently, which will make it hard to add any
intelligence or even predictability to things like search.  However, I also
think that if a significant number of deploying institutions can come to
some consensus on best ways to use and search structured dates, then many
institutions will choose to adopt the same best practices.
Thanks,
Chris
On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote:

Aron was out yesterday, and Rick is out for a few days. They have done a lot
of the structured dates work to date.

The last I heard, Aron was hoping to get consensus from the users about how
we map all the text fields to actual date fields that we can reason on (for
search, ordering, etc). To date, there seems to be no such consensus, which
will preclude search, ordering, etc. on structured dates. I would think that
this would make structured dates quite a bit less useful...

Perhaps (I hope) my understanding is out of date, and we just need to
implement the logic. Perhaps Aron can jump in this AM, when he gets in.

Patrick


From: talk-bounces@lists.collectionspace.org
[mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman
Sent: Tuesday, August 23, 2011 11:05 AM
To: Christopher Pott
Cc: talk@lists.collectionspace.org
Subject: Re: [Talk] what's the best way to represent structured dates?

Hi Chris,
Thanks for asking these questions.  We've been wondering the same things.
 Have you changed any existing calendar date fields to structured dates or
added new structured date fields?  I think this is all quite new, and
together we might be able to come up with some best practices (related to
data migration and data entry).
I suspect that search and advanced search of structured date fields is going
to be an issue for those of us deploying CollectionSpace.  Our users will
want CSpace to be smart about structured dates, interpreting qualifiers, and
so on, but I bet that we won't see much as much as we would like in the
first version of advanced search.  I'm always happy to be proved wrong!  :-)
Chris
On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote:

Hi,

I’m looking at the structured date schema and the best way to represent
dates migrated from our old system. Any information about the following
would be much appreciated….

  1. I see some different/overlapping possibilities for recording an object
    creation date. For example, to represent “1718” (display date) one could
    either store:

a) a single date, providing year only (1718)

b) an earliest date (01.01.1718) with a latest date (31.12.1718)

c) a single date (01.01.1718) with a qualifier of + 364 days

Is there any difference in the meaning of these three (and would they all be
treated equally by services such as advanced search)? Our migration data is
currently in format ‘b’ but the other formats are sometimes ‘friendlier’ for
new input.

  1. How are date “certainties” and “qualifiers” used by advanced search?

I’ve specified “1718”, as a creation date with a certainty of “circa”. Will
a search for objects created during 1717 result in a hit?

Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact
limits for this type of searching?

  1. Is the Date schema locally extendable? (We need a field to hold an
    English translation of the display text)

  2. Will Advanced search by creation date also search the display date or
    will it just use the single/earliest and latest date groups?

Thanks,

Chris


Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org

On Wed, Aug 24, 2011 at 9:16 AM, Patrick Schmitz <pschmitz@berkeley.edu> wrote: >I would strongly suggest that we either require entry of some actual date > fields upon which the application can reason, > or > we define a very simple set of rules for building such date values from some > of the existing text fields. By very simple, I mean that we only consider > fields like year, month, day, and that the rules produce two scalar dates > (early and late), which either define a range, or define a point (if they > are equal). This is how I've understood we'd be modeling this as well, based on our discussions to date. To restate what I personally understand Patrick has said today, and he and others in various discussions, so that we can identify any points of remaining unclarity/disagreement: * In either option, an 'earliest' date and a 'latest' date would be synthesized and stored internally by the services layer, from values entered into other fields in the structured date by the user. These would be exposed in services schema as read-only values; a client could not set these. * In the first option above, date fields would be defined as xs:date or xs:dateTime in the XSD schemas used by Nuxeo, which results in the creation of native date types in the underlying database (PostgreSQL or MySQL). * In the second option, dates would be stored as strings, in a profile of ISO 8601, which would also permit date and date range comparisons. * Data entry would be constrained/validated in the UI and services layers so that at least a year must be present, for a month to be entered; and a year and month would need to be present, for a day to be entered. (This validation is not present in structured dates at the moment, AFAIK.) In either case, as Patrick mentioned: * These date values would initially be synthesized based on just Year, Month, and Day values. Uncertainty qualifiers and the like wouldn't be taken into account in the first pass, although those would be retained for potential future use. Aron > AIUI, a structured date cannot model the idea of "some time/any > time after (or before) a date". > > One further question is the semantic of which date (if any) will be used for > ordering. > > If and when advanced search supports date ranges (which also implies support > for before and after logic in search), we can handle the interval logic > associated with the structured dates, but only if we have a pair of scalar > date values. > > Down the road, I do think it would be interesting to implement some fuzzier > logic on date search, to better support "circa" logic and the uncertainty > values. However, I suspect that it will take longer to arrive at consensus > on these semantics, and would defer that discussion for now. > > Thanks - Patrick > > ________________________________ > From: Chris Hoffman [mailto:chris.hoffman@berkeley.edu] > Sent: Wednesday, August 24, 2011 9:01 AM > To: Patrick Schmitz > Cc: 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org > Subject: Re: [Talk] what's the best way to represent structured dates? > > Hi Patrick, I think it's fair to say there's no consensus because there's > been no discussion :-) It's possible that each customer will want to use > structured dates differently, which will make it hard to add any > intelligence or even predictability to things like search.  However, I also > think that if a significant number of deploying institutions can come to > some consensus on best ways to use and search structured dates, then many > institutions will choose to adopt the same best practices. > Thanks, > Chris > On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote: > > Aron was out yesterday, and Rick is out for a few days. They have done a lot > of the structured dates work to date. > > The last I heard, Aron was hoping to get consensus from the users about how > we map all the text fields to actual date fields that we can reason on (for > search, ordering, etc). To date, there seems to be no such consensus, which > will preclude search, ordering, etc. on structured dates. I would think that > this would make structured dates quite a bit less useful... > > Perhaps (I hope) my understanding is out of date, and we just need to > implement the logic. Perhaps Aron can jump in this AM, when he gets in. > > Patrick > > ________________________________ > From: talk-bounces@lists.collectionspace.org > [mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman > Sent: Tuesday, August 23, 2011 11:05 AM > To: Christopher Pott > Cc: talk@lists.collectionspace.org > Subject: Re: [Talk] what's the best way to represent structured dates? > > Hi Chris, > Thanks for asking these questions.  We've been wondering the same things. >  Have you changed any existing calendar date fields to structured dates or > added new structured date fields?  I think this is all quite new, and > together we might be able to come up with some best practices (related to > data migration and data entry). > I suspect that search and advanced search of structured date fields is going > to be an issue for those of us deploying CollectionSpace.  Our users will > want CSpace to be smart about structured dates, interpreting qualifiers, and > so on, but I bet that we won't see much as much as we would like in the > first version of advanced search.  I'm always happy to be proved wrong!  :-) > Chris > On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote: > > Hi, > > I’m looking at the structured date schema and the best way to represent > dates migrated from our old system. Any information about the following > would be much appreciated…. > > 1. I see some different/overlapping possibilities for recording an object > creation date. For example, to represent “1718” (display date) one could > either store: > > a) a single date, providing year only (1718) > > b) an earliest date (01.01.1718) with a latest date (31.12.1718) > > c) a single date (01.01.1718) with a qualifier of + 364 days > > Is there any difference in the meaning of these three (and would they all be > treated equally by services such as advanced search)? Our migration data is > currently in format ‘b’ but the other formats are sometimes ‘friendlier’ for > new input. > > 2. How are date “certainties” and “qualifiers” used by advanced search? > > I’ve specified “1718”, as a creation date with a certainty of “circa”. Will > a search for objects created during 1717 result in a hit? > > Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact > limits for this type of searching? > > 3. Is the Date schema locally extendable? (We need a field to hold an > English translation of the display text) > > 4. Will Advanced search by creation date also search the display date or > will it just use the single/earliest and latest date groups? > > Thanks, > > Chris > > _______________________________________________ > Talk mailing list > Talk@lists.collectionspace.org > http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org > > >
RM
Richard Moe
Wed, Aug 24, 2011 4:58 PM

I echo Susan's recommendation, having spent a lot of time disentangling May 1 1910 records that are really May 1910 or Jan 1 1910 records that are really
1910.
Smasch (as Susan can explain) uses Julian day representation, with early julian day and late julian day fields to deal with date ranges
(Chris's "b" below). I've continued to use that in the Consortium of California Herbaria

I think Julian day representation doesn't work well with BC, maybe.I appreciate that the need for generality
to accommodate the different epochs in different sorts of museums is troublesome.

Smasch has a fuzzy date module by which uncertainty and variable ranges are represented (ca., about, spring, late fall, before, after, and so on)
It modifies the ejd and ljd as necessary

It seems like a locally modifiable fuzzy date module might be useful.

Dick Moe
On Aug 24, 2011, at 9:33 AM, Susan Stone wrote:

Patrick,

There was quite a bit of discussion of this in our Wednesday afternoon meetings at Berkeley that I'm sure Aron will recall. I don't think date fields go back far enough in time, so I'd recommend using integers or floating point numbers and providing the math in search. It is also important to make sure that false precision isn't introduced. If something is dated 1900, the system shouldn't enter that as January 1, 1900 unless the uncertainty values can be used to differentiate and the display date can be simply 1900.

Susan

On 08/24/2011 09:16 AM, Patrick Schmitz wrote:

Really? I am a little surprised that in all the design discussions for
structured dates this did not come up.
Anyway, I will say that we do not have the time to build some system
that allows for a lot of custom behavior, or elaborate heuristics here
(i.e., trying to divine what is intended from a set of text fields). I
would strongly suggest that we either

  1. require entry of some actual date fields upon which the application
    can reason,
    or
  2. we define a very simple set of rules for building such date values
    from some of the existing text fields. By very simple, I mean that
    we only consider fields like year, month, day, and that the rules
    produce two scalar dates (early and late), which either define a
    range, or define a point (if they are equal). AIUI, a structured
    date /cannot/ model the idea of "some time/any time after (or
    before) a date".

One further question is the semantic of which date (if any) will be used
for ordering.
If and when advanced search supports date ranges (which also implies
support for /before/ and /after/ logic in search), we can handle the
interval logic associated with the structured dates, but only if we have
a pair of scalar date values.
Down the road, I do think it would be interesting to implement some
fuzzier logic on date search, to better support "circa" logic and the
uncertainty values. However, I suspect that it will take longer to
arrive at consensus on these semantics, and would defer that discussion
for now.
Thanks - Patrick

------------------------------------------------------------------------
*From:* Chris Hoffman [mailto:chris.hoffman@berkeley.edu]
*Sent:* Wednesday, August 24, 2011 9:01 AM
*To:* Patrick Schmitz
*Cc:* 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org
*Subject:* Re: [Talk] what's the best way to represent structured dates?

Hi Patrick, I think it's fair to say there's no consensus because
there's been no discussion :-) It's possible that each customer will
want to use structured dates differently, which will make it hard to
add any intelligence or even predictability to things like search.
However, I also think that if a significant number of deploying
institutions can come to some consensus on best ways to use and
search structured dates, then many institutions will choose to adopt
the same best practices.

Thanks,
Chris

On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote:
Aron was out yesterday, and Rick is out for a few days. They have
done a lot of the structured dates work to date.
The last I heard, Aron was hoping to get consensus from the users
about how we map all the text fields to actual date fields that we
can reason on (for search, ordering, etc). To date, there seems to
be no such consensus, which will preclude search, ordering, etc.
on structured dates. I would think that this would make structured
dates quite a bit less useful...
Perhaps (I hope) my understanding is out of date, and we just need
to implement the logic. Perhaps Aron can jump in this AM, when he
gets in.
Patrick

    ------------------------------------------------------------------------
    *From:* talk-bounces@lists.collectionspace.org
    <mailto:talk-bounces@lists.collectionspace.org>
    [mailto:talk-bounces@lists.collectionspace.org] *On Behalf Of
    *Chris Hoffman
    *Sent:* Tuesday, August 23, 2011 11:05 AM
    *To:* Christopher Pott
    *Cc:* talk@lists.collectionspace.org
    <mailto:talk@lists.collectionspace.org>
    *Subject:* Re: [Talk] what's the best way to represent
    structured dates?

    Hi Chris,

    Thanks for asking these questions. We've been wondering the
    same things. Have you changed any existing calendar date
    fields to structured dates or added new structured date
    fields? I think this is all quite new, and together we might
    be able to come up with some best practices (related to data
    migration and data entry).

    I suspect that search and advanced search of structured date
    fields is going to be an issue for those of us deploying
    CollectionSpace. Our users will want CSpace to be smart about
    structured dates, interpreting qualifiers, and so on, but I
    bet that we won't see much as much as we would like in the
    first version of advanced search. I'm always happy to be
    proved wrong! :-)

    Chris

    On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote:
    Hi,

    I’m looking at the structured date schema and the best way to
    represent dates migrated from our old system. Any information
    about the following would be much appreciated….

    1. I see some different/overlapping possibilities for
    recording an object creation date. For example, to represent
    “1718” (display date) one could either store:

    a) a single date, providing year only (1718)

    b) an earliest date (01.01.1718) with a latest date (31.12.1718)

    c) a single date (01.01.1718) with a qualifier of + 364 days

    Is there any difference in the meaning of these three (and
    would they all be treated equally by services such as
    advanced search)? Our migration data is currently in format
    ‘b’ but the other formats are sometimes ‘friendlier’ for new
    input.

    2. How are date “certainties” and “qualifiers” used by
    advanced search?

    I’ve specified “1718”, as a creation date with a certainty of
    “circa”. Will a search for objects created during 1717 result
    in a hit?

    Or is it necessary to use qualifiers (eg. 1718 +/- 2 years)
    to define exact limits for this type of searching?

    3. Is the Date schema locally extendable? (We need a field to
    hold an English translation of the display text)

    4. Will Advanced search by creation date also search the
    display date or will it just use the single/earliest and
    latest date groups?

    Thanks,

    Chris

    _______________________________________________
    Talk mailing list
    Talk@lists.collectionspace.org
    <mailto:Talk@lists.collectionspace.org>
    http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org
I echo Susan's recommendation, having spent a lot of time disentangling May 1 1910 records that are really May 1910 or Jan 1 1910 records that are really 1910. Smasch (as Susan can explain) uses Julian day representation, with early julian day and late julian day fields to deal with date ranges (Chris's "b" below). I've continued to use that in the Consortium of California Herbaria I think Julian day representation doesn't work well with BC, maybe.I appreciate that the need for generality to accommodate the different epochs in different sorts of museums is troublesome. Smasch has a fuzzy date module by which uncertainty and variable ranges are represented (ca., about, spring, late fall, before, after, and so on) It modifies the ejd and ljd as necessary It seems like a locally modifiable fuzzy date module might be useful. Dick Moe On Aug 24, 2011, at 9:33 AM, Susan Stone wrote: > Patrick, > > There was quite a bit of discussion of this in our Wednesday afternoon meetings at Berkeley that I'm sure Aron will recall. I don't think date fields go back far enough in time, so I'd recommend using integers or floating point numbers and providing the math in search. It is also important to make sure that false precision isn't introduced. If something is dated 1900, the system shouldn't enter that as January 1, 1900 unless the uncertainty values can be used to differentiate and the display date can be simply 1900. > > Susan > > > On 08/24/2011 09:16 AM, Patrick Schmitz wrote: >> Really? I am a little surprised that in all the design discussions for >> structured dates this did not come up. >> Anyway, I will say that we do not have the time to build some system >> that allows for a lot of custom behavior, or elaborate heuristics here >> (i.e., trying to divine what is intended from a set of text fields). I >> would strongly suggest that we either >> >> 1. >> require entry of some actual date fields upon which the application >> can reason, >> or >> 2. >> we define a very simple set of rules for building such date values >> from some of the existing text fields. By very simple, I mean that >> we only consider fields like year, month, day, and that the rules >> produce two scalar dates (early and late), which either define a >> range, or define a point (if they are equal). AIUI, a structured >> date /cannot/ model the idea of "some time/any time after (or >> before) a date". >> >> One further question is the semantic of which date (if any) will be used >> for ordering. >> If and when advanced search supports date ranges (which also implies >> support for /before/ and /after/ logic in search), we can handle the >> interval logic associated with the structured dates, but only if we have >> a pair of scalar date values. >> Down the road, I do think it would be interesting to implement some >> fuzzier logic on date search, to better support "circa" logic and the >> uncertainty values. However, I suspect that it will take longer to >> arrive at consensus on these semantics, and would defer that discussion >> for now. >> Thanks - Patrick >> >> ------------------------------------------------------------------------ >> *From:* Chris Hoffman [mailto:chris.hoffman@berkeley.edu] >> *Sent:* Wednesday, August 24, 2011 9:01 AM >> *To:* Patrick Schmitz >> *Cc:* 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org >> *Subject:* Re: [Talk] what's the best way to represent structured dates? >> >> Hi Patrick, I think it's fair to say there's no consensus because >> there's been no discussion :-) It's possible that each customer will >> want to use structured dates differently, which will make it hard to >> add any intelligence or even predictability to things like search. >> However, I also think that if a significant number of deploying >> institutions can come to some consensus on best ways to use and >> search structured dates, then many institutions will choose to adopt >> the same best practices. >> >> Thanks, >> Chris >> >> On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote: >> >>> Aron was out yesterday, and Rick is out for a few days. They have >>> done a lot of the structured dates work to date. >>> The last I heard, Aron was hoping to get consensus from the users >>> about how we map all the text fields to actual date fields that we >>> can reason on (for search, ordering, etc). To date, there seems to >>> be no such consensus, which will preclude search, ordering, etc. >>> on structured dates. I would think that this would make structured >>> dates quite a bit less useful... >>> Perhaps (I hope) my understanding is out of date, and we just need >>> to implement the logic. Perhaps Aron can jump in this AM, when he >>> gets in. >>> Patrick >>> >>> ------------------------------------------------------------------------ >>> *From:* talk-bounces@lists.collectionspace.org >>> <mailto:talk-bounces@lists.collectionspace.org> >>> [mailto:talk-bounces@lists.collectionspace.org] *On Behalf Of >>> *Chris Hoffman >>> *Sent:* Tuesday, August 23, 2011 11:05 AM >>> *To:* Christopher Pott >>> *Cc:* talk@lists.collectionspace.org >>> <mailto:talk@lists.collectionspace.org> >>> *Subject:* Re: [Talk] what's the best way to represent >>> structured dates? >>> >>> Hi Chris, >>> >>> Thanks for asking these questions. We've been wondering the >>> same things. Have you changed any existing calendar date >>> fields to structured dates or added new structured date >>> fields? I think this is all quite new, and together we might >>> be able to come up with some best practices (related to data >>> migration and data entry). >>> >>> I suspect that search and advanced search of structured date >>> fields is going to be an issue for those of us deploying >>> CollectionSpace. Our users will want CSpace to be smart about >>> structured dates, interpreting qualifiers, and so on, but I >>> bet that we won't see much as much as we would like in the >>> first version of advanced search. I'm always happy to be >>> proved wrong! :-) >>> >>> Chris >>> >>> On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote: >>> >>>> Hi, >>>> >>>> I’m looking at the structured date schema and the best way to >>>> represent dates migrated from our old system. Any information >>>> about the following would be much appreciated…. >>>> >>>> 1. I see some different/overlapping possibilities for >>>> recording an object creation date. For example, to represent >>>> “1718” (display date) one could either store: >>>> >>>> a) a single date, providing year only (1718) >>>> >>>> b) an earliest date (01.01.1718) with a latest date (31.12.1718) >>>> >>>> c) a single date (01.01.1718) with a qualifier of + 364 days >>>> >>>> Is there any difference in the meaning of these three (and >>>> would they all be treated equally by services such as >>>> advanced search)? Our migration data is currently in format >>>> ‘b’ but the other formats are sometimes ‘friendlier’ for new >>>> input. >>>> >>>> 2. How are date “certainties” and “qualifiers” used by >>>> advanced search? >>>> >>>> I’ve specified “1718”, as a creation date with a certainty of >>>> “circa”. Will a search for objects created during 1717 result >>>> in a hit? >>>> >>>> Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) >>>> to define exact limits for this type of searching? >>>> >>>> 3. Is the Date schema locally extendable? (We need a field to >>>> hold an English translation of the display text) >>>> >>>> 4. Will Advanced search by creation date also search the >>>> display date or will it just use the single/earliest and >>>> latest date groups? >>>> >>>> Thanks, >>>> >>>> Chris >>>> >>>> _______________________________________________ >>>> Talk mailing list >>>> Talk@lists.collectionspace.org >>>> <mailto:Talk@lists.collectionspace.org> >>>> http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org >>> >> >> >> >> _______________________________________________ >> Talk mailing list >> Talk@lists.collectionspace.org >> http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org > > > _______________________________________________ > Talk mailing list > Talk@lists.collectionspace.org > http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org
CH
Chris Hoffman
Wed, Aug 24, 2011 5:02 PM

Hi all,

I think this would be fine for CSpace 2.0.  Of course we will love to see more advanced treatments in the future (e.g., sophisticated handling of circa dates and qualifiers that get interpreted automagically by the system), but for now of we are able to synthesize early and late dates from entered data and search somewhat intelligently based on those synthesized early and late dates, then that gives us a lot of power.

I've asked my team to show me the algorithm we use that creates early and late date fields automatically from information entered in a single field, so we can offer some examples (and even logic if you want it).

Then if we can come to some agreement about how to search these, then we're happy campers!

Thanks,
Chris

On Aug 24, 2011, at 9:53 AM, Aron Roberts wrote:

On Wed, Aug 24, 2011 at 9:16 AM, Patrick Schmitz pschmitz@berkeley.edu wrote:

I would strongly suggest that we either require entry of some actual date
fields upon which the application can reason,
or
we define a very simple set of rules for building such date values from some
of the existing text fields. By very simple, I mean that we only consider
fields like year, month, day, and that the rules produce two scalar dates
(early and late), which either define a range, or define a point (if they
are equal).

This is how I've understood we'd be modeling this as well, based on
our discussions to date.

To restate what I personally understand Patrick has said today, and
he and others in various discussions, so that we can identify any
points of remaining unclarity/disagreement:

  • In either option, an 'earliest' date and a 'latest' date would be
    synthesized and stored internally by the services layer, from values
    entered into other fields in the structured date by the user.  These
    would be exposed in services schema as read-only values; a client
    could not set these.
  • In the first option above, date fields would be defined as xs:date
    or xs:dateTime in the XSD schemas used by Nuxeo, which results in the
    creation of native date types in the underlying database (PostgreSQL
    or MySQL).
  • In the second option, dates would be stored as strings, in a profile
    of ISO 8601, which would also permit date and date range comparisons.
  • Data entry would be constrained/validated in the UI and services
    layers so that at least a year must be present, for a month to be
    entered; and a year and month would need to be present, for a day to
    be entered.  (This validation is not present in structured dates at
    the moment, AFAIK.)

In either case, as Patrick mentioned:

  • These date values would initially be synthesized based on just Year,
    Month, and Day values.  Uncertainty qualifiers and the like wouldn't
    be taken into account in the first pass, although those would be
    retained for potential future use.

Aron

AIUI, a structured date cannot model the idea of "some time/any
time after (or before) a date".

One further question is the semantic of which date (if any) will be used for
ordering.

If and when advanced search supports date ranges (which also implies support
for before and after logic in search), we can handle the interval logic
associated with the structured dates, but only if we have a pair of scalar
date values.

Down the road, I do think it would be interesting to implement some fuzzier
logic on date search, to better support "circa" logic and the uncertainty
values. However, I suspect that it will take longer to arrive at consensus
on these semantics, and would defer that discussion for now.

Thanks - Patrick


From: Chris Hoffman [mailto:chris.hoffman@berkeley.edu]
Sent: Wednesday, August 24, 2011 9:01 AM
To: Patrick Schmitz
Cc: 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org
Subject: Re: [Talk] what's the best way to represent structured dates?

Hi Patrick, I think it's fair to say there's no consensus because there's
been no discussion :-) It's possible that each customer will want to use
structured dates differently, which will make it hard to add any
intelligence or even predictability to things like search.  However, I also
think that if a significant number of deploying institutions can come to
some consensus on best ways to use and search structured dates, then many
institutions will choose to adopt the same best practices.
Thanks,
Chris
On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote:

Aron was out yesterday, and Rick is out for a few days. They have done a lot
of the structured dates work to date.

The last I heard, Aron was hoping to get consensus from the users about how
we map all the text fields to actual date fields that we can reason on (for
search, ordering, etc). To date, there seems to be no such consensus, which
will preclude search, ordering, etc. on structured dates. I would think that
this would make structured dates quite a bit less useful...

Perhaps (I hope) my understanding is out of date, and we just need to
implement the logic. Perhaps Aron can jump in this AM, when he gets in.

Patrick


From: talk-bounces@lists.collectionspace.org
[mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman
Sent: Tuesday, August 23, 2011 11:05 AM
To: Christopher Pott
Cc: talk@lists.collectionspace.org
Subject: Re: [Talk] what's the best way to represent structured dates?

Hi Chris,
Thanks for asking these questions.  We've been wondering the same things.
Have you changed any existing calendar date fields to structured dates or
added new structured date fields?  I think this is all quite new, and
together we might be able to come up with some best practices (related to
data migration and data entry).
I suspect that search and advanced search of structured date fields is going
to be an issue for those of us deploying CollectionSpace.  Our users will
want CSpace to be smart about structured dates, interpreting qualifiers, and
so on, but I bet that we won't see much as much as we would like in the
first version of advanced search.  I'm always happy to be proved wrong!  :-)
Chris
On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote:

Hi,

I’m looking at the structured date schema and the best way to represent
dates migrated from our old system. Any information about the following
would be much appreciated….

  1. I see some different/overlapping possibilities for recording an object
    creation date. For example, to represent “1718” (display date) one could
    either store:

a) a single date, providing year only (1718)

b) an earliest date (01.01.1718) with a latest date (31.12.1718)

c) a single date (01.01.1718) with a qualifier of + 364 days

Is there any difference in the meaning of these three (and would they all be
treated equally by services such as advanced search)? Our migration data is
currently in format ‘b’ but the other formats are sometimes ‘friendlier’ for
new input.

  1. How are date “certainties” and “qualifiers” used by advanced search?

I’ve specified “1718”, as a creation date with a certainty of “circa”. Will
a search for objects created during 1717 result in a hit?

Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact
limits for this type of searching?

  1. Is the Date schema locally extendable? (We need a field to hold an
    English translation of the display text)

  2. Will Advanced search by creation date also search the display date or
    will it just use the single/earliest and latest date groups?

Thanks,

Chris


Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org

Hi all, I think this would be fine for CSpace 2.0. Of course we will love to see more advanced treatments in the future (e.g., sophisticated handling of circa dates and qualifiers that get interpreted automagically by the system), but for now of we are able to synthesize early and late dates from entered data and search somewhat intelligently based on those synthesized early and late dates, then that gives us a lot of power. I've asked my team to show me the algorithm we use that creates early and late date fields automatically from information entered in a single field, so we can offer some examples (and even logic if you want it). Then if we can come to some agreement about how to search these, then we're happy campers! Thanks, Chris On Aug 24, 2011, at 9:53 AM, Aron Roberts wrote: > On Wed, Aug 24, 2011 at 9:16 AM, Patrick Schmitz <pschmitz@berkeley.edu> wrote: >> I would strongly suggest that we either require entry of some actual date >> fields upon which the application can reason, >> or >> we define a very simple set of rules for building such date values from some >> of the existing text fields. By very simple, I mean that we only consider >> fields like year, month, day, and that the rules produce two scalar dates >> (early and late), which either define a range, or define a point (if they >> are equal). > > This is how I've understood we'd be modeling this as well, based on > our discussions to date. > > To restate what I personally understand Patrick has said today, and > he and others in various discussions, so that we can identify any > points of remaining unclarity/disagreement: > > * In either option, an 'earliest' date and a 'latest' date would be > synthesized and stored internally by the services layer, from values > entered into other fields in the structured date by the user. These > would be exposed in services schema as read-only values; a client > could not set these. > * In the first option above, date fields would be defined as xs:date > or xs:dateTime in the XSD schemas used by Nuxeo, which results in the > creation of native date types in the underlying database (PostgreSQL > or MySQL). > * In the second option, dates would be stored as strings, in a profile > of ISO 8601, which would also permit date and date range comparisons. > * Data entry would be constrained/validated in the UI and services > layers so that at least a year must be present, for a month to be > entered; and a year and month would need to be present, for a day to > be entered. (This validation is not present in structured dates at > the moment, AFAIK.) > > In either case, as Patrick mentioned: > > * These date values would initially be synthesized based on just Year, > Month, and Day values. Uncertainty qualifiers and the like wouldn't > be taken into account in the first pass, although those would be > retained for potential future use. > > Aron > >> AIUI, a structured date cannot model the idea of "some time/any >> time after (or before) a date". >> >> One further question is the semantic of which date (if any) will be used for >> ordering. >> >> If and when advanced search supports date ranges (which also implies support >> for before and after logic in search), we can handle the interval logic >> associated with the structured dates, but only if we have a pair of scalar >> date values. >> >> Down the road, I do think it would be interesting to implement some fuzzier >> logic on date search, to better support "circa" logic and the uncertainty >> values. However, I suspect that it will take longer to arrive at consensus >> on these semantics, and would defer that discussion for now. >> >> Thanks - Patrick >> >> ________________________________ >> From: Chris Hoffman [mailto:chris.hoffman@berkeley.edu] >> Sent: Wednesday, August 24, 2011 9:01 AM >> To: Patrick Schmitz >> Cc: 'Christopher Pott'; 'Aron Roberts'; talk@lists.collectionspace.org >> Subject: Re: [Talk] what's the best way to represent structured dates? >> >> Hi Patrick, I think it's fair to say there's no consensus because there's >> been no discussion :-) It's possible that each customer will want to use >> structured dates differently, which will make it hard to add any >> intelligence or even predictability to things like search. However, I also >> think that if a significant number of deploying institutions can come to >> some consensus on best ways to use and search structured dates, then many >> institutions will choose to adopt the same best practices. >> Thanks, >> Chris >> On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote: >> >> Aron was out yesterday, and Rick is out for a few days. They have done a lot >> of the structured dates work to date. >> >> The last I heard, Aron was hoping to get consensus from the users about how >> we map all the text fields to actual date fields that we can reason on (for >> search, ordering, etc). To date, there seems to be no such consensus, which >> will preclude search, ordering, etc. on structured dates. I would think that >> this would make structured dates quite a bit less useful... >> >> Perhaps (I hope) my understanding is out of date, and we just need to >> implement the logic. Perhaps Aron can jump in this AM, when he gets in. >> >> Patrick >> >> ________________________________ >> From: talk-bounces@lists.collectionspace.org >> [mailto:talk-bounces@lists.collectionspace.org] On Behalf Of Chris Hoffman >> Sent: Tuesday, August 23, 2011 11:05 AM >> To: Christopher Pott >> Cc: talk@lists.collectionspace.org >> Subject: Re: [Talk] what's the best way to represent structured dates? >> >> Hi Chris, >> Thanks for asking these questions. We've been wondering the same things. >> Have you changed any existing calendar date fields to structured dates or >> added new structured date fields? I think this is all quite new, and >> together we might be able to come up with some best practices (related to >> data migration and data entry). >> I suspect that search and advanced search of structured date fields is going >> to be an issue for those of us deploying CollectionSpace. Our users will >> want CSpace to be smart about structured dates, interpreting qualifiers, and >> so on, but I bet that we won't see much as much as we would like in the >> first version of advanced search. I'm always happy to be proved wrong! :-) >> Chris >> On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote: >> >> Hi, >> >> I’m looking at the structured date schema and the best way to represent >> dates migrated from our old system. Any information about the following >> would be much appreciated…. >> >> 1. I see some different/overlapping possibilities for recording an object >> creation date. For example, to represent “1718” (display date) one could >> either store: >> >> a) a single date, providing year only (1718) >> >> b) an earliest date (01.01.1718) with a latest date (31.12.1718) >> >> c) a single date (01.01.1718) with a qualifier of + 364 days >> >> Is there any difference in the meaning of these three (and would they all be >> treated equally by services such as advanced search)? Our migration data is >> currently in format ‘b’ but the other formats are sometimes ‘friendlier’ for >> new input. >> >> 2. How are date “certainties” and “qualifiers” used by advanced search? >> >> I’ve specified “1718”, as a creation date with a certainty of “circa”. Will >> a search for objects created during 1717 result in a hit? >> >> Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) to define exact >> limits for this type of searching? >> >> 3. Is the Date schema locally extendable? (We need a field to hold an >> English translation of the display text) >> >> 4. Will Advanced search by creation date also search the display date or >> will it just use the single/earliest and latest date groups? >> >> Thanks, >> >> Chris >> >> _______________________________________________ >> Talk mailing list >> Talk@lists.collectionspace.org >> http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org >> >> >>
JE
Janice Eklund
Wed, Aug 24, 2011 5:17 PM

I second Susan's concern about false precision.  At least 90% of our
dates are year only single dates or date ranges. If you are
considering fuzzy logic down the road, look at the existing code
carefully.  The fuzzy date machine, at least as implemented at UCB for
the HAVRC, does not handle ancient dates properly.

Jan

On 8/24/11, Susan Stone sstone@socrates.berkeley.edu wrote:

Patrick,

There was quite a bit of discussion of this in our Wednesday afternoon
meetings at Berkeley that I'm sure Aron will recall. I don't think date
fields go back far enough in time, so I'd recommend using integers or
floating point numbers and providing the math in search. It is also
important to make sure that false precision isn't introduced. If
something is dated 1900, the system shouldn't enter that as January 1,
1900 unless the uncertainty values can be used to differentiate and the
display date can be simply 1900.

Susan

On 08/24/2011 09:16 AM, Patrick Schmitz wrote:

Really? I am a little surprised that in all the design discussions for
structured dates this did not come up.
Anyway, I will say that we do not have the time to build some system
that allows for a lot of custom behavior, or elaborate heuristics here
(i.e., trying to divine what is intended from a set of text fields). I
would strongly suggest that we either

  1. require entry of some actual date fields upon which the application
    can reason,
    or
  2. we define a very simple set of rules for building such date values
    from some of the existing text fields. By very simple, I mean that
    we only consider fields like year, month, day, and that the rules
    produce two scalar dates (early and late), which either define a
    range, or define a point (if they are equal). AIUI, a structured
    date /cannot/ model the idea of "some time/any time after (or
    before) a date".

One further question is the semantic of which date (if any) will be used
for ordering.
If and when advanced search supports date ranges (which also implies
support for /before/ and /after/ logic in search), we can handle the
interval logic associated with the structured dates, but only if we have
a pair of scalar date values.
Down the road, I do think it would be interesting to implement some
fuzzier logic on date search, to better support "circa" logic and the
uncertainty values. However, I suspect that it will take longer to
arrive at consensus on these semantics, and would defer that discussion
for now.
Thanks - Patrick


 *From:* Chris Hoffman [mailto:chris.hoffman@berkeley.edu]
 *Sent:* Wednesday, August 24, 2011 9:01 AM
 *To:* Patrick Schmitz
 *Cc:* 'Christopher Pott'; 'Aron Roberts';

talk@lists.collectionspace.org
Subject: Re: [Talk] what's the best way to represent structured
dates?

 Hi Patrick, I think it's fair to say there's no consensus because
 there's been no discussion :-) It's possible that each customer will
 want to use structured dates differently, which will make it hard to
 add any intelligence or even predictability to things like search.
 However, I also think that if a significant number of deploying
 institutions can come to some consensus on best ways to use and
 search structured dates, then many institutions will choose to adopt
 the same best practices.

 Thanks,
 Chris

 On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote:
 Aron was out yesterday, and Rick is out for a few days. They have
 done a lot of the structured dates work to date.
 The last I heard, Aron was hoping to get consensus from the users
 about how we map all the text fields to actual date fields that we
 can reason on (for search, ordering, etc). To date, there seems to
 be no such consensus, which will preclude search, ordering, etc.
 on structured dates. I would think that this would make structured
 dates quite a bit less useful...
 Perhaps (I hope) my understanding is out of date, and we just need
 to implement the logic. Perhaps Aron can jump in this AM, when he
 gets in.
 Patrick

     *From:* talk-bounces@lists.collectionspace.org
     <mailto:talk-bounces@lists.collectionspace.org>
     [mailto:talk-bounces@lists.collectionspace.org] *On Behalf Of
     *Chris Hoffman
     *Sent:* Tuesday, August 23, 2011 11:05 AM
     *To:* Christopher Pott
     *Cc:* talk@lists.collectionspace.org
     <mailto:talk@lists.collectionspace.org>
     *Subject:* Re: [Talk] what's the best way to represent
     structured dates?

     Hi Chris,

     Thanks for asking these questions. We've been wondering the
     same things. Have you changed any existing calendar date
     fields to structured dates or added new structured date
     fields? I think this is all quite new, and together we might
     be able to come up with some best practices (related to data
     migration and data entry).

     I suspect that search and advanced search of structured date
     fields is going to be an issue for those of us deploying
     CollectionSpace. Our users will want CSpace to be smart about
     structured dates, interpreting qualifiers, and so on, but I
     bet that we won't see much as much as we would like in the
     first version of advanced search. I'm always happy to be
     proved wrong! :-)

     Chris

     On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote:
     Hi,

     I’m looking at the structured date schema and the best way to
     represent dates migrated from our old system. Any information
     about the following would be much appreciated….

     1. I see some different/overlapping possibilities for
     recording an object creation date. For example, to represent
     “1718” (display date) one could either store:

     a) a single date, providing year only (1718)

     b) an earliest date (01.01.1718) with a latest date (31.12.1718)

     c) a single date (01.01.1718) with a qualifier of + 364 days

     Is there any difference in the meaning of these three (and
     would they all be treated equally by services such as
     advanced search)? Our migration data is currently in format
     ‘b’ but the other formats are sometimes ‘friendlier’ for new
     input.

     2. How are date “certainties” and “qualifiers” used by
     advanced search?

     I’ve specified “1718”, as a creation date with a certainty of
     “circa”. Will a search for objects created during 1717 result
     in a hit?

     Or is it necessary to use qualifiers (eg. 1718 +/- 2 years)
     to define exact limits for this type of searching?

     3. Is the Date schema locally extendable? (We need a field to
     hold an English translation of the display text)

     4. Will Advanced search by creation date also search the
     display date or will it just use the single/earliest and
     latest date groups?

     Thanks,

     Chris

     _______________________________________________
     Talk mailing list
     Talk@lists.collectionspace.org
     <mailto:Talk@lists.collectionspace.org>

http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org

I second Susan's concern about false precision. At least 90% of our dates are year only single dates or date ranges. If you are considering fuzzy logic down the road, look at the existing code carefully. The fuzzy date machine, at least as implemented at UCB for the HAVRC, does not handle ancient dates properly. Jan On 8/24/11, Susan Stone <sstone@socrates.berkeley.edu> wrote: > Patrick, > > There was quite a bit of discussion of this in our Wednesday afternoon > meetings at Berkeley that I'm sure Aron will recall. I don't think date > fields go back far enough in time, so I'd recommend using integers or > floating point numbers and providing the math in search. It is also > important to make sure that false precision isn't introduced. If > something is dated 1900, the system shouldn't enter that as January 1, > 1900 unless the uncertainty values can be used to differentiate and the > display date can be simply 1900. > > Susan > > > On 08/24/2011 09:16 AM, Patrick Schmitz wrote: >> Really? I am a little surprised that in all the design discussions for >> structured dates this did not come up. >> Anyway, I will say that we do not have the time to build some system >> that allows for a lot of custom behavior, or elaborate heuristics here >> (i.e., trying to divine what is intended from a set of text fields). I >> would strongly suggest that we either >> >> 1. >> require entry of some actual date fields upon which the application >> can reason, >> or >> 2. >> we define a very simple set of rules for building such date values >> from some of the existing text fields. By very simple, I mean that >> we only consider fields like year, month, day, and that the rules >> produce two scalar dates (early and late), which either define a >> range, or define a point (if they are equal). AIUI, a structured >> date /cannot/ model the idea of "some time/any time after (or >> before) a date". >> >> One further question is the semantic of which date (if any) will be used >> for ordering. >> If and when advanced search supports date ranges (which also implies >> support for /before/ and /after/ logic in search), we can handle the >> interval logic associated with the structured dates, but only if we have >> a pair of scalar date values. >> Down the road, I do think it would be interesting to implement some >> fuzzier logic on date search, to better support "circa" logic and the >> uncertainty values. However, I suspect that it will take longer to >> arrive at consensus on these semantics, and would defer that discussion >> for now. >> Thanks - Patrick >> >> >> ------------------------------------------------------------------------ >> *From:* Chris Hoffman [mailto:chris.hoffman@berkeley.edu] >> *Sent:* Wednesday, August 24, 2011 9:01 AM >> *To:* Patrick Schmitz >> *Cc:* 'Christopher Pott'; 'Aron Roberts'; >> talk@lists.collectionspace.org >> *Subject:* Re: [Talk] what's the best way to represent structured >> dates? >> >> Hi Patrick, I think it's fair to say there's no consensus because >> there's been no discussion :-) It's possible that each customer will >> want to use structured dates differently, which will make it hard to >> add any intelligence or even predictability to things like search. >> However, I also think that if a significant number of deploying >> institutions can come to some consensus on best ways to use and >> search structured dates, then many institutions will choose to adopt >> the same best practices. >> >> Thanks, >> Chris >> >> On Aug 24, 2011, at 8:55 AM, Patrick Schmitz wrote: >> >>> Aron was out yesterday, and Rick is out for a few days. They have >>> done a lot of the structured dates work to date. >>> The last I heard, Aron was hoping to get consensus from the users >>> about how we map all the text fields to actual date fields that we >>> can reason on (for search, ordering, etc). To date, there seems to >>> be no such consensus, which will preclude search, ordering, etc. >>> on structured dates. I would think that this would make structured >>> dates quite a bit less useful... >>> Perhaps (I hope) my understanding is out of date, and we just need >>> to implement the logic. Perhaps Aron can jump in this AM, when he >>> gets in. >>> Patrick >>> >>> >>> ------------------------------------------------------------------------ >>> *From:* talk-bounces@lists.collectionspace.org >>> <mailto:talk-bounces@lists.collectionspace.org> >>> [mailto:talk-bounces@lists.collectionspace.org] *On Behalf Of >>> *Chris Hoffman >>> *Sent:* Tuesday, August 23, 2011 11:05 AM >>> *To:* Christopher Pott >>> *Cc:* talk@lists.collectionspace.org >>> <mailto:talk@lists.collectionspace.org> >>> *Subject:* Re: [Talk] what's the best way to represent >>> structured dates? >>> >>> Hi Chris, >>> >>> Thanks for asking these questions. We've been wondering the >>> same things. Have you changed any existing calendar date >>> fields to structured dates or added new structured date >>> fields? I think this is all quite new, and together we might >>> be able to come up with some best practices (related to data >>> migration and data entry). >>> >>> I suspect that search and advanced search of structured date >>> fields is going to be an issue for those of us deploying >>> CollectionSpace. Our users will want CSpace to be smart about >>> structured dates, interpreting qualifiers, and so on, but I >>> bet that we won't see much as much as we would like in the >>> first version of advanced search. I'm always happy to be >>> proved wrong! :-) >>> >>> Chris >>> >>> On Aug 23, 2011, at 8:48 AM, Christopher Pott wrote: >>> >>>> Hi, >>>> >>>> I’m looking at the structured date schema and the best way to >>>> represent dates migrated from our old system. Any information >>>> about the following would be much appreciated…. >>>> >>>> 1. I see some different/overlapping possibilities for >>>> recording an object creation date. For example, to represent >>>> “1718” (display date) one could either store: >>>> >>>> a) a single date, providing year only (1718) >>>> >>>> b) an earliest date (01.01.1718) with a latest date (31.12.1718) >>>> >>>> c) a single date (01.01.1718) with a qualifier of + 364 days >>>> >>>> Is there any difference in the meaning of these three (and >>>> would they all be treated equally by services such as >>>> advanced search)? Our migration data is currently in format >>>> ‘b’ but the other formats are sometimes ‘friendlier’ for new >>>> input. >>>> >>>> 2. How are date “certainties” and “qualifiers” used by >>>> advanced search? >>>> >>>> I’ve specified “1718”, as a creation date with a certainty of >>>> “circa”. Will a search for objects created during 1717 result >>>> in a hit? >>>> >>>> Or is it necessary to use qualifiers (eg. 1718 +/- 2 years) >>>> to define exact limits for this type of searching? >>>> >>>> 3. Is the Date schema locally extendable? (We need a field to >>>> hold an English translation of the display text) >>>> >>>> 4. Will Advanced search by creation date also search the >>>> display date or will it just use the single/earliest and >>>> latest date groups? >>>> >>>> Thanks, >>>> >>>> Chris >>>> >>>> _______________________________________________ >>>> Talk mailing list >>>> Talk@lists.collectionspace.org >>>> <mailto:Talk@lists.collectionspace.org> >>>> >>>> http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org >>> >> >> >> >> _______________________________________________ >> Talk mailing list >> Talk@lists.collectionspace.org >> http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org > > > _______________________________________________ > Talk mailing list > Talk@lists.collectionspace.org > http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org >