WE HAVE SUNSET THIS LISTSERV - Join us at collectionspace@lyrasislists.org
View all threadsHi,
I need to make sure I understand at what scope uniqueness is required for short identifiers in vocabularies. Must short identifiers be unique across all vocabularies within one authority, or do they have to be unique just within a vocabulary? I ask because for our UCJEPS installation, we have 3 taxonomy vocabularies. Short identifiers are unique within each vocabulary but not across the entire set. When I try to display one of those that has a "duplicate" short identifier in a demo 2.4 system, CSpace took 5 minutes to display the record. The GET statement in firebug was only reporting:
http://ucjeps.collectionspace.org:8180/collectionspace/tenant/ucjeps/vocabularies/taxon/urn:cspace:name(0)
although the referer within the request headers is
http://ucjeps.collectionspace.org:8180/collectionspace/ui/ucjeps/html/taxon.html?csid=urn:cspace:name(0)&vocab=taxon
Thanks,
Chris
Not sure i understand the requirements here entirely but
urn:cspace:name(0) does seem very brief!
It would be easy to construct identifiers that are globally unique
(e.g. attaching a UUID at the end of urn:cspace:) so why not use those
so other systems can work directly with the concepts that you are
identifying?
John
On Fri, Aug 31, 2012 at 12:58 PM, Chris Hoffman
chris.hoffman@berkeley.edu wrote:
Hi,
I need to make sure I understand at what scope uniqueness is required for short identifiers in vocabularies. Must short identifiers be unique across all vocabularies within one authority, or do they have to be unique just within a vocabulary? I ask because for our UCJEPS installation, we have 3 taxonomy vocabularies. Short identifiers are unique within each vocabulary but not across the entire set. When I try to display one of those that has a "duplicate" short identifier in a demo 2.4 system, CSpace took 5 minutes to display the record. The GET statement in firebug was only reporting:
http://ucjeps.collectionspace.org:8180/collectionspace/tenant/ucjeps/vocabularies/taxon/urn:cspace:name(0)
although the referer within the request headers is
http://ucjeps.collectionspace.org:8180/collectionspace/ui/ucjeps/html/taxon.html?csid=urn:cspace:name(0)&vocab=taxon
Thanks,
Chris
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org
--
John Deck
(541) 321-0689
I thought short identifiers could not begin with a number?
Susan
Not sure i understand the requirements here entirely but
urn:cspace:name(0) does seem very brief!
It would be easy to construct identifiers that are globally unique
(e.g. attaching a UUID at the end of urn:cspace:) so why not use those
so other systems can work directly with the concepts that you are
identifying?
John
On Fri, Aug 31, 2012 at 12:58 PM, Chris Hoffman
chris.hoffman@berkeley.edu wrote:
Hi,
I need to make sure I understand at what scope uniqueness is required
for short identifiers in vocabularies. Must short identifiers be unique
across all vocabularies within one authority, or do they have to be
unique just within a vocabulary? I ask because for our UCJEPS
installation, we have 3 taxonomy vocabularies. Short identifiers are
unique within each vocabulary but not across the entire set. When I try
to display one of those that has a "duplicate" short identifier in a
demo 2.4 system, CSpace took 5 minutes to display the record. The GET
statement in firebug was only reporting:
http://ucjeps.collectionspace.org:8180/collectionspace/tenant/ucjeps/vocabularies/taxon/urn:cspace:name(0)
although the referer within the request headers is
http://ucjeps.collectionspace.org:8180/collectionspace/ui/ucjeps/html/taxon.html?csid=urn:cspace:name(0)&vocab=taxon
Thanks,
Chris
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org
--
John Deck
(541) 321-0689
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org
Must be unique within a vocab, or you will get duplicates, and things will
be unhappy. Having the same ID in several vocabs (even within the same
authority) is fine.
ShortIdentifiers are the basis for fetching, so having a short ID that is a
reasonable word length is good (indexes better). Having one that has a
numeric pseudo-random tail (e.g., "fooBar2345", as we create for auto-create
on termCompletion) balances ease of reading/debugging with uniqueness that
will behave well in the DB. The DB will try to stem the shortId when it
prepares keyword indexes (which we use in certain cases), and so the suffix
keeps things behaving better, I think.
UUIDs will work fine as well, but are harder to make sense of when
debugging, etc.
Patrick
-----Original Message-----
From: talk-bounces@lists.collectionspace.org
[mailto:talk-bounces@lists.collectionspace.org] On Behalf Of
Chris Hoffman
Sent: Friday, August 31, 2012 12:58 PM
To: CollectionSpace Talk List
Subject: [Talk] Short identifier uniqueness requirements question?
Hi,
I need to make sure I understand at what scope uniqueness is
required for short identifiers in vocabularies. Must short
identifiers be unique across all vocabularies within one
authority, or do they have to be unique just within a
vocabulary? I ask because for our UCJEPS installation, we
have 3 taxonomy vocabularies. Short identifiers are unique
within each vocabulary but not across the entire set. When I
try to display one of those that has a "duplicate" short
identifier in a demo 2.4 system, CSpace took 5 minutes to
display the record. The GET statement in firebug was only reporting:
http://ucjeps.collectionspace.org:8180/collectionspace/tenant/
ucjeps/vocabularies/taxon/urn:cspace:name(0)
although the referer within the request headers is
http://ucjeps.collectionspace.org:8180/collectionspace/ui/ucje
ps/html/taxon.html?csid=urn:cspace:name(0)&vocab=taxon
Thanks,
Chris
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.c
ollectionspace.org
I think we just require all word chars (alnum and _, IIRC).
-----Original Message-----
From: talk-bounces@lists.collectionspace.org
[mailto:talk-bounces@lists.collectionspace.org] On Behalf Of
sstone@socrates.berkeley.edu
Sent: Friday, August 31, 2012 1:49 PM
To: John Deck
Cc: CollectionSpace Talk List
Subject: Re: [Talk] Short identifier uniqueness requirements question?
I thought short identifiers could not begin with a number?
Susan
Not sure i understand the requirements here entirely but
urn:cspace:name(0) does seem very brief!
It would be easy to construct identifiers that are globally unique
(e.g. attaching a UUID at the end of urn:cspace:) so why
not use those
so other systems can work directly with the concepts that you are
identifying?
John
On Fri, Aug 31, 2012 at 12:58 PM, Chris Hoffman
chris.hoffman@berkeley.edu wrote:
Hi,
I need to make sure I understand at what scope uniqueness
is required
for short identifiers in vocabularies. Must short identifiers be
unique across all vocabularies within one authority, or do
they have
to be unique just within a vocabulary? I ask because for
our UCJEPS
installation, we have 3 taxonomy vocabularies. Short
identifiers are
unique within each vocabulary but not across the entire
set. When I
try to display one of those that has a "duplicate" short
identifier
in a demo 2.4 system, CSpace took 5 minutes to display the
record.
The GET statement in firebug was only reporting:
vocabularies/taxon/urn:cspace:name(0)
although the referer within the request headers is
/taxon.html?csid=urn:cspace:name(0)&vocab=taxon
Thanks,
Chris
Talk mailing list
Talk@lists.collectionspace.org
nspace.org
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.c
ollectionspace.org
Thanks, I'm pretty sure my problem is that the legacy identifier I used in this case was simply a one-digit string "0", and fetching is therefore slow. Sigh.
Chris
On Aug 31, 2012, at 2:00 PM, Patrick Schmitz wrote:
Must be unique within a vocab, or you will get duplicates, and things will
be unhappy. Having the same ID in several vocabs (even within the same
authority) is fine.
ShortIdentifiers are the basis for fetching, so having a short ID that is a
reasonable word length is good (indexes better). Having one that has a
numeric pseudo-random tail (e.g., "fooBar2345", as we create for auto-create
on termCompletion) balances ease of reading/debugging with uniqueness that
will behave well in the DB. The DB will try to stem the shortId when it
prepares keyword indexes (which we use in certain cases), and so the suffix
keeps things behaving better, I think.
UUIDs will work fine as well, but are harder to make sense of when
debugging, etc.
Patrick
-----Original Message-----
From: talk-bounces@lists.collectionspace.org
[mailto:talk-bounces@lists.collectionspace.org] On Behalf Of
Chris Hoffman
Sent: Friday, August 31, 2012 12:58 PM
To: CollectionSpace Talk List
Subject: [Talk] Short identifier uniqueness requirements question?
Hi,
I need to make sure I understand at what scope uniqueness is
required for short identifiers in vocabularies. Must short
identifiers be unique across all vocabularies within one
authority, or do they have to be unique just within a
vocabulary? I ask because for our UCJEPS installation, we
have 3 taxonomy vocabularies. Short identifiers are unique
within each vocabulary but not across the entire set. When I
try to display one of those that has a "duplicate" short
identifier in a demo 2.4 system, CSpace took 5 minutes to
display the record. The GET statement in firebug was only reporting:
http://ucjeps.collectionspace.org:8180/collectionspace/tenant/
ucjeps/vocabularies/taxon/urn:cspace:name(0)
although the referer within the request headers is
http://ucjeps.collectionspace.org:8180/collectionspace/ui/ucje
ps/html/taxon.html?csid=urn:cspace:name(0)&vocab=taxon
Thanks,
Chris
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.c
ollectionspace.org
Thanks, John. You're right that the string urn:cspace:name(0) wouldn't be a very good identifier! Fortunately, this one is just used internally, and even then only under specific circumstances. There are UUIDs and such attached to this record that will allow us to construct more appropriate globally unique identifiers for data sharing purposes.
Chris
On Aug 31, 2012, at 1:34 PM, John Deck wrote:
Not sure i understand the requirements here entirely but
urn:cspace:name(0) does seem very brief!
It would be easy to construct identifiers that are globally unique
(e.g. attaching a UUID at the end of urn:cspace:) so why not use those
so other systems can work directly with the concepts that you are
identifying?
John
On Fri, Aug 31, 2012 at 12:58 PM, Chris Hoffman
chris.hoffman@berkeley.edu wrote:
Hi,
I need to make sure I understand at what scope uniqueness is required for short identifiers in vocabularies. Must short identifiers be unique across all vocabularies within one authority, or do they have to be unique just within a vocabulary? I ask because for our UCJEPS installation, we have 3 taxonomy vocabularies. Short identifiers are unique within each vocabulary but not across the entire set. When I try to display one of those that has a "duplicate" short identifier in a demo 2.4 system, CSpace took 5 minutes to display the record. The GET statement in firebug was only reporting:
http://ucjeps.collectionspace.org:8180/collectionspace/tenant/ucjeps/vocabularies/taxon/urn:cspace:name(0)
although the referer within the request headers is
http://ucjeps.collectionspace.org:8180/collectionspace/ui/ucjeps/html/taxon.html?csid=urn:cspace:name(0)&vocab=taxon
Thanks,
Chris
Talk mailing list
Talk@lists.collectionspace.org
http://lists.collectionspace.org/mailman/listinfo/talk_lists.collectionspace.org
--
John Deck
(541) 321-0689