Thank you, Susan, Aron and Chris. In particular a shout-out to Aron for writing up a great list of things to look at for improving performance. This certainly generated more of a discussion than I anticipated -- my inquiries seem to do that -- and I appreciate everyone's time in writing out their thoughts. In particular, I though Postgesql's AutoVacuum process would take care of things for me, but in this time of large record loads it may make sense for me to do that more often than the automatic processes would dictate. Knowing about the anchors, as Chris called out, is particularly useful as well.
I am intrigued by Susan's comment about using external vocabularies. There seems to be foresight in the data model to be able to do that, and hooking into something like the Getty's Linked Open Data SPARQL endpoint would certainly be nice to do. As with most open source, though, if only someone had the time to scratch that itch...
Peter
On Nov 16, 2015, at 6:29 PM, Aron Roberts aron@socrates.berkeley.edu wrote:
This will likely repeat a lot of what John, Ray, and Richard have said, but my two cents as well ... there's also a note here from a discussion with Chris about search training.
A summary / tl;dr:
- Elasticsearch integration may be a future help in speeding up large searches. Until then (and likely even after then ...
- Training users on doing effective searching can significantly improve search speed.
- Using just a subset of term lists can help (if it's amenable to that; it may not be here ...).
- Throwing hardware at the problem - particularly fast disk - can help quite a lot.
- Be sure that the database is cared for/tuned a bit:
5.1 Run VACUUM/ANALYZE
5.2 Look at other tuning tips
Aron
--
Peter Murray
Dev/Ops Lead and Project Manager
Cherry Hill Company
Thank you, Susan, Aron and Chris. In particular a shout-out to Aron for writing up a great list of things to look at for improving performance. This certainly generated more of a discussion than I anticipated -- my inquiries seem to do that -- and I appreciate everyone's time in writing out their thoughts. In particular, I though Postgesql's AutoVacuum process would take care of things for me, but in this time of large record loads it may make sense for me to do that more often than the automatic processes would dictate. Knowing about the anchors, as Chris called out, is particularly useful as well.
I am intrigued by Susan's comment about using external vocabularies. There seems to be foresight in the data model to be able to do that, and hooking into something like the Getty's Linked Open Data SPARQL endpoint would certainly be nice to do. As with most open source, though, if only someone had the time to scratch that itch...
Peter
> On Nov 16, 2015, at 6:29 PM, Aron Roberts <aron@socrates.berkeley.edu> wrote:
>
> This will likely repeat a lot of what John, Ray, and Richard have said, but my two cents as well ... there's also a note here from a discussion with Chris about search training.
>
> A summary / tl;dr:
>
> 1. Elasticsearch integration may be a future help in speeding up large searches. Until then (and likely even after then ...
> 2. Training users on doing effective searching can significantly improve search speed.
> 3. Using just a subset of term lists can help (if it's amenable to that; it may not be here ...).
> 4. Throwing hardware at the problem - particularly fast disk - can help quite a lot.
> 5. Be sure that the database is cared for/tuned a bit:
> 5.1 Run VACUUM/ANALYZE
> 5.2 Look at other tuning tips
>
> Aron
--
Peter Murray
Dev/Ops Lead and Project Manager
Cherry Hill Company