Hello: I wonder if there's a fast way to get the contents or names of the resources that start with a certain path. We have something like this: string query = @"for $doc in collection() where substring-after(document-uri($doc), '/') = '" + documentName + @"' return $doc/Root"; But our db has millions of resources so this doesn't scale too well. I'd like a way to get the contents (or names) of resources in which the names start with a certain pattern without having to check all names. Is it possible? The alternative would be to add a field inside the xml resources, index on it, and perform a query. Thanks, Martín.
Hello Martin,
as stated in our documentation at http://docs.basex.org/wiki/Databases#XML_Documents you can further restrict the documents in the collection via the argument.
Also, you can get all document names in a database using db:list("database") or even stricter using a path, e.g. db:list("database", "my/path").
Cheers Dirk
On 08/28/2015 06:35 PM, Martín Ferrari wrote:
Hello: I wonder if there's a fast way to get the contents or names of the resources that start with a certain path. We have something like this:
string query = @"for $doc in collection() where substring-after(document-uri($doc),
'/') = '" + documentName + @"' return $doc/Root";
But our db has millions of resources so this doesn't scale too well. I'd like a way to get the contents (or names) of resources in
which the names start with a certain pattern without having to check all names. Is it possible?
The alternative would be to add a field inside the xml resources,
index on it, and perform a query.
Thanks, Martín.
HI Dirk, Yes, I've already looked at it. My problem is that unfortunately I need to get all resources whose names *start* with some string, but I don't know the exact path of the file. So, I might have paths like "RequestABC20150108", "RequestABC20150201". I know they start with "RequestABC", but I don't know the rest of the path. :( :( The db has more than a million records, so getting a list of all files and processing seems too inefficient, so I need something that would access indexed data. This is preexisting code so I was looking for an easy way to fix it. If not, I would have to create an indexed value so searches are faster, or maybe have subpaths inside the db as I assume that would be faster too. Thanks! Martín. From: dk@basex.org To: ferrari_martin@hotmail.com CC: basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] Fastest way to get resources starting with path Date: Sat, 29 Aug 2015 00:51:52 +0200
Hello Martin,
as stated in our documentation at http://docs.basex.org/wiki/Databases#XML_Documents you can further restrict the documents in the collection via the argument.
Also, you can get all document names in a database using db:list("database") or even stricter using a path, e.g. db:list("database", "my/path").
Cheers
Dirk
On 08/28/2015 06:35 PM, Martín Ferrari wrote:
Hello: I wonder if there's a fast way to get the contents or names of the resources that start with a certain path. We have something like this:
string query = @"for $doc in collection() where substring-after(document-uri($doc), '/') = '" + documentName + @"' return $doc/Root";
But our db has millions of resources so this doesn't scale too well. I'd like a way to get the contents (or names) of resources in which the names start with a certain pattern without having to check all names. Is it possible?
The alternative would be to add a field inside the xml resources, index on it, and perform a query.
Thanks, Martín.
-- Dirk Kirsten, BaseX GmbH, http://basexgmbh.de |-- Firmensitz: Blarerstrasse 56, 78462 Konstanz |-- Registergericht Freiburg, HRB: 708285, Geschäftsführer: | Dr. Christian Grün, Dr. Alexander Holupirek, Michael Seiferle `-- Phone: 0049 7531 28 28 676, Fax: 0049 7531 20 05 22
Hi Martín,
There is currently no way to filter resources by their prefix. As has already been motivated, the best solution for now is indeed to use the path semantics (i.e., distribute your resources to sub-directories) and call db:open($db, $path).
Hope this helps, Christian
On Fri, Aug 28, 2015 at 6:35 PM, Martín Ferrari ferrari_martin@hotmail.com wrote:
Hello: I wonder if there's a fast way to get the contents or names of the resources that start with a certain path. We have something like this:
string query = @"for $doc in collection() where substring-after(document-uri($doc), '/')
= '" + documentName + @"' return $doc/Root";
But our db has millions of resources so this doesn't scale too well. I'd like a way to get the contents (or names) of resources in which the
names start with a certain pattern without having to check all names. Is it possible?
The alternative would be to add a field inside the xml resources, index
on it, and perform a query.
Thanks, Martín.
basex-talk@mailman.uni-konstanz.de