[file:invalid-path] Invalid file path: 'Temp/pdfs/
e5-轻奢版-80-global-en-us-dh9.fo
<http://xn--e5--80-global-en-us-dh9-3195b4226a2w0g.fo>
Any chance this limit can be overcome?
--
France Baril
Architecte documentaire / Documentation architect
france.baril(a)architextus.com
Hi BaseX team,
Is this a bug or an improper use?
*Unexpected error: Improper use? Potential bug? Your feedback is welcome:*
*Contact: basex-talk(a)mailman.uni-konstanz.de
<basex-talk(a)mailman.uni-konstanz.de>*
*Version: BaseX 9.7.2*
*Java: Oracle Corporation, 1.8.0_281*
*OS: Windows 10, amd64*
*Stack Trace: *
*java.lang.ArrayIndexOutOfBoundsException: 559483324*
* at org.basex.io.random.TableDiskAccess.read4(TableDiskAccess.java:174)*
* at org.basex.data.Data.id <http://org.basex.data.Data.id>(Data.java:302)*
* at org.basex.data.Data.pre(Data.java:286)*
* at org.basex.query.func.db.DbOpenId.pre(DbOpenId.java:47)*
* at org.basex.query.func.db.DbOpenId.value(DbOpenId.java:28)*
* at org.basex.query.func.StaticFuncCall.evalArgs(StaticFuncCall.java:147)*
* at org.basex.query.func.FuncCall.value(FuncCall.java:53)*
* at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:46)*
* at org.basex.query.expr.Switch.iter(Switch.java:198)*
* at org.basex.query.expr.IterMap$1.next(IterMap.java:72)*
* at org.basex.query.QueryContext.next(QueryContext.java:359)*
* at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:66)*
* at org.basex.query.QueryContext.next(QueryContext.java:359)*
* at org.basex.query.expr.constr.Constr.add(Constr.java:73)*
* at org.basex.query.expr.constr.CElem.item(CElem.java:149)*
* at org.basex.query.expr.constr.CElem.item(CElem.java:1)*
* at org.basex.query.expr.ParseExpr.value(ParseExpr.java:51)*
* at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:46)*
* at org.basex.query.expr.constr.Constr.add(Constr.java:72)*
* at org.basex.query.expr.constr.CElem.item(CElem.java:149)*
* at org.basex.query.expr.constr.CElem.item(CElem.java:1)*
* at org.basex.query.expr.ItemMap.item(ItemMap.java:37)*
* at org.basex.query.expr.ParseExpr.value(ParseExpr.java:51)*
* at org.basex.query.expr.ParseExpr.iter(ParseExpr.java:46)*
* at org.basex.query.expr.gflwor.GFLWOR$1.next(GFLWOR.java:72)*
* at org.basex.query.scope.MainModule$1.next(MainModule.java:67)*
* at org.basex.http.restxq.RestXqResponse.serialize(RestXqResponse.java:87)*
* at org.basex.http.web.WebResponse.create(WebResponse.java:58)*
* at org.basex.http.restxq.RestXqServlet.run(RestXqServlet.java:72)*
* at org.basex.http.BaseXServlet.service(BaseXServlet.java:69)*
* at javax.servlet.http.HttpServlet.service(HttpServlet.java:790)*
* at
org.eclipse.jetty.servlet.ServletHolder$NotAsync.service(ServletHolder.java:1450)*
* at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:799)*
* at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:550)*
* at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)*
* at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:600)*
* at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)*
* at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:235)*
* at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1624)*
* at
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:233)*
* at
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1440)*
* at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:188)*
* at
org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:501)*
* at
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1594)*
* at
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:186)*
* at
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1355)*
* at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)*
* at
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:127)*
* at org.eclipse.jetty.server.Server.handle(Server.java:516)*
* at
org.eclipse.jetty.server.HttpChannel.lambda$handle$1(HttpChannel.java:487)*
* at org.eclipse.jetty.server.HttpChannel.dispatch(HttpChannel.java:732)*
* at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:479)*
* at
org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:277)*
* at
org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:311)*
* at org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:105)*
* at org.eclipse.jetty.io.ChannelEndPoint$1.run(ChannelEndPoint.java:104)*
* at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.runTask(EatWhatYouKill.java:338)*
* at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.doProduce(EatWhatYouKill.java:315)*
* at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.tryProduce(EatWhatYouKill.java:173)*
* at
org.eclipse.jetty.util.thread.strategy.EatWhatYouKill.run(EatWhatYouKill.java:131)*
* at
org.eclipse.jetty.util.thread.ReservedThreadExecutor$ReservedThread.run(ReservedThreadExecutor.java:409)*
* at
org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:883)*
* at
org.eclipse.jetty.util.thread.QueuedThreadPool$Runner.run(QueuedThreadPool.java:1034)*
* at java.lang.Thread.run(Unknown Source)*
I'm getting this error with a call to a complex ( more or less ) function.
The more results it needs to handle, the more errors I get.
Any help is welcome!
Is it possible to do faceted browsing with BaseX ?
I think the answer is no, however I may have been initially fooled into trying because my first test case was far simpler than my other use cases.
The simple case was getting counts of publisher from an EAD collection and sorting by greatest number. Here’s my working code:
declare function local:normalize( $s ) {
translate( replace( lower-case($s), '^(us-)+', '' ), '-', '' )
};
declare variable $orgs := doc('ead-inst/ead-inst.xml');
declare variable $orgcodes := collection('published')/ead/eadheader/eadid/@mainagencycode ! local:normalize(.) => distinct-values() ;
declare function local:countpubfacets( $c ) {
for $x in (
for $ead in $c
let $ORG := local:normalize($ead/*:ead/*:eadheader/*:eadid/@mainagencycode)
group by $ORG
order by $ORG
let $inst := if ($ORG != "") then ($orgs/list/inst[@prefix=$ORG],$orgs/list/inst[lower-case(@oclc) = $ORG] )
return array{ count($ead), $ORG, $inst/@orgcode/string(), $inst/string() } )
order by $x(1) descending return $x
};
local:countpubfacets( collection('published'))
In this case:
(1) The number of unique @mainagencycode’s are less than 100, and
(2) There is only one location for those codes.
and performance is acceptable (or at least it seems to be in my tests).
My other unsuccessful attempts have been with trying to rank //subject or //persname ’s.
In this case, there are many thousands of unique subjects and names, and the subject and persname elements can occur in multiple locations in the file.
Attempts to search similar to the above method ( as well as a couple of other variations I’ve tried ), even on a smaller subset of categories take entirely too much time — often I have to kill the search before it manages to complete.
I have tried looking at index:facets() https://docs.basex.org/wiki/Index_Module#index:facets <https://docs.basex.org/wiki/Index_Module#index:facets>
Which has only reinforced my notion that it’s not possible.
So for now, I’m resigned to deferring that functionality, and exploring building a specialized index along side the BaseX indexes - either using Solr and querying the Solr index from BaseX, or else building some other index structure DB in BaseX along side my document DB.
Eager to hear any tips or feedback on this problem or alternate solutions, and also general info about BaseX index structure and what useful info can be caught by introspection by those index module functions.
Aside from the faceting, search by //subject (or other fields) is quite acceptable performance,
even chaining several filters together with =>
declare function eadsearch:findBySubj( $ctx, $subj as xs:string?, $opt ) {
if ( $subj ) then $ctx/*[ft:contains( .//subject, ft:tokenize($subj), $opt )]
else $ctx
};
— Steve M.
Hi,
after reading https://docs.basex.org/wiki/Catalog_Resolver and digging in the list archives (https://mailman.uni-konstanz.de/pipermail/basex-talk/2019-March/014199.html ) I still have trouble understanding catalog files.
Is this supposed to work with xslt:transform() and BaseX GUI 9.7.2?
The default option (DTD = false) is ignored by xslt:transform() because the function is definitely requesting the external DTD.
This prevents transforming XML with DTD declarations that are not available (if I understand correctly, a problem that the DTD option is trying to solve in general).
When I try to solve this via catalog files (actually I do not need the DTD), I do not have success.
Here are my mini examples:
Saxon HE 10.3 resides in the lib folder
.basex setting:
# Local Options
SERIALIZER = indent=no
DTD = true
XML in local folder "C:/temp/catalog":
<!DOCTYPE dokument
SYSTEM "http://www.blahblahblah.info/dtd/dokument.dtd">
<dokument>
<doknr>01</doknr>
</dokument>
catalog.xml in local folder "C:/temp/catalog":
<catalog prefer="system" xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog">
<rewriteSystem systemIdStartString="http://www.blahblahblah.info/dtd/" rewritePrefix="file:///C:/temp/catalog/dtd/"/>
</catalog>
dokument.dtd in local folder "C:/temp/catalog/dtd":
<!ELEMENT dokument (doknr)>
<!ELEMENT doknr (#PCDATA)>
XQuery query.xq in local folder "C:/temp/catalog":
(# db:catfile catalog.xml #) {
xslt:transform('dokument.xml', 'transform.xsl')
}
With or without pragma, this always results in a java.net.UnknownHostException (because the system ID is not available, that's true), but I would be expecting this would resolve to "file:///C:/temp/catalog/dtd/dokument.dtd"
Not working in GUI nor via CCL.
What am I getting wrong?
Thanks, Daniel