Hi,
I am accessing a RESTful API via the following command:
curl -F data=@example.txt -F tokenizer= -F tagger= -F parser= http://lindat.mff.cuni.cz/services/udpipe/api/process > example2.txt
I am wondering what the best way is to do that in BaseX. The service also has a URL syntax, as shown in the following example:
http://lindat.mff.cuni.cz/services/udpipe/api/process?tokenizer&tagger&a.... http://lindat.mff.cuni.cz/services/udpipe/api/process?tokenizer&tagger&parser&data=D%C4%9Bti%20pojedou%20k%20babi%C4%8Dce.%20U%C5%BE%20se%20t%C4%9B%C5%A1%C3%AD.
I have tried:
http:send-request(<http:request method='get' href='http://lindat.mff.cuni.cz/services/udpipe/api/process?tokenizer&tagger&a...)
and works perfectly; but I was trying to put the body of the request in http:body, but it does not work:
http:send-request(<http:request method='get' href='http://lindat.mff.cuni.cz/services/udpipe/api/process%27%3E <http:body media-type ="string"> tokenizer&tagger&parser&data=Děti pojedou k babičce. Už se těší. </http:body> </http:request>)
I could invoke the curl command in BaseX, but maybe there is a more elegant way to send the content of the file (than adding it to the URL). Thanks.
Ciao, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
Hi Giuseppe, . You could use web:create-url to create your url [1] , although it requires parameters to have explicit values. E.g
let $target:="http://lindat.mff.cuni.cz/services/udpipe/api/process" let $params:=map{ "tokenizer":0, "tagger":0, "parser":0, "data":"Děti pojedou k babičce. Už se těší" } return http:send-request(<http:request method='get'/>, web:create-url($target,$params))
You could change this to read data from a file "data":file:read-text("example.txt"),
Regards /Andy
[1] http://docs.basex.org/wiki/Web_Module#web:create-url
On 14 August 2017 at 13:11, Giuseppe Celano < celano@informatik.uni-leipzig.de> wrote:
Hi,
I am accessing a RESTful API via the following command:
curl -F data=@example.txt -F tokenizer= -F tagger= -F parser= http://lindat.mff.cuni.cz/services/udpipe/api/process > example2.txt
I am wondering what the best way is to do that in BaseX. The service also has a URL syntax, as shown in the following example:
http://lindat.mff.cuni.cz/services/udpipe/api/process? tokenizer&tagger&parser&data=D%C4%9Bti%20pojedou%20k% 20babi%C4%8Dce.%20U%C5%BE%20se%20t%C4%9B%C5%A1%C3%AD.
I have tried:
http:send-request(<http:request method='get' href='http://lindat.mff.cuni.cz/services/udpipe/api/ process?tokenizer&tagger&parser&data=D%C4%9Bti% 20pojedou%20k%20babi%C4%8Dce.%20U%C5%BE%20se%20t%C4%9B%C5%A1%C3%AD'/>)
and works perfectly; but I was trying to put the body of the request in http:body, but it does not work:
http:send-request(<http:request method='get' href='http://lindat.mff.cuni.cz/services/udpipe/api/process%27%3E <http:body media-type ="string"> tokenizer&tagger&parser&data=Děti pojedou k babičce. Už se těší. </http:body> </http:request>)
I could invoke the curl command in BaseX, but maybe there is a more elegant way to send the content of the file (than adding it to the URL). Thanks.
Ciao, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
Hi Andy,
Thanks for the help. If I pass an entire text via the url, though, I get the error message "Request-URI Too Large".
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On 14 Aug 2017, at 14:11, Giuseppe Celano celano@informatik.uni-leipzig.de wrote:
Hi,
I am accessing a RESTful API via the following command:
curl -F data=@example.txt mailto:data=@example.txt -F tokenizer= -F tagger= -F parser= http://lindat.mff.cuni.cz/services/udpipe/api/process http://lindat.mff.cuni.cz/services/udpipe/api/process > example2.txt
I am wondering what the best way is to do that in BaseX. The service also has a URL syntax, as shown in the following example:
http://lindat.mff.cuni.cz/services/udpipe/api/process?tokenizer&tagger&a.... http://lindat.mff.cuni.cz/services/udpipe/api/process?tokenizer&tagger&parser&data=D%C4%9Bti%20pojedou%20k%20babi%C4%8Dce.%20U%C5%BE%20se%20t%C4%9B%C5%A1%C3%AD.
I have tried:
http:send-request(<http:request method='get' href='http://lindat.mff.cuni.cz/services/udpipe/api/process?tokenizer&tagger&a... http://lindat.mff.cuni.cz/services/udpipe/api/process?tokenizer&tagger&parser&data=D%C4%9Bti%20pojedou%20k%20babi%C4%8Dce.%20U%C5%BE%20se%20t%C4%9B%C5%A1%C3%AD'/>)
and works perfectly; but I was trying to put the body of the request in http:body, but it does not work:
http:send-request(<http:request method='get' href='http://lindat.mff.cuni.cz/services/udpipe/api/process http://lindat.mff.cuni.cz/services/udpipe/api/process'> <http:body media-type ="string"> tokenizer&tagger&parser&data=Děti pojedou k babičce. Už se těší. </http:body> </http:request>)
I could invoke the curl command in BaseX, but maybe there is a more elegant way to send the content of the file (than adding it to the URL). Thanks.
Ciao, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de mailto:celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com mailto:giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/ https://sites.google.com/site/giuseppegacelano/
Hi Giuseppe,
You were asking about a method other than adding to the URL. Using –F with curl uses the POST method, rather than GET. Also, “string” isn’t a recognized content-type, I think.
Your example might work if you changed it to use POST and text/plain:
http:send-request(<http:request method='post' href='http://lindat.mff.cuni.cz/services/udpipe/api/processhttps://urldefense.proofpoint.com/v2/url?u=http-3A__lindat.mff.cuni.cz_services_udpipe_api_process&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=DaGziNTCuVsrynMx6YDP9swm59C7-6gsFKdm56BS0Aw&e='> <http:body media-type ="text/plain"> tokenizer&tagger&parser&data=Děti pojedou k babičce. Už se těší. </http:body> </http:request>)
Kendall
From: basex-talk-bounces@mailman.uni-konstanz.de on behalf of Giuseppe Celano celano@informatik.uni-leipzig.de Date: Monday, August 14, 2017 at 8:05 AM To: BaseX basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] HTTP module
Hi Andy,
Thanks for the help. If I pass an entire text via the url, though, I get the error message "Request-URI Too Large".
Best, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.demailto:celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.commailto:giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/https://urldefense.proofpoint.com/v2/url?u=http-3A__www.dh.uni-2Dleipzig.de_wo_team_&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=loDi_1hbkABa34MU40mPc3iOc7tHuC0cSM35UJ-BV2s&e= Web site 2: https://sites.google.com/site/giuseppegacelano/https://urldefense.proofpoint.com/v2/url?u=https-3A__sites.google.com_site_giuseppegacelano_&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=zBJ8ibuViLzDyQgdQyAB8OO9l-30zOE2_Xid91jFZH8&e=
On 14 Aug 2017, at 14:11, Giuseppe Celano <celano@informatik.uni-leipzig.demailto:celano@informatik.uni-leipzig.de> wrote:
Hi,
I am accessing a RESTful API via the following command:
curl -F data=@example.txtmailto:data=@example.txt -F tokenizer= -F tagger= -F parser= http://lindat.mff.cuni.cz/services/udpipe/api/processhttps://urldefense.proofpoint.com/v2/url?u=http-3A__lindat.mff.cuni.cz_services_udpipe_api_process&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=DaGziNTCuVsrynMx6YDP9swm59C7-6gsFKdm56BS0Aw&e= > example2.txt
I am wondering what the best way is to do that in BaseX. The service also has a URL syntax, as shown in the following example:
http://lindat.mff.cuni.cz/services/udpipe/api/process?tokenizer&tagger&a....https://urldefense.proofpoint.com/v2/url?u=http-3A__lindat.mff.cuni.cz_services_udpipe_api_process-3Ftokenizer-26tagger-26parser-26data-3DD-25C4-259Bti-2520pojedou-2520k-2520babi-25C4-258Dce.-2520U-25C5-25BE-2520se-2520t-25C4-259B-25C5-25A1-25C3-25AD.&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=JsKpeHWT9fIPwq35J9JzJQekeMiJxkBjU6_1duyZkgs&e=
I have tried:
http:send-request(<http:request method='get' href='http://lindat.mff.cuni.cz/services/udpipe/api/process?tokenizer&tagger&a...https://urldefense.proofpoint.com/v2/url?u=http-3A__lindat.mff.cuni.cz_services_udpipe_api_process-3Ftokenizer-26amp-3Btagger-26amp-3Bparser-26amp-3Bdata-3DD-25C4-259Bti-2520pojedou-2520k-2520babi-25C4-258Dce.-2520U-25C5-25BE-2520se-2520t-25C4-259B-25C5-25A1-25C3-25AD-27_&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=rtjKIcEd_N_VB8XayzehrSAD34ew4ynTrog-moDSHXc&e=>)
and works perfectly; but I was trying to put the body of the request in http:body, but it does not work:
http:send-request(<http:request method='get' href='http://lindat.mff.cuni.cz/services/udpipe/api/processhttps://urldefense.proofpoint.com/v2/url?u=http-3A__lindat.mff.cuni.cz_services_udpipe_api_process&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=DaGziNTCuVsrynMx6YDP9swm59C7-6gsFKdm56BS0Aw&e='> <http:body media-type ="string"> tokenizer&tagger&parser&data=Děti pojedou k babičce. Už se těší. </http:body> </http:request>)
I could invoke the curl command in BaseX, but maybe there is a more elegant way to send the content of the file (than adding it to the URL). Thanks.
Ciao, Giuseppe
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.demailto:celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.commailto:giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/https://urldefense.proofpoint.com/v2/url?u=http-3A__www.dh.uni-2Dleipzig.de_wo_team_&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=loDi_1hbkABa34MU40mPc3iOc7tHuC0cSM35UJ-BV2s&e= Web site 2: https://sites.google.com/site/giuseppegacelano/https://urldefense.proofpoint.com/v2/url?u=https-3A__sites.google.com_site_giuseppegacelano_&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=zBJ8ibuViLzDyQgdQyAB8OO9l-30zOE2_Xid91jFZH8&e=
Thanks, Kendall, I tried but it does not work :(
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On Aug 14, 2017, at 6:53 PM, Kendall Shaw kendall.shaw@workday.com wrote:
http:send-request(<http:request method='post' href='http://lindat.mff.cuni.cz/services/udpipe/api/process https://urldefense.proofpoint.com/v2/url?u=http-3A__lindat.mff.cuni.cz_services_udpipe_api_process&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=DaGziNTCuVsrynMx6YDP9swm59C7-6gsFKdm56BS0Aw&e='> <http:body media-type ="text/plain"> tokenizer&tagger&parser&data=Děti pojedou k babičce. Už se těší. </http:body> </http:request>)
Can you elaborate on “does not work”? What response do you get.
From: Giuseppe Celano celano@informatik.uni-leipzig.de Date: Monday, August 14, 2017 at 10:43 AM To: Kendall Shaw kendall.shaw@workday.com Cc: BaseX basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] HTTP module
Thanks, Kendall, I tried but it does not work :(
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.demailto:celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.commailto:giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/https://urldefense.proofpoint.com/v2/url?u=http-3A__www.dh.uni-2Dleipzig.de_wo_team_&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=TlIOuppKOV4asTDc_0id-1Pvzd-oP7t4YHr2yE_Gq3A&s=dux16wNCZ5h2561adWMHR4ElYd_DdDGhkNgDiHLnFew&e= Web site 2: https://sites.google.com/site/giuseppegacelano/https://urldefense.proofpoint.com/v2/url?u=https-3A__sites.google.com_site_giuseppegacelano_&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=TlIOuppKOV4asTDc_0id-1Pvzd-oP7t4YHr2yE_Gq3A&s=Nt61VJ9xd6KArgVDT8ISIy7kOYiUYAKczZkMyWv2jh8&e=
On Aug 14, 2017, at 6:53 PM, Kendall Shaw <kendall.shaw@workday.commailto:kendall.shaw@workday.com> wrote:
http:send-request(<http:request method='post' href='http://lindat.mff.cuni.cz/services/udpipe/api/processhttps://urldefense.proofpoint.com/v2/url?u=http-3A__lindat.mff.cuni.cz_services_udpipe_api_process&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=DaGziNTCuVsrynMx6YDP9swm59C7-6gsFKdm56BS0Aw&e='> <http:body media-type ="text/plain"> tokenizer&tagger&parser&data=Děti pojedou k babičce. Už se těší. </http:body> </http:request>)
For me it gives...
<http:response xmlns:http="http://expath.org/ns/http-client" status="400" message="Bad Request"> <http:header name="Server" value="nginx/1.11.5"/> <http:header name="Access-Control-Allow-Origin" value="*"/> <http:header name="Connection" value="keep-alive"/> <http:header name="Content-Length" value="37"/> <http:header name="Date" value="Mon, 14 Aug 2017 18:45:48 GMT"/> <http:header name="Content-Type" value="text/plain"/> <http:body media-type="text/plain"/> </http:response> Required argument 'data' is missing.
I think the below should work, but it does not
let $d:="Děti pojedou k babičce. Už se těší" return http:send-request(<http:request method='POST' href=' http://lindat.mff.cuni.cz/services/udpipe/api/process%27%3E <http:body media-type ="application/x-www-form-urlencoded">data={encode-for-uri($d)}&tokenizer&tagger&parser</http:body> </http:request>)
returns:
<http:response xmlns:http="http://expath.org/ns/http-client" status="400" message="Bad Request"> <http:header name="Server" value="nginx/1.11.5"/> <http:header name="Access-Control-Allow-Origin" value="*"/> <http:header name="Connection" value="keep-alive"/> <http:header name="Content-Length" value="113"/> <http:header name="Date" value="Mon, 14 Aug 2017 18:47:32 GMT"/> <http:header name="Content-Type" value="text/plain"/> <http:body media-type="text/plain"/> </http:response> Cannot read input data: The CoNLL-U line 'Děti pojedou k babičce. Už se těší' does not contain 10 columns!
Trying to directly emulate curl
let $target:="http://lindat.mff.cuni.cz/services/udpipe/api/process"
let $data:="Děti pojedou k babičce. Už se těší"
let $req:= <http:request method="POST" xmlns=" http://expath.org/ns/http-client%22%3E <multipart media-type="multipart/form-data"> <header name="Content-Disposition" value='form-data; name="tokenizer"'/> <body media-type="text/plain"/> <header name="Content-Disposition" value='form-data; name="tagger"'/> <body media-type="text/plain"/> <header name="Content-Disposition" value='form-data; name="parser"'/> <body media-type="text/plain"/> <header name="Content-Disposition" value='form-data; name="data"'/> <header name="Content-Transfer-Encoding" value='binary'/> <body media-type="text/plain" /> </multipart> </http:request> return http:send-request($req,$target,("","","",($data)))
This works, except when $data contains non 7bit chars, as here. In this case base64 data appears in place of the correct text.
It appears to be similar to this issue:
https://www.mail-archive.com/basex-talk@mailman.uni-konstanz.de/msg09206.htm...
/Andy
On 14 August 2017 at 19:04, Kendall Shaw kendall.shaw@workday.com wrote:
Can you elaborate on “does not work”? What response do you get.
*From: *Giuseppe Celano celano@informatik.uni-leipzig.de *Date: *Monday, August 14, 2017 at 10:43 AM *To: *Kendall Shaw kendall.shaw@workday.com *Cc: *BaseX basex-talk@mailman.uni-konstanz.de *Subject: *Re: [basex-talk] HTTP module
Thanks, Kendall, I tried but it does not work :(
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ https://urldefense.proofpoint.com/v2/url?u=http-3A__www.dh.uni-2Dleipzig.de_wo_team_&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=TlIOuppKOV4asTDc_0id-1Pvzd-oP7t4YHr2yE_Gq3A&s=dux16wNCZ5h2561adWMHR4ElYd_DdDGhkNgDiHLnFew&e= Web site 2: https://sites.google.com/site/giuseppegacelano/ https://urldefense.proofpoint.com/v2/url?u=https-3A__sites.google.com_site_giuseppegacelano_&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=TlIOuppKOV4asTDc_0id-1Pvzd-oP7t4YHr2yE_Gq3A&s=Nt61VJ9xd6KArgVDT8ISIy7kOYiUYAKczZkMyWv2jh8&e=
On Aug 14, 2017, at 6:53 PM, Kendall Shaw kendall.shaw@workday.com wrote:
http:send-request(<http:request method='post'
href='http://lindat.mff.cuni.cz/services/udpipe/api/process https://urldefense.proofpoint.com/v2/url?u=http-3A__lindat.mff.cuni.cz_services_udpipe_api_process&d=DwMFaQ&c=DS6PUFBBr_KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq-Up2QMq9rrGyfWK0KtSpT7dxRglA&m=oFx2aqo7jCo5sUYE-dtOz0f9n3lLCHBvHn5wCV_mz_U&s=DaGziNTCuVsrynMx6YDP9swm59C7-6gsFKdm56BS0Aw&e= '>
<http:body media-type ="text/plain">
tokenizer&tagger&parser&data=Děti pojedou k babičce. Už se těší.
</http:body>
</http:request>)
Thanks, Andy. I have also tried to invoke curl via proc:execute():
proc:execute("curl",("-F", "data=@example.txt", "-F", "tagger=", "-F", "parser=", "http://lindat.mff.cuni.cz/services/udpipe/api/process" ))
The function works, but unfortunately the text inside the file is not recognized as UTF-8, and so I get al lot of gibberish in the result. At the beginning I though it was due to my MacOS configuration, but I experimented a lot, and the problem seems to depend on BaseX.
I run the basexgui (and basex) commands of the bin folder from my Terminal window and they should inherit the environment variables (and indeed proc:execute("locale") also shows the right UTF-8 values).
I will open a Github issue, unless I am missing something here.
Does it work to use Andy’s multipart version with encoding=”UTF-8” added to body elements?
On 8/14/17, 3:59 PM, "Giuseppe Celano" celano@informatik.uni-leipzig.de wrote:
Thanks, Andy. I have also tried to invoke curl via proc:execute():
proc:execute("curl",("-F", "data=@example.txt", "-F", "tagger=", "-F", "parser=", "https://urldefense.proofpoint.com/v2/url?u=http-3A__lindat.mff.cuni.cz_servi... " ))
The function works, but unfortunately the text inside the file is not recognized as UTF-8, and so I get al lot of gibberish in the result. At the beginning I though it was due to my MacOS configuration, but I experimented a lot, and the problem seems to depend on BaseX.
I run the basexgui (and basex) commands of the bin folder from my Terminal window and they should inherit the environment variables (and indeed proc:execute("locale") also shows the right UTF-8 values).
I will open a Github issue, unless I am missing something here.
I have submiited an issue https://github.com/BaseXdb/basex/issues/1487
/Andy
On 15 August 2017 at 00:06, Kendall Shaw kendall.shaw@workday.com wrote:
Does it work to use Andy’s multipart version with encoding=”UTF-8” added to body elements?
On 8/14/17, 3:59 PM, "Giuseppe Celano" celano@informatik.uni-leipzig.de wrote:
Thanks, Andy. I have also tried to invoke curl via proc:execute(): proc:execute("curl",("-F", "data=@example.txt", "-F", "tagger=",
"-F", "parser=", "https://urldefense.proofpoint.com/v2/url?u=http- 3A__lindat.mff.cuni.cz_services_udpipe_api_process&d=DwIFAg&c=DS6PUFBBr_ KiLo7Sjt3ljp5jaW5k2i9ijVXllEdOozc&r=JgwnBEpN1c-DDmq- Up2QMq9rrGyfWK0KtSpT7dxRglA&m=qRCvzBXhWnhavXBh6P8EXvNUf4YQi3 2_9ryqeOGjFo0&s=eR_QvfgwPhEJZno15HPXKZ7T6_aiGenJQ-NEFhJyCyU&e= " ))
The function works, but unfortunately the text inside the file is not
recognized as UTF-8, and so I get al lot of gibberish in the result. At the beginning I though it was due to my MacOS configuration, but I experimented a lot, and the problem seems to depend on BaseX.
I run the basexgui (and basex) commands of the bin folder from my
Terminal window and they should inherit the environment variables (and indeed proc:execute("locale") also shows the right UTF-8 values).
I will open a Github issue, unless I am missing something here.
Hi Giuseppe,
Thanks to Andy’s bug report [1], I believe the encoding issue with non-ASCII characters has been fixed. I’m looking forward to your feedback on the latest snapshot [2].
Best, Christian
[1] https://github.com/BaseXdb/basex/issues/1487 [2] http://files.basex.org/releases/latest/
On Mon, Aug 14, 2017 at 8:55 PM, Andy Bunce bunce.andy@gmail.com wrote:
For me it gives...
<http:response xmlns:http="http://expath.org/ns/http-client" status="400" message="Bad Request"> <http:header name="Server" value="nginx/1.11.5"/> <http:header name="Access-Control-Allow-Origin" value="*"/> <http:header name="Connection" value="keep-alive"/> <http:header name="Content-Length" value="37"/> <http:header name="Date" value="Mon, 14 Aug 2017 18:45:48 GMT"/> <http:header name="Content-Type" value="text/plain"/> <http:body media-type="text/plain"/> </http:response> Required argument 'data' is missing.
I think the below should work, but it does not
let $d:="Děti pojedou k babičce. Už se těší" return http:send-request(<http:request method='POST' href='http://lindat.mff.cuni.cz/services/udpipe/api/process%27%3E <http:body media-type ="application/x-www-form-urlencoded">data={encode-for-uri($d)}&tokenizer&tagger&parser</http:body> </http:request>)
returns:
<http:response xmlns:http="http://expath.org/ns/http-client" status="400" message="Bad Request"> <http:header name="Server" value="nginx/1.11.5"/> <http:header name="Access-Control-Allow-Origin" value="*"/> <http:header name="Connection" value="keep-alive"/> <http:header name="Content-Length" value="113"/> <http:header name="Date" value="Mon, 14 Aug 2017 18:47:32 GMT"/> <http:header name="Content-Type" value="text/plain"/> <http:body media-type="text/plain"/> </http:response> Cannot read input data: The CoNLL-U line 'Děti pojedou k babičce. Už se těší' does not contain 10 columns!
Trying to directly emulate curl
let $target:="http://lindat.mff.cuni.cz/services/udpipe/api/process"
let $data:="Děti pojedou k babičce. Už se těší"
let $req:= <http:request method="POST" xmlns="http://expath.org/ns/http-client%22%3E
<multipart media-type="multipart/form-data"> <header name="Content-Disposition" value='form-data; name="tokenizer"'/> <body media-type="text/plain"/> <header name="Content-Disposition" value='form-data; name="tagger"'/> <body media-type="text/plain"/> <header name="Content-Disposition" value='form-data; name="parser"'/> <body media-type="text/plain"/> <header name="Content-Disposition" value='form-data; name="data"'/> <header name="Content-Transfer-Encoding" value='binary'/> <body media-type="text/plain" /> </multipart> </http:request> return http:send-request($req,$target,("","","",($data)))
This works, except when $data contains non 7bit chars, as here. In this case base64 data appears in place of the correct text.
It appears to be similar to this issue:
https://www.mail-archive.com/basex-talk@mailman.uni-konstanz.de/msg09206.htm...
/Andy
On 14 August 2017 at 19:04, Kendall Shaw kendall.shaw@workday.com wrote:
Can you elaborate on “does not work”? What response do you get.
From: Giuseppe Celano celano@informatik.uni-leipzig.de Date: Monday, August 14, 2017 at 10:43 AM To: Kendall Shaw kendall.shaw@workday.com Cc: BaseX basex-talk@mailman.uni-konstanz.de Subject: Re: [basex-talk] HTTP module
Thanks, Kendall, I tried but it does not work :(
Universität Leipzig Institute of Computer Science, Digital Humanities Augustusplatz 10 04109 Leipzig Deutschland E-mail: celano@informatik.uni-leipzig.de E-mail: giuseppegacelano@gmail.com Web site 1: http://www.dh.uni-leipzig.de/wo/team/ Web site 2: https://sites.google.com/site/giuseppegacelano/
On Aug 14, 2017, at 6:53 PM, Kendall Shaw kendall.shaw@workday.com wrote:
http:send-request(<http:request method='post'
href='http://lindat.mff.cuni.cz/services/udpipe/api/process%27%3E
<http:body media-type ="text/plain">
tokenizer&tagger&parser&data=Děti pojedou k babičce. Už se těší.
</http:body>
</http:request>)
basex-talk@mailman.uni-konstanz.de