Hi,
I have an item that I would like to bring to attention. We have developed a web controller to let users manages translation processes for BaseX content. Our process is something like this:
- Users select content to translation (1 to 500 small files) + languages to translation to (1 to 32 selections). - For each language, for each files: - The system transforms the content to xliff and sets the segments to translate="yes" if they have changed since the last translation for this language. Content is saved because we'll need to query it, we redirect to the next task. - File without new segment to translate, content is processed and saved to the target languages (because attributes might have changed and/or segments might have been deleted). The xliff-file is deleted, we redirect to the next task because we'll need the new information for the next query. - We query the server to offer the users stats on items to translate.
This is a simplification of the process, but it shows the logics. Redirects occur after each step for each language. We have grouped operations to limit commits/redirects to a minimum. We apply them:
- After each language to avoid running out of memory. - Before each operation that needs to query files based on the changes from the previous steps.
We have also split groups of tasks into smaller groups where too many tasks have led us to run out of memory in the past.
Our request would be for a way to force changes to commit without having to redirect. Refreshing the browser has a big impact on performance. Or maybe you have suggestions to improve batch processing when using a web interface for process management.
Thank you in advance for you input!
Hi France,
I guess there is no simple answer to your question; it mostly depends on the architecture of your approach what would be the best solution and further steps. And I'm not quite sure what's the major challenge? Is it performance, is it technical restrictions, is it the overall concept?
As you were mentioning that you are working with a web interface, our approach would be to provide a RESTXQ function that triggers all the transformations whenever a user requests it. Have you thought about that? What language is your web controller built on?
Best, Christian
On Mon, Feb 9, 2015 at 8:12 PM, France Baril france.baril@architextus.com wrote:
Hi,
I have an item that I would like to bring to attention. We have developed a web controller to let users manages translation processes for BaseX content. Our process is something like this:
Users select content to translation (1 to 500 small files) + languages to translation to (1 to 32 selections). For each language, for each files:
The system transforms the content to xliff and sets the segments to translate="yes" if they have changed since the last translation for this language. Content is saved because we'll need to query it, we redirect to the next task. File without new segment to translate, content is processed and saved to the target languages (because attributes might have changed and/or segments might have been deleted). The xliff-file is deleted, we redirect to the next task because we'll need the new information for the next query. We query the server to offer the users stats on items to translate.
This is a simplification of the process, but it shows the logics. Redirects occur after each step for each language. We have grouped operations to limit commits/redirects to a minimum. We apply them:
After each language to avoid running out of memory. Before each operation that needs to query files based on the changes from the previous steps.
We have also split groups of tasks into smaller groups where too many tasks have led us to run out of memory in the past.
Our request would be for a way to force changes to commit without having to redirect. Refreshing the browser has a big impact on performance. Or maybe you have suggestions to improve batch processing when using a web interface for process management.
Thank you in advance for you input!
-- France Baril Architecte documentaire / Documentation architect france.baril@architextus.com
Hi,
- Our issue is with performance. - Performing all the transformations lead to 2 issues: - We run out of memory. - We sometimes need to query the content that has been transformed after it has gone through some of the transformations. - Web controller is jquery, but we redirect from xquery. jquery just says: translate these files in these languages. The restxq function handles the steps, and calls itself back with an incremented step number when a group of transformations that could be handle without committing content (without a need to query the saved files or without running out of memory) are completed.
On Wed, Feb 11, 2015 at 1:47 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi France,
I guess there is no simple answer to your question; it mostly depends on the architecture of your approach what would be the best solution and further steps. And I'm not quite sure what's the major challenge? Is it performance, is it technical restrictions, is it the overall concept?
As you were mentioning that you are working with a web interface, our approach would be to provide a RESTXQ function that triggers all the transformations whenever a user requests it. Have you thought about that? What language is your web controller built on?
Best, Christian
On Mon, Feb 9, 2015 at 8:12 PM, France Baril france.baril@architextus.com wrote:
Hi,
I have an item that I would like to bring to attention. We have
developed a
web controller to let users manages translation processes for BaseX
content.
Our process is something like this:
Users select content to translation (1 to 500 small files) + languages to translation to (1 to 32 selections). For each language, for each files:
The system transforms the content to xliff and sets the segments to translate="yes" if they have changed since the last translation for this language. Content is saved because we'll need to query it, we redirect to the next task. File without new segment to translate, content is processed and saved to
the
target languages (because attributes might have changed and/or segments might have been deleted). The xliff-file is deleted, we redirect to the
next
task because we'll need the new information for the next query. We query the server to offer the users stats on items to translate.
This is a simplification of the process, but it shows the logics.
Redirects
occur after each step for each language. We have grouped operations to
limit
commits/redirects to a minimum. We apply them:
After each language to avoid running out of memory. Before each operation that needs to query files based on the changes from the previous steps.
We have also split groups of tasks into smaller groups where too many
tasks
have led us to run out of memory in the past.
Our request would be for a way to force changes to commit without having
to
redirect. Refreshing the browser has a big impact on performance. Or
maybe
you have suggestions to improve batch processing when using a web
interface
for process management.
Thank you in advance for you input!
-- France Baril Architecte documentaire / Documentation architect france.baril@architextus.com
We run out of memory.
Who/what is responsible for the OOM)? Could you give us some more information on the exact step in the process that causes the bottleneck?
Web controller is jquery, but we redirect from xquery. jquery just says: translate these files in these languages. The restxq function handles the steps, and calls itself back with an incremented step number when a group of transformations that could be handle without committing content (without a need to query the saved files or without running out of memory) are completed.
On Wed, Feb 11, 2015 at 1:47 AM, Christian Grün christian.gruen@gmail.com wrote:
Hi France,
I guess there is no simple answer to your question; it mostly depends on the architecture of your approach what would be the best solution and further steps. And I'm not quite sure what's the major challenge? Is it performance, is it technical restrictions, is it the overall concept?
As you were mentioning that you are working with a web interface, our approach would be to provide a RESTXQ function that triggers all the transformations whenever a user requests it. Have you thought about that? What language is your web controller built on?
Best, Christian
On Mon, Feb 9, 2015 at 8:12 PM, France Baril france.baril@architextus.com wrote:
Hi,
I have an item that I would like to bring to attention. We have developed a web controller to let users manages translation processes for BaseX content. Our process is something like this:
Users select content to translation (1 to 500 small files) + languages to translation to (1 to 32 selections). For each language, for each files:
The system transforms the content to xliff and sets the segments to translate="yes" if they have changed since the last translation for this language. Content is saved because we'll need to query it, we redirect to the next task. File without new segment to translate, content is processed and saved to the target languages (because attributes might have changed and/or segments might have been deleted). The xliff-file is deleted, we redirect to the next task because we'll need the new information for the next query. We query the server to offer the users stats on items to translate.
This is a simplification of the process, but it shows the logics. Redirects occur after each step for each language. We have grouped operations to limit commits/redirects to a minimum. We apply them:
After each language to avoid running out of memory. Before each operation that needs to query files based on the changes from the previous steps.
We have also split groups of tasks into smaller groups where too many tasks have led us to run out of memory in the past.
Our request would be for a way to force changes to commit without having to redirect. Refreshing the browser has a big impact on performance. Or maybe you have suggestions to improve batch processing when using a web interface for process management.
Thank you in advance for you input!
-- France Baril Architecte documentaire / Documentation architect france.baril@architextus.com
-- France Baril Architecte documentaire / Documentation architect france.baril@architextus.com
Hi, here is an example: A process that aggregates a few 100 topics and transforms the aggregated content to a large HTML file for reviewers to see all content together works fine. Try to do it for 32 languages, and you run out of memory.
I'm trying to build a small sample. Our real processes also resolves gui values from the developers library of strings and filters content based on audiences and product numbers. It may take a while before I can get this to work.
On Fri, Feb 13, 2015 at 5:29 AM, Christian Grün christian.gruen@gmail.com wrote:
We run out of memory.
Who/what is responsible for the OOM)? Could you give us some more information on the exact step in the process that causes the bottleneck?
Web controller is jquery, but we redirect from xquery. jquery just says: translate these files in these languages. The restxq function handles the steps, and calls itself back with an incremented step number when a
group of
transformations that could be handle without committing content (without
a
need to query the saved files or without running out of memory) are completed.
On Wed, Feb 11, 2015 at 1:47 AM, Christian Grün <
christian.gruen@gmail.com>
wrote:
Hi France,
I guess there is no simple answer to your question; it mostly depends on the architecture of your approach what would be the best solution and further steps. And I'm not quite sure what's the major challenge? Is it performance, is it technical restrictions, is it the overall concept?
As you were mentioning that you are working with a web interface, our approach would be to provide a RESTXQ function that triggers all the transformations whenever a user requests it. Have you thought about that? What language is your web controller built on?
Best, Christian
On Mon, Feb 9, 2015 at 8:12 PM, France Baril france.baril@architextus.com wrote:
Hi,
I have an item that I would like to bring to attention. We have developed a web controller to let users manages translation processes for BaseX content. Our process is something like this:
Users select content to translation (1 to 500 small files) + languages to translation to (1 to 32 selections). For each language, for each files:
The system transforms the content to xliff and sets the segments to translate="yes" if they have changed since the last translation for
this
language. Content is saved because we'll need to query it, we redirect to the next task. File without new segment to translate, content is processed and saved
to
the target languages (because attributes might have changed and/or
segments
might have been deleted). The xliff-file is deleted, we redirect to
the
next task because we'll need the new information for the next query. We query the server to offer the users stats on items to translate.
This is a simplification of the process, but it shows the logics. Redirects occur after each step for each language. We have grouped operations to limit commits/redirects to a minimum. We apply them:
After each language to avoid running out of memory. Before each operation that needs to query files based on the changes from the previous steps.
We have also split groups of tasks into smaller groups where too many tasks have led us to run out of memory in the past.
Our request would be for a way to force changes to commit without
having
to redirect. Refreshing the browser has a big impact on performance. Or maybe you have suggestions to improve batch processing when using a web interface for process management.
Thank you in advance for you input!
-- France Baril Architecte documentaire / Documentation architect france.baril@architextus.com
-- France Baril Architecte documentaire / Documentation architect france.baril@architextus.com
basex-talk@mailman.uni-konstanz.de