Hi Eric,

As you say, "XML and JSON databases are needed when the data consists of documents already occurring in such formats." BaseX databases are intended to be representations of physical files, so what operations are you performing on the databases that are not easily reproducible from the XML? You say if you move your project to a new system, you'll "lose all data from the database"--is your goal to ship out the databases as files for read-only access, without including the source documents? I'm not a BaseX dev, but I don't think that's its intended use.

Re: "In your model, all copies of my project would access the same database, on the same system." One folder of databases isn't the same as one database. Your project copies can each have their own database within BaseX; they just need to have distinct names like "db-prod" for documents stored in your production folder, "db-dev" for documents stored in your development folder, etc. You can construct your queries to access the database you want at the time by passing an external variable like:

declare variable $d as xs:string external;
for $result in db:open($d)/etc.

In that way you can access data representing files from anywhere on the machine, so it doesn't matter where the encrypted BaseX databases are stored. You can also copy databases into new ones for manipulation or preservation of a "state" with COPY original_db copied_db. If your changes in original_db get messed up and you want to restore, DROP DB original_db; COPY copied_db original_db.

Assuming every distribution of your project will contain the original files, or work based off of user-supplied files, there is no limitation to putting all databases under one directory. Your instructions for installation only need to include specifying that directory in the .basex file, or in a custom file of definitions if you want to set it programmatically; and running a script that constructs the BaseX database on that computer from the documents. Whenever I move our project to a new machine--which I did frequently during development--our scripts rebuilt our databases of ~42K documents, including full-text and custom indexes, in about five minutes.

-Tamara

On Mon, Feb 21, 2022 at 2:08 PM Eric Levy <contact@ericlevy.name> wrote:
> > > Maybe it helps to compare BaseX with other Open-Source SQL
> > databases,
> > > e.g. PostgreSQL: While it’s possibly to use it embedded,
> > specifying a
> > > simple path as target won’t suffice.
> >
> > SQL databases are most useful for data that may be represented
> > naturally using a relational model.
>
> Sure. I simply wanted to illustrate that you cannot embed PostgreSQL
> that easily in your project as you can embed SQLite.

I may have misunderstood the direction of your comment.

Broadly, as you suggest, the databases mentioned may be classified
according to two orthogonal traits, whether supporting server versus
embedded modes, and by the data or document model.

I had mentioned SQLite, because, even though it has a different data
model from that of BaseX, I thought it might useful to compare in terms
of how it resolves the location of the database. Among the use cases
for which SQLite has become successful is one in which a database
occurs in a portable project directory. Such flexibility makes it a
credible choice for managing relational data in contexts for which
PostgreSQL and other server-only database systems would be
inappropriate.

Presently, BaseX offers limited support for embedded use. It seems, at
least in principle, a feasible path is available to strengthen the
support for the embedded case by supporting a mode of opening a
database from a file path independent of data external to the path.
Such support, of course, may be added without imposing any constraints
on existing support for either embedded or server modes.


> I suppose BaseX is not the right tool for you, as you seem to look
> for something that's more light-weight.

It may not be the right tool, at least at present. I believe the
limitation is currently constrained to the issue I have raised, about
opening a database at a user-supplied location. In terms of essential
operational characteristics, I think the application is suitably light
weight, as it seems to handle invocation-per-query use cases with
adequate efficiency on the target systems.

I think the request is sound, as it would leverage much of the already
available functionality for embedded invocation, but for a much more
general set of use cases.

Of course, how well the request would align with broader project
objectives is outside the scope of my information.




--

Tamara Marnell
Program Manager, Systems
Orbis Cascade Alliance (orbiscascade.org)
Pronouns: she/her/hers