Hi, I am loading 5GB xml in DB it takes around 10-12 minutes. I need to load about 20-30 document of same size (5Gb) , So i created thread to loading these document in DB but its taking too much time ,I wait till 4 hrs after that I stop process to load document . Pls tell me the best solution to loading large document. my program is following :
package org.basex.examples.query;
import java.io.*;
import javax.xml.namespace.*;
import org.basex.build.*; import org.basex.core.*; import org.basex.core.cmd.*; import org.xml.sax.Parser;
/** * This class demonstrates collection relevant queries. * It shows how to find and query specific documents. * * @author BaseX Team 2005-12, BSD License */ public final class QueryCollection implements Runnable{ Context context = new Context(); String dbName = null; public QueryCollection(String string) { dbName = string;
}
/** * Runs the example code. * @param args (ignored) command-line arguments * @throws BaseXException if a database command fails * @throws XQException */ public static void main(final String[] args) throws BaseXException { /** Database context. */ Context context = new Context();
System.out.println("=== QueryCollection ===");
// ------------------------------------------------------------------------ // Create a collection from all XML documents in the specified directory System.out.println("\n* Create a collection."); System.out.println(System.currentTimeMillis()/1000); Thread thread = new Thread(new QueryCollection("Collection1")); Thread thread1 = new Thread(new QueryCollection("Collection2")); Thread thread2 = new Thread(new QueryCollection("Collection3")); Thread thread3 = new Thread(new QueryCollection("Collection4")); Thread thread4 = new Thread(new QueryCollection("Collection5")); thread.start(); thread1.start(); thread2.start(); thread3.start(); thread4.start();
OutputStream outputStream =null; try {
outputStream = new FileOutputStream(new File("result.xml")); } catch(FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); }
// ------------------------------------------------------------------------ // Evaluate a query on a single document System.out.println(System.currentTimeMillis()/1000);
// new DropDB("Collection").execute(context);
// ------------------------------------------------------------------------ // Close the database context context.close();
}
public void createDB(String dbName){ // TODO Auto-generated constructor stub
new CreateDB(dbName, "E:/downloads/temp/test.xml").run(context);
} @Override public void run() { // TODO Auto-generated method stub System.out.println(dbName+" start time : "+System.currentTimeMillis()/1000); createDB(dbName); System.out.println(dbName+" end time : "+System.currentTimeMillis()/1000);
} }
Hi Pushpendra,
as database creation is mainly bound to I/O, you will get no better results when using threading. Even with SSDs, performance is rather expected to decrease: when creating a single database, the access patterns will mostly be sequential. If multiple databases are created at the same time, the patterns will degenerate into quasi-random ones.
Hope this helps, Christian ___________________________
2013/6/12 Pushpendra Singh Sengar pushpendra1412@gmail.com:
Hi, I am loading 5GB xml in DB it takes around 10-12 minutes. I need to load about 20-30 document of same size (5Gb) , So i created thread to loading these document in DB but its taking too much time ,I wait till 4 hrs after that I stop process to load document . Pls tell me the best solution to loading large document. my program is following :
package org.basex.examples.query;
import java.io.*;
import javax.xml.namespace.*;
import org.basex.build.*; import org.basex.core.*; import org.basex.core.cmd.*; import org.xml.sax.Parser;
/**
- This class demonstrates collection relevant queries.
- It shows how to find and query specific documents.
- @author BaseX Team 2005-12, BSD License
*/ public final class QueryCollection implements Runnable{ Context context = new Context(); String dbName = null; public QueryCollection(String string) { dbName = string;
}
/**
- Runs the example code.
- @param args (ignored) command-line arguments
- @throws BaseXException if a database command fails
- @throws XQException
*/ public static void main(final String[] args) throws BaseXException { /** Database context. */ Context context = new Context();
System.out.println("=== QueryCollection ==="); //
// Create a collection from all XML documents in the specified directory System.out.println("\n* Create a collection."); System.out.println(System.currentTimeMillis()/1000); Thread thread = new Thread(new QueryCollection("Collection1")); Thread thread1 = new Thread(new QueryCollection("Collection2")); Thread thread2 = new Thread(new QueryCollection("Collection3")); Thread thread3 = new Thread(new QueryCollection("Collection4")); Thread thread4 = new Thread(new QueryCollection("Collection5")); thread.start(); thread1.start(); thread2.start(); thread3.start(); thread4.start(); OutputStream outputStream =null; try { outputStream = new FileOutputStream(new File("result.xml")); } catch(FileNotFoundException e) { // TODO Auto-generated catch block e.printStackTrace(); } //
// Evaluate a query on a single document System.out.println(System.currentTimeMillis()/1000);
// new DropDB("Collection").execute(context);
//
// Close the database context context.close();
}
public void createDB(String dbName){ // TODO Auto-generated constructor stub
new CreateDB(dbName, "E:/downloads/temp/test.xml").run(context);
} @Override public void run() { // TODO Auto-generated method stub System.out.println(dbName+" start time : "+System.currentTimeMillis()/1000); createDB(dbName); System.out.println(dbName+" end time : "+System.currentTimeMillis()/1000);
} }
-- Thanks & Regards , Pushpendra Singh
BaseX-Talk mailing list BaseX-Talk@mailman.uni-konstanz.de https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk
basex-talk@mailman.uni-konstanz.de