Sunday, 15 June 2014

mongodb - Mongoexport to multiple csv files -


I have a large Mongolia collection I want to export this collection to CSV so that I can do data analysis It can be imported into the package.

There are about 15 GB documents in this collection, I want to split the archive into ~ 100 equal size CSV files. Is there any way to use this Mongoxport? I can also query the whole collection with pymongo, split it and write CSV files manually, but I think it will be slow and require more coding.

Thanks for the input.

You can call it - skip & amp; - boundary option.

For example, if you have placed 1000 documents in your collection, you can use it script loop:

  loops = 100 count = Db.collection.count () for batch_size = count / loops (i = 0; i  

Export, keeping in mind that your document size is equal.

However, note that this large stops is slow

Lower bound repeated iterations will be faster than upper bound iterations.


No comments:

Post a Comment