Improving memory usage when importing large datasets

XMLWordPrintable

    • 7

      Currently, importing large sets of data requires a lot of memory as the collection of data being imported is loaded up to memory before being purged to the database. In order to improve memory efficiency, we need to perform the import in "chunks" such that we read a chunk from the input and write a chunk out to the database. The writing process already supports purging in "chunks" via JDBC batch processing. The input process will need to read in chunks in various ways:

      • XML import: use event-based XML processing
      • CSV import: read data as a stream
      • JDBC import: read pages of data

            Assignee:
            Unassigned
            Reporter:
            shihab
            Votes:
            2 Vote for this issue
            Watchers:
            5 Start watching this issue

              Created:
              Updated: